Name: Michael Zhang

Contact Info: 832-726-2026

Date: 6/3/2024

Who’s meddling in your elections:

A minority of large donors have always had a disproportionate influence on the Houston elections. For example, the current mayor of Houston, John Whitmire, held a huge advantage during the recent mayoral race with his “war chest” campaign funding. With $9 million dollars from donations, his funding pool was 30 times the size of any of his competitors in the Houston mayoral race.

This trend doesn’t end with Whitmire either. Various incumbent Houston city council members, judges, and elected constables all have records showing large donations from mysterious, and often out-of-state donors. Yet little information exists online about who these donors are, and more importantly, what their motivations are to funding local Houston politicians.

Records: The data for this project comes from The City of Houston Open Data, which is updated and maintained by the City Secretary’s Office. The excel sheet lists a detailed history of campaign finance transactions, containing candidates, transaction amounts, and donor information from 2007-2019.

Using the records, we can hopefully shed some more light on the background and motivation of these local “big hitter” campaign donors. While the sheet has lacked sufficient upkeep, with its most recent update being in 2019, many listed in the document are currently in some kind of local political office. It would be interesting to see an expansive story on where these city politicans get all their campaign money, and how it could influence their policy decisions.

Surprising findings

  1. Non-Houstonian Donors often contributed large sums of money to the current political incumbent campaings. While some could be just friends of the politicians, a significant portion of the donors held high corporate positions or were some kind of investor.

  2. There were also a large number of non Houstonian real estate developers, it would be worth looking into whether some of these incumbents were unfairly proportioning out contracts to developers which donated to their campaigns

  3. Another story worth pursuing would be covering the lack of transparency for this data. The excel sheet I retrieved was not updated properly, and when I called the City of Houston Secretary’s Office, they didn’t even know the database listed on their website existed. Further, the website was being run by a contracted data company, with how badly this company is doing its job, it would be worthwhile to see how much the government pays this company for its badly kept data.

installing basic packages if needed

#install.packages("openxlsx")
#install.packages("tidyverse")
#install.packages("rio")
#install.packages("janitor")
#install.packages("dplyr")
#install.packages("scales")

R Markdown

loading in necessary packages

library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.4     ✔ readr     2.1.5
## ✔ forcats   1.0.0     ✔ stringr   1.5.1
## ✔ ggplot2   3.5.0     ✔ tibble    3.2.1
## ✔ lubridate 1.9.3     ✔ tidyr     1.3.1
## ✔ purrr     1.0.2     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(rio)
library(janitor)
## 
## Attaching package: 'janitor'
## 
## The following objects are masked from 'package:stats':
## 
##     chisq.test, fisher.test
library(dplyr)
library(openxlsx)
library(kableExtra)
## 
## Attaching package: 'kableExtra'
## 
## The following object is masked from 'package:dplyr':
## 
##     group_rows
library(scales)
## 
## Attaching package: 'scales'
## 
## The following object is masked from 'package:purrr':
## 
##     discard
## 
## The following object is masked from 'package:readr':
## 
##     col_factor

reading in the data

suppressWarnings({
show_col_types = FALSE
Campaign_data <- read_csv("campaign_finance.csv")
})
## Rows: 198231 Columns: 87
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr  (66): CandidateOrCommittee, CommitteeName, CandidateFirstName, Candidat...
## dbl  (13): ReportID, CampaignYear, MostRecentReportVersion, 30th day before ...
## lgl   (7): LoanGuarantorName, LoanGuarantorCity, LoanGuarantorState, LoanGua...
## dttm  (1): ReportFiledDate
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

Transaction Type Analysis

Now, just to get a sense of the kind of transactions being tracked in this file, we do a quick count and arrange

# Perform the count and arrange operations
counted_data <- Campaign_data %>% 
  count(MasterTransactionType) %>% 
  rename(Count = n,`Transaction Type` = MasterTransactionType) %>% 
  arrange(desc(Count))
top_5_data <- head(counted_data, 5)

top_5_data %>%
  kable(format = "html", caption = "<b style='color:black;'>Top 5 Political Transactions by Count</b>") %>%
  kable_styling("striped")
Top 5 Political Transactions by Count
Transaction Type Count
Revenue 111851
Expense 72919
Political Contributions 4791
Political Expenditures 4509
Political Contributions Other Than Pledges or Loans 3059

After getting a sense of the data the next step was to filter the data set for current incumbents, creating a new spreadsheet for each individual.

Individual Incumbent Analysis

Sallie_Alcorn <- Campaign_data %>% filter(CandidateFullName %in% "Sallie Alcorn")
Willie_R_Davis <- Campaign_data %>% filter(CandidateFullName == "Willie R Davis")
Martha_Castex_Tatum <- Campaign_data %>% filter(CandidateFullName == "Martha Castex-Tatum")
Mr_Edward_J_Pollard <- Campaign_data %>% filter(CandidateFullName == "Mr. Edward J Pollard")
Adrian_Garcia <- Campaign_data %>% filter(CandidateFullName == "Adrian Garcia")
Dr_Carolyn_Evans_Shabazz <- Campaign_data %>% filter(CandidateFullName == "Dr. Carolyn Evans-Shabazz")
amy_peck_variants <- c("Ms. Amy Peck", "Ms. Amy L Peck", "Mrs. Amy Peck")
Amy_Peck <- Campaign_data %>% filter(CandidateFullName %in% amy_peck_variants)
edward_gonzalez_variants <- c("Mr. Edward Ed Gonzalez", "Edward Edward Gonzalez", "Edward Ed Gonzalez")
Edward_Gonzalez <- Campaign_data %>% filter(CandidateFullName %in% edward_gonzalez_variants)

From there, we can sort each of the incumbent dataframes by donation amount to see the most influential donors for each incumbent

suppressWarnings({
# Amy_Peck
Amy_Peck$TransactionAmount <- as.numeric(Amy_Peck$TransactionAmount)
Amy_Peck <- Amy_Peck %>% arrange(desc(TransactionAmount))

# Willie_R_Davis
Willie_R_Davis$TransactionAmount <- as.numeric(Willie_R_Davis$TransactionAmount)
Willie_R_Davis <- Willie_R_Davis %>% arrange(desc(TransactionAmount))

# Martha_Castex_Tatum
Martha_Castex_Tatum$TransactionAmount <- as.numeric(Martha_Castex_Tatum$TransactionAmount)
Martha_Castex_Tatum <- Martha_Castex_Tatum %>% arrange(desc(TransactionAmount))

# Mr_Edward_J_Pollard
Mr_Edward_J_Pollard$TransactionAmount <- as.numeric(Mr_Edward_J_Pollard$TransactionAmount)
Mr_Edward_J_Pollard <- Mr_Edward_J_Pollard %>% arrange(desc(TransactionAmount))

# Adrian_Garcia
Adrian_Garcia$TransactionAmount <- as.numeric(Adrian_Garcia$TransactionAmount)
Adrian_Garcia <- Adrian_Garcia %>% arrange(desc(TransactionAmount))

# Dr_Carolyn_Evans_Shabazz
Dr_Carolyn_Evans_Shabazz$TransactionAmount <- as.numeric(Dr_Carolyn_Evans_Shabazz$TransactionAmount)
Dr_Carolyn_Evans_Shabazz <- Dr_Carolyn_Evans_Shabazz %>% arrange(desc(TransactionAmount))

# Edward_Gonzalez
Edward_Gonzalez$TransactionAmount <- as.numeric(Edward_Gonzalez$TransactionAmount)
Edward_Gonzalez <- Edward_Gonzalez %>% arrange(desc(TransactionAmount))

# Sallie_Alcorn
Sallie_Alcorn$TransactionAmount <- as.numeric(Sallie_Alcorn$TransactionAmount)
Sallie_Alcorn <- Sallie_Alcorn %>% arrange(desc(TransactionAmount))
})

From there, we check that the candidate spreadsheet was properly sorted based on the transaction amount

Just as an example, we can use Sallie Alcorn’s transaction sheet to check this

result <- Sallie_Alcorn %>% 
  select(CampaignYear, TransactionAmount, TransactionPartyFullName, TransactionPartyEmployer, OfficeSought, TransactionAddressStreet, TransactionPartyCity, TransactionPartyOccupation, TransactionType) %>% 
  group_by(TransactionAmount)
result <- Sallie_Alcorn %>% filter(TransactionType %in% c("Political Contributions","Monetary Political Contributions"))

write.xlsx(result, file = "Sallie_Alcorn_Transactions.xlsx")

top_5_data <- result %>% 
  select(CampaignYear, TransactionAmount, TransactionPartyFullName, TransactionPartyCity, TransactionType) 
top_5_data <- head(top_5_data, 10)


# Render the table
top_5_data %>%
  kable(format = "html", caption = "<b style='color:black;'>Sallie Alcorn Transaction History</b>") %>%
  kable_styling("striped")
Sallie Alcorn Transaction History
CampaignYear TransactionAmount TransactionPartyFullName TransactionPartyCity TransactionType
2019 5000 Mr. Scott Sparvero Houston Political Contributions
2019 5000 Anthony Riggins Oakland Political Contributions
2019 5000 Melinda Hildebrand Houston Political Contributions
2019 5000 Diana Hawkins Houston Political Contributions
2019 5000 Jeff Hildebrand Houston Political Contributions
2019 5000 Mrs. Rosemary Comstock Houston Political Contributions
2019 5000 Mrs. Virginia Elkins Houston Political Contributions
2019 5000 Stephen Costello Houston Political Contributions
2019 2500 Ms. Jan Bres Houston Political Contributions
2019 2500 Kate Gibson Houston Political Contributions

Great, now we have seperated the dataset by incumbent, and can use that data for further analysis if needed.

To make it easier for reporters to play around with this data, we could use datasette lite. For proof of concept, feel free to load the following link in the load csv tab on datasette lite:

https://raw.githubusercontent.com/MichaelZhangReporter/Document-Chatbot/main/Alcorn_Csv.csv?token=GHSAT0AAAAAACTFGDPXE5VCMWYEMRALODTUZTEMRWA

All Incumbent Transaction Analysis

Going forward, if we want to look at in city/ outside city contributions, we need to create a incumbent transactions variable, and use a secondary software to clean the naming conventions

suppressWarnings({
incumbent_transactions <- read_csv("incumbent_transactions_cleaned.csv")
})
## Rows: 9512 Columns: 88
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (62): CandidateOrCommittee, ReportFiledDate, CandidateFirstName, Candida...
## dbl (16): Column, ReportID, CampaignYear, MostRecentReportVersion, 15th day ...
## lgl (10): CommitteeName, 15-Jul, BallotMeasureDescription, ExpenseCandidateB...
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

Let’s filter our data to only include contributions and get rid of NAs (For now at least, cleaning up the data more and figuring out what the NA means is necessary for later)

contributions_to_incumbents <- incumbent_transactions %>%
  filter(transaction_type %in% c("Political Contributions", "Monetary Political Contributions")) %>%
  filter(!is.na(transaction_amount))

A few candidates’ names show up as multiple values. We can use case_when() to standardize them.

contributions_to_incumbents <- contributions_to_incumbents %>%
  mutate(candidate_full_name = case_when(
    candidate_full_name == "Mr. Edward Ed Gonzalez" ~ "Edward Ed Gonzalez",
    candidate_full_name == "Mrs. Amy Peck" ~ "Ms. Amy Peck",
    candidate_full_name == "Ms. Amy L Peck" ~ "Ms. Amy Peck",
    .default = candidate_full_name
  ))

I wanted to also take a look at which kind of groups are funding these politicans, so let’s do a quick count of that as well.

Who’s funding our local politicans?

# Perform the count and arrange operations
counted_data <- contributions_to_incumbents %>% 
   count(TransactionPartyEmployer) %>% 
   arrange(desc(n))



top_5_data <- head(counted_data, 5)


# Render the table
top_5_data %>%
  kable(format = "html", caption = "<b style='color:black;'>Top 5 Donor Occupations</b>") %>%
  kable_styling("striped")
Top 5 Donor Occupations
TransactionPartyEmployer n
NA 3231
Self 402
Self-Employed 112
City of Houston 58
Information Requested 53

Seems like a lot of the contributors are either self employed, or did not fill in that section of the form, regardless this data could be interesting as those listed as “self employed” are likely some kind of lobbyist or investor with some sort of business intentions.

Now back to the original analysis, let’s group all contributions by person and get the sum for each person.

contr_sums_per_cand <- contributions_to_incumbents %>%
  group_by(candidate_full_name) %>%
  summarise(total_contributions = sum(transaction_amount)) %>%
  arrange(desc(total_contributions))

top_5_data <- head(contr_sums_per_cand, 10)
top_5_data <- top_5_data %>%
  mutate(total_contributions = dollar(total_contributions))

top_5_data %>%
  kable(format = "html", caption = "<b style='color:black;'>Total Incumbent Campaign Funds</b>") %>%
  kable_styling("striped")
Total Incumbent Campaign Funds
candidate_full_name total_contributions
Adrian Garcia $1,745,114
Edward Ed Gonzalez $659,938
Dr. Carolyn Evans-Shabazz $152,735
Sallie Alcorn $145,907
Martha Castex-Tatum $123,198
Willie R Davis $36,125
Ms. Amy Peck $14,099
Mr. Edward J Pollard $11,285

In-state vs Out-state Contributors

In_state_table <- contributions_to_incumbents %>%
  group_by(candidate_full_name) %>%
  filter(str_detect(transaction_party_state, "TX")) %>%
  summarise(In_state_contributions = sum(transaction_amount)) %>%
  arrange(desc(In_state_contributions))

top_5_data <- head(In_state_table, 8)
top_5_data <- top_5_data %>%
  mutate(In_state_contributions = dollar(In_state_contributions))

top_5_data %>%
  kable(format = "html", caption = "<b style='color:black;'> Incumbent In-State Campaign Funds</b>") %>%
  kable_styling("striped")
Incumbent In-State Campaign Funds
candidate_full_name In_state_contributions
Adrian Garcia $1,675,152
Edward Ed Gonzalez $627,471
Sallie Alcorn $135,177
Dr. Carolyn Evans-Shabazz $128,260
Martha Castex-Tatum $122,198
Willie R Davis $35,575
Ms. Amy Peck $12,600
Mr. Edward J Pollard $10,785
Instate_outstate_comparison <- left_join(In_state_table, contr_sums_per_cand, by = "candidate_full_name")
Instate_outstate_comparison <- Instate_outstate_comparison %>%
  mutate(in_state_percentage = In_state_contributions
 / total_contributions)

top_5_data <- head(Instate_outstate_comparison, 8)
top_5_data <- top_5_data %>%
  mutate(In_state_contributions = dollar(In_state_contributions)) %>%
  mutate(total_contributions = dollar(total_contributions)) %>%
  mutate(in_state_percentage = percent(in_state_percentage))


top_5_data %>%
  kable(format = "html", caption = "<b style='color:black;'>Incumbent Campaign Fund Breakdown</b>") %>%
  kable_styling("striped")
Incumbent Campaign Fund Breakdown
candidate_full_name In_state_contributions total_contributions in_state_percentage
Adrian Garcia $1,675,152 $1,745,114 95.99%
Edward Ed Gonzalez $627,471 $659,938 95.08%
Sallie Alcorn $135,177 $145,907 92.65%
Dr. Carolyn Evans-Shabazz $128,260 $152,735 83.98%
Martha Castex-Tatum $122,198 $123,198 99.19%
Willie R Davis $35,575 $36,125 98.48%
Ms. Amy Peck $12,600 $14,099 89.37%
Mr. Edward J Pollard $10,785 $11,285 95.57%

In-city vs Out-city Contributors

contr_sums_per_cand <- contributions_to_incumbents %>%
  group_by(candidate_full_name) %>%
  summarise(total_contributions = sum(transaction_amount)) %>%
  arrange(desc(total_contributions))

In_city_table <- contributions_to_incumbents %>%
  group_by(candidate_full_name) %>%
  filter(str_detect(transaction_party_city, "Houston")) %>%
  summarise(In_City_Contributions = sum(transaction_amount)) %>%
  arrange(desc(In_City_Contributions))

InCity_OutCity_comparison <- left_join(In_city_table, contr_sums_per_cand, by = "candidate_full_name")
InCity_OutCity_comparison <- InCity_OutCity_comparison %>%
  mutate(in_city_percentage = In_City_Contributions
 / total_contributions)

InCity_OutCity_comparison <- InCity_OutCity_comparison %>%
  mutate(In_City_Contributions = dollar(In_City_Contributions)) %>%
  mutate(total_contributions = dollar(total_contributions)) %>%
  mutate(in_city_percentage = percent(in_city_percentage))

InCity_OutCity_comparison %>%
  kable(format = "html", caption = "<b style='color:black;'>Incumbent Campaign Fund Breakdown</b>") %>%
  kable_styling("striped")
Incumbent Campaign Fund Breakdown
candidate_full_name In_City_Contributions total_contributions in_city_percentage
Adrian Garcia $1,135,151 $1,745,114 65.0%
Edward Ed Gonzalez $530,194 $659,938 80.3%
Sallie Alcorn $131,477 $145,907 90.1%
Dr. Carolyn Evans-Shabazz $105,710 $152,735 69.2%
Martha Castex-Tatum $92,585 $123,198 75.2%
Willie R Davis $27,675 $36,125 76.6%
Ms. Amy Peck $10,025 $14,099 71.1%
Mr. Edward J Pollard $9,235 $11,285 81.8%

now write the csv’s out

write.csv(Instate_outstate_comparison, "inState_outState_comparison.csv", row.names = FALSE)
write.csv(InCity_OutCity_comparison, "inCity_outCity_comparison.csv", row.names = FALSE)