Name: Michael Zhang

Contact Info: 832-726-2026

Date: 6/3/2024

Who’s meddling in your elections:

A minority of large donors have always had a disproportionate influence on the Houston elections. For example, the current mayor of Houston, John Whitmire, held a huge advantage during the recent mayoral race with his “war chest” campaign funding. With $9 million dollars from donations, his funding pool was 30 times the size of any of his competitors in the Houston mayoral race.

This trend doesn’t end with Whitmire either. Various incumbent Houston city council members, judges, and elected constables all have records showing large donations from mysterious, and often out-of-state donors. Yet little information exists online about who these donors are, and more importantly, what their motivations are to funding local Houston politicians.

Records: The data for this project comes from The City of Houston Open Data, which is updated and maintained by the City Secretary’s Office. The excel sheet lists a detailed history of campaign finance transactions, containing candidates, transaction amounts, and donor information from 2007-2019.

Using the records, we can hopefully shed some more light on the background and motivation of these local “big hitter” campaign donors. While the sheet has lacked sufficient upkeep, with its most recent update being in 2019, many listed in the document are currently in some kind of local political office. It would be interesting to see an expansive story on where these city politicans get all their campaign money, and how it could influence their policy decisions.

Surprising findings

Non-Houstonian Donors often contributed large sums of money to the current political incumbent campaings. While some could be just friends of the politicians, a significant portion of the donors held high corporate positions or were some kind of investor.
There were also a large number of non Houstonian real estate developers, it would be worth looking into whether some of these incumbents were unfairly proportioning out contracts to developers which donated to their campaigns
Another story worth pursuing would be covering the lack of transparency for this data. The excel sheet I retrieved was not updated properly, and when I called the City of Houston Secretary’s Office, they didn’t even know the database listed on their website existed. Further, the website was being run by a contracted data company, with how badly this company is doing its job, it would be worthwhile to see how much the government pays this company for its badly kept data.

installing basic packages if needed

#install.packages("openxlsx")
#install.packages("tidyverse")
#install.packages("rio")
#install.packages("janitor")
#install.packages("dplyr")
#install.packages("scales")

R Markdown

loading in necessary packages

library(tidyverse)

## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.4     ✔ readr     2.1.5
## ✔ forcats   1.0.0     ✔ stringr   1.5.1
## ✔ ggplot2   3.5.0     ✔ tibble    3.2.1
## ✔ lubridate 1.9.3     ✔ tidyr     1.3.1
## ✔ purrr     1.0.2     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

library(rio)
library(janitor)

## 
## Attaching package: 'janitor'
## 
## The following objects are masked from 'package:stats':
## 
##     chisq.test, fisher.test

library(dplyr)
library(openxlsx)
library(kableExtra)

## 
## Attaching package: 'kableExtra'
## 
## The following object is masked from 'package:dplyr':
## 
##     group_rows

library(scales)

## 
## Attaching package: 'scales'
## 
## The following object is masked from 'package:purrr':
## 
##     discard
## 
## The following object is masked from 'package:readr':
## 
##     col_factor

reading in the data

suppressWarnings({
show_col_types = FALSE
Campaign_data <- read_csv("campaign_finance.csv")
})

## Rows: 198231 Columns: 87
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr  (66): CandidateOrCommittee, CommitteeName, CandidateFirstName, Candidat...
## dbl  (13): ReportID, CampaignYear, MostRecentReportVersion, 30th day before ...
## lgl   (7): LoanGuarantorName, LoanGuarantorCity, LoanGuarantorState, LoanGua...
## dttm  (1): ReportFiledDate
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

Transaction Type Analysis

Now, just to get a sense of the kind of transactions being tracked in this file, we do a quick count and arrange

# Perform the count and arrange operations
counted_data <- Campaign_data %>% 
  count(MasterTransactionType) %>% 
  rename(Count = n,`Transaction Type` = MasterTransactionType) %>% 
  arrange(desc(Count))
top_5_data <- head(counted_data, 5)

top_5_data %>%
  kable(format = "html", caption = "<b style='color:black;'>Top 5 Political Transactions by Count</b>") %>%
  kable_styling("striped")

**Top 5 Political Transactions by Count**
Transaction Type	Count
Revenue	111851
Expense	72919
Political Contributions	4791
Political Expenditures	4509
Political Contributions Other Than Pledges or Loans	3059

After getting a sense of the data the next step was to filter the data set for current incumbents, creating a new spreadsheet for each individual.

Individual Incumbent Analysis

Sallie_Alcorn <- Campaign_data %>% filter(CandidateFullName %in% "Sallie Alcorn")
Willie_R_Davis <- Campaign_data %>% filter(CandidateFullName == "Willie R Davis")
Martha_Castex_Tatum <- Campaign_data %>% filter(CandidateFullName == "Martha Castex-Tatum")
Mr_Edward_J_Pollard <- Campaign_data %>% filter(CandidateFullName == "Mr. Edward J Pollard")
Adrian_Garcia <- Campaign_data %>% filter(CandidateFullName == "Adrian Garcia")
Dr_Carolyn_Evans_Shabazz <- Campaign_data %>% filter(CandidateFullName == "Dr. Carolyn Evans-Shabazz")
amy_peck_variants <- c("Ms. Amy Peck", "Ms. Amy L Peck", "Mrs. Amy Peck")
Amy_Peck <- Campaign_data %>% filter(CandidateFullName %in% amy_peck_variants)
edward_gonzalez_variants <- c("Mr. Edward Ed Gonzalez", "Edward Edward Gonzalez", "Edward Ed Gonzalez")
Edward_Gonzalez <- Campaign_data %>% filter(CandidateFullName %in% edward_gonzalez_variants)

From there, we can sort each of the incumbent dataframes by donation amount to see the most influential donors for each incumbent

suppressWarnings({
# Amy_Peck
Amy_Peck$TransactionAmount <- as.numeric(Amy_Peck$TransactionAmount)
Amy_Peck <- Amy_Peck %>% arrange(desc(TransactionAmount))

# Willie_R_Davis
Willie_R_Davis$TransactionAmount <- as.numeric(Willie_R_Davis$TransactionAmount)
Willie_R_Davis <- Willie_R_Davis %>% arrange(desc(TransactionAmount))

# Martha_Castex_Tatum
Martha_Castex_Tatum$TransactionAmount <- as.numeric(Martha_Castex_Tatum$TransactionAmount)
Martha_Castex_Tatum <- Martha_Castex_Tatum %>% arrange(desc(TransactionAmount))

# Mr_Edward_J_Pollard
Mr_Edward_J_Pollard$TransactionAmount <- as.numeric(Mr_Edward_J_Pollard$TransactionAmount)
Mr_Edward_J_Pollard <- Mr_Edward_J_Pollard %>% arrange(desc(TransactionAmount))

# Adrian_Garcia
Adrian_Garcia$TransactionAmount <- as.numeric(Adrian_Garcia$TransactionAmount)
Adrian_Garcia <- Adrian_Garcia %>% arrange(desc(TransactionAmount))

# Dr_Carolyn_Evans_Shabazz
Dr_Carolyn_Evans_Shabazz$TransactionAmount <- as.numeric(Dr_Carolyn_Evans_Shabazz$TransactionAmount)
Dr_Carolyn_Evans_Shabazz <- Dr_Carolyn_Evans_Shabazz %>% arrange(desc(TransactionAmount))

# Edward_Gonzalez
Edward_Gonzalez$TransactionAmount <- as.numeric(Edward_Gonzalez$TransactionAmount)
Edward_Gonzalez <- Edward_Gonzalez %>% arrange(desc(TransactionAmount))

# Sallie_Alcorn
Sallie_Alcorn$TransactionAmount <- as.numeric(Sallie_Alcorn$TransactionAmount)
Sallie_Alcorn <- Sallie_Alcorn %>% arrange(desc(TransactionAmount))
})

From there, we check that the candidate spreadsheet was properly sorted based on the transaction amount

Just as an example, we can use Sallie Alcorn’s transaction sheet to check this

result <- Sallie_Alcorn %>% 
  select(CampaignYear, TransactionAmount, TransactionPartyFullName, TransactionPartyEmployer, OfficeSought, TransactionAddressStreet, TransactionPartyCity, TransactionPartyOccupation, TransactionType) %>% 
  group_by(TransactionAmount)
result <- Sallie_Alcorn %>% filter(TransactionType %in% c("Political Contributions","Monetary Political Contributions"))

write.xlsx(result, file = "Sallie_Alcorn_Transactions.xlsx")

top_5_data <- result %>% 
  select(CampaignYear, TransactionAmount, TransactionPartyFullName, TransactionPartyCity, TransactionType) 
top_5_data <- head(top_5_data, 10)


# Render the table
top_5_data %>%
  kable(format = "html", caption = "<b style='color:black;'>Sallie Alcorn Transaction History</b>") %>%
  kable_styling("striped")

**Sallie Alcorn Transaction History**
CampaignYear	TransactionAmount	TransactionPartyFullName	TransactionPartyCity	TransactionType
2019	5000	Mr. Scott Sparvero	Houston	Political Contributions
2019	5000	Anthony Riggins	Oakland	Political Contributions
2019	5000	Melinda Hildebrand	Houston	Political Contributions
2019	5000	Diana Hawkins	Houston	Political Contributions
2019	5000	Jeff Hildebrand	Houston	Political Contributions
2019	5000	Mrs. Rosemary Comstock	Houston	Political Contributions
2019	5000	Mrs. Virginia Elkins	Houston	Political Contributions
2019	5000	Stephen Costello	Houston	Political Contributions
2019	2500	Ms. Jan Bres	Houston	Political Contributions
2019	2500	Kate Gibson	Houston	Political Contributions

Great, now we have seperated the dataset by incumbent, and can use that data for further analysis if needed.

To make it easier for reporters to play around with this data, we could use datasette lite. For proof of concept, feel free to load the following link in the load csv tab on datasette lite:

https://raw.githubusercontent.com/MichaelZhangReporter/Document-Chatbot/main/Alcorn_Csv.csv?token=GHSAT0AAAAAACTFGDPXE5VCMWYEMRALODTUZTEMRWA

All Incumbent Transaction Analysis

Going forward, if we want to look at in city/ outside city contributions, we need to create a incumbent transactions variable, and use a secondary software to clean the naming conventions

suppressWarnings({
incumbent_transactions <- read_csv("incumbent_transactions_cleaned.csv")
})

## Rows: 9512 Columns: 88
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (62): CandidateOrCommittee, ReportFiledDate, CandidateFirstName, Candida...
## dbl (16): Column, ReportID, CampaignYear, MostRecentReportVersion, 15th day ...
## lgl (10): CommitteeName, 15-Jul, BallotMeasureDescription, ExpenseCandidateB...
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

Let’s filter our data to only include contributions and get rid of NAs (For now at least, cleaning up the data more and figuring out what the NA means is necessary for later)

contributions_to_incumbents <- incumbent_transactions %>%
  filter(transaction_type %in% c("Political Contributions", "Monetary Political Contributions")) %>%
  filter(!is.na(transaction_amount))

A few candidates’ names show up as multiple values. We can use case_when() to standardize them.

contributions_to_incumbents <- contributions_to_incumbents %>%
  mutate(candidate_full_name = case_when(
    candidate_full_name == "Mr. Edward Ed Gonzalez" ~ "Edward Ed Gonzalez",
    candidate_full_name == "Mrs. Amy Peck" ~ "Ms. Amy Peck",
    candidate_full_name == "Ms. Amy L Peck" ~ "Ms. Amy Peck",
    .default = candidate_full_name
  ))

I wanted to also take a look at which kind of groups are funding these politicans, so let’s do a quick count of that as well.

Who’s funding our local politicans?

# Perform the count and arrange operations
counted_data <- contributions_to_incumbents %>% 
   count(TransactionPartyEmployer) %>% 
   arrange(desc(n))



top_5_data <- head(counted_data, 5)


# Render the table
top_5_data %>%
  kable(format = "html", caption = "<b style='color:black;'>Top 5 Donor Occupations</b>") %>%
  kable_styling("striped")

**Top 5 Donor Occupations**
TransactionPartyEmployer	n
NA	3231
Self	402
Self-Employed	112
City of Houston	58
Information Requested	53

Seems like a lot of the contributors are either self employed, or did not fill in that section of the form, regardless this data could be interesting as those listed as “self employed” are likely some kind of lobbyist or investor with some sort of business intentions.

Now back to the original analysis, let’s group all contributions by person and get the sum for each person.

contr_sums_per_cand <- contributions_to_incumbents %>%
  group_by(candidate_full_name) %>%
  summarise(total_contributions = sum(transaction_amount)) %>%
  arrange(desc(total_contributions))

top_5_data <- head(contr_sums_per_cand, 10)
top_5_data <- top_5_data %>%
  mutate(total_contributions = dollar(total_contributions))

top_5_data %>%
  kable(format = "html", caption = "<b style='color:black;'>Total Incumbent Campaign Funds</b>") %>%
  kable_styling("striped")

**Total Incumbent Campaign Funds**
candidate_full_name	total_contributions
Adrian Garcia	$1,745,114
Edward Ed Gonzalez	$659,938
Dr. Carolyn Evans-Shabazz	$152,735
Sallie Alcorn	$145,907
Martha Castex-Tatum	$123,198
Willie R Davis	$36,125
Ms. Amy Peck	$14,099
Mr. Edward J Pollard	$11,285

In-state vs Out-state Contributors

In_state_table <- contributions_to_incumbents %>%
  group_by(candidate_full_name) %>%
  filter(str_detect(transaction_party_state, "TX")) %>%
  summarise(In_state_contributions = sum(transaction_amount)) %>%
  arrange(desc(In_state_contributions))

top_5_data <- head(In_state_table, 8)
top_5_data <- top_5_data %>%
  mutate(In_state_contributions = dollar(In_state_contributions))

top_5_data %>%
  kable(format = "html", caption = "<b style='color:black;'> Incumbent In-State Campaign Funds</b>") %>%
  kable_styling("striped")

**Incumbent In-State Campaign Funds**
candidate_full_name	In_state_contributions
Adrian Garcia	$1,675,152
Edward Ed Gonzalez	$627,471
Sallie Alcorn	$135,177
Dr. Carolyn Evans-Shabazz	$128,260
Martha Castex-Tatum	$122,198
Willie R Davis	$35,575
Ms. Amy Peck	$12,600
Mr. Edward J Pollard	$10,785

Instate_outstate_comparison <- left_join(In_state_table, contr_sums_per_cand, by = "candidate_full_name")
Instate_outstate_comparison <- Instate_outstate_comparison %>%
  mutate(in_state_percentage = In_state_contributions
 / total_contributions)

top_5_data <- head(Instate_outstate_comparison, 8)
top_5_data <- top_5_data %>%
  mutate(In_state_contributions = dollar(In_state_contributions)) %>%
  mutate(total_contributions = dollar(total_contributions)) %>%
  mutate(in_state_percentage = percent(in_state_percentage))


top_5_data %>%
  kable(format = "html", caption = "<b style='color:black;'>Incumbent Campaign Fund Breakdown</b>") %>%
  kable_styling("striped")

**Incumbent Campaign Fund Breakdown**
candidate_full_name	In_state_contributions	total_contributions	in_state_percentage
Adrian Garcia	$1,675,152	$1,745,114	95.99%
Edward Ed Gonzalez	$627,471	$659,938	95.08%
Sallie Alcorn	$135,177	$145,907	92.65%
Dr. Carolyn Evans-Shabazz	$128,260	$152,735	83.98%
Martha Castex-Tatum	$122,198	$123,198	99.19%
Willie R Davis	$35,575	$36,125	98.48%
Ms. Amy Peck	$12,600	$14,099	89.37%
Mr. Edward J Pollard	$10,785	$11,285	95.57%

In-city vs Out-city Contributors

contr_sums_per_cand <- contributions_to_incumbents %>%
  group_by(candidate_full_name) %>%
  summarise(total_contributions = sum(transaction_amount)) %>%
  arrange(desc(total_contributions))

In_city_table <- contributions_to_incumbents %>%
  group_by(candidate_full_name) %>%
  filter(str_detect(transaction_party_city, "Houston")) %>%
  summarise(In_City_Contributions = sum(transaction_amount)) %>%
  arrange(desc(In_City_Contributions))

InCity_OutCity_comparison <- left_join(In_city_table, contr_sums_per_cand, by = "candidate_full_name")
InCity_OutCity_comparison <- InCity_OutCity_comparison %>%
  mutate(in_city_percentage = In_City_Contributions
 / total_contributions)

InCity_OutCity_comparison <- InCity_OutCity_comparison %>%
  mutate(In_City_Contributions = dollar(In_City_Contributions)) %>%
  mutate(total_contributions = dollar(total_contributions)) %>%
  mutate(in_city_percentage = percent(in_city_percentage))

InCity_OutCity_comparison %>%
  kable(format = "html", caption = "<b style='color:black;'>Incumbent Campaign Fund Breakdown</b>") %>%
  kable_styling("striped")

**Incumbent Campaign Fund Breakdown**
candidate_full_name	In_City_Contributions	total_contributions	in_city_percentage
Adrian Garcia	$1,135,151	$1,745,114	65.0%
Edward Ed Gonzalez	$530,194	$659,938	80.3%
Sallie Alcorn	$131,477	$145,907	90.1%
Dr. Carolyn Evans-Shabazz	$105,710	$152,735	69.2%
Martha Castex-Tatum	$92,585	$123,198	75.2%
Willie R Davis	$27,675	$36,125	76.6%
Ms. Amy Peck	$10,025	$14,099	71.1%
Mr. Edward J Pollard	$9,235	$11,285	81.8%

now write the csv’s out

write.csv(Instate_outstate_comparison, "inState_outState_comparison.csv", row.names = FALSE)
write.csv(InCity_OutCity_comparison, "inCity_outCity_comparison.csv", row.names = FALSE)

Houston_Finance

2024-06-03