class: center, middle, inverse, title-slide # Mini-Lecture 19 ## FEC ### Ben Baumer ### SDS 192March 9, 2020(
http://beanumber.github.io/sds192/lectures/19-fec.html
) --- ## The SDS Major .pull-left[
] .pull-right[ - SDS Presentation of the [Major](https://www.smith.edu/academics/statistics#academics-statistical-data-sciences-the-major) Tues, Mar 31 12--1, Ford Atrium free food! ] --- class: center, middle, inverse # Exam prep --- ## Sample question - Recall Cleveland and McGill's assessment of human ability to accurately perceive differences in visual cues. This study led to the *Perceptual Hierarchy*. --- <iframe src="https://embed.polleverywhere.com/multiple_choice_polls/ANRxa0IfgCs9e6AuN8R3N?controls=none&short_poll=true" width="800" height="600" frameBorder="0"></iframe> --- ## Sample question (harder) ```r ggplot(mtcars, aes(x = disp, y = mpg), color = am) + geom_point() ``` --- <iframe src="https://embed.polleverywhere.com/multiple_choice_polls/09SYWcR8xEiLOLrqmJLfF?controls=none&short_poll=true" width="800" height="600" frameBorder="0"></iframe> --- class: center, middle, inverse # FEC --- ## Overwhelming .pull-left[ ![](https://media.giphy.com/media/1L5YuA6wpKkNO/giphy.gif) ] -- .pull-right[ ![](https://media.giphy.com/media/NQRRqqkImJ3Da/giphy.gif) ] -- ![](https://media.giphy.com/media/8GoeF2PXOoF2w/giphy.gif) --- ## `fec16`: the elections .footnote[https://www.fec.gov/data/browse-data/?tab=bulk-data] - `candidates`: master table of ~7,000 candidates - e.g., `CLINTON, HILLARY RODHAM / TIMOTHY MICHAEL KAINE` -- - `committees`: master table of ~17,000 committees - e.g., `DONALD J. TRUMP FOR PRESIDENT, INC.` -- - e.g., ` KILLARY CLINTON` -- - May or may not be affiliated with a candidate - Lots of [different types](https://www.fec.gov/campaign-finance-data/committee-type-code-descriptions/) (e.g., PACs, etc.) -- - Election results - `house_results`: more than 435 congressional elections - `senate_results`: 33 Senatorial elections - `president_results`: state-by-state results --- ## `fec16`: the money - `contributions`: Contributions from committees to candidates - ~500,000 transactions - Can be **for or against** candidate! (foreshadowing) -- - `pac`: **Summary** of PAC activity - ~12,000 rows -- - Sampled data: 100,000 records each - `individuals` - `expenditures` - `transactions` - **You should probably just ignore these!!** .footnote[https://www.fec.gov/campaign-finance-data/contributions-committees-candidates-file-description/] --- ## Ex: Who "gave" to Hillary? ```r library(fec16) hillary_id <- candidates %>% filter(cand_election_yr == 2016, cand_pty_affiliation == "DEM", str_detect(cand_name, "CLINTON")) %>% pull(cand_id) ``` ```r contributions %>% filter(cand_id == hillary_id) %>% group_by(cmte_id) %>% summarize(num_transactions = n(), total = sum(transaction_amt)) %>% arrange(desc(total)) %>% left_join(committees, by = "cmte_id") %>% select(num_transactions, total, cmte_nm) ``` ``` ## # A tibble: 634 x 3 ## num_transactions total cmte_nm ## <int> <dbl> <chr> ## 1 208 24195670 FUTURE45 ## 2 59 22816861 DNC SERVICES CORP./DEM. NAT'L COMMITTEE ## 3 50 17182458 REBUILDING AMERICA NOW ## 4 30 12307924 NATIONAL RIFLE ASSOCIATION INSTITUTE FOR LEGISLATI… ## 5 104 7448422 NATIONAL RIFLE ASSOCIATION OF AMERICA POLITICAL VI… ## 6 93 6473727 PRIORITIES USA ACTION ## 7 22 6283002 LCV VICTORY FUND ## 8 40 5728857 RGA RIGHT DIRECTION PAC ## 9 67 5651780 WOMEN VOTE! ## 10 97 4830230 UNITED WE CAN ## # … with 624 more rows ``` --- ## Huh? ```r contributions %>% filter(cand_id == hillary_id) %>% * group_by(cmte_id, transaction_tp) %>% summarize(num_transactions = n(), total = sum(transaction_amt)) %>% arrange(desc(total)) %>% left_join(committees, by = "cmte_id") %>% select(transaction_tp, num_transactions, total, cmte_nm) ``` ``` ## Adding missing grouping variables: `cmte_id` ``` ``` ## # A tibble: 680 x 5 ## # Groups: cmte_id [634] ## cmte_id transaction_tp num_transactions total cmte_nm ## <chr> <chr> <int> <dbl> <chr> ## 1 C005745… 24A 208 2.42e7 FUTURE45 ## 2 C000106… 24C 59 2.28e7 DNC SERVICES CORP./DEM. NAT… ## 3 C006188… 24A 48 1.71e7 REBUILDING AMERICA NOW ## 4 C900133… 24A 30 1.23e7 NATIONAL RIFLE ASSOCIATION … ## 5 C000535… 24A 101 7.45e6 NATIONAL RIFLE ASSOCIATION … ## 6 C004958… 24E 93 6.47e6 PRIORITIES USA ACTION ## 7 C004868… 24E 22 6.28e6 LCV VICTORY FUND ## 8 C004907… 24A 40 5.73e6 RGA RIGHT DIRECTION PAC ## 9 C004739… 24E 67 5.65e6 WOMEN VOTE! ## 10 C005236… 24E 96 4.83e6 UNITED WE CAN ## # … with 670 more rows ``` .footnote[https://www.fec.gov/campaign-finance-data/transaction-type-code-descriptions/] --- ## The FEC > How do they have so much information about all these candidates and where did the information get stored? > How did they get the addresses of all those people?? -- - Federal election law requires disclosure - Address are probably self-reported --- ## Candidates > Candidate election year spans 1986-2916. What’s the deal with that? ```r library(fec16) candidates %>% filter(cand_election_yr == 2016, cand_office == "P", cand_status == "C") ``` ``` ## # A tibble: 72 x 15 ## cand_id cand_name cand_pty_affili… cand_election_yr cand_office_st ## <chr> <chr> <chr> <dbl> <chr> ## 1 P00003… CLINTON,… DEM 2016 US ## 2 P00003… SCHRINER… UNK 2016 US ## 3 P00004… BROWN, H… NNE 2016 US ## 4 P00004… BICKELME… REP 2016 US ## 5 P20002… "JOHNSON… LIB 2016 US ## 6 P20002… SANTORUM… REP 2016 US ## 7 P20002… HILL, CH… REP 2016 US ## 8 P20003… PERRY, J… REP 2016 US ## 9 P20003… STEIN, J… GRE 2016 US ## 10 P20004… WELLS, R… DEM 2016 US ## # … with 62 more rows, and 10 more variables: cand_office <chr>, ## # cand_office_district <chr>, cand_ici <chr>, cand_status <chr>, ## # cand_pcc <chr>, cand_st1 <chr>, cand_st2 <chr>, cand_city <chr>, ## # cand_st <chr>, cand_zip <chr> ``` --- ## Corporate contributions > Is there a way to view the monetary values of contributions not made by individuals? -- - I don't know. You may be referring to ["dark money"](https://en.wikipedia.org/wiki/Dark_money) - Corporations can't donate "hard money" - [Citizen's Brochure](https://transition.fec.gov/pages/brochures/citizens.shtml) - [Who can and can't contribute](https://www.fec.gov/help-candidates-and-committees/candidate-taking-receipts/who-can-and-cannot-contribute/) --- ## Contribution limits > I am a little confused by the context of the data. I thought that **committees have a cap** ($5000) for their donations but, from the data, the contributions look much more than that. (https://en.wikipedia.org/wiki/Campaign_finance_in_the_United_States#Sources_of_campaign_funding) --- ## Codebook > I don't know what some variables in contribution and committees datasets. > What do all the numbers and symbols mean? > the abbreviations of column and row names are confusing me. Is there a key somewhere? -- - Use the help documentation in R - Yes: (https://classic.fec.gov/finance/disclosure/ftpdet.shtml) -- - [Ballotpedia](https://ballotpedia.org/) --- ## Joining > Is it normal that the variable `candidate_name` in `house_elections` data doesn't match the `cand_name` variable in `candidates` data? > - Yes -- authoritative data is in `candidates` --- ## Missing data > Why are employer and occupation all **empty** in contributions dataset? It happens in `cand_id` in committees dataset as well. > Lots of **data is missing** from the dataset and the labels are defiantly more confusing than other datasets in terms of what the categories are labeled. -- - That's what real data is like 🤷♂ --- ## Committees? > It doesn't say to whom the various individuals made the donations to, is there a way we can find that out? -- - individuals don't give to candidates, they give to committees - committees spend on behalf of or against candidates - use `cmte_id` and `cand_id` to link tables - note that `contributions` table has **both** - [different types of committees](https://www.fec.gov/campaign-finance-data/committee-type-code-descriptions/) --- ## Negative amounts? > In the individuals and transactions data, how were some of the transaction amounts negative or zero? -- - donations can be [returned](https://www.fec.gov/help-candidates-and-committees/taking-receipts-political-party/refunds-contributions/) - Pay attention to [`transaction_type` codes](https://www.fec.gov/campaign-finance-data/transaction-type-code-descriptions/): - `24A`: Independent expenditure **opposing** election of candidate - `24E`: Independent expenditure **advocating** election of candidate <!-- ## Making connections > Is it normal that the variable candidate_name in house_elections data doesn't match the cand_name variable in candidates data? > - Yes -- authoritative data is in `candidates` > I was wondering if the **fec_id** variables in the house_elections dataset were the same as the **cand_id** variables in the candidates and contributions dataset. > - Should be the same --> --- ## Run-offs > Under house results what is a run-off vote? -- - In various situations there can be a run-off vote - This is not common, but when it does occur it determines the winner ```r house_results %>% filter(state == "TX", district_id == 15) ``` ``` ## # A tibble: 11 x 15 ## state district_id cand_id incumbent name_first name_last party primary_votes ## <chr> <chr> <chr> <lgl> <chr> <chr> <chr> <dbl> ## 1 TX 15 H6TX15… FALSE Vicente Gonzalez D 22151 ## 2 TX 15 H6TX15… FALSE "Juan \"S… Palacios D 9913 ## 3 TX 15 H6TX15… FALSE Dolly Elizondo D 8888 ## 4 TX 15 H6TX15… FALSE Joel Quintani… D 6152 ## 5 TX 15 H6TX15… FALSE Rubén Ramírez D 3149 ## 6 TX 15 H6TX15… FALSE "Rance G … Sweeten D 2224 ## 7 TX 15 H6TX15… FALSE Tim Westley R 13164 ## 8 TX 15 H6TX15… FALSE Ruben O. Villarre… R 9349 ## 9 TX 15 H6TX15… FALSE Xavier Salinas R 6734 ## 10 TX 15 H6TX15… FALSE Vanessa S. Tijerina GRE NA ## 11 TX 15 H8TX28… FALSE Ross Lynn Leone LIB NA ## # … with 7 more variables: primary_percent <dbl>, runoff_votes <dbl>, ## # runoff_percent <dbl>, general_votes <dbl>, general_percent <dbl>, ## # won <lgl>, footnotes <chr> ``` .footnote[https://en.wikipedia.org/wiki/Texas%27s_15th_congressional_district#2016] --- ## What will happen? .pull-left[ > I'm curious about the **connection** between the committees and contributions dataset and I'm wondering what will happen if I try to join the two. ] .pull-right[ ![](https://media.giphy.com/media/5nFShZWwq3fdm/giphy.gif) ] --- ## Excitement .pull-left[ > I thought it was really interesting and encouraging that this dataset was made by students! This dataset is **really interesting and I can’t wait to work with it**! ] .pull-right[ ![](https://media.giphy.com/media/ikMSB7JPYY1jy/giphy.gif) ] --- ## Generalization > What does " generalize your analysis" mean? This is very vague. Can you give us an example? -- - Instead of just doing the analysis: - for one candidate, - for one party, - for one state, -- - Can you do it for all of them? --- ## Recoding factors - Turn all `REP`s into `R`s and all `DEM`s into `D`s (see also `fct_recode()`) - Using `gsub()`: ```r data <- data %>% mutate(var = gsub('REP', 'R', var)) ``` - Using `ifelse()`: ```r data <- data %>% mutate(var = ifelse(var == 'REP', 'R', var)) ``` - Using `case_when()`: ```r data <- data %>% mutate(var = case_when( var == "REP" ~ "R", var == "DEM" ~ "D", TRUE ~ var) ) ``` --- class: inverse ## Work on... - [Mini-project \#2](../mod_data.html) - In-class presentations Friday - Write-ups due by 11:55 pm on Sunday night - Up next: importing data...