Skip to content

CalMatters/ca-form-700-data

Repository files navigation

California financial disclosure data

Structured and cleaned financial disclosure data for California legislators from Form 700 filings.

Scrape financial disclosure data

Methodology

The data is extracted by journalists from the legislator-submitted PDF versions of the forms which are downloaded from the FPPC. We attempt to match the reported information on the form to the extent possible. We use amendments where they are filed.

All of the names of people and organizations are subsequently normalized so that the data is easily compared and grouped. This is done with a two column lookup table per data schedule managed as a Google spreadsheet which is linked below with the cleaned data column.

Data files

There are three CSV files generated and each is based on a section, called a "schedule", in the FPPC form.

Investments - Schedule A1

Here is the data dictionary for the schedule-a1.csv file.

Column name Description and caveats
filer The legislator who filed the form
filingYear The year in which the investment was owned, bought, or sold
name The name of the investment, normalized with this spreadsheet
description Description supplied on the form
fmv "Fair market value" of the asset, categorized as one of the following: $2k - $10k, $10k - $100k, $100k - $1m, $1m+
nature
acquired Date acquired, null when not in the same filing year
disposed Date sold, null when not in the same filing year
formUrl A URL for the PDF version of the form where the source data is located
legislatorDigitalDemocracyUrl A URL for the legislator's profile page on CalMatters' Digital Democracy

Income received - Schedule C

Here is the data dictionary for the schedule-c-income.csv file.

Column name Description and caveats
filer The legislator who filed the form
filingYear The year in which the gift was given
sourceName The name of the source of the income
address The address of the source
businessActivity A general description of the business activity if the source is a business entity
position The position held with the entity
grossIncome Total amount of income before deducting expenses, losses, or taxes and includes loans other than loans from a commercial lending institution
consideration The reason the income was received
saleOf The item that was sold if the income was from a sale
commissionOrRentalIncomeDescription A description of what was commissioned or rented
otherDescription A description of "other" considerations
formUrl A URL for the PDF version of the form where the source data is located
legislatorDigitalDemocracyUrl A URL for the legislator's profile page on CalMatters' Digital Democracy

Gifts - Schedule D

Here is the data dictionary for the schedule-d.csv file.

Column name Description and caveats
filer The legislator who filed the form
filingYear The year in which the gift was given
sourceName The name of the gift giver, normalized with this spreadsheet
amount Dollar value of gift
reimbursedAmount Dollars returned to the source
date Date the gift was given
description A description of the gift
formUrl A URL for the PDF version of the form where the source data is located
legislatorDigitalDemocracyUrl A URL for the legislator's profile page on CalMatters' Digital Democracy

Things to look out for when working with gift data

Some legislators reported taking gifts that exceeded the FPPC's annual per-source limit ($520 in 2022) and included a note that says they returned the part of the gift exceeding the amount. This note isn't captured anywhere in our data, so if you notice anybody reporting a haul over the legal limit you should also look at the submitted form to confirm the legislator didn't return part of the gift. You can use the PDF available in the formUrl column for each gift.

Asm. Marie Waldron reported a gift total of over $3k in her 2022 form but it appears to be travel that should have been reported in Schedule E instead. CalMatters reached out to her office for clarification but has yet to hear back.

Sponsored trips - Schedule E

Here is the data dictionary for the schedule-e.csv file.

Column name Description and caveats
filer The legislator who filed the form.
filingYear The year in which the trip took place.
sourceName The name of the travel sponsor, normalized with this spreadsheet.
address Address of the source from the filing
cityAndState City and state of the source from the filing
amount Dollars spent on travel
reimbursedAmount Dollars returned to the source
onDate Starting date of sponsored travel
throughDate End date of sponsored travel
giftOrIncome From the original filing
madeASpeechParticipatedInPanel From the original filing
giftTravelDestination The place the legislator traveled to.
otherDescription
formUrl A URL for the PDF version of the form where the source data is located
legislatorDigitalDemocracyUrl A URL for the legislator's profile page on CalMatters' Digital Democracy.

Data use

If you use this dataset, please mention it was collected and cleaned by CalMatters. If you have any questions about this dataset, feel free to contact us.

CalMatters is a nonpartisan, nonprofit journalism venture committed to explaining how California’s state Capitol works and why it matters.

Stories and projects that use this data

Credits

The FPPC publishes only the PDF versions of each filing, though many of them are submitted electronically. This data set was created by people going through all of the forms and creating structured data. The contributors are:

  • Jeremia Kimelman
  • John Osborn D'Agostino
  • Erica Yee
  • Alesha Riani Blaauw
  • Chris Woodard
  • Cristian Gonzalez
  • Emma Hall
  • Hailey Valdivia
  • Jack Freeman
  • Julianna Rodriguez
  • Katelyn Marano
  • Mercy Sosa
  • Nancy Rodriguez
  翻译: