data codes through eyeglasses

Harnessing the US Census API for Demographic Data Retrieval

The US Census API allows us to extract an array of demographic data, ranging from population characteristics to economic indicators. This post is meant to guide you through a Python-based approach to harness this powerful tool, allowing you to fetch detailed demographic data for specific locations, using latitude and longitude coordinates from a CSV file.

Skip ahead to the Github Repo!

Photo by Chris Ried on Unsplash

Initial Set-Up

Install the `census` Python package. This package is a lightweight, easy-to-use wrapper for the Census API. For table formatting, we’ll also use the `tabulate` package. If not already installed, you can add them to your environment using the following command:

%pip install census tabulate

Next, we import the necessary libraries:

import csv
from census import Census
from tabulate import tabulate

Setting Up the API

To use the Census API, you need an API key, which can be obtained for free from the US Census website. Once you have your key, you can replace `”xyz123″` with it to set up your API connection:

api_key = "xyz123"  # Replace with your actual API key
census_client = Census(api_key)

Storing the Data

For this task, we’re going to store the retrieved census data in a list:

census_data_list = []

Specifying CSV File Path

This is where you specify the path to your CSV file. Make sure to replace `”/content/YOUR.csv”` with your actual file path:

csv_file_path = "/content/YOUR.csv"  # Replace with the actual path to your CSV file

Reading the CSV File

We’re using Python’s `csv.DictReader` to read the file and treat each row as a dictionary:

with open(csv_file_path, "r") as csv_file:
csv_reader = csv.DictReader(csv_file)

Extracting Census Data

This is where the magic happens. For each row in the CSV file, we extract the latitude and longitude values, make a request to the Census API, and retrieve the relevant data:

for row in csv_reader:
# Extract location data
state = row["STATEFIPS"]
county = row["COUNTYFIPS"]
tract = row["TRACT"]
latitude = row["latitude"] # Replace with the correct column name for latitude
longitude = row["longitude"] # Replace with the correct column name for longitude

# Request data from Census API
census_data = census_client.acs5.state_county_tract(
("NAME","B01003_001E", "B01001_002E", "B01001_026E", "B01001B_001E", "B01001I_001E", "B06012_004E"),
state,
county,
tract,
year=2020,
lat=latitude,
lon=longitude
)

Processing the Data

Having fetched our data, we then need to process it. This involves iterating over the results and extracting the information we need:

for result in census_data:
# Extract data
name = result["NAME"]
total_population = result["B01003_001E"]
male_population = result["B01001_002E"]
female_population = result["B01001_026E"]
poverty = result["B06012_004E"]

# Store data in a dictionary
census_row = {
"name": name,
"total_population": total_population,
"male_population": male_population,
"female_population": female_population,
"poverty": poverty,
}

# Append the dictionary to the data list
census_data_list.append(census_row)

Saving the Data

Finally, we save our data to a new CSV file, ensuring it’s neatly organized for future use:

csv_output_path = "/content/output.csv"  # Replace with the desired output file path

# Write the census data to the CSV file
with open(csv_output_path, "w", newline="") as csv_file:
fieldnames = census_data_list[0].keys()
writer = csv.DictWriter(csv_file, fieldnames=fieldnames)

# Write the header row
writer.writeheader()

# Write each row of census data
for row in census_data_list:
writer.writerow(row)

print("CSV file saved successfully!")

Now you have a Python script that uses the US Census API to retrieve demographic data based on latitude and longitude. You can adjust this script as needed to fit your specific data and research requirements. Enjoy diving into the wealth of information that the US Census data provides!