For those who want to “see for themselves”.

Downloading the data manually

The simplest way to view the data is to download the raw data directly and view it in Excel or OpenOffice (or some similar tool).

You can download the full CSV data directly at this link or others like it in the archive.org snapshots: https://web.archive.org/web/20201115001813if_/https://data.pa.gov/api/views/mcba-yywm/rows.csv?accessType=DOWNLOAD

What the data looks like

…in a text editor like Notepad.

Be sure to read the descriptions of the data fields

For example, the Date of Birth January 1st 1800 occurs in the dataset. It may be alarming to you initially, but the field descriptions explain that this is a “protection” for certain voters (like the victims of domestic violence). In all, I only counted ~60 of these.

Using the PAOpendata viz in your browser

The second easiest way to view the data is using the PA OpenData visualization tool in your web browser. It offers some user-friendly sorting, querying, and graphing functions. The original PAOpenData link is here: https://data.pa.gov/Government-Efficiency-Citizen-Engagement/2020-General-Election-Mail-Ballot-Requests-Departm/mcba-yywm

You can see all of Archive.org’s snapshots of the PAOpenData site here:

https://web.archive.org/web/20240000000000*/https://data.pa.gov/Government-Efficiency-Citizen-Engagement/2020-General-Election-Mail-Ballot-Requests-Departm/mcba-yywm

Here is a specific snapshot:

https://web.archive.org/web/20201118054516/https://data.pa.gov/Government-Efficiency-Citizen-Engagement/2020-General-Election-Mail-Ballot-Requests-Departm/mcba-yywm

Click “View Data” near the top.

NOTE: keep in mind this is Archive.org’s archive of the site, so it may take a long time to load in your browser.

After loaded it will look like this:

Running the Docker container:

The Docker container contains the code I wrote to explore the dataset. It is self-contained and has all the dependencies already installed. This will allow you to immediately get to work tinkering with the data and writing your own code if you are so inclined. It is the same code that was used to generate the screenshots in the main writeup.

It also contains a sampling of the data itself, but you can download the data yourself also and use that.

Tools Used:

Jupyter Notebooks
Python
Pandas
PyTorch
Matplotlib (only for heatmaps and scatterplots not shared in this post)

How to run the Docker container

docker run -it -p 8888:8888 -p 6006:6006 sa7ori/pa2020 bash

It will automatically download the container: Once download is complete, the container will run and drop you into a rootshell: Run the run_jupyter.sh shell script Open your browser to: http://localhost:8888 and click on the PA 2020 Notebook You are now in control: