diversity traces

an interactive lens on multi-racial
families in the United States · 1860-2020

When looking at diversity from a racial perspective, homogenous communities are still the norm, as they remain siloed not only locally, but in their very own households as well. This visualization project comes as a celebration of the fringe couples and families who have a multi-racial identity, effectively embodying the intermingling of races, and dissolving the systemic barriers put on their very own existence.

According to the census, there are only vestiges of these multi-racial families until 1960. Prior to that, the census enumerator was responsible for categorizing persons, while after 1970 race was reported by someone in the household. In 1967, Loving v. Virginia ended restrictions on multi-racial marriage. Only after 2000 people can identify as being of multiple races. More recently, there has been a surge of multi-racial families in the data, but they are still a rarity, still mere traces of diversity in America.




In this visualization you can see every registered multi-racial couple in America, for recent periods in 1-5% samples of the population, and for older periods in 100% samples of the population. Each couple is represented as a colorful chromosome, enabling you to see the races within each family, their ages, sexes, and children. For each year, these couples are organized by rarest multi-racial group first, by ascending average age of the couple, and by number of children. This means that in each group you will first see couples with no children, but as you navigate towards the end a group, you will see couples with more children. In 2015 Obergefell v. Hodges ended restrictions on same-sex marriages. Only in recent years you will be able to see same-sex couples. In addition to race, individuals who identify themselves as latino/as are also marked with an L.

percentage of couples in
the United States
are multi-racial

2020 2000 1980 1960 1940 1920 1900 1880 1860
LOADING...

visualizing xxxx multi-racial couples
out of a total of XXXXXXX couples
in a XX sample of the U.S. population +info

Project by Pedro M. Cruz at the Co-Lab for Data Impact and the Center for Design at Northeastern University. This project was funded by a National Geographic Explorer Grant for “Visualizing the Evolution of Household Diversity in America.”

Contributors
John Wihbey @ Co-Lab for Data Impact, Associate Professor, School of Journalism and Media Innovation, Northeastern University
Kathleen Foley, visual identity and typography
Leah Welch, functional prototyping and ideation
Ryan Morrill, research, ideation, and visual prototyping
Arushi Singh, data frameworks and statistics
Eunice Esomonu, data querying
Yuqing Liu, research and visual prototyping
Anuj Golesar, database setup and statistics
Dishali Sonawane, database setup and statistics



This project uses the Nunito typeface served by Google Fonts, and is hosted by GitHub. Coded in pure Javascript and p5.js.

The application uses two canvas elements: one that is updated when a year is selected to render all the families,and a second one that is redrawn according to the mouse position in order to render a zoomed in version of the hovered families. In order to avoid computing distances from the mouse pointer to every other family, an hash map of families and coordinates is kept in memory, enabling to match mouse coordinates with any other families just by rounding integers. The chromosomes are drawn using the built-in Rom-Catmull splines in p5.js, then rendered with a thick stroke and rounded joints. The motion of the chromosomes was created by shifting horizontally the middle coordinate of each of the arms using the Perlin noise function. The ages correspond to height, not linearly, but by a square root instead: making the height of each chromosome correspond to the age of the spouses creates a wide variation of heights making it hard to condense so much information in a constrained space.

The micro-census data utilized for this project were obtained from the USA IPUMS database (IPUMS USA, University of Minnesota, www.ipums.org). Nine sampled years were utilized:

Year Sample size
2020 1%
2000 1%
1980 5%
1960 100%
1940 100%
1920 100%
1900 100%
1880 100%
1860 100%

The data from IPUMS is anonymized. For each sample, the following variables were extracted from IPUMS in a rectangular format:

Variable Label
YEAR Census year
SAMPLE IPUMS sample identifier
SERIAL Household serial number
HHWT Household weight
PERNUM Person number in sample unit
SPLOC Spouse's location in household
RELATE Relationship to household head
SEX Sex
AGE Age
RACE Race
HISPAN Hispanic origin

Each sample was then inserted as a document of persons in a MongoDB database. The database's engine was then used to parse households, extract nuclear families from within those households, detect multi-racial families, and store them with as hierarchical objects: each family is comprised of two parents and has list of children. The results of the data processing were exported as static JSON files which are loaded by the browser and used to render the visualization.