top of page

Popularity of Biblical Names in America 1880–2017

  • toldham2
  • Sep 21, 2021
  • 3 min read

Tyler Oldham

9/21/2021


Introduction

This report will visualize and contextualize the popularity of first names appearing in the Revised English Bible compared to all first names of newborns in America from 1880 to 2017. According to Pew Research, religiosity in America is at an all-time low. This report hypothesizes that the proportion of biblical names in America will decrease over time to reflect the changing rates of religiousness.

While the Christian Bible is used to create the dataset, this report includes names found in all Abrahamic religions and makes no effort to analyze the popularity of any specific religion.


Libraries

Code

library(tidyverse)
library(rvest)
library(dplyr)
library(lubridate)
library(scales)
library(ggthemes)

Datasets

This report uses two primary datasets: First names of babies born in America and first names mentioned in the Bible.


Babynames

The babynames dataset is a package for R created by Hadley Wickham using data from the Social Security Administration. The package contains a table of ~2,000,000 instances of names, total count, and proportion of each name by gender per year.

Code

library(babynames)

Names in the Christian Bible

The biblical names dataset is generated from web scraping this comprehensive table of names appearing in the Bible. The source contains variables for name, gender, verse, and alternative names. The script uses the rvest package to scrape and compile the data.


To start, I created data frames from the four pages of the source and merge the data frames into one.

Code

#Collect all four pages of the list
bible0 <- read_html("https://www.behindthename.com/namesakes/list/biblical/alpha")

bible0 <- bible0 %>% 
  html_nodes("table") %>% 
  .[4] %>% 
  html_table(fill = TRUE)
as.data.frame(bible0) -> bible0


bible1 <- read_html("https://www.behindthename.com/namesakes/list/biblical/alpha/2")

bible1 <- bible1 %>% 
  html_nodes("table") %>% 
  .[4] %>% 
  html_table(fill = TRUE)
as.data.frame(bible1) -> bible1


bible2 <- read_html("https://www.behindthename.com/namesakes/list/biblical/alpha/3")

bible2 <- bible2 %>% 
  html_nodes("table") %>% 
  .[4] %>% 
  html_table(fill = TRUE)
as.data.frame(bible2) -> bible2


bible3 <- read_html("https://www.behindthename.com/namesakes/list/biblical/alpha/4")

bible3 <- bible3 %>% 
  html_nodes("table") %>% 
  .[4] %>% 
  html_table(fill = TRUE)
as.data.frame(bible3) -> bible3

# Merge the four
rbind(bible0, bible1, bible2, bible3) -> bibleMerged

#Set column names
colnames(bibleMerged)[1] <- "Name"
colnames(bibleMerged)[2] <- "Gender"
colnames(bibleMerged)[3] <- "Verse"
colnames(bibleMerged)[4] <- "Other Names"
colnames(bibleMerged)[5] <- "Testament"#Adjust dataframe
bibleMerged = bibleMerged [-1, ]

Then, I created another variable for the Testament origin of the name.

Code

# Set the value to "New" if abbreviated chapter name appear in verse, if else, set value to "Old".
bibleMerged$Testament <- ifelse(grepl("Chr|Tim|John|Acts|Col|Eph|John|Jude|Luke|Mark|Matt|Phlm|Phlp|Rev|Rom|Titus", bibleMerged$Verse), "New", "Old")

Method

Create base subset of only biblical name birth data.

Code

babynamesBible <-merge(bibleMerged, babynames, by.x = "Name", by.y = "name", all.x = FALSE)

Filter dataframes to be plotted.

Code

# Create a separate dataframe for the sum of biblical names by year.
babynamesBible %>% 
  group_by(year) %>% 
  summarise(n = sum(n)) -> yearTotals
lubridate::ymd(yearTotals$year, truncated = 2L) -> yearTotals$year

# Create a separate data frame for the average of biblical names by year.
babynamesBible %>% 
  group_by(year) %>% 
  summarise(prop = mean(prop)) -> yearProps
lubridate::ymd(yearProps$year, truncated = 2L) -> yearProps$year

# Create a separate data frame for the sum of biblical names by year segregated by testament origin.
babynamesBible %>% 
  group_by(year, Testament) %>% 
  summarise(n = sum(n)) %>% 
  arrange(year, Testament) -> yearTotalsTestament
lubridate::ymd(yearTotalsTestament$year, truncated = 2L) -> yearTotalsTestament$year

# Create a separate data frame for the average of biblical names by year segregated by testament origin.
babynamesBible %>% 
  group_by(year, Testament) %>% 
  summarise(prop = mean(prop)) %>% 
  arrange(year, Testament) -> yearPropsTestament
lubridate::ymd(yearPropsTestament$year, truncated = 2L) -> yearPropsTestament$year

Results

Annotations have been added to show historical context. There is no proven correlation between the events and the data. Source: Association of Religion Data Archives: Prominent Religious Events and People in American History.


All Biblical Names

Sum of All Biblical Names by Year

Code



The frequency of births with Biblical names has generally risen since 1880, though this is likely due to population growth. Note the significant drop in frequency after the end of the cold war, possibly due to religious/cultural disillusionment after the end of the Cold War.


The Proportion of All Biblical Names by Year

Code


The proportion of births with biblical names in comparison to all births has decreased over time. This graph seems to follow an inverse trend of the sum graph. The line follows a trend very closely, except for the dip from 1910 to 1950. This dip and the spike from figure 1 seem to coincide, which would show that the spike is very likely due to population growth.


Separate Data by Testament

Now that we've established that biblical names have decreased over time in both births and the proportion of births in each year, let's find out if that applies to New Testament names, Old Testament names, or both.


These graphs are similar to those in the previous section, with an added variable for testament origin. New Testament names are the red line, while Old Testament names are the blue line. Annotations have been removed for readability.

Sum of Biblical Names by Year by Testament

Code


Both lines begin and remain close together from 1880 to 1910 when a spike in New Testament names causes a divergence between the two lines. The two follow the same trends respectively until the 1990s, where both lines started to converge and then finally met in the 2000s.


Proportion of Biblical Names by Year by Testament

Code


Unlike the previous graph, both lines begin with a significant difference between the proportions. However, they continue to follow respective trends until they converge in the 2000s. While the proportion of New Testament names falls by ~0.5%, change in Old Testament names from 1880 to 2017 only fall by ~0.05%.


Most Popular Names by Gender

Compile a frequency and a proportion data frame sorted by gender for every year combined, each containing the top ten most popular names.

Code

#Frequency
babynamesBible %>% 
  group_by(Name, Gender) %>%
  summarize(total = sum(n)) %>%
  filter(Gender == "m") %>% 
  arrange(desc(total)) %>% 
  head(10) -> popNamesSumM

babynamesBible %>% 
  group_by(Name, Gender) %>%
  summarize(total = sum(n)) %>%
  filter(Gender == "f") %>% 
  arrange(desc(total)) %>% 
  head(10) -> popNamesSumF

rbind(popNamesSumM, popNamesSumF) -> popNamesSumMerge

# Proportion
babynamesBible %>% 
  group_by(Name, Gender) %>%
  summarize(total = mean(prop)) %>%
  filter(Gender == "m") %>% 
  arrange(desc(total)) %>% 
  head(10) -> popNamesPropM

babynamesBible %>% 
  group_by(Name, Gender) %>%
  summarize(total = mean(prop)) %>%
  filter(Gender == "f") %>% 
  arrange(desc(total)) %>% 
  head(10) -> popNamesPropF

rbind(popNamesPropM, popNamesPropF) -> popNamesPropMerge

Most Popular Names by Frequency

Code




Most Popular Names by Proportion

Code


Conclusion

The hypothesis that the proportion of Biblical names in America is decreasing over time is confirmed.



Comentários


bottom of page