Visualizing the Difficulty of Wordle through Twitter Data
- toldham2
- Feb 7, 2022
- 1 min read
Updated: May 14, 2023
Collecting and Cleaning Twitter Data to Showcase the Difficulty of this Week's Wordle


For those of you living under rocks, Wordle is the universal word puzzle that's sweeping the internet. The game, recently purchased by the New York Times, gives players 6 chances to guess a 5-letter word, giving only hints in the form of Battleship-Esque colored blocks. The kicker? Everyone in the world has the same word each day, so you can compare your score directly to your friends and family.

My favorite part about it, however, is the game's sharing features. When a user goes to share their score, the game generates a pre-formatted message containing the game id, your score, and an emoji matrix of blocks to represent your guesses without revealing the answer.
As soon as I saw this, I knew the consistent formatting would make for a great dataset, and I was right!
I used the Twitter API as well as the rtweets and tidyverse R packages to collect and clean the data. I visualized the data using Adobe Illustrator's built-in graph tool.
Code
#### Library ####
library(rtweet)
library(tidyverse)
#### Get Data ####
master <- data.frame()
Sys.sleep(900) # Wait 15 minutes to ensure no API timeouts
for (i in 223:231){ # Loop to collect and combine tweets for the past ~9 games
query <- paste0('"Wordle ', i, '"')
message(paste0("\n", Sys.time(), ": Getting scores for game ", i, "..."))
tweets <- search_tweets(query,
n = 1999,
include_rts = FALSE,
type = "recent")
score <- str_sub(tweets$text, 12, 14)
actual_score <- str_sub(tweets$text, 12, 12)
game_id <- str_sub(tweets$text, 8, 11)
date <- str_sub(tweets$created_at, 1, 11)
scores <- data.frame(tweets$screen_name, score, actual_score, date, game_id, tweets$lang)
scores <- filter(scores, grepl("/", score))
scores <- filter(scores, !grepl(" ", score))
scores <- filter(scores, grepl("en", tweets.lang))
master <- rbind(master, scores)
}
#### Clean Data ####
master <- filter(master, grepl("1/6|2/6|3/6|4/6|5/6|6/6|X/6", score))
master$actual_score <- gsub("X", "7", master$actual_score)
master <- filter(master, grepl("223|224|225|226|227|228|229|230|231", game_id))
write.csv(master, "~/wordle_scores_223-231.csv")
Comentarios