BeyonTay - rhyming and combining Beyoncé and Taylor Swift lyrics

Luke Shaw

2020/12/31

Intro

For TidyTuesday 2020 week 40 I created code to randomly pair Beyoncé and Taylor Swift rhyming lyrics.

If you want to make your own BeyonTay verse, you need the following things:

  1. be willing to code in R

If the above criteria are satisfied, head over to the GitHub repo. Feel free to reach out if you have any issues or questions.

In the rest of this blog post I’ll talk about how I went about getting to this output, with some musings on the way. Each section will end with a BeyonTay verse to break up the text.

Wrangling

The raw data needed some playing with to become make the different sets of Beyoncé and Taylor Swift lyrics comparable. I used the tidyverse flavour of R for this (code here), in keeping with TidyTuesday, but I won’t say much more than that as it’s not that interesting.

I was a bit slapdash with my data prep, but I think it’s important to note that that doesn’t matter because this is a silly for-fun project. The consequences of getting this wrong are… nothing? Maybe mild embarrassment at worst. The point being, it is OK to be a bit fast and carefree if the project is inconsequential, but our code can, and often does, have consequences so it is important to really think about the potential harm down the line once your code is deployed. Tom Scott explains this point far more eloquently.

The idea

I had this thought ‘how can I combine the data sets?’, which led to the thought ‘could I make a Beyoncé / Taylor Swift lyric set?’ which led to ‘could I find rhymes to make a BeyonTay verse?’

Not documented are the many other failed ideas I had. One example was seeing ‘who uses words which didn’t exist in the formal English language?’ but it didn’t really work. There’s always a lot of the iceberg you don’t see.

Finding rhymes

How can you tell if two words rhyme?

My first thought was to try matching the end of words, but then the english language isn’t that straight-forward - we would miss that “time” and “rhyme” don’t rhyme, and incorrectly assert that “food” and “good” do rhyme. As an aside, I wonder if the reason the Co-Op dropped their slogan “good with food” is because it reads like it should rhyme, but doesn’t.

Any other ideas?

Aha! The international phonetic alphabet (IPA)! It tells you how words sound, which is what we’re after really. It turns out if you want a good API, so you can send an English word and it will return the IPA, you have to spend money. It’s one of the services the big Dictionary companies provide now. Ah well. Good for them, I guess.

As an alternative to an IPA API, I found this site phonetizer which has a user interface for inputting english and outputting IPA.

I did some real code bodging at this step (for a bodging explanation see this Tom Scott video - yes I am aware I have referenced him twice now).

My Bodge Solution:

Visualising the answers

I decided that rhyming couplets were the way forward, and that it should always be a match across artist. The idea of a Beyontay verse came quite naturally as a pair of rhyming couplets, so I used ggplot to visualise such a thing. I also limited the length of each line to be at most 40 characters, to stop them running of the right hand side of the image.

Issues with the results

There are a few problems with my approach.

Firstly, the last 3 letters of IPA aren’t always sufficient. For example, “don’t” and “amount” are given as valid rhymes via this method because their IPAs are “dəʊnt” and “əˈmaʊnt”. Damn. I can’t see a sensible way round this problem (as you probably have worked out by now, I ain’t no linguist). Another example is in the above image; “running” and “drowning”.

Secondly, a vast number of words are totally ignored. For example, the two most common line ending words are “you” and “me”, and they account for 7.6% of all the line enders (2260 out of 29571). Now, for both of them there is no rhyme pair in the data set. Why? Because their entire word is taken as the match for the rhyme, as they are short words with under 4 IPA characters making them up. So “me” and “be” don’t rhyme with my definition of last 3 IPA character matching. Dangit.

Conclusion

This is stupid, but fun.

My favourite rhyming couplet to come out of it was this:

And you don’t know what you don’t know

I threw myself into a volcano

Pure poetry from BeyonTay.