1964-Chevrolet-Impala-SS_255413_low_res (3)

The Data Science of Cars and Rap Lyrics

Cars in Rap Lyrics – A Sweet Data Visualization

Rap Artists are notorious for often singing about which cars are the most beloved. However, they often do this without the research to back it up. As a result, an author at Cuepoint decided to analyze all of the lyrics on Rap Genius – a resource for crowd-sourced annotations of rap lyrics - and displayed his findings using a number of methods.  He revealed the data behind how many cars are mentioned in tracks, as well as the kinds, their frequency, and other interesting trends.

Using a horizontal bar graph, the author determined that the most frequently mentioned car make is Mercedez Benz. The data was further extrapolated down to model types: the number one most mentioned model is the 1964 Chevy Impala.

Displaying the data with a time series analysis, the chart identifies the way that certain makes and models were changing in popularity. Analysis of the chart led to some noteworthy observations. For example, during the economic downturn in 2008, the popularity of less expensive brands increased.

Finally, an analysis of the specific artist that used car references most frequently was displayed with a unique sorted table.  The top artists were The Game, with 279 car songs (64% of all of his songs) and Gucci Mane with 309 car songs, (50% of all of his songs).

Cars and rap lyrics are like peanut butter and jelly – they just go together. Through the use of math and science, we are able to definitively see the data. This data analysis allows us to look further into rap and identify the valuable, less obvious trends.

What is Data Science?

Data Science refers to the tools and methods used to analyze large amounts of data. It is also known as knowledge discovery and data mining.  Many academics and journalists see no distinction between data science and statistics.

Data science employs techniques and theories drawn from many fields within the broad areas of physics, robotics, mathematics, statistics, information theory and information technology, including signal processing, probability models, machine learning, statistical learning, data mining, data engineering, pattern recognition and learning, visualization, predictive analytics, uncertainty modeling, data warehousing, data compression, computer programming, and high performance computing.

Data science techniques affect research in many domains, including the biological sciences, health care, social sciences and the humanities. It heavily influences economics, business and finance.

Visual representations of data, such as the one showing the relationship between cars and rap, evince trends in a clear and easy to understand manner. Visualizing data is a great way to make numbers, values, and what they mean easier to process for those without a deep understanding of statistics. Data science does not need to be extremely advanced or applied to something “boring.” In fact, often looking at variables outside-the-box provides the most unique and interesting insight into unexpected matters.