Music Metadata: A Case Study Through Genre
Keeping metadata accurate and consistent is one of the major challenges of publishing and maintaining a substantial music catalog. Here we look at common metatda issues in classical music, hip hop, and EDM, three genre's which may have more in common than one might think.
Guest Post by MediaNet Content Manager Amy Vandergon on Medium
Here at MediaNet’s Content department, we spend a lot of time staring at metadata tags. We have become metadata whisperers, noticing genre-wide trends, peculiarities, and common mistakes. We nurture any problem data, and once it’s fixed we release it into our catalog. Over time we have noticed three genres in particular that often need data intervention: Classical, hip hop, and electronic dance music (EDM).
Beyond the metadata similarities, these three genres are all quite different. However, they do share some additional common characteristics, including a heavy reliance on patterns, multi-movement works, and the integration of dance.
Patterns are important in every genre, but perhaps more so for these three. Classical music laid the groundwork of tonality, or the patterns of pitches our ears expect. Baroque fugues, for example, contain tightly-repeated harmonic and melodic sequences. In both hip hop and EDM, patterns manifest through repetitive samples and beat structures.
All three genres have many multi-movement works. Just as any classical symphony should be listened to in its entirety, so too should Kanye West’s The College Dropout or Daft Punk’s Random Access Memories (or any live DJ set, for that matter).
Unique styles of dance, such as the minuet and waltz, were popularized through the classical music tradition. Hip hop has spawned a wide variety of dance styles, including breaking and krumping. An entire subculture of dance has been created through the popularization of EDM. Even more important than the musical similarities of these genres is the ability of each to enact social change, becoming a voice and distraction for the oppressed.
Olivier Messiaen wrote and premiered his Quatuor pour la fin du temps, inspired by the Book of Revelation, while imprisoned in a German POW camp. N.W.A.’s Straight Outta Compton highlighted the poverty, drug abuse, and police brutality that continue to run rampant in that city. EDM is a direct descendant of disco, which began in jazz halls in Occupied France that were only allowed to play recorded music. Disco and early EDM developed largely through the work of homosexual, black, female, and Latino communities — groups which have a history of devalued cultural contributions.
Now let’s look at the practical problems they pose when it comes to metadata tags.
- There are several contributor options (composer, conductor, performer, etc.) and multiple accepted spellings of composers’ names (like Stravinsky/Strawinski/Strawinsky or Schoenberg/Schönberg).
- Artists in hip-hop and EDM often change spellings or have multiple variants (like Jay Z/Jay-Z or Puffy/P. Diddy/Puff Daddy).
- EDM artists often get credited on their own or as part of collaborations (e.g. Axwell, Sebastian Ingrosso, and Axwell Λ Ingrosso).
When it comes to artist names, accuracy is important in all genres. Should the album be attributed to “Drake” or “Nick Drake?” Should the artist name be “Sammy” or “DJ Sammy?” For classical music, the preferred format is typically “First Last.” “André Previn” is correct, rather than “A. Previn,” “Previn, André,” or “Previn”. In some cases, special characters are necessary (e.g. “Béla Bartók” instead of “Bela Bartok”).
What would happen if we didn’t intervene? Every time an album is submitted with a mistake in the artist name, that album will not show in a search for the correct artist name. Let’s use the example of André Previn. Here are some of the name variations that have made their way into our system:
To fix this issue, we look at which spelling has the highest amount of data in our system and compare it with additional research. Our research confirmed that the proper, accepted spelling is “André Previn.” The other records were automatically created by incorrect metadata. As you can see, using the proper é character is important here, as are spelling, formatting, and spacing (poor “AndréPrevin” is afflicted with a missing space).
If a label were to submit an André Previn album under the name “Andri Previn,” it would prove difficult for a listener to find. The album would not show up under the accepted “André Previn” name. We merge these entries so the associated albums show up under “André Previn.” After we correct and merge this metadata, our database automatically reassigns anything further submitted under the names above to “André Previn.”
Our Content Operations Team repairs these inconsistencies every day. With more than 25,000 new artists submitted to our catalog each month, you can see how valuable proper submission is to database integrity. Every incorrectly submitted new artist must be repaired by hand. As Benjamin Franklin said, “An ounce of prevention is worth a pound of cure.”
Accurate data is paramount to good systems and great user experience. We know how much work goes into preparing, recording, and distributing an album. We all care greatly that it is represented accurately in our system.Accurate metadata submission means users will find the albums they want to hear, and that means more earnings for rights holders. Data submitted to us without the need for intervention makes this process go much more quickly. In short, proper submission = better, faster returns.