Setting Genre en-mass (an idea)

Tell me if there's a better way to go about this...

I've put together a little code already to pull genre's from two sources; iTunes and Last.fm. I can merge the results together and keep the most popular two (or three or whatever) - this part hasn't been done yet.

Initially, I dumped out my song list to a file, and wrote something to read that file and query the iTunes API for each artist/song. I wrote the results to an output file where I produced the following:

Artist|Title|PathFilename|Year|NewYear|Genre|NewGenre

The idea being, I could then use MP3Tag (or some other tool) to read that file, match on PathFilename, and determine whether to use NewGenre, keep the original (Genre), or somehow blend the two.

I'm aware of Convert->TextFile to Tag, and considered using that to accomplish this, but before spending much more time on this, I wanted to hear what other ideas people had.

The key points here are:

  • iTunes doesn't have every song
  • Last.fm doesn't have every song
  • I'd like to do this at an artist/track level, not an artist level
  • Some "filtering" will need to be done, as Last.fm sometimes provides an artists name as a genre

I would think that if a source does not supply data for a file, the record would be empty or you still have the same data as before. This should not do any harm.
Any comparison between fields can be done with Mp3tag. And if you write the new data to a user-defined field, the original data stays in the original field until you take the initiative to overwrite it.
I wonder why the whole effort spins around GENRE as this is one of the most unsecure pieces of data as it depends a lot on taste.
What thoughts did you want to hear?

I was hoping to get suggestions on ways to update the tags once I have the data, and of course, suggestions on apps/scripts/etc. that may already be available to do what I'm after.

My thought is that it would be beneficial to hit 3 or 4 sources, and assume the most popular data is the most accurate. So, with four sources, if 3 say a song is rap, and one says hip-hop, go with rap. I realize that's simplistic, and it could be two & two, but the point being, assume the popular opinion is the right one.

One thought I have is to write something that will:

  • navigate each directory (Artist\Album[CDx]\Track - Artist - Title.ext)
  • make the API calls (to multiple sources) for each track
  • write a file to that directory that could be used with MP3Tag Text File -> Tag

The assumption being that I could drag a lot of folders into MP3Tag, highlight them, run that operation, they would pull from the file in their folder, and I could undo things if they looked strange.

Or, is there a better method to doing this? It might make sense to say that I'm really only interested in a handful or so of genres (country, rap, rock, classical, holiday, etc.). I'd rather have 7 or 8 unique genres than 30 or 40.