Advice: Massive Grateful Dead Metadata Project

MP3TAG Community,
Hey now! New member. My first post. I want to outline a massive project I am embarking upon.

Background/Context:
In their 30+ year career, the Grateful Dead played over 2,300 live concerts. They permitted live recording and tape trading, which eventually proved to be a massive factor in driving their popularity. Those recordings have now, of course, been digitized and continue to be traded among fans. As one might expect, it creates a huge opportunity for inconsistency in file structure, naming, and tagging.

My collection of live Grateful Dead music:

  • 3.7 TB
  • 4,743 folders
  • 123,807 files

My Project:
To 'normalize' the filename, directory name, and tags in the collection so my player software (ROON, JRiver, etc.) can process and interpret the files consistently.

Step #1 -
The first thing I did was to get each concert directory (i.e. - album) out of various sub-directories (E:\Grateful Dead\GD 78-85\1982) and loaded into the same root directory (E:\Grateful Dead)

Step #2 -
I've done a 'rough' job of then separating those folders into directories based on the directory naming convention. For example...

Note: 89% of files are in that top folder, using the following directory naming convention:
gdYY-MM-DD.notes.notes.filetype

Examples:

So what you see here is something like:
gdYY-MM-DD.notes.extension

Example #1:
"gd89-10-15.aud.senn441s+nak300.alabamabob-currier.keo.120533.flac2496f" could translate into:

Artist: Grateful Dead
Year: 1989
Month: 10
Day: 15
Source Notes: "Audience recording. Senn441s+nak300 (recording equipment). Recorded by Alabama Bob-Currier. Keo.120533 (unsure of meaning)"
File Type: .flac2496f

Example #2:
"gd80-05-xx.sbd.12259.shnf" could translate into:
Artist: Grateful Dead
Year: 1980
Month: 05
Day: unknown
Source Notes: Soundboard recording (=sbd). 12259 (unsure of meaning)
File Type: .shnf

Notes:

  1. Sometimes (rarely) the dates are not complete, like "gd80.05.xx....flac" (Unknown day in May, 1980)

  2. Those "source notes" usually start with one piece of important information: is it an audience microphone recording (aud) or a soundboard recording (sbd). Then there can be additional notes separated by 1, 2, 3, 4, periods which usually refer to the provenance (equipment used, taper name, etc.) It is not critical that this "Notes" information be separated out into different tags. It could all live in a single "notes" field.

BOTTOM LINE:
Important tag information is in the directory name.

Next...
TRACKS:
Things get more interesting. Within the folders, the file naming conventions are inconsistent. See screenshot:

BOTTOM LINE:
Important tag information is in the filename, but those filename conventions are wildly inconsistent.

QUESTIONS:

  1. When browsing files, I can display the Path, but cannot seem to get a column to display the Parent. See screenshot...

  2. Workflow Advice?
    Obviously, this is a big project. I am really not sure how to approach it or get started. I am new to mp3tag and have a lot to learn about the software. I ASSUME that I am going to have to approach this in batches, based on various naming conventions in the library. That raises another question...

  3. ...How do I discovery, separate, and organize this work, based on the underlying file naming conventions used?

  4. Other Advice?
    Help! I've never tackled anything like this before. My hope is that some of you out there enjoy these challenges, recognize the enormity of this one, and will give me a hand (regardless of whether you like the Grateful Dead!)

Thanks,
Bart

You can retrieve data from the filename (and that includes the path) with Converter>Filename-Tag.
The Converter only works if the pattern that you supplied matches the pattern in the (complete) filename. Otherwise not data is transferred.

There is a fairly sophisticated filter function in MP3tag that allows to use plain words or more complicated patterns.
I would use that to narrow down the files that have a common pattern in the filename and that can be treated by the same converter mask.

You would definitely have to work in steps to treat all the variations.

I also would first of all get the raw data into clearly identifiable fields, even if the data is not yet quite up to standard - and then treat these fields with special actions which (I think) is easier with structured data in fields than unstructured data in the filename string.
I strongly recommend a look at the FAQs on the converter function.
Once you get some data into the fields or fail to get any at all, come back here and ask specific questions.

1 Like

With respect to working in "batches"...

"...narrow down the files that have a common pattern in the filename and that can be treated by the same converter mask."

So, different patterns will require different batches. I understand this.

But what about batch SIZE? If I can identify 100,000 file names that match a pattern, can the program work at that scale? Or would that invite crashes and corruption? Do I need to batch the work for the sake of processing efficiency?

There is the library function in Tools>Options>Library which should be switched on if you deal with so many files. The library is only for internal purposes.

In general, I would always advise to use a backup, just in case.
So the procedure would be that you load all files and then filter them in MP3tag. I am optimistic that you don't run out of memory with 100,000 files.
Every function and action then treats each file individually - so you could interrupt the process. There is also an undo function for the current session. Don't know though, how many steps you can undo and how safe that is. It is better if you know what you are doing.

1 Like

I don't have much help to offer, just wanted to say Hello from another Deadhead. My collection is smaller, but it's also MP3, not FLAC. About 70GB, maybe 250 different shows/folders. I have a few dozen boots, but mostly stick to official releases, since that's enough to keep my ears busy.

I did have to come up with my own TAGGING Rules, which I detailed in a text file since it can get complicated. Rules vary based on official releases vs bootlegs vs sub-groups (Jerry Garcia Band) vs Dicks & Dave's Picks, etc. I'm kinda OCD about tagging, as it sounds like you are.

Good luck to you!!

1 Like

@astrohip Well maybe since I have the comprehensive collection of Hi Res,....and you have the experience with this software and are "OCD about tagging"...we can work out a partnership? Do I need to ship you a hard-drive!?!?! :wink:

Great idea, except I don't have enough years left to tag your massive collection! :rofl:

1 Like

Any help on this question?

Create / Modify the column for the folder with
Value: %_directory%

Leave Field empty.

:raised_hands:
:clap:
:+1:

That worked. Thank you!

You may find this interesting...

I wish I would have noticed this earlier. I just completed a huge GD project containing over 12k songs and the end of 2021. Sucks, I could have given you a huge leg up. Unfortunately I am working on another huge project and cannot edit all my style guide but I offer you a section relating to GD. I have attached the file. I hope it helps. When I get more of a chance I will read everything here and offer any additional assistance. Good luck, it is a labor of love! It took me almost a year to complete my project. Over 500 shows. I did a lot of research to find what were considered at least the top 10 shows per year with many years containing over 20 shows.
GD Style Guide (Incomplete)

1 Like

I would like to add the update that since 3.16 there is also a 64bit-version of MP3tag so that the 4-GB-Limitation should not apply any more and it should now be possible to treat the collection in 1 go.
For resource reasons, it is still useful to use the library function.

2 Likes

Wow, that is... comprehensive. And I thought my Tagging Rules were OCD. Yours is a work of art. I love it!

And my previously reported, just this past January, collection of 60GB/250 shows, is now up to 72GB and 331 shows (or more accurately, Folders).

PS: Y'all have Dave's Picks 42 yet, just released last month? Superb show.

1 Like

It most certainly is very helpful! And YAY! for 64bit!!!!!

Not yet. But I will in the next month. Going to Virginai, Philly and the 2 NYC shows!

And you don't know the half of my OCD. That comes from a 48page .doc just on tagging, which I have to revise yet again!!!

1 Like