Hello, I'm new to AI but have been a software developer for about 10 years.
I'm an older man, and I don't fully understand the capabilities of AI in its current state. My hard drive contains a massive collection of audiobooks, sorted in folders as "Author Name/Book Title.mp3", where a decent amount of them are tagged in some fashion. I want to enrich the metadata and transfer them to a server interfaced with browser-based UI (Plex, Jellyfin, etc.) for easier navigation and displaying of synposes. Getting to that point is resource-intensive via manual labor.
In terms of how I do it manually: I've found a Plex Mp3tag workflow outlined on Github that involves running a search on Audible for my selected file, then renaming and moving the file to a specified location.
The first part of the user's method was to initialize a search type, then use a shortcut hotkey CTRL + SHIFT + I when selecting files to use that as a default search method. Thanks to this forum, I'm using "Audible API -> Audible.com Search by Author + Title" as my default search method. I took that a step further and set it to a macro on my keyboard, G2.
The second part is to "Rename and Move Files", which they created a script for and then mapped to ALT + A + 1. I changed the script to substitute spaces in place of colons, then mapped it to a second macro, G3.
My whiteboard concept for this is such: I have a structured folder of n size with child folders containing distinct mp3 files. Each file is one book.
- Perform G2 search by default for each file recursively.
- If the search result contains more than one match, attempt to match on filename. If more than one match is found, attempt to match the duration. If more than one match is found, do nothing.
- If no author or title is present in metadata, use Folder + Filename as search instead.
- If any search result contains exactly one match, perform G3 step to the file.
- Files that cannot be processed via automation are simply ignored, and I will manually match them. I'm not going to attempt to program audio analysis, not even sure if there'd be a reference library to match data against.
These are large datasets. Assume at least 10,000 files for manual processing (I'm in a preservation community). Is this level of decision-making possible, and should I specifically be looking at something like PyTorch when developing it? Is there a CLI for mp3tag capable of performing searches and renaming/moving files, or would this use RPA to visually parse the screen? Need a sanity check and thought this would be the best place. TIA