Fill ACOUSTID_ID field with acoustic fingerprint using an AI-generated script

realizing I’m a couple of years late…

I had the same (or similar) desire - I wanted to write AcoustID values directly to each MP3 file. It turns out there is a lot going on there, but after working on this for a day or two, I’ve arrived at a solution that I think is kinda slick.

Part of the problem is that a field value of ACOUSTID_ID is actually a custom frame (TXXX). I ended up having to use Mutagen to modify the MP3 headers to make this work properly.

The solution also requires the use of:
a) Python which acts as the engine/orchestrator, managing the logic, API calls, and binary data handling.

b) fpcalc.exe the essential specialized tool that "listens" to the audio to create the unique fingerprint.

c) exiftool.exe the industry-standard utility used to write those specific custom TXXX frames (like ACOUSTID_ID) into the file headers where other tools often fail.

At any rate, to make a long story boring, it works as envisioned. I can now add Acoust ID values to the metadata for each .mp3 in my stupidly huge music collection.

The tool has been designed to scale for large libraries, although I strongly recommend “batches” of files - I tagged 752 files (my Rolling Stones directory) at one time and it took about 5 minutes. Running thousands of files against AcoustID’s API MAY get you blacklisted - I didn’t want to try. :slight_smile:

Wanna try it? https://drive.google.com/file/d/1pj9BnndNPFgWmNtUBVmPu8I05lsPpmIy/
Extract the zip file to C:\
Check the ReadMe file to learn how to both finish the installation (You need an AcoustID API key) and also to integrate this into MP3Tag.

PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

USE THIS TOOL AT YOUR OWN RISK. ALWAYS BACK UP YOUR MEDIA LIBRARY BEFORE PERFORMING BULK TAGGING OPERATIONS.

Just a small note to

AFAIK exiftool.exe can still NOT write any tag into .mp3 or .flac files.
Please check the current Read & Write capabilities for your own music format:

Full Disclosure: I spent a couple of hours using AI to build what I wanted.

What can I say except that it works. It's a critical dependency to the app, and the .exe is bundled into the zip file along with the other required .exe. Maybe it's working with the Mutagen module (another dependency)?

Did you have a look at your code to understand how it works?
Of course it's Mutagen that write the TXXX tag into the music files.
(The same library is used by MusicBrainz Picard, the official tagger from MusicBrainz).

# Apply tags via Mutagen
audio = MP3(str(f), ID3=ID3)
if audio.tags is None: audio.add_tags()

I have no idea, why you add the field ACOUSTID_ID twice:

audio.tags.add(TXXX(encoding=3, desc='ACOUSTID_ID', text=[aid]))
audio.tags.add(TXXX(encoding=3, desc='AcoustID Id', text=[aid]))

You should really make it clear to people what the prerequisites are, including which installed Python-version in your point a) and the need for an individual Acoustid API key maybe as an additional point d). I would even add a point e) Follow the instructions in the included ReadMe file carefully.

BTW:
exiftool.exe is not used at all, you only set a
EXIFTOOL_PATH =
and never use it in your code. And exiftool is not mentioned in your ReadMe.rtf

1 Like

Could you please explain why you exclude exactly this two MBIDs:

 if m.lower() not in ["8e27c4ed-86fb-4f87-aff1-14b485b4fc1d", "1657488d-327c-4734-92e1-4c125d070b3b"]:
   mbid = m.lower()
   break

The first MBID belongs to the release "Blurring the Edges" from Meredith Brooks on MB.
The second MBID does not exist as release nor as another entity (as per 2025-12-27).

@johnfoliot I understand that you shared your script with the best intentions, but in its current state I don't think it's very good. In some cases, it may not even do what you intend it to do.

Also, completely ignoring the current rate limit of 3 requests per second for AcoustID is not good practice and something I don't encourage. Lukáš running AcoustID is also just a developer who tries to offer a good service.

To be clear, everyone should be free to use LLMs to generate solutions to their own problems and even to share them. However, once something is shared with others (such as this community), the author (or originator) should be able to understand, maintain, and support it. This is my personal opinion.

If that isn't the case, I don't see much value in sharing it. Anyone can now prompt their favourite LLM to generate a solution for them.

Personally, I don't appreciate low-quality AI-generated content, and I hope to keep this community mostly free of it.

4 Likes

Fair enough. You are correct on both counts: my desire was to share a solution that ‘works for me’, as well as understanding your concern over low-quality AI-generated content.
My bad - sorry.

FWIW, the current delay of 0.4 seconds meets the technical requirement; the tool is making approximately 2.5 requests per second. However, bumping that value to 0.5 seconds will reduce the calls to exactly 2 requests per second.

Fixed.