Find Duplicate Files in Different Music Directories

Could you please explain this a bit more?
What kind of "order that will identify the duplicates"?
What "function capabilities" of Excel?

Maybe you could tell us some real examples?

What kind of "order that will identify the duplicates"?

That would be what you use to determine a duplicate. I use the tags like Artist and Title but you could use filename. I then use the sort capability of the tool, in my case MS Excel to sort the list in Artist Title order.

What "function capabilities" of Excel?

There are various functions that you can use to program kind of like the scripting in MP3Tag that allow you to do comparisons of 2 rows in a list to see if fields in each are a match. If they are you can flag the duplicate row (record) for deletion.

I will try to put together an example in a word doc and post it here later if anyone is interested.

1 Like

I still would like to add that a strict comparison between 2 lines in an Excel table still has the fuzziness of showing only exact matches but leaving out all those files with spelling mistakes in any of the compared data fields. In this case duplicates would be left in the collection.

Or, the other way round: the Excel method could also indicate duplicates which are none - e.g. if the tag data is incomplete and misses vital information. In this case files would be deleted that really should be kept and their tag data updated so that they can be distinguished from the other files.

I don't know if an exported md5 hash together with the other tag data would lead to more accurarcy..

But putting all this together is - and that is just my opinion - too much of an effort to gain a rather small benefit. But I think I made that point.

I agree with ohrenkino, There is no perfect way to find all pure duplicated from within a list of imperfect data. That being said preparation and standardization of the tags in all your files will go a long way into making it possible to do a reasonable job or at least get you a list. That is what I do and the attached file is the steps I go through to do that. It works for me but may not be enough for others. If you can use it great if not I tried. Good luck to all.
How to find Duplicates with MP3Tag and MS Excel.zip (1.4 MB)

Same question again:
why use MP3 tag for this? Again: it's an really cool tootl but not designed for doing that.
There are many specialized tools for that doing fuzzy search and audio comparison. (scroll up)

I will mention the excellent software "Similarity" http://www.similarityapp.com/ with its "group" function

@oliverq02:
You should mention, that "Groups" and "acoustic fingerprint" are only available in the "premium version" for which you have to pay.

Curiously, the group function seems to work in the free version (tested).

i guess test period / trail limited by file count or time.
there are some alternatives that are really free....

No period / trail limited.
On the other hand, no saving of the search.
If you have a name of a free software for windows I'm interested. (my google search gives paid software)

strange but ok if it's working in free version.

we already discussed, scroll up
https://community.mp3tag.de/t/find-duplicate-files-in-different-music-directories/47952/15?u=mp3freak_peter

well the OP is well old now; but i see it was updated in 22 and perhaps later

what i was surprised to see in all the comments/replies about the issue of 'duplicate files' (surely an issue that all of us must relate to i think, certainly in my case, is that no-one seemed to suggest other such specialised tools/apps

certainly you can do 'visually' by merging dirs, and sorting by name. one adv that MP3Tag has is that it displays all the relevant Tag values eg size, length, bit rate etc, to make a visual comparison more accurate.

i have actually found sorting by SIZE useful so then you would get files of the same or similar size AND names listing together

this issue is of burning importance to me since i started trying to collect and curate this whole damn thing, and something i am arm-wrestling with AS WE SPEAK.

after a huge amount of time checking and trialling and using various tools, these are the ones i now use, bearing in mind, with my very meagre budget, i had to keep any cost free or minimal

  1. Beyond Compare 4 is a sync app, and is a wonderful too for comparing files in different directories (and it has an often very useful option to list files on both sides w/o regard to the sub-dir structure, and many other useful features for filters and many others). this is an app i use almost every day. btw, i find the way it displays the dirs/files side by side FAR easier to understand/check that the many other such tools that have in a vertical list.

  2. recently i finally forked out ($US30) for Audio Dedup by MindGems - and it gets the most amazing results. found all these dups with quite different names/artists that were actually the same. and it will display all the Tag field values for closer inspection and decisions about what to keep and what to 'delete'; but be aware, i have unexpectedly, found some groups of false positives, ie dups that were not dups, so some caution is advised before just using Dele All - moving (instead of deleting) into a Deleted Folder is often good advice. this is the BEST app i have found for all sorts of audio files. i would have to check again if it does MP4 too.

  3. and surprisingly, i found that another MindGems app i already had but was not using - Fast Duplicate File Finder (FDFF) - i have just found it actually does a really great job on audio (and photos) as well as normal files (with the same caveat above)

  4. and finally a specialised Dup app that can be used for files of all sorts and has options for images, audio, and video (?), and i use constantly for both audio (until i got Audio Dedup) and images (and 'normal' files) is - AllDup! a wonderful all-purpose dup app - and free if i remember correctly. but now maybe i will use the previous two apps in preference.

fyi.

1 Like

There are a number of deduplication utilities that compare audio fingerprints across various bitrates and filetypes along with other complex ways of comparing media (such as ignoring metadata and comparing only the audio/video stream). Based on the size and complexity of such programs, it doesn't seem an idea fit for a metadata editor.

More good news? This sounds like an ideal job for neural network/AI. So, I'm sorting out the easy stuff with existing tools and will try again when the next-gen tools arrive.

AI can do a lot of things, but be aware, it is not "the answer to life the universe and everything".
AI often fail, and explain wrong results with wrong facts.
Right now - nobody know how to fix that.

Another problem is that an AI network just do what it was trained for, no inteligence. that may cause plausible but fatal wrong results if something unexpected happen, maybe unexpected input data where every human would say "hey, there's something strange here".

I wouldn't say exploration AI is bad, but right now most people believe all that clickbait stuff of "wonders" around AI that isn't true right now.

Wish I had seen your post earlier. I wasted so much time testing duplicate finders until I found the MindGems Duplicate Audio Finder. I just recommended it here:

from where was pointed to this thread.
I totally love how good this thing is at finding similar songs without ID3 or any titles in file names!
Finding similar ID3 tags is very good too - it doesn't do just partial matching as even if words in the ID3 are rearranged it still recognizes that the title is similar.
My audio library has never been so organized :yum: :smile:
I got the PRO version - it is worth every cent and has so many features.

Someone proposed the Similarity app above - I tried that tool too. It looks fancy, but it compares frequency histograms or something and produces way worse results and a lot of false detections.