Help building correct Regular Exprssions -

Hoping I'm not doing anything amiss by recommending this really terrific website one can use to build regular expressions and test them against a known string:

I just today was able to use it to figure out how to use the folder for some mp3 files to put a value in the Discnumber tag (folders are based on Vol#CD# of the Bach Complete Edition from Teldec/Warner Music, and each contains the disc number X of the total, 153). I copied one of the folder names into the website field, then built the RegEx with capture groups to get to just the Disc number.

Folder: Bach 2000 v03CD15 045 - Cantatas BWV 147-149
RegEx: (^\w+\s\w+\s\w+\s)(\d{3})(.*)

Match 1: 0-45; Bach 2000 v03CD15 045 - Cantatas BWV 147-149/
Match 2: 0-18; Bach 2000 v03CD15
Match 3: 18-21; 045
Match 4: 21-45; - Cantatas BWV 147-149/

Which got me to, in mp3tag:

Format: Discnumber: $regexp(%_directory%,^(\w+\s\w+\s\w+\s)(\d{3})(.*),$2)

This really is magic! With 153 folders, that would have been a lot of hand editing.

The site has an explanation of the RegEx parsing, both good and bad, and if grouping is done, what is contained in each grouping, and an on-screen reference for all of the various RegEx.

1 Like

Have you used the forum search before and searched for regex101?
I just found 45 results mentioning this site. :innocent:

But yes, you are right, it is a great site to test new and let explain existing regular expressions.

Just in case you are looking for a tool to visualize (and explain) your regular expressions:

Thanks for sharing! Despite that there are some mentions of the site here in the community, I find the description of your process for arriving at a solution a valuable contribution.

Thanks for sharing your excitement! :raised_hands:

Please be aware that this regular expression only works if your fourth value (\d{3}) always has exactly 3 numbers. As you can see in this regex101-examples, some possible variants would not match.

IMHO if you really only need the DISCNUMBER from the directory name It would be easier to use the Convert Filename -> Tag and this formatstring:
%dummy% v03CD%DISCNUMBER% %dummy%\%dummy%

The Convert function has also the advantage that you can immediately see the result as
For more details about the Convert Filename -> Tag please see the F.A.Q:.

This does not at all mean that your contribution is not valuable. It is just important to know that regular expressions only works for exactly the situation they were built for.

Thanks! That was not on my radar as an option. But it will be in future.

1 Like

P.S. I am aware thet (\d{3}) gets exactly 3 digits as the "disc# of total" value was a 3-digit number. I had issues trying to use (\d+) which was too greedy. That said, your reference to Convert Filename seems to me a much, much better solution as is does not involve RegEx which I find (even though I am a retired 30-year IT guy) so arcane as to be the equivalent of Minoan Linear A. I've never before needed to use a part of a folder/filename for tagging and had never expored this option. Now I know. So, thanks again.

You're welcome.

Feel free to let us know if you need some specific regular expression or another way to adjust collections like yours with 153 CDs.