Regular Expressions alternations in find AND replace

I’m attempting to cobble together a set of regular expressions for use in MP3Tag to convert between different date string formats, such as:

June 15, 1974 >>> 15 Jun 1974
15 Jun 1974 >>> 1974-06-15
6/15/1974 >>> 1974-06-15

…and other variations. Due to some of the input and output formats using actual or abbreviated month names, I’d like to utilize some forms of alternation in both the find and replace expressions. I already have suitable find/replace expressions that achieve various date string conversions in Notepad++, which uses the Boost regular expression library v1.70, based on PCRE. One example:

Find: \b(?:(?<M01>January)|(?<M02>February)|(?<M03>March)|(?<M04>April)|(?<M05>May)|(?<M06>June)|(?<M07>July)|(?<M08>August)|(?<M09>September)|(?<M10>October)|(?<M11>November)|(?<M12>December)) (?:(?<D>\d{1})|(?<DD>\d{2})), (?<YYYY>\d{4})\b

Replace: (?{D}0$+{D})(?{DD}$+{DD}) (?{M01}Jan)(?{M02}Feb)(?{M03}Mar)(?{M04}Apr)(?{M05}May)(?{M06}Jun)(?{M07}Jul)(?{M08}Aug)(?{M09}Sep)(?{M10}Oct)(?{M11}Nov)(?{M12}Dec) $+{YYYY}

When I try using these expressions in MP3Tag’s action ‘Replace with regular expression’, it replaces June 15, 1974 with (?{D}0)(?{DD}15) (?{M01}Jan)(?{M02}Feb)(?{M03}Mar)(?{M04}Apr)(?{M05}May)(?{M06}Jun)(?{M07}Jul)(?{M08}Aug)(?{M09}Sep)(?{M10}Oct)(?{M11}Nov)(?{M12}Dec) 1974

It’s been asked here and here what RegEx flavor is used by MP3Tag, though the answer by @dano in the former thread is “perl flavor to a large extent”, which doesn’t seem like a definitive answer to me. Anybody know if what I want to do is possible in MP3Tag, and what I need to do to get there?

But this is the only answer that can be found.

You can test your regular expressions in Konverter>Tag-Tag.
The supported operators are described in the help

If you click on the Help - About menu and there on "Credits" you get a list of components/libraries used by Mp3tag. One of them is called "boost::regex"

Maybe that gives you a hint?

1 Like

Thanks, all, for the enlightening replies. Didn’t think to look in Help > About > Credits, and my forum search terms apparently weren’t sufficient to locate that authoritative post by @Florian.

So now I’ve learned that MP3Tag uses the Boost::Regex library, which is also used by Notepad++, which is where I successfully tested the regex code as shown in my first post. But, again, that code doesn’t work for me in MP3Tag. I’ve consulted the help for ‘Replace with regular expressions’, which really only covers basics, and I’ve consulted the Boost::Regex docs. As far as I can tell, my code should work. Has anyone done this sort of operation in MP3Tag before, or have any other suggestions?

If I see that correctly, then ...
the first date has a comma,
the second has only blanks .. and is the transformation of the first one.
The transformation of 1 and 2 should already lead to the final form.
And the third has the slash as separator.
So I would first handle the dates with the month-string in it and transform them and finally the date with the slashes.

Some explanation is in order: I don’t use these RegEx actions as steps to go from A to B to C. I have various uses for dates as they relate to audio files, depending on what a particular date represents and what tag field(s) I see fit to store it in. For example, I like to retain the date when something was recorded, if known. When an audio file lands on my desk, the date it was recorded may already be represented within the filename and/or the TITLE tag and/or ALBUM tag and/or COMMENT tag, in SOME form (any of those I listed in my first post, or others not listed). But my preference is to have it stored as ‘yyyy-mm-dd’ in a custom field for recording date, and as ‘d MMM yyyy’, along with recording location, in COMMENT. So, if I have the recording date but in a non-preferred format, I copy it to my custom field for recording date, and to COMMENT, and run separate actions to convert the string in each field it to my preferred format for that respective one. If I happen to already have the date recorded in my preferred format for one of the fields, I just copy it as-is to both fields, then run the appropriate action to convert the other one.

Anyway, thanks for the suggestions, though still hoping for specific pointers to get my RegEx code working in MP3Tag.

I think it's currently not possible. You're using features that are part of the Boost-Extended Format String Syntax that are not enabled in Mp3tag's use of Boost::Regex.

Ok, thanks, that explains it then. In that case, please consider this a formal request to include those features, if there’s no compelling reason not to.

Also, I’d like to suggest adding some text to the help section for action ‘Replace with regular expressions” specifying that Boost:: RegEx is the engine used, what limitations (features not enabled) there are to MP3Tag’s implementation, and maybe providing that same link to the Boost docs you included in your 30 Jan 2020 post, or to whichever version of the engine is used by the current version of MP3Tag.

In the meantime (while your regular expression doesn't work) you could simplify your modifications with some
$replace(%TITLE%,January,Jan,February,Feb,...,June,Jun,December,Dec)
commands for the tags you assume this month names.

From the help:

There are several existing threads about this. Including how to change the position of parts like the year, the month or the day.

That’s an idea, and probably makes more sense than the fallback option I’d thought of, which was to create an action group with a separate regex replacement action for each potential month name. Thanks.

Unfortunately, there is one compelling reason: the extended syntax has some special characters '$', '\', '(', ')', '?', and ':' which would break existing replacements that use those characters.

Hmmm… I know all of those characters as being special even in basic regex syntax, and pretty sure I’ve used every one in MP3Tag regex replacement ops before, successfully.

Maybe I'm missing the tree because of all the special characters involved: I'm talking about the substitution / replacement / format string and I think. e.g., parentheses are just copied over to the result string with the current implementation.

Having the boost extended syntax for the format string would require escaping of the special characters, e.g., \( to make the parentheses appear in the result string.

Ok, I think I see what you’re saying. For example:

Exisiting COMMENT: ExactAudioCopy v0.99pb4

Menu > Convert > Tag - Tag
Field: COMMENT
Format string: $regexp(%comment%,(.+?)Copy(.+?),\1(Copy)\2)

Updated COMMENT: ExactAudio(Copy) v0.99pb4

Although, it looks to me like there is some confusing or conflicting information in the help:

[]$% You have to put a single quote around these reserved characters if you want to use them unparsed.
,() These characters must only be escaped when they are inside a scripting function.

As you can see, I’ve used () inside the scripting function $regexp(), and did not escape them, but the function accomplished what I intended. Am I misinterpreting the help?

In any case, for what it’s worth, I had always been escaping special RegEx characters such as ()[]{}$\?: in my replace expressions up until now, having gotten myself into the habit years ago to be on the safe side, rather than trying to remember when I do and do not need to escape them. I don’t know how other users feel, but I’d rather be required to escape them and get the additional RegEx functionality implemented.

You're just lucky. Any imbalanced number of parentheses would produce an error, so it's better to stick with the recommendations. The notes on escaping refer to Mp3tag's scripting language which you're essentially also using with $regexp and the like.

I can understand your wish to have this additional RegEx functionality, but it's not an easy switch. Everyone who is using the special characters unescaped would see their replacements broken.

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.