Regular Expressions

If you want to add TEXT to the existing tag:

Regular expression: (.*)
Replace with: TEXT \1 \0

I found that if you don't put in the \0 at the end it will repeat TEXT twice. For example:

Title: Texty text
Becomes: TEXT Texty textTEXT

I'm using v2.48. This might of been fixed in the later versions, but it's the version I've been using and some things I refuse to update to a newer version since I've never had any other issues with it except for this one bit of annoyance.

What seems to be happening is the "end of line" is being matched a second time for some reason.

Using 2.48, I duplicated your bug but also tweaked the RegEx to work properly.

Regular expression: ^(.*)$
Replace with: TEXT $1

That explicitly tells the engine not to include the newline in the match.

It's not a bug.
It happens because global matching is activated by default.
That means the engine tries to match the regex pattern as many times as possible.

With (.) the whole text is matched at first. Then the engine stands at the end of the text. And it tries to match the pattern again.
And it succeeds because .
also matches "nothing".

But I'd prefer a simple "Format value" action is for this task anyway.

Can you help me to convert 1 Feb 2009 HH:MM to 2009-02-01 HH:MM

Read there ...
Regular Expressions

DD.20140310.2238.CET

A very nice regex. I change it a little bit, with this one:

Regex:

((\s*0*)*(\d+)(\s*0*)*)*(.*(\s*0*)*\d+(\s*0*)*)*

Replace with:

$3

because in this extreme examples:

01/100
0 01/100 0
0 01/ 100
0 01 / 0 0100
0 0 00001 / 0 0 0 100 0 0 0
0 0 20 0 / 0 200 0
0 0 2 0 0 / 0 200 0
/ 0 200 0
/ 0 0 20 0 0

didn't work for me.

1 Like

I use this regex expressions for my albums:

field: Album

((\s?\bv(ol)?(ume)?)(?=(\s*\.*\s*[0-9])+))(\s*\.*\s*([0-9]+)\s*)
Replace with:
spacev\7space

\s+\(*\[*\s*((dis)?c(d)?)(?=\s*\d)\s*([0-9]*)\s*\]*\)*
Replace with:
spacecd\4space

(v)([0-9](?![0-9]))
Replace with:
v0\2

(\sv\d+)(\s*\-+\s*)*(cd\d+)
Replace with:
\1space\3

to turn this:

Rave Mission Vol. 9 - Cd 2
Rave Mission vol. 9 [cd 2]
Rave Mission Vol. 9 (CD2)
Rave Mission Vol 9 Disc 2
Rave Mission Volume 09 (DISC 2)
Rave Mission Volume 9 CD2
Rave Mission Volume9 (Disc 2)

into this:

Rave Mission v09 cd2

The expressions must be in this order to work and where you see the word
(space) delete it and press spacebar.

1 Like

Here's an all in one expression:
Replace: (\s)(?i)vol[^\d]*\s*0*(\d+).+?(?:\(|\[)*(?:cd|disc)[^\d]*?\s*(\d+)(?:\)|\])*
With: $1v0$2 cd$3

1 Like

This was very nice and inspiring.

I tried out and study it but I couldn't accomplish what I was looking for into one line with my poor knowledge on regex.
So here is an update of mine.
If you can make it better ( I couldn't ) i will be vary happy.
I hope you have a success because the smaller is better.

This is an updated version for the ( album tag ) volume and disc renaming.
I hope you find it much better!!!

ALBUM
\s*\(*\[*\s*(v|ol|ume)+(\s*\.*\s*(?=0))?\s*\.*\s*0*([0-9]+)\s*\]*\)*((\s)+|$)
replace with:
spacev\3\5

ALBUM
\s*\(*\[*\s*(dis|c|d)+(\s*\.*\s*(?=0))?\s*\.*\s*0*([0-9]+)\s*\]*\)*((\s)+|$)
replace with:
spacecd\3\5

ALBUM
(\s*)(v)([0-9](?![0-9]))
replace with:
\1v0\3

ALBUM
(\sv\d+)(\s*\-+\s*)*(cd\d+)
replace with:
\1 \3

You can turn this:

Rave Mission V9 cd9
Rave Mission V 9 cd 9
Rave Mission V.9 cd.9
Rave Mission V. 9 cd. 9
Rave Mission Vol9 (cd9)
Rave Mission Vol 9 (cd 9)
Rave Mission Vol.9 (cd.9)
Rave Mission Vol. 9 (cd. 9)
Rave Mission Volume9 disc9
Rave Mission Volume 9 disc 9
Rave Mission Volume.9 disc.9
Rave Mission Volume. 9 disc. 9
Rave Mission Volan9 (disc9)
Rave Mission Volan 9 (disc 9)
Rave Mission Volan.9 (disc.9)
Rave Mission Volum. 9 (disc. 9)
Rave Mission V0123 [cd9]
Rave Mission V 0123 [cd 9]
Rave Mission V.0123 [disc123]
Rave Mission V. 0123 [disc 0123]
Rave Mission Vol 0123 (cd0123)
Rave Mission Vol. 0123 (cd 0123)
Rave Mission Volume0123 (disc0123)
Rave Mission Volume 0123 (disc 0123)
Rave Mission Volume.0123 ( disc 0123 )
Rave Mission V0123 ( discopolis 0123 )
Rave Mission V0123 discopolis 0123
Rave Mission ( discopolis 0123 ) volume 0123
Rave Mission disco 9 volume 09
Rave Mission disc 9 volume 9

Into this:

Rave Mission v09 cd9
Rave Mission Volan9 cd9
Rave Mission Volan 9 cd9
Rave Mission Volan.9 cd9
Rave Mission Volum. 9 cd9
Rave Mission v123 cd9
Rave Mission v123 cd123
Rave Mission v123 ( discopolis 0123 )
Rave Mission v123 discopolis 0123
Rave Mission ( discopolis 0123 ) v123
Rave Mission disco 9 v09
Rave Mission cd9 v09

The expressions must be in this order to work and where you see the word

(space) delete it and press spacebar.samples.zip (2.3 MB)

1 Like

A post was split to a new topic: Extract remix information

NON ASCII detection

a simple expression to place in the Filter
[^ \t-~]
to be used with a 'MATCHES' sentence

All tags with NON ASCII char will be revealed, even invisible one (CR,LF!!)

regex onlineTool
https://regexr.com/

online tool usefull for testing regex on text you can import
full of example of regex (working and non working...it's a community thing)

:grinning:

1 Like

5 posts were split to a new topic: Extract catalogue number from album field

Here is one to use only the last name of a composer
$regexp(%composer%,(.+)(?=\W),)
So "Johann Sebastian Bach" in the composer field could become " Bach" in the filename.

This is only true, if the composer name is filled with "first name last name". This regex doesn't give you the "last name", it give you the last word filled in composer.
So "Bach, Johann Sebastian" would result in Sebastian (last word of his first name).

Yes, that's right, the last word of the composer. And as I've got it, it includes the space before that last word as well, which isn't ideal. But as my composer fields are formatted as "First Middle Last" it worked for my purpose, which was to put the composer's last name in the filename. It would fail in the case there was a suffix like "Johann Stauss Jr."

A regex to grab the first word should be even simpler and could key on the presence of a comma.

4 posts were split to a new topic: Rename based on certain tag fields

To my (already not redactable) earlier post.

Be, aware, first check it out! Pay attention to multiple surnames; to paternal + maternal surnames (for example, in spanish: "paternal" y "maternal", or inversely); and languages with proper using (= family name first) of family names (for example, hungarian or japan).

Fixing/Renaming files with invalid(invisible) characters.

I came across some MP3's with invisible characters (like Ø ) in their name because of some codepage hickups. It's quite easy to fix with a simple regex search and replace.

Actions -> Replace with regular expression
Field: _ALL
regex: \xc3\x98
Replace matches with: Ø

\xc3 = hex c3 = Ã
\x98 = hex 98 = invisible control character (on my PC)

A full list with possible hickups can be found here:
UTF-8 Encoding Debugging Chart

1 Like

A post was split to a new topic: Change backslash to forward slash in Genre field

Few tips can help reduce complexity of Regular Expressions and likely make them run faster (see point 5).
Mp3Tag uses Perl Regular Expressions.

  1. $ vs \ in replacement section.
    e.g. swap artist name. Mercury, Freddy --> Freddy Mercury
    RegExp: (\w+),(\w+)
    ReplaceWith: \2, \1
    \2, \1 is equivalent to $2, $1 and ${2} ${1}
    The later is very useful if a fixed digit must be added after a group, e.g. ${1}3 adds digit 3 after group1.
    \w+ captures a series of ASCII characters.

  2. Named group.
    Ever wondering creating a complex regex and then starting counting the groups (1,2,3)? What if you create or remove a group? use ?<...> to give a human readable name to your group. We create here two groups: first and last.
    Above regexp would become:
    RegExp: (?<last>\w+),(?<first>\w+)
    ReplaceWith: $+{first} $+{last}
    Note the additional + here, $+{first} which is not used in numbered groups $1, ${1}.

  3. Reuse pattern.
    Named group can be used for recurring patterns. Could make your regex more human readable.
    if we create a group name for our first/last name pattern (?\<name>[^\s]+) we can reuse it.
    RegExp: (?<last>(?<name>\w+)), (?<first>(?&name))
    ReplaceWith: $+{first} $+{last}
    Note that (?&name) only reuses the pattern name. Their actual captured values can be anything. The three groups last, name and first.

  4. Reuse pattern relative position
    RegExp: (?<last>(?<name>\w+)), (?<first>(?-2))
    ReplaceWith: $+{first} $+{last}
    ((?-2)) here goes two groups behind (the 1st is its wrapper and the 2nd is the group name).

  5. Reuse value. \k{name}
    To capture same value use \k{name}, instead of (?&name). This is the similar to what we normally do with \1.

  6. LookBehind. Magic switch \K.
    Perl supports only fixed length look behind (this is a pattern we want to check before the current cursor) but it since some versions includes the magic switch \K .This switch tells the system to use only the part after this switch.
    Example, delete trailing non-ASCII characters of our above pattern.
    RegExp: (?<first>(?\w+))\s(?<last>(?&name))\K[^\w]+
    ReplaceWith ""
    Anything before the \K is not involved in the replacement part.

All above are 100% tested in:

  • Actions (replace with regular expression)
    Enter source tag (e.g. artist) in Field:.
    Enter text RegExp into Regular Expression:.
    Enter text ReplaceWith into Replaces Matches with:.

  • Function $regexp(Text,'Find','Replace')
    Enter source tag (e.g. %artist%) in Text.
    Enter text RegExp into Find.
    Enter text ReplaceWith into Replace.

  • WebSources regexpreplace "Find" "Replace"
    Enter text RegExp into Find.
    Enter text ReplaceWith into Replace.

Happy coding & testing.

1 Like