Regular Expressions

Few tips can help reduce complexity of Regular Expressions and likely make them run faster (see point 5).
Mp3Tag uses Perl Regular Expressions.

  1. $ vs \ in replacement section.
    e.g. swap artist name. Mercury, Freddy --> Freddy Mercury
    RegExp: (\w+),(\w+)
    ReplaceWith: \2, \1
    \2, \1 is equivalent to $2, $1 and ${2} ${1}
    The later is very useful if a fixed digit must be added after a group, e.g. ${1}3 adds digit 3 after group1.
    \w+ captures a series of ASCII characters.

  2. Named group.
    Ever wondering creating a complex regex and then starting counting the groups (1,2,3)? What if you create or remove a group? use ?<...> to give a human readable name to your group. We create here two groups: first and last.
    Above regexp would become:
    RegExp: (?<last>\w+),(?<first>\w+)
    ReplaceWith: $+{first} $+{last}
    Note the additional + here, $+{first} which is not used in numbered groups $1, ${1}.

  3. Reuse pattern.
    Named group can be used for recurring patterns. Could make your regex more human readable.
    if we create a group name for our first/last name pattern (?\<name>[^\s]+) we can reuse it.
    RegExp: (?<last>(?<name>\w+)), (?<first>(?&name))
    ReplaceWith: $+{first} $+{last}
    Note that (?&name) only reuses the pattern name. Their actual captured values can be anything. The three groups last, name and first.

  4. Reuse pattern relative position
    RegExp: (?<last>(?<name>\w+)), (?<first>(?-2))
    ReplaceWith: $+{first} $+{last}
    ((?-2)) here goes two groups behind (the 1st is its wrapper and the 2nd is the group name).

  5. Reuse value. \k{name}
    To capture same value use \k{name}, instead of (?&name). This is the similar to what we normally do with \1.

  6. LookBehind. Magic switch \K.
    Perl supports only fixed length look behind (this is a pattern we want to check before the current cursor) but it since some versions includes the magic switch \K .This switch tells the system to use only the part after this switch.
    Example, delete trailing non-ASCII characters of our above pattern.
    RegExp: (?<first>(?\w+))\s(?<last>(?&name))\K[^\w]+
    ReplaceWith ""
    Anything before the \K is not involved in the replacement part.

All above are 100% tested in:

  • Actions (replace with regular expression)
    Enter source tag (e.g. artist) in Field:.
    Enter text RegExp into Regular Expression:.
    Enter text ReplaceWith into Replaces Matches with:.

  • Function $regexp(Text,'Find','Replace')
    Enter source tag (e.g. %artist%) in Text.
    Enter text RegExp into Find.
    Enter text ReplaceWith into Replace.

  • WebSources regexpreplace "Find" "Replace"
    Enter text RegExp into Find.
    Enter text ReplaceWith into Replace.

Happy coding & testing.

6 Likes