Improve Regex Documentation

Took me a while to figure out why ..

$regexp('[abc]',\],)

.. throws a syntax error.

It's not documented at all that you should wrap the regex pattern in quotes xD   (Mp3tag Help - Configuration / Actions)

I highly recommend doing this!

This would also prevent rough disputes like this one I saw :face_with_raised_eyebrow: Regular Expression To Replace '['

Hope this helps

I'm not sure what you want to achieve with your Regular Expression.

The syntax for $regexp is:
$regexp(x,expr,repl) replaces the pattern specified by the regular expression expr in the string x by repl. The fourth optional parameter enables ignore case (1) or disables the ignore case setting (0).
Please note that you have to escape comma and other special characters in expr.

If you want to replace the characters a or b or c with a closing square bracket ], you have to add the field/tag where you want to apply it too. For ARTIST it should look like :
$regexp(%artist%,'[abc]',\']')
or
$regexp(%artist%,'[abc]',\']',1)

This would change the content in the tag ARTIST from
Articst
to
]rti]st

If you try such expressions with the Converter Tag -> Tag you can see immediately, if you get the desired result or a [ SYNTAX ERROR IN FORMATTING STRING ]

Please don't pollute this thread.
I mean exactly what I said.
I'm not looking for a solution to some personal problem, my goal is to improve the current documentation.


This throws an syntax error:

$regexp('[abc]',\],)

The solution is to escape the regex pattern with single quotes.

$regexp('[abc]','\]',)

This is not obvious, in my opinion, because the regex syntax is part of the scripting language, which normally is the stuff that is NOT escaped.
This thread is about how to improve the documentation in that regard.

Not exactly. The solution is to take the special characters of the scripting language into account. Because you can also escape like this:

$regexp('['abc']',\']',) or
$regexp('[''abc]',\']',) ...

Escaping the pattern in single quotes might work most of the times, but if the pattern contains a single quote it will fail. It would also be a wrong understanding of the syntax.
The regexp pattern and the other strings must conform to the overlying scripting syntax.

So first build your regex pattern, than check if it has special chars from the scripting language and then escape these.
Of course starting and ending with a single quote is the easiest and most readable way, but we need to understand why it works

2 Likes

Thanks for the correct and detailed explanation.

My goal is to give you a cue to improve your documentation, in your interest.
I think it would be beneficial to provide the explanation you gave me to the users right away in the documentation, without requiring them to resolve to the forum.

Please try to get into the perspective of a documentation reader.
I would argue that it is not clear whether the square brackets have to be escaped in regex patterns or not.

The docs say that . | * ? + ( ) { } [ ] ^ $ are special regex symbols.

My initial understanding after reading this was that, just like the symbols for scripting functions $(,) and conditional strings [], the regex symbols are part of the scripting language and escaping them with quotes would actually disable their functionality and make them plain strings. I assumed that the code automatically determines by the commas , where the regex starts and ends and then processes the regex pattern like a normal regex pattern from programming (without additional string escaping of square brackets). It's quite particular that the program expects the beginning or end of conditional strings (with [ or ]) INSIDE a regex pattern and requires the user to escape it (there isn't even a scenario where you would NOT escape it). Maybe you disagree, because you are used to different programming languages, but certainly you can acknowledge that if a documentation reader was unsure about how to escape square brackets in regex patterns, he/she would not get enough information from the docs to accurately tell.

You can do with my cue what you want, in the end. But maybe it matters to you.

I suggest to add one sentence to the introduction paragraph of the regular expressions' usage details (here Mp3tag Help - Configuration / Actions), informing the user that each "Regular expression" must conform to the global string escape rules, concerning %[]$(,), which have to be escaped with single quotes.

1 Like

I'm not sure if this is the right place to explain this in the documentation. The action Replace with regular expression does not require escaping the characters of the scripting language. It simply takes a regular expression as input.

You've stumbled on this while using $regexp, which is documented at Mp3tag Help — Scripting functions. The documentation of this function states

Please note that you have to escape comma and other special characters in expr.

The special characters are listed on the same page at Characters with special functionality.

So I think the correct place to improve the documentation would be on this page. I'm also not really sure how to improve that bit of the documentation, as I understand it as already fairly complete.

Please correct me if I'm missing something and please do send suggestions (— which are ideally compatible with the compactness of the documentation).

1 Like

Okay, thanks for pointing this out.

You're right. Technically, it is documented.
Still, it's not that easy to truly grasp and understand this without an example.
I know, in the current compact style of your docs, there is no space for examples.
However, adding examples to all of the scripting functions (eg. in an expandable/collapsible container so that the documentation stays compact) would definitely be a valuable improvement to the documentation (especially for less tech-savvy users) and would have prevented my misunderstanding right from the beginning.

As this would mean a lot of work, I promise to not be disappointed if you don't have time for it :wink:



Side Note:
I would actually prefer a different scripting language syntax, that is more intuitive and needs way less escaping. This is just for inspiration purposes, I don't advocate for you to abruptly change the scripting language that is currently in use xD

For instance, I really like the JavaScript's "Template Literal" syntax Template literals (Template strings) - JavaScript | MDN).
The default mode is plain text, and the expressions start with ${ and end with }. So the only character you would ever have to escape is the $ symbol, which is done with back-slash \$.
Inside the ${...} you put can put ANY code (including arbitrary white-space) without having to care about escaping whatsoever, because the script engine knows it is in "code mode" now ! ;D

You could implement it like that:

${%artist} - ${%album} - ${num(%track, 2)} - ${ regexp(%title, /[^\w]{1,}/, '�') }



Side Question:
Is your code open source, or is there any API in any programming language for the MP3TAG core functionality? Because sometimes, the ability to write more advanced scripts outside the UI and its custom scripting language would be nice. I am not aware of any piece of open source library that supports Meta Tags so well, broadly, robustly and easy like your project. It's really great. Why not share this core code.


Anyway. I think we're done here. Thanks for your open ear.