The idea was to look for single letters \w followed by an optional period (capture group 1). After that an optional white space is allowed. The whole thing is embedded in \b anchors.
This works for the example above, but unfortunately \b also triggers for single apostrophs ('), although it should only do so for \w character class.
So "Rock 'n Roll" is converted to "Rock 'N. Roll" and "Mother's Finest" -> "Mother'S. Finest"...
Any idea how to improve this?
I don't think that replacing "a single letter followed by a space" with "same letter in uppercase followed by a dot and a space" will work as expected.
Just think about ARTIST names like Cardi B -- > without dot at all B. B. Gabor --> with spaces after the dot B.B. King --> without spaces after the dot BB Bronx -- > without dot at all
You could work around the problem with letters after an apostrophe if you filter them out (manual with F3 filter in Mp3tag) or expand the regular expression. Maybe with some kind of negative lookbehind.
Finding a regular expression for all thinkable cases seems to be overly complicated (and quite dangerous), if possible at all.
You're right - you'll never catch every thinkable combination, but that was not my objective anyway.
Even if you refer to "official/reliable" sources like Discogs, MusicBrainz or AllMusic, you'll find spellings different from what the artist wants it to be:
"B. B. Gabor" is "B.B. Gabor" @ AllMusic, "BB Gabor" @ Discogs and "B. B. Gabor" @ MusicBrainz and Wikipedia.
Therefore I decided to consistently use "X.X. Name" (like B.B. King) if I find the track tagged like "X X Name" or "X. X. Name" and leave the rest as it is (for manual editing).
I've never used lookbehind/lookforward regex expressions, maybe it's about time to start...
Btw. the above regex has another flaw: It replaces "X." with "X.."
P.S. Ugly work-around-solution (until I did my homework about advanced regex):
a) replace ' (single apostrophe) with "000" (Rock 'n Roll -> Rock 000n Roll)
b) regex replace "\b(\w)\.?\s?\b" with $upper($1)
c) replace back "000" with '
Back with some less ugly solution, although not the single liner I wanted to write, sort of.
This one does it in two steps, which could be combined into a nested $regexp(), but I refrained from doing so for better readability:
which finds a single letter with no non-whitespace to the left , an optional "." followed by a space. That gives me e.g. "B. J. Thomas". I was not able to eliminate the space between B. and J. while leaving the space before "Thomas" alone. That is done in the next step:
regex 2: "(\w\.)\s+(?=\w\.)"
which finds a single letter follow by a period, optional white space and another single letter with period.
But anyway, like you already noticed, this works for a lot of cases but you'll never know if some line surfaces where it doesn't. And since my action group "Correct Artist Spellings" already contains a lot actions to add a missing "The" or remove a wrong "The", correct upper/lowercase, correct incorrect notations (I was surprised how may different ways there are to spell "Booker T. & The MG's") etc., it doesn't really hurt to add some more lines for artists with initials. And that comes with the bonus of taking various notations into account.