Selecting terms between interchangeable (, [, { brackets

I'll actually be my last post's problem in different way:
I am attempting to have this Format string: $regexp(%title%,(.*)\((.*?[m|M]ix|.*?[e|E]dit|.*?[v|V]ersion|.*?[a|A]coustic)\),$1'['$2']')
not only find text inside of this:
"SongName (Feat. Artist2) (MixArtist Remix)"
but also either of these:
"SongName (Feat. Artist2) {MixArtist Remix}"
"SongName (Feat. Artist2) [MixArtist Remix]"

I know that given the context of the format string, the selection in the third seems redundant and unnecessary, but I'm choosing to put it in this for lack of confusion later on. As always, any and all help is appreciated - Thanks!

I thought that perhaps that I had found something that would work:
$regexp(%title%,(.*)[\[|\(|\{](.*?[m|M]ix|.*?[e|E]dit|.*?[v|V]ersion|.*?[a|A]coustic)[\)|\]|\}],$1 [$2])
I also attempted this:
$regexp(%title%,(.*)[[{(](.*?[m|M]ix|.*?[e|E]dit|.*?[v|V]ersion|.*?[a|A]coustic)[])}],$1 [$2])
but it doesn't seem that the grouping of the bracket types worked properly. Maybe somebody can show me what went wrong.

You want to find conditionally matching brackets of three different types. That is either impossible or very verbose within a single regular expression.

If you wanted to find pairs of matching simple quotation marks ' and ", that is possible in many engines: (["'])[^"']*\1 or ("|').*?$1, or some other combination of matching the start character, optionally advancing and then matching that start character as end character.

Your current start character is a left parenthesis ( which needs to be escaped \( but is not yet in a matching group (\(). Likewise, your current end character is a right parenthesis (\)). You could add square brackets and curly braces to these groups: either (\(\[\{) and (\)\]\}) or ([([{]) and ([)\[}]). This would, however also match unequal pairs like (], {) and [}. If that is good enough because you will never encounter nested brackets, your solution (with non-capturing groups – otherwise $2 must become $3) could be:

$regexp(%title%,(.*)(?:[([{])(.*?[mM]ix|.*?[eE]dit|.*?[vV]ersion|.*?[aA]coustic)(?:[)\[}]),$1'['$2']')

image

PS: You had vertical bars | inside character classes [aA], but they only separate alternatives as or within groups (a|A).

1 Like

This should work well, as none of the song titles I have encountered have nested brackets of any form. As far as matching mixed brackets, like ( ] and [ }, that was part of the goal too (I should have mentioned) due to an error I've made with another format some time ago.

You're right I suppose I completely forgot that the purpose of the character class is to 'choose one of these' by default.

You've helped me greatly, and I had a couple of questions, but I've narrowed it down to just one: did you place a character class inside the non-capturing groups? I believe that's how I should be reading that. If so, now I know that's possible. I just thought that the ] would need to be escaped like \].

Actually, perhaps this doesn't work. Using this
$regexp(%title%,(.*)(?:[([{])(.*?[mM]ix|.*?[eE]dit|.*?[vV]ersion|.*?[aA]coustic)(?:[)\[}]),$1'['$2']')
resulted in [ SYNTAX ERROR IN FORMATTING STRING ].
I thought perhaps the error was within the second non-capturing group, here:
(?:[) >> \[ << }])
but after changing that to what i expected to solve the issue:
$regexp(%title%,(.*)(?:[([{])(.*?[mM]ix|.*?[eE]dit|.*?[vV]ersion|.*?[aA]coustic)(?:[)\]}]),$1'['$2']')
I got this as a result:
SongName (Ft. Artist2) (MixArtist Remix) [][]
The same result is found if using:
$regexp(%title%,(.*)(?:\(\{\[)(.*?[mM]ix|.*?[eE]dit|.*?[vV]ersion|.*?[aA]coustic)(?:\)\}\]),$1'['$2']')
or
$regexp(%title%,(.*)([({[])(.*?[mM]ix|.*?[eE]dit|.*?[vV]ersion|.*?[aA]coustic)([)}\]]),$1'['$3']')
I must be missing something. Maybe I misunderstood something you said, but I'm not sure.

Oddly enough, using the same exact format here:


shows (.*)(?:[({[])(.*?[mM]ix|.*?[eE]dit|.*?[vV]ersion|.*?[aA]coustic)(?:[)}\]]) to be a functioning and proper solution.

Parsers subtly differ in their requirements to escape special characters. Additional backslashes usually donʼt hurt: (?:[\(\{\[]) and (?:[\)\}\]]).

1 Like

Using $regexp(%TITLE%,(.*)(?:[\(\{\[])(.*?[mM]ix|.*?[eE]dit|.*?[vV]ersion|.*?[aA]coustic)(?:[\)\}\]]),$1 [$2])
on these:
SongName (Feat. Artist2) (MixArtist Remix)
Summer Days (feat. Macklemore & Patrick Stump of Fall Out Boy) (Junior Sanchez Remix)
unfortunately did not work. I noticed that there WAS a small change, though. The titles had two spaces at the end (which I could not see at first glance)
I also noticed this:
by simplifying down to this:
$regexp(%TITLE%,(.*)\((.*?[mM]ix|.*?[eE]dit|.*?[vV]ersion|.*?[aA]coustic)\),$1 [$2])
I got this result:
SongName (Feat. Artist2)>2_spaces_here<
Could I be capturing just spaces and/or be misusing the replace part at the end of the function?

I can confirm that this works for just parenthesis:
$regexp(%title%,(.*)\((.*?[Aa]coustic|.*?[Ee]dit|.*?[Mm]ix|.*?[Vv]ersion)\),$1'['$2']')
and have helped me with this:


Before Format: SongName (Feat. Artist2) (MixArtist Remix)
Field: TITLE
Format string: $regexp(%title%,(.*)\((.*?[Aa]coustic|.*?[Ee]dit|.*?[Mm]ix|.*?[Vv]ersion)\),$1'['$2']')
<<<<<<<<<<Resulting In>>>>>>>>>>
After Format: SongName (Feat. Artist2) [MixArtist Remix]

However, attempting to use the workaround for other bracket formats didn't go so well:


Before Format: SongName (Feat. Artist2) (MixArtist Remix)
Field: TITLE
Format string: $regexp(%title%,(.*)(?:[\(\{\[])(.*?[Aa]coustic|.*?[Ee]dit|.*?[Mm]ix|.*?[Vv]ersion)(?:[\)\}\]]),$1'['$2']')
<<<<<<<<<<Resulting In>>>>>>>>>>
After Format: SongName (Feat. Artist2) (MixArtist Remix)[][]

Perhaps this could help. I currently use this Format string to place my Feat. Artist directly into my artist tag field including within multi-artist songs:
$meta_sep(artist,\\)\\$regexp(%TITLE%,'^(.+?)\s+[[({]?(?:ft\.?|feat\.?|featuring)\s+([^][(){}]+)[])}]?(\s+.+)?$','$2',1)
It can also be used this way:


Title: SongName {Feat. Artist2] (MixArtist Remix)
Field: TITLE
Format string: $regexp(%TITLE%,'^(.+?)\s+[[({]?(?:ft\.?|feat\.?|featuring)\s+(?:[^][(){}]+)[])}]?(\s+.+)?$','$1$2',1)
<<<<<<<<<<Results In>>>>>>>>>>
Title: SongName (MixArtist Remix)


I think everything we need is there, I just still can't figure it out.

I think I've figured it out :smiley:
$regexp(%title%,'^(.+)\s[[({](.+?[Mm]ix|.+?[Aa]coustic|.+?[Ee]dit|.+?[Vv]ersion|.+?[Dd]ub)[])}]?$',$1 '['$2']')
This seems to work, so I'll likely stick with it! It's pretty modular, so if anybody needs to add/subtract a term, they can just add to/remove from the list :slight_smile:

Slightly revised:

$regexp(%title%,
'^(.+)\s+[[({](.+?(?:[Mm]ix|[Aa]coustic|[Ee]dit|[Vv]ersion|[Dd]ub))[])}]?\s*$'
,
$1 '['$2']'
)

This only supports the last pair of brackets, though.

1 Like

While I believe I will only see these as the last brackets, I should put in countermeasure so that it would work whether or not it's the last. Thanks for pointing that out. I see what you did with the cleanup work in the Mix/acoustic/etc matching. That looks good!
The addition of the greedy quantifier is a good touch too. I can probably just do the same at the end followed by a third capture group to make this more of a cherry-picking tool:
Expression: '^(.+)\s+[[({](.+?(?:[Mm]ix|[Aa]coustic|[Ee]dit|[Vv]ersion|[Dd]ub))[])}]?(.+)?$'
Replacement: $1 '['$2']'$3
This way it will work even if followed by anything. I used this as a rather silly example/test:


Title before: Song1 Has Long Name (Feat. Artist2) {MixArtist Remix) (Other stuff) [Even more] wow [this is ridiculous}
Field: TITLE
Format string: $regexp(%title%,'^(.+)\s+[[({](.+?(?:[Mm]ix|[Aa]coustic|[Ee]dit|[Vv]ersion|[Dd]ub))[])}]?(.+)?$',$1 '['$2']'$3)
<<<<<<<<<<Resulting In>>>>>>>>>>
Song1 Has Long Name (Feat. Artist2) [MixArtist Remix] (Others) [Even more] wow [this is ridiculous}

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.