Count spaces before a "("

Hi people.
I'm working on a way to address common issues with renaming artists and songs with "The" as the first word.
I realise some people do not like Artist, The but I will campaign for this nomenclature as when you have Artist = Eagles and Artist = Eagles, The it is easier to see in your "sorted by artist" list.

So far i've come up with a method of renaming some of the TITLES. eg.
Artist - The Song Title (Some Text) ... becomes
Artist - Song Title, The (Some Text)

This is part way there as my made up "rules" for artist and title total about four.

I'm also having an issue with regex being greedy, so i moved to format value.

Can I incorporate / mix regex and format value?

The ideal outcomes / rules are:

  1. The Artist ==> Artist, The
  2. The Artist word word word ==> Artist word word word (drops "The")
  3. The Artist feat. Artist Two ==> Artist, The feat. Artist Two
  4. Artist One feat. The Artist ==> Artist One feat. Artist, The

Title is the same as rules 1 and 2 for Artist SO
The word word word (text in brackets) ==> word word word (text in brackets)
BUT
The word (text in brackets) ==> Word, The (text in brackets) *1

*1 I have done this successfully with regex. However, it fails when the Title is:
The word (text) (in) (brackets) as i'm using a greedy operator.

So that's when I switched to Format Value.

It would be great if I could count the number of spaces before a ( so i can use the $if function.

I'll keep working on it but any tips on how to count spaces before a bracket would help lots. Cheers

Please show us the actions and the real strings that you have.
Also, the attempts with regular expressions that allegedly failed would be nice to see.

I doubt that this is title. That looks much more like a filename. So, what are we talking about?
If it is the filename, then also please show us the contents of the corresponding tag fields.
It is much simpler to work on structured data in fields than the amorphous single string of a filename.

Just an example for a regular expression that moves the first The behind the second word:
$regexp('The Good, The Bad and the Ugly (text in brackets)',The (.*?) (.*),'$1, The $2')
leads to
Good,, The The Bad and the Ugly (text in brackets)
Does not look right to me in respect to meaning and punctuation, actually.

Hey, yes. That indeed does look weird, but the "rules" that I have will drop the first "The", leaving my Title in this case as "Good, The Bad & The Ugly"... not ideal, but manageable.

In the meantime, I have come up with my TITLE "Fix THE", first draft... I've just copied the text from the script file. (not sure how to put it in a text box)

NB: I have created customised columns but don't know if that's necessary when using %inventedVariables%. The ones I "invented" are:
%brackets%, !stores text in brackets
%tmpfield% !used for checking values and string building. Not present in this code... and
%text_ee% !Everything Else that's text EXCEPT the first word.

!GUESS VALUE - Move EVERYthing after the first "(" to the variable %brackets%. DevNote: Test for "(" before moving in case of NO "(".
! For this code to work, replace [SPACE] with one actual space
[#0]
T=7
F=%title%
1=%title%(%brackets%

!Add the lost "(" back to the brackets text
[#1]
T=5
1=(%brackets%
F=BRACKETS

!Move everything after the first space to variable %text_ee%. DevNote: Not tested for ONLY one word title in case of accidental run.
[#2]
T=7
F=%title%
1=%title%[SPACE]%text_ee%

!Remove trailing space from %text_ee%
[#3]
T=5
1=$trim(%text_ee%)
F=TEXT_EE

! #1TEST if the remaining word in TITLE is "The", (AKA Accidental code run)
!if YES,...Code run good
! #2TEST more than 1 word in EE
! if YES,
! reconfigure TITLE >> Title Words With No THE (Some Text)
! if NO
! reconfigure TITLE >> Title, The (Some Text)
! #2TEST END
!if NO...aka accidental code run
! rebuild the %title%
! populate title with desired configuration
[#4]
T=5
1=$if($stricmp('The',$left(%title%,3)),$if($strchr(%text_ee%,'[SPACE]'),%text_ee%'[SPACE]'%brackets%,%text_ee%',[SPACE]THE[SPACE]'%brackets%),%title%[SPACE]%text_ee%'[SPACE]'%brackets%)
F=TITLE

!Tidy up
[#5]
T=9
F=TMPFIELD;BRACKETS;TEXT_EE

!EOF

===
I will run some more tests. This works, but can it be faster by running it in one line of code?

Thanks for your prompt reply and moving this to a new topic

You people rock!

You're right. Just ignore the "Artist - " part. My error, sorry

Real strings... (or close proximity of)
The Panama (Intro Clean)
The Owner Of A Lonely Heart (Remix) (Intro Clean)

Regex use (i think one or both of these)
(.*)\s
!Finds everything between ( and )... eg. XXX (YYY) finds (YYY)
OR
(.+?()
OR
(.+()

Resulted in
The Panama (
The Owner Of A Lonely Heart (Remix) (

Hope this helps

EDIT:
Found my code for the whole regex to apply to TITLE
(The )((?<=\s).+(?=\s(?=())) ((.+)

And what should it look like?
Panama, The (Intro Clean)
or
Panama (Intro Clean), The

The example
The Owner Of A Lonely Heart (Remix) (Intro Clean)
is called
Owner of a lonely heart
https://www.discogs.com/de/master/35851-Yes-Owner-Of-A-Lonely-Heart

anyway, so deleting the "The" would come closer to the orginal.

$regexp('The Panama (Intro Clean)',The (.*?) (.*),'$1, The $2')
leads to
Panama, The (Intro Clean)
$regexp('The Panama (Intro Clean)',The (.*),'$1, The')
leads to
Panama (Intro Clean), The

Thanks.
The Titles of songs were examples I made... Running my code of:
(The )((?<=\s).+(?=\s(?=\())) (\(.+)
Resulted in (using these examples):
Desired result : Panama, The (Intro Clean)
Undesired result: Owner Of A Lonely Heart (Remix), The (Intro Clean)
Or something pretty close...

I think my new expressions using GUESS and FORMAT are the way for me to go. I understand those much better and constructed those in the space of a couple hours whereas I've been working on REGEX for 2 complete days!

What do you think?
Thanks again

For that result there is a simple expression:

You have not said yet what the "Owner ..." example should look like.

Hello again
Owner Of A Lonely Heart (Remix), The (Intro Clean)
should look like
Owner Of A Lonely Heart (Remix) (Intro Clean)
"The" has been dropped because the title has more than 2 words...
The Joker ==> Joker, The
The Joker (Live at Glenrock) ==> Joker, The (Live at Glenrock)
The Joker & The Thief ==> Joker & The Thief
The Joker & The Thief (Remix) ==> Joker & The Thief (Remix)
The Joker & The Thief (Radio Edit) (Clean Intro) ==> Joker & The Thief (Radio Edit) (Clean Intro)

Your code:
$regexp('The Panama (Intro Clean)',The (.?) (.),'$1, The $2')
Would work perfectly if :
you have a 2 word title before the brackets, and
if I knew which input box to place that in

If the title has more than 2 words before the brackets, (eg. The Young Panama Girls), it changes to
Young, The Panama Girls (Intro Clean)

3 questions.

  1. Can I change 'The Panama (Intro Clean)' into %title% so it reads
    $regexp(%title%,The (.*?) (.*),'$1, The $2')
  2. Which input box do I put this into and how's it broken up if at all??
  3. Can I use regular expressions in format value input box, or the other way around, and if so, how?

Thanks again. I'm still working on my GUESS VALUES and FORMAT VALUES

EDIT: I worked out i use
$regexp(%title%,The (.*?) (.*),'$1, The $2') in the FORMAT VALUES input box, which answers Questions 2 AND 3.

Yes.
Any generating format string allows scripting functions

I understand that for sorting purposes it may be of advantage to leave out articles. E.g. The Beatles called themselves The Beatles and just Beatles. Getting all variations together would be OK.
For this purpose there is a field called ARTISTSORT (and there are further ~SORT fields).
For you problem with data in TITLE I would recommend ...
to leave TITLE as it is
to fill TITLESORT with the data stripped by the leading "The" - for the correct sorting, the missing "The" is not relevant anymore as I bet that the correct order is found after the first 10 characters.
This has the advantage that you do not loose any data.
With your rule of throwing away the "The" sometimes and sometimes not, it is never obvious whether there was no "The" in the first place of whether it is missing right now because of the (strange) rule - which leads to titles like "The owner of a lonely heart" which was never called like that.

To get rid of the leading "the" in TITLESORT
Create an action of the type "Format value" for TITLESORT
Format string: $regexp(%title%,^The\s+(.*),$1)

Here an expression that counts the spaces in a given string:
$add($len($regexp(%title%,'[^\s]',)),1)
You could no condense the part of TITLE to that which you want to check for spaces and then create an expression with an $if() condition to treat the cases.
This expression will become long and complicated.
So I would favour the alternative with TITLESORT

Hey, that's a great idea IF this is the program I use to access and play my files... But, I'm not sure that SERATO or other DJ software will allow me to have those columns

There are players that support the ~sort fields, e.g. iTunes, foobar.

What you could still do, if the ~sort field is not evaluated:
Put the data without "The" into the field TITLE and the original data in TITLESORT.
If you find a better program that can deal with the TITLESORT field, swap the fields, see the FAQs for that

Yep, I know! You should see the ones I have in my current Format Values script!

$if($stricmp('The',$left(%title%,3)),$ifgreater($strchr(%text_ee%,' '),1,$trim(%text_ee%' '%brackets%),$trim(%text_ee%', THE '%brackets%)),$trim(%title%' '%text_ee%%brackets%))

It checks title starts with "The" before it then checks if it needs a ", The" and then rebuilds the title in my preferred naming convention, or otherwise returns the title to it's initial state if i've accidentally run the script on a title that doesn't start with "the"....

And i still have more error-checking, such as "does the song title start with T H E but not the word "The", like There, or Then.
Also, does my song start with brackets e.g. (Just like) Starting Over. But I will have that sorted by the weekend.

That could save me many more hours. Sincere Thanks

So to check, this code identifies the characters in the specified string (%title%) that are NOT space, replaces them with NOTHING, thereby leaving only spaces, then asks the length of that string? Why do we need to add 1 to the number?

You could rule that out if you used "The " (with a space) instead of the short "The".

I could. I did before but the code I use is destructive. It removes the (brackets text) from the TITLE, then removed everything else after the first word, including the space.
Solutions I see are either: (in order of easy to hard)

  1. I'll need to look up a way of COPYING the TITLE to a TEMP FIELD and then be destructive as I like with the info in TEMPFIELD. - Copying will leave me the original title to check for The ",
  2. Add a space to the destroyed TITLE field and then check if there's a "The ",
  3. Add an AND condition to check the length of the Title :face_with_spiral_eyes: