Create new field by trimming existing tags

phso · June 3, 2022, 7:30am

Hi everyone-

I have %unsyncedlyrics% populated for my songs. They always begin in either one of the following formats:

(A)

eng||ComposerA: Abc/Def
LyricistA: Ghi/Jkl
ArrangerA: Mno/Pqr
ProducerA: Stu/Vwx

(Lyrics blah blah blah)

Or, (B)

eng||ComposerB: Abc/Def
LyricistB: Ghi/Jkl
ArrangerB: Mno/Pqr
ProducerB: Stu/Vwx

(Lyrics blah blah blah)

I want to feed this data into the fields: %composer%, %lyricist%, %arranger%, and %producer%.

Think of ComposerA and ComposerB as "composer" written in different languages, e.g. "Composer" for an English song, but "Compositeur" for a French song. (Idk, I'm just making stuff up - I don't think it'd matter much as it'll just be a search term.)

Below is as far as I got:

$mid(%unsyncedlyrics%,$add($strchr(%unsyncedlyrics%,Composer: ),10),$sub($strchr(%unsyncedlyrics%,Lyricist: ),$add($strchr(%unsyncedlyrics%,Composer: ),10)))

The "10" represents the length of "Composer: ", so it will return the text after "...ser: " and not "Com...".

This does not take into account the A OR B option. However, the issue I have now is that, it seems like the line spacing is giving me errors. When running this code, in the %composer% field, I see:

Abc/Def...

Or

Abc/Def
(blank line)

That is, the composer field is also capturing the line break, which I do not want. I tried trimming it by one character, but that doesn't work.

Could anyone please help. Much appreciated.

ohrenkino · June 3, 2022, 8:12am

Try an action of the type "Guess value" (Import tag fields)
Source: $regexp(%unsyncedlyrics%,'.*ComposerA: (.*?)\s*LyricistA: (.*?)\r\n\s*ArrangerA: (.*?)\r\n\s*ProducerA: (.*?)\r\n\s*.*',$1==$2==$3==$4)
Target format string: %composer%==%lyricist%==%arranger%==%producer%

You would have to replate the ComposerA etc. text with the real text.
It will not work for different spellings in different languages. YOu would have to create an individual action for each string variation.

phso · June 3, 2022, 8:28am

For now I will aim for just 1 language - and I got some progress extracting Composer, Lyricist, and Arranger.

$mid(%unsyncedlyrics%,$add($strchr(%unsyncedlyrics%,Composer: ),10),$sub($strchr(%unsyncedlyrics%,Lyricist: ),$add($strchr(%unsyncedlyrics%,Composer: ),10)))

As mentioned earlier, for composer, I'm finding everything to the right of "Composer: ", and before the left of "Lyricist: ".

The outstanding question now is for Producer. I can find everything to the right of "Producer: ", but I need to tell where to stop. (at the line break). Any idea on how to search for that line break?

ohrenkino · June 3, 2022, 8:31am

Have you tried my example?
You can test the regular expression in Converter>tag-Tag.
If that gives you
Abc/Def==Ghi/Jkl==Mno/Pqr==Stu/Vwx
then you should be able to extract all the fields with the suggested action.

ohrenkino · June 3, 2022, 8:35am

This does not assign the strings to any field.
The "Guess value" action that I suggest does just that.

phso · June 3, 2022, 8:49am

I tried but that didn't populate the fields. I am simplifying my ask to this: Instead of composer lyricist arranger producer, in the %unsyncedlyrics% the field will be as follows:

A:
B:
C:
D:

So I added the action to import tag fields, and I used the same target format string you provided.

$regexp(%unsyncedlyrics%,'.*A: (.*?)\s*B: (.*?)\r\n\s*C: (.*?)\r\n\s*D: (.*?)\r\n\s*.*',$1==$2==$3==$4)

I also tried the follwoing incase it's a typo but htat didn't work as well.

$regexp(%unsyncedlyrics%,'.*A: (.*?)\r\n\s*B: (.*?)\r\n\s*C: (.*?)\r\n\s*D: (.*?)\r\n\s*.*',$1==$2==$3==$4)

ohrenkino · June 3, 2022, 10:03am

Did you try the regular expression Converter>tag-Tag?

Alternatively: could you supply the contents of a real field instead of these made-up examples?
Because, naturally, I succeed with your dummy data in a lyrics field:

(This was for testing the regular expression - so it does not matter that I selected the field COMMENT as target)

phso · June 3, 2022, 3:49pm

Thanks so much for your help. Please see below

eng||Composer: Johnny YimC
Lyricist: -
Arranger: Johnny YimA
Producer: Gary Chan/Johnny Yim

La la la

tag1

See below for results:

Being a music-only track (no lyrics in the song) - that's why there's a dash for the lyricist field. I'd like to feed the - into the lyricist field as well.

ohrenkino · June 3, 2022, 4:20pm

Oh, you are on a Mac!
Please try to enclose also the last, the result string in apostrophes.
(I cannot test this as I haven't got a Mac).
But that is why I asked you to test the regular expression first.
Perhaps someone else can chime in if the expected result cannot be achieved by you on your own.

phso · June 3, 2022, 4:51pm

Thanks fo ryour prompt response... I tried adding apostrophes/quotes to the result string and that didn't work. Also tried single/double quotes and that didn't work as well.

ohrenkino · June 3, 2022, 5:09pm

I would suggest that you build the regular expression from scratch and watch the preview to see how the result changes, e.g. start with
$regexp(%unsyncedlyrics%,'.*ComposerA: (.*?)\s*\r\n.*','$1')
and see if that returns the string following "Composer: "
As soon as you got that working, you can add the other parts.

phso · June 3, 2022, 6:22pm

This is what I got:

$regexp(%unsyncedlyrics%,'.*Composer: (.*?)\n*Lyricist: (.*?)\n*Arranger: (.*?)\n*Producer: (.*?)\n*',$1==$2==$3==$4)

Composer/Lyricist/Arranger work perfectly, and now I'm back to the initial trouble with the producer line - It doesn't know where to stop because it's not reading the line break character...

Thanks again ohrenkino for your guidance thus far.

ohrenkino · June 3, 2022, 6:23pm

the expression should be
$regexp(%unsyncedlyrics%,'.*Composer: (.*?)\n*Lyricist: (.*?)\n*Arranger: (.*?)\n*Producer: (.*?)\n.*',$1==$2==$3==$4)

phso · June 3, 2022, 6:43pm

Negative...

This is the %unsyncedlyrics%:

eng||Composer: Mark Lui
Lyricist: Aarif Rahman
Arranger: Mark Lui
Producer: Mark Lui

When I hear...

Using this code-
$regexp(%unsyncedlyrics%,'.*Composer: (.*?)\n*Lyricist: (.*?)\n*Arranger: (.*?)\n*Producer: (.*?)\n*',$1==$2==$3==$4)

I get this-

So it's able to identify the first 3 fields, and it just doesn't know where to stop.

Using the other code-
$regexp(%unsyncedlyrics%,'.*Composer: (.*?)\n*Lyricist: (.*?)\n*Arranger: (.*?)\n*Producer: (.*?)\n.*',$1==$2==$3==$4)
where the difference is the period in the end \n.* (vs \n*)...

I get this-

Here, it's seems like it is unable to extract any data so it just spit out the full %unsyncedlyrics%.

ohrenkino · June 3, 2022, 6:50pm

In the WIndows world the lines are terminated by \r\n.
So perhaps you add the \r to each field ...
The .* means "any number of characters" so that should describe the (unwanted) rest.

In the original suggestion I had \s* to match any number of white space characters.
So, sorry, I can't help you from the distance. You would need to experiment a little more with the dots and asterisks.