Split COMMENT to multiple fields


#1

Hello!

All my music has a COMMENT tag, which is formatted like this:

YEAROFRELEASE LABEL - CATALOGNO - DG DISCOGS_ID
(example: 1991 Columbia - 468214 2 - DG 1799201)

But now I want to split that information into multiple tags. I was wondering if someone could help me do that, because I'm not that great at using regexp...

Thanks in advance!


Tags: Exporting Only a Portion of One Field to Another
#2

Hi, welcome to the forum,

Guess values is the one for this particular task. And a simple reg exp command to delete everything in the comment field afterwards.

Make sure action #2 is below the guess values one.

Do not use action #2 if you want to retain the information within the comment field.

Action #1:
Action type: Guess values
Source format: %comment%
Guessing pattern: %year% %publisher% - %catalog #% - %discogs_release_id%

Action #2:
Action type: Replace with regular expression
Field: COMMENT
Regular expression: .+
Replace matches with:

[ ] case-sensitive comparison

#3

Thanks, that worked just fine! Just a question, is it possible to use regexp syntax in Guess values? (e.g. "20[0-9][0-9]")


#4

I think, the answer might be "no".
But you can offer a real world example and we will see what the problem is and may help to solve.

DD.20111103.1900.CET


#5

You can you $regexp in the Sourceformat line, not in the Guessing Pattern.

e.g. you can use:
Sourceformat: $regexp(%comment%,'20[0-9][0-9]',20xx)
And refer to it with 20xx in the Guessing Pattern to all numbers from 2000 to 2099. But you can't write the numbers that way.

Sourceformat: $regexp(%comment%,'(20[0-9][0-9])',<YEAR:$1>)
And refer to it with <YEAR:%year%> in the Guessing Pattern. This way you can write it.

Or you don't du $regexp and use just "20%dummy% " in the Guessing Pattern, if you don't want to write it anywhere. (without quotationmarks, just to show the trailing space)

As DelevD said, it depends on what you are going to do.


#6

A real world example would be: "24/96 Vinyl Rip - aksman 2011 - 1999 Speakers Corner SP-3647 - DG 2513725" in COMMENT.

From this I wanted to extract "24/96 Vinyl Rip" to FORMAT, aksman to RIPPER, 2011 to RIPYEAR, 1999 to RELEASEYEAR, Speakers Corner to PUBLISHER, SP-3647 to CATALOG # and 2513725 to DISCOGS_RELEASE_ID.

I can manually introduce a " - " to separate the label from the catalog # if that simplifies the process.


#7

Yes, this may help, but check out the following proposal, maybe it can handle your real world example.

We use an action "Guess values".
To let Mp3tag guess values for these seven tag-fields ...

FORMAT, RIPPER, RIPYEAR, RELEASEYEAR, PUBLISHER, CATALOG #, DISCOGS_RELEASE_ID

... from this input string ...

24/96 Vinyl Rip - aksman 2011 - 1999 Speakers Corner SP-3647 - DG 2513725

... we need to prepare or modify the given input string to fit to a splitting rule, that will allow to guess values.

There are some ways to modify the input string to fit to a splitting rule, here is one ...

$regexp(%COMMENT%,'^(.+?) - (.+?) (\d{4}) - (\d{4}) (.+) (.+?\d+) - (.+?)$','$1_$2_$3_$4_$5_$6_$7')

This will change the input string to ...

24/96 Vinyl Rip_aksman_2011_1999_Speakers Corner_SP-3647_DG 2513725

The corresponding format string looks like ...

%FORMAT%_%RIPPER%_%RIPYEAR%_%RELEASEYEAR%_%PUBLISHER%_%CATALOG #%_%DISCOGS_RELEASE_ID%

The result will be ... seven tag-fields filled this way ...

Tag-Field: CATALOG #

Value: SP-3647
Tag-Field: DISCOGS_RELEASE_ID
Value: DG 2513725
Tag-Field: FORMAT
Value: 24/96 Vinyl Rip
Tag-Field: PUBLISHER
Value: Speakers Corner
Tag-Field: RELEASEYEAR
Value: 1999
Tag-Field: RIPPER
Value: aksman
Tag-Field: RIPYEAR
Value: 2011

Put it all together ...
Action: Guess values
Source format:

$regexp(%COMMENT%,'^(.+?) - (.+?) (\d{4}) - (\d{4}) (.+) (.+?\d+) - (.+?)$','$1_$2_$3_$4_$5_$6_$7')

Format string:

%FORMAT%_%RIPPER%_%RIPYEAR%_%RELEASEYEAR%_%PUBLISHER%_%CATALOG #%_%DISCOGS_RELEASE_ID%

You do not need to do the splitting by this preprocessing method using a complicated regular expression.
At the first step you can simply split the given input string into its obvious four components by filling regular and temporary tag-fields.
In following steps you can split the temporary tag-fields to get the seven tag-fields as wanted.

Example for an Action Group:

Begin Action Group Test_2011#20111105.GuessValues

Action #1
Actiontype 7: Import tag fields (guess values)
Source format __: %COMMENT%
Guessing pattern: %FORMAT% - %TMP_RIP% - %TMP_PUB% - %DISCOGS_RELEASE_ID%

Action #2
Actiontype 4: Replace with regular expression
Field ______________: TMP_RIP
Regular expression _: ^(.+)\s(\d{4})$
Replace matches with: $1 - $2

[_] Case sensitive comparison

Action #3
Actiontype 4: Replace with regular expression
Field ______________: TMP_PUB
Regular expression _: ^(\d{4})\s(.+)\s(.+)$
Replace matches with: $1 - $2 - $3

[_] Case sensitive comparison

Action #4
Actiontype 7: Import tag fields (guess values)
Source format __: %TMP_RIP%
Guessing pattern: %RIPPER% - %RIPYEAR%

Action #5
Actiontype 7: Import tag fields (guess values)
Source format __: %TMP_PUB%
Guessing pattern: %RELEASEYEAR% - %PUBLISHER% - %CATALOG #%

Action #6
Actiontype 9: Remove fields
Fields to remove (semicolon separated): TMP_RIP;TMP_PUB

End Action Group Test_2011#20111105.GuessValues (6 Actions)

DD.20111105.0840.CET