Removing duplicate fields

Hi!
I'm using multiple genre fields on my mp3, and as a lot of players don't handle it that well, I'm using as well a generic genre, stored in "grouping", that I wanted to add as first genre.
(usually, players that don't deal with multiple genre fields just use the first one available, which is perfect).

Sooo...
Basically, if the first genre is already the same that grouping, I don't have anything to do.
But if different, I wanted my genre field to be grouping\\genre1\\genre2\\...

To do so, I used format value on genre:

$if($neql($meta(genre,0),%contentgroup%),%contentgroup%\\\\$meta_sep(genre,\\\\),%genre%)

(I had to use %contentgroup%\\$meta_sep(genre,\\) instead of %contentgroup%\%genre%, because only the first genre field was copied if using %genre% without meta_sep... Is it the way it's supposed to work?)

problem is, I don't really know why, but I got a lot of duplicate fields, like Rock\\Alternative\\Alternative\\Indie\\Indie.

I was expecting it for the first genre, and it's not really a huge problem, but...
Is there a way to remove those real duplicates?
merge duplicates would keep those duplicated infos, and delete duplicates lose all the first genres, including the one I just added...

Thanks a lot!

PS: just found out why I have duplicate fields:
I was splitting the genre field with character "; ", with a space, instead of ";"
So basically the field was split with ";" AND " "...
Still interested if you have an idea on how to remove those duplicated field though! :slight_smile:

I've found a regex on regular-expressions.info that can do that:

Action type: Format value
Field: GENRE
Format string: $regexp($meta_sep(genre,\\\\),'(?:^|(?<=\\\\))([^\\\\]*)(\\\\\1)+(?=\\\\|$)',$1)

The original regex is (?<=,|^)([^,]*)(,\1)+(?=,|$)

dano's regex didn't work for me. The following did:
([^|]+)(|[ ]*\1)+ --> \1
found and modified the regex found here:

It only works for consecutive genres (pipe delimited), so:
blues|jazz|blues|blues --> blues|jazz|blues

At least it's something.

Further experimentation shows the following regex works better:

(?:^|\|)([^\|]+)(\|[ ]*\1)+ --> \2

I've fixed dano's regexp from above (made the lookbehind fixed width) and changed the formatting to enclose the regexp in backticks ` so that important characters are not removed by the forum software.

1 Like