Regular Expressions

Zoofield · June 10, 2011, 7:17pm

I could not get, "DetlevD - Splitting an Upper Camel Case string" to work, although I like the idea!

These together:

RE:Adds a space between Capital and lowercase letter or digit behind it.
Field:_Tag
re:([^A-Z\W\_])([A-Z])(?=[^A-Z])
($1) ($2)

RE:Adds a space between Digit and lower case letter behind it
Field:_Tag
re:([^\W\d\_])(\d)
($1) ($2)

Will...

Example.
From:
"ThisIsThe2ndSongFromD.D.'sFirstAlbum30YearsAgo."
To:
"This Is The 2nd Song From D.D.'s First Album 30 Years Ago."

This is the two above combined:

RE:Adds a space between, Capital and lowercase letter or digit behind it, Digit and lower case letter behind it.
Field:_Tag
re:([^A-Z\W\_])([A-Z])(?=[^A-Z])|([^\W\d\_])(\d)
$1$3 $2$4

Does the same except in the case of...

CapitalWord9CapitalWord2ndSong30Years

Example.
First pass will: Capital Word9 Capital Word 2nd Song 30 Years
Second pass will: Capital Word 9 Capital Word 2nd Song 30 Years
Third pass will: Reveal A Latent O.C. Disorder

Because the single digit '9' in the example can only be captured once per pass per replacement.

I would use the first set for completeness and through the "Action Groups".
I would use the Second for brevity and through "Actions (Quick)"

tobi06 · June 18, 2011, 8:01am

To remove any website in the filename

example: artist - title[www.whateverwebsite.com].mp3

           artist - title.mp3

regular expression: [.{3}.\w*..{3}]

gingernob · June 19, 2011, 3:06pm

OK, i know im being dumb, just this reg ex stuff is over my head..u might just as well speak japanese to me.
All the examples i find talk of removing ## - [track title] to [track title]
i just want ## [track title] to [track title]...no dash.
appreciate a 'simple' answer for simpleton.

dano · June 19, 2011, 5:14pm

I've added an example without dash to that post.

gingernob · June 19, 2011, 10:39pm

Thank u so much...

DetlevD · August 1, 2011, 2:03am

Convert from RFC822/1123 Date string to ISO-8601 Date string.

Examples:
RFC822/RFC 1123 ==> ISO-8601
1 Feb 2009 ==> 2009-02-01
30 Sep 2010 ==> 2010-09-30

$replace($regexp($right('0'%YEAR%,11), '(\d{1,2}) (Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec) (\d{2,4})','$3-$2-$1'), 'Jan','01','Feb','02','Mar','03','Apr','04','May','05','Jun','06', 'Jul','07','Aug','08','Sep','09','Oct','10','Nov','11','Dec','12')

See also:
http://www.w3.org/Protocols/rfc822/
http://www.freesoft.org/CIE/RFC/1123/99.htm

DD.20110801.0623.CEST

Example:
1 Feb 2009 HH:MM ==> 2009-02-01 HH:MM

$replace($regexp($right('0'$cutRight(%YEAR%,6),11), '(\d{1,2}) (Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec) (\d{2,4})','$3-$2-$1'), 'Jan','01','Feb','02','Mar','03','Apr','04','May','05','Jun','06', 'Jul','07','Aug','08','Sep','09','Oct','10','Nov','11','Dec','12')$right(%YEAR%,6)

DD.20140310.2236.CET

DetlevD · August 24, 2011, 12:55pm

How to copy a list of artists and their roles ...
... from tag-field COMMENT
... to tag-field INVOLVEDPEOPLE

Example 1
From: COMMENT
Person1:Role1
Person2:Role2
Person3:Role3
To: INVOLVEDPEOPLE
Role1:Person1;Role2:Person2;Role3:Person3;

Action: Format value
Field: INVOLVEDPEOPLE
Formatstring:

$regexp(%COMMENT%$char(13),'(.+?):(.+?)[\r\n]+','$2:$1;')

Example 2
From: COMMENT
Person1:Role1
Person2: Role2a,Role2b
Person3 : Role3a, Role3b
Person4: Role4a & Role4b, Role4c
To: INVOLVEDPEOPLE
Role1:Person1;Role2a,Role2b:Person2;Role3a,Role3b:Person3;Role4a & Role4b,Role4c:Person4;

Action: Format value
Field: INVOLVEDPEOPLE
Formatstring:

$regexp($regexp(%COMMENT%$char(13),'(.+?)\s*:\s*(.+?)[\r\n]+','$2:$1;'),'\s*,\s*',',')

DD.20110824.1757.CEST

pone · August 24, 2011, 1:29pm

What is $char(13) ? And why is it needed here?

DetlevD · August 24, 2011, 1:48pm

$char(13) is the "CarriageReturn" control character.
It is appended here on the fly to the COMMENT string as a helper, just to make sure, that there is at least one "CarriageReturn" character at the end of the COMMENT string, in order to let the RegExp work correctly, even for the case, when the original COMMENT string has no trailing CarriageReturn/LineFeed sequence.

DD.20110824.1750.CEST

naisanza · February 7, 2013, 10:55pm

If you want to add TEXT to the existing tag:

Regular expression: (.*)
Replace with: TEXT \1 \0

I found that if you don't put in the \0 at the end it will repeat TEXT twice. For example:

Title: Texty text
Becomes: TEXT Texty textTEXT

I'm using v2.48. This might of been fixed in the later versions, but it's the version I've been using and some things I refuse to update to a newer version since I've never had any other issues with it except for this one bit of annoyance.

RevRagnarok · April 5, 2013, 7:41am

What seems to be happening is the "end of line" is being matched a second time for some reason.

Using 2.48, I duplicated your bug but also tweaked the RegEx to work properly.

Regular expression: ^(.*)$
Replace with: TEXT $1

That explicitly tells the engine not to include the newline in the match.

dano · April 5, 2013, 10:39am

It's not a bug.
It happens because global matching is activated by default.
That means the engine tries to match the regex pattern as many times as possible.

With (.) the whole text is matched at first. Then the engine stands at the end of the text. And it tries to match the pattern again.
And it succeeds because . also matches "nothing".

But I'd prefer a simple "Format value" action is for this task anyway.

Luctus · March 10, 2014, 4:36pm

Can you help me to convert 1 Feb 2009 HH:MM to 2009-02-01 HH:MM

DetlevD · March 10, 2014, 8:38pm

Read there ...
Regular Expressions

DD.20140310.2238.CET

Stem75 · March 11, 2018, 3:54pm

A very nice regex. I change it a little bit, with this one:

Regex:

((\s*0*)*(\d+)(\s*0*)*)*(.*(\s*0*)*\d+(\s*0*)*)*

Replace with:

$3

because in this extreme examples:

01/100
0 01/100 0
0 01/ 100
0 01 / 0 0100
0 0 00001 / 0 0 0 100 0 0 0
0 0 20 0 / 0 200 0
0 0 2 0 0 / 0 200 0
/ 0 200 0
/ 0 0 20 0 0

didn't work for me.

Stem75 · March 12, 2018, 11:45am

I use this regex expressions for my albums:

field: Album

((\s?\bv(ol)?(ume)?)(?=(\s*\.*\s*[0-9])+))(\s*\.*\s*([0-9]+)\s*)
Replace with:
spacev\7space

\s+$*\[*\s*((dis)?c(d)?)(?=\s*\d)\s*([0-9]*)\s*\]*$*
Replace with:
spacecd\4space

(v)([0-9](?![0-9]))
Replace with:
v0\2

(\sv\d+)(\s*\-+\s*)*(cd\d+)
Replace with:
\1space\3

to turn this:

Rave Mission Vol. 9 - Cd 2
Rave Mission vol. 9 [cd 2]
Rave Mission Vol. 9 (CD2)
Rave Mission Vol 9 Disc 2
Rave Mission Volume 09 (DISC 2)
Rave Mission Volume 9 CD2
Rave Mission Volume9 (Disc 2)

into this:

Rave Mission v09 cd2

The expressions must be in this order to work and where you see the word
(space) delete it and press spacebar.

stevehero · May 16, 2018, 3:54pm

Here's an all in one expression:
Replace: (\s)(?i)vol[^\d]*\s*0*(\d+).+?(?:$|\[)*(?:cd|disc)[^\d]*?\s*(\d+)(?:$|\])*
With: $1v0$2 cd$3

Stem75 · May 17, 2018, 8:20pm

This was very nice and inspiring.

I tried out and study it but I couldn't accomplish what I was looking for into one line with my poor knowledge on regex.
So here is an update of mine.
If you can make it better ( I couldn't ) i will be vary happy.
I hope you have a success because the smaller is better.

This is an updated version for the ( album tag ) volume and disc renaming.
I hope you find it much better!!!

ALBUM
\s*$*\[*\s*(v|ol|ume)+(\s*\.*\s*(?=0))?\s*\.*\s*0*([0-9]+)\s*\]*$*((\s)+|$)
replace with:
spacev\3\5

ALBUM
\s*$*\[*\s*(dis|c|d)+(\s*\.*\s*(?=0))?\s*\.*\s*0*([0-9]+)\s*\]*$*((\s)+|$)
replace with:
spacecd\3\5

ALBUM
(\s*)(v)([0-9](?![0-9]))
replace with:
\1v0\3

ALBUM
(\sv\d+)(\s*\-+\s*)*(cd\d+)
replace with:
\1 \3

You can turn this:

Rave Mission V9 cd9
Rave Mission V 9 cd 9
Rave Mission V.9 cd.9
Rave Mission V. 9 cd. 9
Rave Mission Vol9 (cd9)
Rave Mission Vol 9 (cd 9)
Rave Mission Vol.9 (cd.9)
Rave Mission Vol. 9 (cd. 9)
Rave Mission Volume9 disc9
Rave Mission Volume 9 disc 9
Rave Mission Volume.9 disc.9
Rave Mission Volume. 9 disc. 9
Rave Mission Volan9 (disc9)
Rave Mission Volan 9 (disc 9)
Rave Mission Volan.9 (disc.9)
Rave Mission Volum. 9 (disc. 9)
Rave Mission V0123 [cd9]
Rave Mission V 0123 [cd 9]
Rave Mission V.0123 [disc123]
Rave Mission V. 0123 [disc 0123]
Rave Mission Vol 0123 (cd0123)
Rave Mission Vol. 0123 (cd 0123)
Rave Mission Volume0123 (disc0123)
Rave Mission Volume 0123 (disc 0123)
Rave Mission Volume.0123 ( disc 0123 )
Rave Mission V0123 ( discopolis 0123 )
Rave Mission V0123 discopolis 0123
Rave Mission ( discopolis 0123 ) volume 0123
Rave Mission disco 9 volume 09
Rave Mission disc 9 volume 9

Into this:

Rave Mission v09 cd9
Rave Mission Volan9 cd9
Rave Mission Volan 9 cd9
Rave Mission Volan.9 cd9
Rave Mission Volum. 9 cd9
Rave Mission v123 cd9
Rave Mission v123 cd123
Rave Mission v123 ( discopolis 0123 )
Rave Mission v123 discopolis 0123
Rave Mission ( discopolis 0123 ) v123
Rave Mission disco 9 v09
Rave Mission cd9 v09

The expressions must be in this order to work and where you see the word

(space) delete it and press spacebar.samples.zip (2.3 MB)

Florian · June 29, 2018, 8:04am

A post was split to a new topic: Extract remix information

guimms · March 24, 2019, 9:30am

NON ASCII detection

a simple expression to place in the Filter
[^ \t-~]
to be used with a 'MATCHES' sentence

All tags with NON ASCII char will be revealed, even invisible one (CR,LF!!)

regex onlineTool
https://regexr.com/

online tool usefull for testing regex on text you can import
full of example of regex (working and non working...it's a community thing)