Case conversion...

yog-sothoth · March 28, 2011, 11:16am

Hi dano, I can't seem to figure the correct syntax out. Could you show me the way, please?

dano · March 28, 2011, 11:31am

$regexp(%_filename%,'(.)(?![\s.'''':;)]"])(?)(?!$)(?!\d)(?!(co|net|org|gov|edu|mil))',$0 )

yog-sothoth · March 28, 2011, 11:51am

Thanks dano.

yog-sothoth · March 28, 2011, 3:05pm

I've just created my first regex script from scratch. It's very basic, but it seems to do its job ok. I just want some advice on whether or not there is a better method.

The purpose of the script is to remove white space between initials and an ampersand (&). For instance, "R & B" to "R&B".

Action type: Replace with regular expression
Field: _TAG
Regular expression: \b([a-zA-Z])\s(&)\s([a-zA-Z])\b
Replace matches with: $1$2$3
[ ] case-sensitive comparison

Did I do alright?

DetlevD · March 28, 2011, 4:11pm

It looks like a working regular expression. Congratulations!

If you want such matches, then it would be the right regular expression for you.

Note:
If you do not check the 'case-sensitive comparison', then it makes no sense to build a character group of '[a-zA-Z]'.
Be aware that all tag-fields will be respected when using the pseudo tag-field '_TAG'.
You can also think about how to avoid false positives. A simple $replace could be more safe.

DD.20110328.2012.CEST

yog-sothoth · March 28, 2011, 5:08pm

Thanks Detlev, I've taken your advice and changed it to "\b([a-z])\s(&)\s([a-z])\b". Works fine. Cheers.

yog-sothoth · March 29, 2011, 12:29pm

Good afternoon, chaps. I've been tinkering with this script for around three weeks now, but finally the first stable version is finished. You can find it, together with instructions and accreditations, on my updated first post on this thread. Thanks to everyone who helped out, I really couldn't have done it without you all. I'm looking forward to some critical feedback, and any suggestions/bug reports would be appreciated. Cheers!

Doug_Mackie · March 29, 2011, 7:16pm

Nicely done, sir, and thanks for sharing!

One note of caution about adding back apostrophes. I think that the reason that people remove them is that some popular burning programs have problems with them.

I encountered this myself. Nero 9 and earlier would crash after dropping a playlist containing apostrophes onto its audio CD burn window. Nero 10 doesn't crash in that scenario, but it does strip them out of CD-Text. I have to add them back by hand before executing the burn. I was unable to determine exactly why apostrophes are problematic.

yog-sothoth · March 29, 2011, 8:14pm

Ha, such modesty! At least 1/3 of it was lifted wholesale from your script. The honour is mine, but the credit must go largely to you. Thanks again.

To be honest I very rarely burn CD's nowadays, but I can imagine it being a problem for lots of people. I'll update the zip file to include an info file in due course, within which a current bug/compatibility list shall be maintained. Thanks for the tip.

DJBoS · March 31, 2011, 11:51am

Is the "directory names" script actually supposed to change the names of the artist and/or album folders the files are contained within? Or am I misunderstanding the purpose of this one? I didn't notice the actual folder titles changing. Other than that, great scripts, work perfectly. I'm no English major, but it looks like the grammar and everything is coming out right.

yog-sothoth · March 31, 2011, 2:25pm

Yes, exactly that. Try testing it on a containing folder that has lowercase lettering.

yog-sothoth · March 31, 2011, 2:58pm

Script Updated

v.0.3.1 beta

FIXED: Roman Numerals action causing unwanted deletion of "$" symbols.
FIXED: Replay Gain formatting error.

DJBoS · April 4, 2011, 3:10am

I know you have a section in your script that changes things like MacDonald, or McDonald, etc.

I'm noticing in that it's changing words that shouldn't be like... Machine = MacHine or Creamcheese = CreaMcHeese.

Doug_Mackie · April 4, 2011, 12:21pm

Hello DjBos,

Sorry, the problem with words like "creamcheese" was fixed quite a while ago but I neglected to update that section in the Filenames action of the posted version, and Yog's script uses my Scottish names element as well. It is now fixed so please download the latest version of my script (22 April 2016).

I don't see how to prevent errors with words like "machine". When words are lowercase to begin with (and all of my scripts assume that), a script cannot distinguish lastnames from ordinary nouns beginning with "mac". So the decision is whether you prefer to fix those nouns by hand afterwards, or to correct Scottish names by hand or with a separate action. You can delete or copy the Scottish names element from the actions. It appears once in the file names script and three times in the tag script (for artist, album, and title).

DJBoS · April 4, 2011, 12:41pm

Doug Mackie:

Hello DjBos,

Sorry, the problem with words like "creamcheese" was fixed quite a while ago but I neglected to update that section in the Filenames action of the posted version, and Yog's script uses my Scottish names element as well. It is now fixed so please download the latest version.

I don't see how to prevent errors with words like "machine". When words are lowercase to begin with (and all of my scripts assume that), a script cannot distinguish lastnames from ordinary nouns beginning with "mac". So the decision is whether you prefer to fix those nouns by hand afterwards, or to correct Scottish names by hand or with a separate action. You can delete or copy the Scottish names element from the actions. It appears once in the file names script and three times in the tag script (for artist, album, and title).

After I saw what was happening, I would run the scripts, then filter the directory for "mc" and just manually fix the few files that had changed in the wrong way. Thanks for updating the files, I'll make sure to get the latest rev.

-edit-

just to clarify, this update is in your personal script not the ones yog has in this thread?

Doug_Mackie · April 4, 2011, 12:58pm

Correct, only in my scripts. The zip name is the same as before so any link to my script will take you to the new version.

yog-sothoth · April 5, 2011, 10:57am

You're quite right, I hadn't thought of that. Considering how many words beginning with mac there is, making a list of corrections is infeasible. I think the only logical thing to do is to remove support for Mac names, or perhaps replace it with a small corrections list of common ones, like MacDonald. What do you think, Doug?

Doug_Mackie · April 5, 2011, 2:10pm

It depends on the material that you work with. One the one hand, if you collect Celtic music, then surely it would be helpful. There are not too many common words in English that begin with "mac" and have three or more letters after the "c". The ones that do are often imports (like macabre, macaroni, macaroon, and machete) and are rarely found in song titles that I see.

On the other hand, many Americans with Scottish last names (including me) only capitalize the first letter because that became the usage of their ancestors over here. Not all names beginning with "mac" are Scottish (like Hawaiian singer Lena Machado), and usage may be inconsistent (Shakespeare's Macbeth, the Scottish whiskey Macallen, Apple Macintosh but MacBook).

To assess your needs or to look for errors, you can use the Mp3tag filter to look for words that begin with "mac" followed by three or more characters (these expressions may look inadequate but they work as described):

* MATCHES \bmac\l{3} * MATCHES "(?-i:\b(Mac|Mc)\l{3})" [case-sensitive and also detects prefix Mc]

I found surprisingly few matches, only 33 out of about 10,000 files. Of the 33, 12 were not Scottish names and "machine" and "Machado" accounted for most of those. For now, I am keeping Scottish names in my actions, but I can see a case for removing them.

yog-sothoth · April 5, 2011, 3:48pm

I found 251 matches from ~70,000. That's a comparable ratio to yours. However, the majority of them were not Scottish names.

As my script ignores uppercase letters during case conversion, and assuming that many of the unformatted tags will already have the correct casing for such words, I'm going to remove the Mac comparison and replace it with a corrections list. Cheers.

yog-sothoth · April 5, 2011, 5:50pm

Here's a fix to un-camelcase words beginning with Mac.

Action type: Replace with regular expression
Field: _ALL
Regular expression: \b(Mac)((?:[a-z]|-){3,})
Replace matches with: $1$lower($2,-)
[ ] case-sensitive comparison

Action type: Replace with regular expression
Field: _DIRECTORY
Regular expression: \b(Mac)((?:[a-z]|-){3,})
Replace matches with: $1$lower($2,-)
[ ] case-sensitive comparison

I'll update the script tomorrow if I have time.