Add a space after a period if none exists in %comment% while filtering terms like U.S.A

Hello,
I searched and was surprised I could not find a solution for this. I am trying to figure out how to add a space after a period in my %comment% field.

For Example: "Test is what I like to use.Using test is good.However, there is a problem with certain abbreviations like O.K. and U.S.A. "
Would be transformed to: "Test is what I like to use. Using test is good. However, there is a problem with certain abbreviations like O.K. and U.S.A"

As my example mentions, there needs to be a way to filter words such as U.S.A or anything with a ".com," ".org" etc. I assume that would require the use for the " | " to seperate the terms. I have looked at the scripts I have to capitalize all words except preposition in titles, but cannot figure out how to adapt them.

you could try:
$regexp('Test is what I like to use.Using test is good.However, there is a problem with certain abbreviations like O.K. and U.S.A.',(\l)\.(\u),$1. $2)

Thanks! Pehaps I was not clear, I was not talking about just this specific example, but for any text contained in the %comment% field. I have a lot of stuff I have amalgamated in the comments section and sometimes there is a bunch that has just been separated by a period. like:
"Tracks 4-11 & 4-12 - from sessions for "Sailin' Shoes".Tracks 4-11 & 4-12 recorded at Sunwest Recording Studios, Hollywood, CA (4/7/71-4/14/71).EAC FLAC -8"

I would prefer that it would display like this:
"Tracks 4-11 & 4-12 - from sessions for "Sailin' Shoes". Tracks 4-11 & 4-12 recorded at Sunwest Recording Studios, Hollywood, CA (4/7/71-4/14/71). EAC FLAC -8"

I would like to add a step to my clean-up action that will add a space after a period at the end of a sentence, but not after any period, like is used on abbreviations such as O.K., U.S.A. and so forth. I assume one would have to compile a list of exceptions to add to the action, indicating that these terms be skipped, and there for not have a space added after the period. One approach could instruct the action to skip words that are all capitalized, but that would still miss abbreviations like Rmx. or Rmst. for remixed or remastered.

Perhaps that is not clear to you: I can only test my idea with your data.
You, of course, replace that string with the field variable.

And - as it is almost always the case - the first string shows a different pattern than the one that you added now.
My idea was that it is OK to add the space if you have a lower case character, the dot and then an upper case character.
Now there are also the inverted commas.
But such characters could easily be replaced with a simple replace action: replace the inverted commas with inverted commas space.
The same applies to parenthesis and such things.
The problematic case were real letters. And that is catered for with my suggestions.

OK, let me try to figure this out. I am easily confused.

I think I understand now. Thank you! I set up a action using Replace with regular expression, I set the field to COMMENT, the regular expression as "(\l)\.(\u)" and replace matches with "$1. $2"

If that is the way it is supposed to be set up, it did not work, these are my results:
Before action: EAC FLAC -8.Test if this is O.K.Sentence2. Sentence 3.
After action: EAC FLAC -8.Test if this is O. K.Sentence2. Sentence 3.
Running 2nd time: EAC FLAC -8.Test if this is O. K. Sentence2. Sentence 3.

As you can see I did not get the desired result. In the first run, there was no space added after the "8" or "K" (I am not sure why) but there was a space added after the "O" which was not the intention. I believe this script would have to be more complex to skip certain state of affairs.

The script should add a space if a set of conditions do not exist, such as:

  • if it is followed by a letter followed by a period (which would indicate a abbreviation or acronym)
  • if the period is followed by 2 more periods (which would indicate an ellipses) or by a comma, closed bracket, parenthesis, or curly bracket, or another punctuation.
  • if the period is not followed by a space (to avoid adding white spaces)

I cannot think of more omission samples, though I am sure more exist and will present themselves while testing the action. Of course if others want to offer some I would appreciate it.

Anyhow, is there a way to incorporate these conditions into the script?
Is there a way to get the script to add a space to a period following a number, as in the "8" in my example? Any idea why it did not add the space.

I cannot reproduce the result of the added space following the "O.".
$regexp('EAC FLAC -8.Test if this is O.K.Sentence2. Sentence 3.',(\l)\.(\u),$1. $2)
works as expected: It leaves more less all the dots as they are because the condition "lower case letter followed by dot followed by upper case letter" is not met
The String "O.K.Sentence" looks a lot like "U.S.A" - which was the initial example.

THis also does not match "U.S.A" as "8" is a number.
If you want to replace those as well, try:
$regexp(%comment%,(\d*)\.,$1. )

This could easily remedied by an action that replaces all double spaces with a single one. Much easier than to test all possibilities when to add a space or not.

This is also something for an action of the type "Replace" as you replace string constants with each other.
The original problem you had was that you wanted to replace a variable thingy (like number or letter) with more space in between but keeping the variable parts.
So the action group that you need has a set of replaces with regular expression (those for the patterns) an others where you treat the constant strings.

Thank you! I will try to figure this out and report back.

So I have done some thinking and I agree that the white spaces and bracket information can be addressed by "Replace" actions. The trick is to put them after the action to add a space after a period. I already have some of these build in so I will have to place the period action prior to them.

I am going to look at this action which may have become a bit unruly. I will attach it, perhaps you can take a look. I pieced it together from other actions I got here, some I was able to figure out myself, and some I created with your help. It has worked well for me, but is not perfect.

In the action "steps," starting 19 actions from the bottom, I have 8 "replace with regular expression" actions that replace a set of spaces with one space, but I am pretty sure that not how it works and perhaps they should be just "replace" actions. I do have a set of those (simple "replace" actions) earlier in the action, so perhaps I got confused, which seems to happen often to me with all script related topics, and just repeated them with the wrong type of action.

You may note that 11 actions up from the bottom I do have so regular expression actions that deal with multiple cases (separated by a "|" ) in one action. Maybe I can make one of those for all the special cases to skip (comma, closed bracket, parenthesis, or curly bracket, or another punctuation)? It would be more neat than making a "Replace" for each, I believe? I just don't know if you can place it in a "replace with regular expression" action, where you instruct the action NOT to and a space after a period if followed by once of those special cases.

Here is the action I was hoping to modify to serve this purpose:
Action: Replace with regular expression
Field: Comment
Refular Expression:
(?<!&|:|;|-|/|!|,|>|/|(?<![A-Z].[A-Z]).|)|?|+)(\s+\b(A|An|And|As|At|But|By|De|Et|For|From|In|Into|Le|Nor|Of|Off|On|Onto|Or|So|Than|The|To|Upon|Von|With)(?=\s)(?!\s[-()[]{}]))
Replace matches with:
$lower($1)
*using case sensitive comparison

Anyhow, I added your suggested replace action

Field: COMMENT
Regular expression: "(\d*)."
Replace matches with: "$1. "

and this is what happened:

"Test the 8.Now test U.S.A.Now Test O.K.Now test the ellipse...All songs recorded at Muscle Shoals Sound Studios, Sheffield, Alabama.

Produced for Muscle Shoals Sound productions, Lynyrd Skynyrd, Inc., and Sir Productions.
Audio restoration, assembly and CD digital remastering at Audio Mechanics, Los Angeles, CA.

Photography: cover, MCA Archives; page 5, courtesy Judy Van Zant Jenness; page 7, courtesy Judy Van Zant.

Track 1 - Recorded June 28-July 2, 1971. Previously unreleased.

Converted to:

"Test the 8.Now test U.S.A.Now Test O.K.Now test the ellipse...All songs recorded at Muscle Shoals Sound Studios, Sheffield, Alabama..
Produced for Muscle Shoals Sound productions, Lynyrd Skynyrd, Inc., and Sir Productions..
Audio restoration, assembly and CD digital remastering at Audio Mechanics, Los Angeles, CA..
Photography: cover, MCA Archives; page 5, courtesy Judy Van Zant Jenness; page 7, courtesy Judy Van Zant..
Track 1 - Recorded June 28-July 2, 1971. Previously unreleased..
Track 2 - Recorded June 28-July 2, 1971. Originally released on "Lynyrd Skynyrd" boxed set, MCAD3-10390, November 12, 1991.."

Unfortunately, it does not seem to work. There was no space added after the 8, the U.S.A. of ellipse. What is weird is when I paste it here, there there are new lines or carriage returns that do not display in the comment. There is no space between "Alabama.." and "Produced for Muscle" in Mp3Tag, but when pasted here it exists. I am not sure why.

I also figure my actions with Char(10) and char(13) are adding the extra period after a sentence punctuated by a period and followed by a new line or carriage return, such as "Alabama.."

Hopefully I can get this script sorted out as it would be very useful if I got it to serve its purpose. As always, many thanks for your help!
Here is the action:
Delete all whitespaces & Case Corrections-2;Length from _Legth)-improved Mixed Caps only on Comment, Album &Title Fields, Barcode format (Edit 2 second suggestion by ohrenkino)).mta (8.7 KB)