If you're trying to find long lines of subtitle text and then find a 
grammar appropriate place to break the line up within the first 23 
characters (or 24 if there's one of the punctuation characters you've 
specified), then this should do the job:

(?=^.{42,}$)(.{,23}\b[,.:;"!?]?) \b(.*)

I'm not an expert in written English grammar but I'm pretty sure within a 
single line of text whole words are always separated by space character. 
The first whole word of a two whole word pair may be followed by a 
punctuation character but if so the punctuation character has to be 
followed by a space character. This in part was why your regular expression 
patterns weren't working for you.

The space between the two words where you're separating the line parts 
doesn't need to be captured when reformatted one line into two lines. It 
isn't proper formatting to carry that space character with the second part 
of the line. Also, keeping the space character with the first line part 
isn't necessary and if kept could add some complications dealing with it in 
any later text manipulations.

The one thing the grep pattern doesn't handle is the double space 
characters after punctuation marks carried over from the typewriter days. 
If the text you're dealing with has that, change the " \b"  to " {1,2}\b" 
(all without the " characters).

On Tuesday, April 8, 2025 at 2:09:20 AM UTC-7 Otto Munters wrote:

> Error in regex for BBedit:
> What is wrong in this regex?
> (?=^.{42,}$)(.{,23}\b[,.:;"!?]?\b)(.*)
>
> problem: the comma is also moved with capture group 2, it should stay with 
> capture group 1
>
> Example, whole sentence to be split in two parts:
> They get rid of things, very simple clothing.
>
> regex with error returns:
> 1st line: They get rid of things
> 2nd line: , very simple clothing.
>
> should be:
> 1st line: They get rid of things,
> 2nd line: very simple clothing.
>
> punctuation mark is not included in correct capture group
>
> I tried different patterns, like:
> (?=^.{42,}$)(.{0,23}\b[\w,.:;"!?]*\b)(.*)     also not working right
>
> (?=^.{42,}$)(.{1,23}\b(?:[,.:;"!?]?)\b)(.*)   also not working right
>
> (?=^.{42,}$)(?<Group1>.{1,23}\b[,.:;"!?]?)\b(?<Group2>.*)   also not 
> working right
>
>
> Thanks a lot for your help!
> Otto
>

-- 
This is the BBEdit Talk public discussion group. If you have a feature request 
or believe that the application isn't working correctly, please email 
"[email protected]" rather than posting here. Follow @bbedit on Mastodon: 
<https://mastodon.social/@bbedit>
--- 
You received this message because you are subscribed to the Google Groups 
"BBEdit Talk" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion visit 
https://groups.google.com/d/msgid/bbedit/b983846e-628e-4d44-812a-e63a220a31adn%40googlegroups.com.

Reply via email to