If you're trying to find long lines of subtitle text and then find a
grammar appropriate place to break the line up within the first 23
characters (or 24 if there's one of the punctuation characters you've
specified), then this should do the job:
(?=^.{42,}$)(.{,23}\b[,.:;"!?]?) \b(.*)
I'm not an expert in written English grammar but I'm pretty sure within a
single line of text whole words are always separated by space character.
The first whole word of a two whole word pair may be followed by a
punctuation character but if so the punctuation character has to be
followed by a space character. This in part was why your regular expression
patterns weren't working for you.
The space between the two words where you're separating the line parts
doesn't need to be captured when reformatted one line into two lines. It
isn't proper formatting to carry that space character with the second part
of the line. Also, keeping the space character with the first line part
isn't necessary and if kept could add some complications dealing with it in
any later text manipulations.
The one thing the grep pattern doesn't handle is the double space
characters after punctuation marks carried over from the typewriter days.
If the text you're dealing with has that, change the " \b" to " {1,2}\b"
(all without the " characters).
On Tuesday, April 8, 2025 at 2:09:20 AM UTC-7 Otto Munters wrote:
> Error in regex for BBedit:
> What is wrong in this regex?
> (?=^.{42,}$)(.{,23}\b[,.:;"!?]?\b)(.*)
>
> problem: the comma is also moved with capture group 2, it should stay with
> capture group 1
>
> Example, whole sentence to be split in two parts:
> They get rid of things, very simple clothing.
>
> regex with error returns:
> 1st line: They get rid of things
> 2nd line: , very simple clothing.
>
> should be:
> 1st line: They get rid of things,
> 2nd line: very simple clothing.
>
> punctuation mark is not included in correct capture group
>
> I tried different patterns, like:
> (?=^.{42,}$)(.{0,23}\b[\w,.:;"!?]*\b)(.*) also not working right
>
> (?=^.{42,}$)(.{1,23}\b(?:[,.:;"!?]?)\b)(.*) also not working right
>
> (?=^.{42,}$)(?<Group1>.{1,23}\b[,.:;"!?]?)\b(?<Group2>.*) also not
> working right
>
>
> Thanks a lot for your help!
> Otto
>
--
This is the BBEdit Talk public discussion group. If you have a feature request
or believe that the application isn't working correctly, please email
"[email protected]" rather than posting here. Follow @bbedit on Mastodon:
<https://mastodon.social/@bbedit>
---
You received this message because you are subscribed to the Google Groups
"BBEdit Talk" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion visit
https://groups.google.com/d/msgid/bbedit/b983846e-628e-4d44-812a-e63a220a31adn%40googlegroups.com.