In your original example:
print "match1='$1' '$2'\n" if ($T=~/^((mr|mrs|miss|dr|prof|sir) .{5,}?)\n/smi);
print "match2='$1' '$2'\n" if ($T=~/^(mr|mrs|miss|dr|prof|sir .{5,}?)\n/smi);
the interior parentheses in example one terminates the alternation, so the last
string is ’sir’.
In example two, the alternation is not terminated until the first ‘)', so the
last string is ’sir .{5,}?’. followed in the regular expression by the “\n”
character. Since in $T ‘miss’ is not followed by an \n, the match fails. Vlado
has explained how to group and terminate the alternation without capturing the
match result.
> On Dec 2, 2020, at 6:08 AM, Gary Stainburn <[email protected]>
> wrote:
>
> On 02/12/2020 13:56, Vlado Keselj wrote:
>> Well, it seems that the first one is what you want, but you just need to
>> use $1 and ignore $2.
>>
>> You do need parentheses in '(mr|mrs|miss|dr|prof|sir)' but if you do not
>> want for them to be captured in $2, you can use:
>> '(?:mr|mrs|miss|dr|prof|sir)'. For example:
>>
>> print "match3='$1' '$2'\n" if
>> ($T=~/^((?:mr|mrs|miss|dr|prof|sir) .{5,}?)\n/smi);
>>
>> would give output:
>>
>> match3='Miss Jayne Doe' ''
> Perfect, thank you.
>
> I can't ignore $2 as it's in a loop with other regex that genuinely returns
> multiple matches. The amendment to the REGEX worked perfectly.
It is always best to save the results of a match with capturing in another
variable. The capturing variables $1, $2, etc. are not reassigned if a match
fails, so if you use them after a failed match, they will be the values left
over from a previous match. So do this:
my $salutation = $1;
my $name = $2;
If you don’t want a possible undefined value, so this instead:
my $name = $2 || '';
Jim Gibson
[email protected]
--
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
http://learn.perl.org/