Well, it seems that the first one is what you want, but you just need to
use $1 and ignore $2.
You do need parentheses in '(mr|mrs|miss|dr|prof|sir)' but if you do not
want for them to be captured in $2, you can use:
'(?:mr|mrs|miss|dr|prof|sir)'. For example:
print "match3='$1' '$2'\n" if
($T=~/^((?:mr|mrs|miss|dr|prof|sir) .{5,}?)\n/smi);
would give output:
match3='Miss Jayne Doe' ''
On Wed, 2 Dec 2020, Gary Stainburn wrote:
> I have an array of regex expressions that I apply to text returned from
> tesseract.
>
> Each match that I get then gets stored for future processing. However, I'm
> struggling with one regex.
>
> The problem is that:
>
> 1) with brackets round the titles it returns two matches.
> 2) without brackets, it returns nothing.
>
> Can anyone point me at the correct syntax please.
>
> Gary
>
> [root@dev dev]# ./t
> match1='Miss Jayne Doe' 'Miss'
> [root@dev dev]# cat t
> #!/usr/bin/perl
>
> use strict;
> use warnings;
>
> my $T=<<EOF;
> Customer name and address
> Miss Jayne Doe
> 19 Their Street
> Somewehere
> In Yorkshire
> IN1 3YY
> EOF
>
> print "match1='$1' '$2'\n" if ($T=~/^((mr|mrs|miss|dr|prof|sir)
> .{5,}?)\n/smi);
> print "match2='$1' '$2'\n" if ($T=~/^(mr|mrs|miss|dr|prof|sir .{5,}?)\n/smi);
> [root@dev dev]#
>
> --
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
> http://learn.perl.org/
>
>
--
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
http://learn.perl.org/