Brano Gerzo schreef:
> hello all!
>
> I'd like to request for help with this regexp. I want match
> these examples:
>
> word word
> 3 word word
> 3 word word en
> 3 word word en,pt
> 3 word word en,pt 1cd
>
> ok, here is regexp I wrote:
> ^\s*(\d{1,2}\s+)?([\w\s\+:]+)
> (sq|hy|ay|bs|bg|hr|cs|da|nl|en|et|fi|fr|de|gr|he|hu|zh|it|ja|kk|lv|
> pl|pt|pb|ro|ru|sr|sk|sl|es|sv|th|tr|uk|al|\s*,\s*)*\s*(?:(\d)cd)?$
>
> problem is, I can't get "en,pt" together, "en" is mathed to $2.
> Anyone will help me on this, please ?
I don't understand what you try to match with "[\w\s\+:]+". It matches
any series of characters that belong to the character class containing
[[:word:]], [[:space:]], a plus and a colon. So "a b :c" would match.
#!/usr/bin/perl
use warnings ;
use strict ;
sub sp { '[[:blank:]]+' }
sub capture { "(@_)" }
sub optional { "(?:@_)?" }
sub REnumber { '\d+' }
sub REword { '\w+' }
sub RElang { '
(?:
a[ly]|b[gs]|cs|d[ae]|e[nst]|
f[ir]|gr|h[eruy]|it|ja|kk|lv|nl|
p[blt]|r[ou]|s[klqrv]|t[hr]|uk|zh)
' }
sub REwordlist { REword . sp . REword }
sub RElanglist { RElang . optional( ',' . RElang ) }
my $re = optional(capture(REnumber).sp)
. capture(REwordlist)
. optional(sp.capture(RElanglist))
. optional(sp.capture('\d+').'cd') ;
print "re/$re/\n\n\n" ;
my $qr = qr/ $re /x ;
while ( <DATA> )
{
no warnings ;
print "\n" ;
print ;
/$qr/ and print "($1) ($2) ($3) ($4)\n" ;
}
__DATA__
word word
3 word word
3 word word en
3 word word en,pt
3 word word en,pt 1cd
--
Affijn, Ruud
"Gewoon is een tijger."
--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>