Re: reusing parts of a string in RE matches?

2006-05-10 Thread mpeters42
>From the Python 2.4 docs:

  findall( pattern, string[, flags])
  Return a list of all ***non-overlapping*** matches of pattern in
string

By design, the regex functions return non-overlapping patterns.

Without doing some kind of looping, I think you are out of luck.

If you pattern is fixed, then a solution might be:
>>>string = 'abababababababab'
>>>pat = 'aba'
>>>[pat for s in re.compile('(?='+pat+')').findall(string)]
['aba', 'aba', 'aba', 'aba', 'aba', 'aba', 'aba']

If the pattern is not fixed (i.e. 'a.a') then this method can still get
a count of overlapping matches, but cannot get the individual match
strings themselves.

A simple loop should do in this case, though:

>>> for i in range(len(string)):
... r= re.match(pat,string[i:])
... if r: print r.group()
...
aba
aba
aba
aba
aba
aba
aba

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: reusing parts of a string in RE matches?

2006-05-10 Thread mpeters42
Exactly,

Now this will work as long as there are no wildcards in the pattern.
Thus, only with fixed strings.  But if you have a fixed string, there
is really no need to use regex, as it will complicate you life for no
real reason (as opposed to simple string methods).

With a more complex pattern (like 'a.a': match any character between
two 'a' characters) this will get the length, but not what character is
between the a's.

To actually do that you will need to iterate through the string and
apply the pattern match (which matches only the beginning of a string)
to a indexed subset of the original (see example in the last post)

-- 
http://mail.python.org/mailman/listinfo/python-list