subject:"\[Tutor\] regex\: matching unicode"

Re: [Tutor] regex: matching unicode

2012-12-24 Thread eryksun

On Mon, Dec 24, 2012 at 2:51 AM, Albert-Jan Roskam wrote: > > First, check if the first character is a (unicode) letter You can use unicode.isalpha, with a caveat. On a narrow build isalpha fails for supplementary planes. That's about 50% of all alphabetic characters, +/- depending on the version

Re: [Tutor] regex: matching unicode

2012-12-23 Thread Albert-Jan Roskam

>>Is the code below the only/shortest way to match unicode characters? I would >>like to match whatever is defined as a character in the unicode reference >>database. So letters in the broadest sense of the word, but not digits, >>underscore or whitespace. Until just now, I was convinced that th

Re: [Tutor] regex: matching unicode

2012-12-23 Thread eryksun

On Sat, Dec 22, 2012 at 11:12 PM, Steven D'Aprano wrote: > > No. You could install a more Unicode-aware regex engine, and use it instead > of Python's re module, where Unicode support is at best only partial. > > Try this one: > > http://pypi.python.org/pypi/regex Looking over the old docs, I cou

Re: [Tutor] regex: matching unicode

2012-12-22 Thread Steven D'Aprano

On 23/12/12 07:53, Albert-Jan Roskam wrote: Hi, Is the code below the only/shortest way to match unicode characters? No. You could install a more Unicode-aware regex engine, and use it instead of Python's re module, where Unicode support is at best only partial. Try this one: http://pypi.py

Re: [Tutor] regex: matching unicode

2012-12-22 Thread Hugo Arts

On Sat, Dec 22, 2012 at 9:53 PM, Albert-Jan Roskam wrote: > Hi, > > Is the code below the only/shortest way to match unicode characters? I > would like to match whatever is defined as a character in the unicode > reference database. So letters in the broadest sense of the word, but not > digits,

[Tutor] regex: matching unicode

2012-12-22 Thread Albert-Jan Roskam

Hi, Is the code below the only/shortest way to match unicode characters? I would like to match whatever is defined as a character in the unicode reference database. So letters in the broadest sense of the word, but not digits, underscore or whitespace. Until just now, I was convinced that the r

Re: [Tutor] regex: matching unicode

Re: [Tutor] regex: matching unicode

Re: [Tutor] regex: matching unicode

Re: [Tutor] regex: matching unicode

Re: [Tutor] regex: matching unicode

[Tutor] regex: matching unicode

6 matches

Site Navigation

Mail list logo

Footer information