Thank you all. I got it. :) I need to read more between lines .
On Wed, Feb 19, 2014 at 4:25 AM, spir <denis.s...@gmail.com> wrote: > On 02/18/2014 08:39 PM, Zachary Ware wrote: > >> Hi Santosh, >> >> On Tue, Feb 18, 2014 at 9:52 AM, Santosh Kumar <rhce....@gmail.com> >> wrote: >> >>> >>> Hi All, >>> >>> If you notice the below example, case I is working as expected. >>> >>> Case I: >>> In [41]: string = "<H*>test<H*>" >>> >>> In [42]: re.match('<H\*>',string).group() >>> Out[42]: '<H*>' >>> >>> But why is the raw string 'r' not working as expected ? >>> >>> Case II: >>> >>> In [43]: re.match(r'<H*>',string).group() >>> ------------------------------------------------------------ >>> --------------- >>> AttributeError Traceback (most recent call >>> last) >>> <ipython-input-43-d66b47f01f1c> in <module>() >>> ----> 1 re.match(r'<H*>',string).group() >>> >>> AttributeError: 'NoneType' object has no attribute 'group' >>> >>> In [44]: re.match(r'<H*>',string) >>> >> >> It is working as expected, but you're not expecting the right thing >> ;). Raw strings don't escape anything, they just prevent backslash >> escapes from expanding. Case I works because "\*" is not a special >> character to Python (like "\n" or "\t"), so it leaves the backslash in >> place: >> >> >>> '<H\*>' >> '<H\*>' >> >> The equivalent raw string is exactly the same in this case: >> >> >>> r'<H\*>' >> '<H\*>' >> >> The raw string you provided doesn't have the backslash, and Python >> will not add backslashes for you: >> >> >>> r'<H*>' >> '<H*>' >> >> The purpose of raw strings is to prevent Python from recognizing >> backslash escapes. For example: >> >> >>> path = 'C:\temp\new\dir' # Windows paths are notorious... >> >>> path # it looks mostly ok... [1] >> 'C:\temp\new\\dir' >> >>> print(path) # until you try to use it >> C: emp >> ew\dir >> >>> path = r'C:\temp\new\dir' # now try a raw string >> >>> path # Now it looks like it's stuffed full of backslashes [2] >> 'C:\\temp\\new\\dir' >> >>> print(path) # but it works properly! >> C:\temp\new\dir >> >> [1] Count the backslashes in the repr of 'path'. Notice that there is >> only one before the 't' and the 'n', but two before the 'd'. "\d" is >> not a special character, so Python didn't do anything to it. There >> are two backslashes in the repr of "\d", because that's the only way >> to distinguish a real backslash; the "\t" and "\n" are actually the >> TAB and LINE FEED characters, as seen when printing 'path'. >> >> [2] Because they are all real backslashes now, so they have to be >> shown escaped ("\\") in the repr. >> >> In your regex, since you're looking for, literally, "<H*>", you'll >> need to backslash escape the "*" since it is a special character *in >> regular expressions*. To avoid having to keep track of what's special >> to Python as well as regular expressions, you'll need to make sure the >> backslash itself is escaped, to make sure the regex sees "\*", and the >> easiest way to do that is a raw string: >> >> >>> re.match(r'<H\*>', string).group() >> '<H*>' >> >> I hope this makes some amount of sense; I've had to write it up >> piecemeal and will never get it posted at all if I don't go ahead and >> post :). If you still have questions, I'm happy to try again. You >> may also want to have a look at the Regex HowTo in the Python docs: >> http://docs.python.org/3/howto/regex.html >> > > In addition to all this: > * You may confuse raw strings with "regex escaping" (a tool func that > escapes special regex characters for you). > * For simplicity, always use raw strings for regex formats (as in your > second example); this does not prevent you to escape special characters, > but you only have to do it once! > > > d > _______________________________________________ > Tutor maillist - Tutor@python.org > To unsubscribe or change subscription options: > https://mail.python.org/mailman/listinfo/tutor > -- D. Santosh Kumar RHCE | SCSA +91-9703206361 Every task has a unpleasant side .. But you must focus on the end result you are producing.
_______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor