wesley chun wrote: >> so it's guaranteed that 'Writing Message to' >> will always be followed by 'TRANSPORT_STREAM_ID' >> before the next occurrence of 'Writing Message to' >> and all text between can be ignored, >> and we increment the counter if and only if >> there is a newline immediately after 'TRANSPORT_STREAM_ID' >> yes? > > > just throwing this out there... would anyone do something like a > open('log.txt', 'w').write(str(len(re.split(r'Writing Message > to([\w\d\s:/\.]+?)TRANSPORT_STREAM_ID Parameter value: > 0160\r?\n'))), or is this unseemly due the fact that the file may be > very large?
If the log file can be read into memory then a regex-based solution might work well though your code looks a bit scrambled to me. Rather than re.split() I would use re.findall(). To solve this line-by-line I would make a simple state machine that looks for lines of interest and moves through the states Begin, Found_Transport_Stream_Id and Found_Writing_Message. Kent > > advantages i see here include: no counter to maintain since you get > the one answer at the end, your python code is not iterating thru the > file one line at a time (the faster C code in 're' is), you auto > matically skip the TRANSPORT_STREAM_IDs that are *not* followed by a > NEWLINE, etc. > > just wondering, > -- wesley > - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - > "Core Python Programming", Prentice Hall, (c)2007,2001 > http://corepython.com > > wesley.j.chun :: wescpy-at-gmail.com > python training and technical consulting > cyberweb.consulting : silicon valley, ca > http://cyberwebconsulting.com > _______________________________________________ > Tutor maillist - Tutor@python.org > http://mail.python.org/mailman/listinfo/tutor > > _______________________________________________ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor