Emad wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Since I'm learning Pyparsing, this was a nice excercise. I've written this elementary script which does the job well in light of the data we have
from pyparsing import * ID_TAG = Literal("<ID>") FULL_NAME_TAG1 = Literal("<Full") FULL_NAME_TAG2 = Literal("name>") END_TAG = Literal("</") word = Word(alphas) pattern1 = ID_TAG + word + END_TAG pattern2 = FULL_NAME_TAG1 + FULL_NAME_TAG2 + OneOrMore(word) + END_TAG result = pattern1 | pattern2 lines = open("lines.txt")# This is your file name for line in lines: myresult = result.searchString(line) if myresult: print myresult[0] # This prints out ['<ID>', 'Joseph', '</'] ['<Full', 'name>', 'Joseph', 'Smith', '</'] # You can access the individual elements of the lists to pick whatever you want Emad - >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Welcome to the world of pyparsing! Your program is a very good first cut at this problem. Let me add some suggestions (more like hints toward more advanced concepts in your pyparsing learning): - Look into Group, as in Group(OneOrMore(word)), this will add organization and structure to the returned results. - Results names will make it easier to access the separate parsed fields. - Check out the makeHTMLTags and makeXMLTags helper methods - these do more than just wrap angle brackets around a tag name, but also handle attributes in varying order, case variability, and (of course) varying whitespace - the OP didn't explicitly say this XML data, but the sample does look suspicious. If you only easy_install'ed pyparsing or used the binary windows installer, please go back to SourceForge and download the source .ZIP or tarball package - these have the full examples and htmldoc directories that the auto-installers omit. Good luck in your continued studies! -- Paul _______________________________________________ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor