Bernard Lebel wrote: > The file size is 112 Kb. Most lines look this way: > > <parameter name="roty" type="Parameter" sourceclassname="nosource"> > > > I'll give a try to ElementTree.
To get you started: from elementtree import ElementTree doc = ElementTree.parse('myfile.xml') for sceneobject in doc.findall('//sceneobject'): if sceneobject.get('type') == 'CameraRoot': # this is a sceneobject that you want print sceneobject.get('name') One gotcha - if your XML uses namespaces, you have to prefix the namespace to the tag name in findall(). It will look something like d.findall('//{http://www.imsproject.org/xsd/imscp_rootv1p1p2}resource') Let us know how long that takes... Kent > > > Bernard > > > > On 9/14/05, Kent Johnson <[EMAIL PROTECTED]> wrote: > >>Bernard Lebel wrote: >> >>>Thanks for that pointer Kent, I'll check it out. Also thanks for >>>letting me know I'm not nuts! :-) >>> >>>Alan's suggestion about BeautifulSoup is actually excellent. The >>>documentation is nice and the tool is very easy to use. >>> >>>However is it normal that to parse a 2618 lines xml file it takes >>>20-30 seconds or so? >> >>That seems slow to me unless the lines are really long! How many bytes is the >>file? But I don't have much experience with BeautifulSoup. >> >>ElementTree is fast and cElementTree (the C implementation) is really fast. I >>have used it to read, process and write a 28 MB XML file, it took about 10 >>seconds. >> >>Kent >> >> >>> >>>Thanks >>>Bernard >>> >>> >>> >>>On 9/14/05, Kent Johnson <[EMAIL PROTECTED]> wrote: >>> >>> >>>>Bernard Lebel wrote: >>>> >>>> >>>>>Thanks Alan, >>>>> >>>>>I'll check BeautifulSoup asap. >>>>> >>>>>I'm using regex simply because I have no clue where to start to parse >>>>>XML. I have read the various xml tools available in the Python >>>>>library, however I'm a complete loss at what to make out of them. Many >>>>>of them seem to use some programming standards, wich I am completely >>>>>unfamiliar with (this is the first time that I dig into XML writing >>>>>and parsing). >>>>> >>>>>I don't know where to start to learn about all these standards, and as >>>>>usual with new programming things, the documentation is hard to >>>>>swallow (it usually is written more as a reference than a proper user >>>>>guide/tutorial). I have to admit this is very frustrating, so if I'm >>>>>looking at things from a wrong perspective please advise me, I need >>>>>it. >>>> >>>>I agree that the Python XML story is confusing even for the files in the >>>>standard library. Worse, the (IMO) best solutions are not to be found in >>>>the standard lib or PyXML at all. >>>> >>>>The std lib and PyXML are based on the DOM and SAX standards. These >>>>standards were designed to be "language-neutral" - there are >>>>implementations in Python, Java and other languages. The good side of this >>>>is, if you learn how to use them, the knowledge is pretty portable to other >>>>languages. The bad side is, the APIs defined by the standard are IMO clunky >>>>and painful to use, especially in Python. >>>> >>>>There is a current thread on comp.lang.python discussing this with good >>>>suggestions and pointers to more info: >>>>http://groups.google.com/group/comp.lang.python/browse_frm/thread/a48891aa645ead13/dcd8fdc20b4b191b?hl=en#dcd8fdc20b4b191b >>>> >>>>My personal preference is ElementTree. Beautiful Soup is good too though I >>>>have only tried it with HTML. If I was running on Linux I would try lxml >>>>which uses the ElementTree API and adds full XPath support. Amara looks >>>>like the Cadillac solution - big and cushy. I haven't tried it. Uche's >>>>articles (referenced in the thread above) have pointers to many other >>>>choices but these seem to be the most popular. >>>> >>>>My favorite XML lib is actually dom4j which is in Java. It works great with >>>>Jython. >>>> >>>>Kent >>>> >>>> >>>> >>>>>So right now I'm just taking a shortcut and using ultra-simple >>>>>re-based parser to retrieve the tags I'm looking for. I know it will >>>>>probably be slow, but hopefully I'll get familiar with sophisticated >>>>>parsing in the future and improve my code. As it stands right now, >>>>>even the re syntax is not super easy to learn. >>>> >>>>For what you are doing re seems fine to me. You can get in trouble using >>>>re's with XML because of nested tags, variations in spelling and order, >>>>probably a bunch of other things. But for simple stuff it can work fine. >>>> >>>>Kent >>>> >>>> >>>> >>>>>Kent: That works (of course!). Thanks a bunch once again! >>>>> >>>>> >>>>>Thanks >>>>>Bernard >>>>> >>>>>On 9/14/05, Alan G <[EMAIL PROTECTED]> wrote: >>>>> >>>>> >>>>> >>>>>>Hi Bernard, >>>>>> >>>>>> >>>>>> >>>>>> >>>>>>>Hello, yet another regular expression question :-) >>>>>>> >>>>>>>So I have this xml file that I'm trying to find a >>>>>>>specific tag in. >>>>>> >>>>>>I'm always suspicious when I see regular expression >>>>>>and xml/html in the same context. regex are not good >>>>>>for parsing xml/html files and it's usually much easier >>>>>>to use a proper parser - such as beautiful soup. >>>>>> >>>>>>http://www.crummy.com/software/BeautifulSoup/ >>>>>> >>>>>>Is there any special reason why you are using a regex >>>>>>sledgehammer to crack this particular nut? Or is it >>>>>>just to gain experience using regex? >>>>>> >>>>>>Alan G. >>>>>> >>>>> >>>>>_______________________________________________ >>>>>Tutor maillist - Tutor@python.org >>>>>http://mail.python.org/mailman/listinfo/tutor >>>>> >>>>> >>>> >>>>_______________________________________________ >>>>Tutor maillist - Tutor@python.org >>>>http://mail.python.org/mailman/listinfo/tutor >>>> >>> >>>_______________________________________________ >>>Tutor maillist - Tutor@python.org >>>http://mail.python.org/mailman/listinfo/tutor >>> >>> >> >>_______________________________________________ >>Tutor maillist - Tutor@python.org >>http://mail.python.org/mailman/listinfo/tutor >> > > > _______________________________________________ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor