I had planned to parse myself, but am not sure how to go about it. I assume regular expressions, but I couldn't even find the amount of units in the file by using: unitReg=re.compile(r"\<unit\>(*)\</unit\>") unitCount=unitReg.search(fileContents) print "number of units: "+unitCount.len(groups())
I just get an exception that "None type object has no attribute groups", meaning that the search was unsuccessful. What I was hoping to do was to grab everything between the opening and closing unit tags, then read it one at a time and parse further. There is a tag inside a unit tag called AttackTable which also terminates, so I would need to pull that out and work with it separately. I probably just have misunderstood how regular expressions and groups work... On 1/7/12, Chris Fuller <cfuller...@thinkingplanet.net> wrote: > > If it's unambiguous as to which tags are closed and which are not, then it's > pretty easy to preprocess the file into valid XML. Scan for the naughty > bits > (single quotes) and insert escape characters, replace with something else, > etc., then scan for the unterminated tags and throw in a "/" at the end. > > Anyhow, if there's no tree structure, or its only one level deep, using > ElementTree is probably overkill and just gives you lots of leaking > abstractions to plug for little benefit. Why not just scan the file > directly? > > Cheers > > On Saturday 07 January 2012, Alex Hall wrote: >> Hello all, >> I have a file with xml-ish code in it, the definitions for units in a >> real-time strategy game. I say xml-ish because the tags are like xml, >> but no quotes are used and most tags do not have to end. Also, >> comments in this file are prefaced by an apostrophe, and there is no >> multi-line commenting syntax. For example: >> >> <unit> >> <number=1> >> <name=my unit> >> <canMove=True> >> <canCarry=unit2, unit3, unit4> >> 'this line is a comment >> </unit> >> >> The game is not mine, but I would like to put together a python >> interface to more easily manage custom units for it. To do that, I >> have to be able to parse these files, but elementtree does not seem to >> like them very much. I imagine it is due to the lack of quotes, the >> non-standard commenting method, and the lack of closing tags. I think >> my only recourse here is to create my own parser and tell elementtree >> to use that. The docs say this is possible, but they also seem to >> indicate that the parser has to already exist in the elementtree >> package and there is no mention of making one's own method for >> parsing. Even if this were possible, though, I am not sure how to go >> about it. I can of course strip comments, but that is as far as I have >> gotten. >> >> Bottom line: can I create a method and tell elementtree to parse using >> it, and what would such a function look like (generally) if I can? >> Thanks! > > -- Have a great day, Alex (msg sent from GMail website) mehg...@gmail.com; http://www.facebook.com/mehgcap _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor