lxml, comparing nodes

2008-07-23 Thread code_berzerker
I'd like to know if there is any built in mechanism in lxml that lets
you check equality of two nodes from separate documents. I'd like it
to ignore attribute order and so on. It would be even better if there
was built in method for checking equality of whole documents (ignoring
document order). Please let me know if you know of such method or
existing scipt. I dont like reinventing the wheel :)
--
http://mail.python.org/mailman/listinfo/python-list


Re: lxml, comparing nodes

2008-07-23 Thread code_berzerker
On Jul 23, 6:29 pm, Stefan Behnel <[EMAIL PROTECTED]> wrote:

> Your requirements for a single Element are simple enough to write it in three
> to five lines of Python code (depending on your definition of equality).
> Checking this equality recursively is another two to three lines. Not complex
> enough to be considered a wheel in the first place.

Forgive my ignorance as I am new to both Python and lxml ;)
--
http://mail.python.org/mailman/listinfo/python-list


Re: lxml, comparing nodes

2008-07-24 Thread code_berzerker
> off the top of my head (untested):
>
>  >>> def equal(a, b):
> ...     if a.tag != b.tag or a.attrib != b.attrib:
> ...         return False
> ...     if a.text != b.text or a.tail != b.tail:
> ...         return False
> ...     if len(a) != len(b):
> ...         return False
> ...     if any(not equal(a, b) for a, b in zip(a, b)):
> ...         return False
> ...     return True
>
> this should work for arbitrary ET implementations (lxmk, xml.etree, ET,
> etc).  tweak as necessary.
>
> 

Thanks for help. Thats inspiring, tho not exactly what I need, coz
ignoring document order is requirement (ignoring changes in order of
different siblings of the same type, etc). I plan to try something
like that:

def xmlCmp(xmlStr1, xmlStr2):
  et1 = etree.XML(xmlStr1)
  et2 = etree.XML(xmlStr2)

  queue = []
  tmpq = deque([et1])
  tmpq2 = deque([et2])

  while tmpq:
el = tmpq.popleft()
tmpq.extend(el)
queue.append(el.tag)

  while queue:
el = queue.pop()
foundEl = findMatchingElem(el, et2)
if foundEl:
  et1.remove(el)
  tmpq2.remove(foundEl)
else:
  return False

  if len(tmpq2) == 0:
return True
  else:
return False


def findMatchingElem(el, eTree):
  for elem in eTree:
if elemCmp(el, elem):
  return elem
  return None


def elemCmp(el1, el2):
  pass # yet to be implemented ;)

--
http://mail.python.org/mailman/listinfo/python-list


Re: lxml, comparing nodes

2008-07-25 Thread code_berzerker
> If document order doesn't matter, try sorting the elements of each level in
> the two documents by some arbitrary deterministic key, such as (tag name,
> text, attr count, whatever), and then compare them in order, instead of trying
> to find matches in multiple passes. itertools.groupby() might be your friend 
> here.

I think that sorting multiple times by each attribute will cost more
than I've managed to do:

from lxml import etree
from collections import deque
import string, re, time

def xmlEqual(xmlStr1, xmlStr2):
  et1 = etree.XML(xmlStr1)
  et2 = etree.XML(xmlStr2)

  let1 = [x for x in et1.iter()]
  let2 = [x for x in et2.iter()]

  if len(let1) != len(let2):
return False

  while let1:
el = let1.pop(0)
foundEl = findMatchingElem(el, let2)
if foundEl is None:
  return False
let2.remove(foundEl)
  return True


def findMatchingElem(el, eList):
  for elem in eList:
if elemsEqual(el, elem):
  return elem
  return None


def elemsEqual(el1, el2):
  if el1.tag != el2.tag or el1.attrib != el2.attrib:
return False
  # no requirement for text checking for now
  #if el1.text != el2.text or el1.tail != el2.tail:
#return False
  path1 = el1.getroottree().getpath(el1)
  path2 = el2.getroottree().getpath(el2)
  idxRE = re.compile(r"(\[\d*\])")
  path1 = idxRE.sub("", path1)
  path2 = idxRE.sub("", path2)
  if path1 != path2:
return False

  return True

Notice that if documents are in exact same order, each element is
compared only once!
--
http://mail.python.org/mailman/listinfo/python-list


Re: lxml, comparing nodes

2008-07-25 Thread code_berzerker
> Not in your code.
>
> Stefan

Not sure what you mean, but I tested and so far every document with
the same order of elements had number of comparisons equal to number
of nodes.
--
http://mail.python.org/mailman/listinfo/python-list


SWIG and char* newb questions :)

2008-07-29 Thread code_berzerker
Hi i'm relatively new to Python and my C/C++ knowledge is near to
None. Having said that I feel justified to ask stupid questions :)

Ok now more seriously. I have question refering to char* used as
function parameters to return values. I have read SWIG manual to find
best way to overcome that, but there are many warnings about memory
leaks and stuff, so I feel confused.

Ok to put it more simply: how to safely define a variable in Python
and have it modified by C/C++ function?
Even better would be a way to make a tuple of return value and out
parameters, but thats probably a lot more work.

Any hint will be appreciated!
--
http://mail.python.org/mailman/listinfo/python-list


Re: SWIG and char* newb questions :)

2008-07-29 Thread code_berzerker
Ok I think I got it:


PyObject* myFuncXXX(char* p_1, int p_2, char* p_3, int p_4)
{
  int res;
  char _host[255] = "";
  int _port;
  res = funcXXX(p_1, p_2, p_3, p_4, _host, &_port);

  PyObject* res1 = PyInt_FromLong(res);
  PyObject* res2 = PyString_FromStringAndSize(_host, strlen(_host));
  PyObject* res3 = PyInt_FromLong(_port);

  PyObject* resTuple = PyTuple_New(3);

  PyTuple_SetItem(resTuple, 0, res1);
  PyTuple_SetItem(resTuple, 1, res2);
  PyTuple_SetItem(resTuple, 2, res3);

  return resTuple;
}

It seems to work when I put it into swig's "*.i" file.

me proud of me.self :D
--
http://mail.python.org/mailman/listinfo/python-list