Re: DIH: Limited xpath syntax unable to parse all xml elements

2009-07-02 Thread Jay Hill
Thanks Fergus, setting the field to multivalued did work: gets all the elements as multivalue fields in the body field. The only thing is, the body field is used by some other content sources, so I have to look at the implications setting it to multi-valued will have on the other data sour

Re: DIH: Limited xpath syntax unable to parse all xml elements

2009-07-02 Thread Jay Hill
I'm on the trunk, built on July 2: 1.4-dev 789506 Thanks, -Jay On Thu, Jul 2, 2009 at 11:33 AM, Shalin Shekhar Mangar < shalinman...@gmail.com> wrote: > On Thu, Jul 2, 2009 at 11:38 PM, Mark Miller > wrote: > > > Shalin Shekhar Mangar wrote: > > > >> > >> It selects all matching nodes. But if t

Re: DIH: Limited xpath syntax unable to parse all xml elements

2009-07-02 Thread Mark Miller
Shalin Shekhar Mangar wrote: On Thu, Jul 2, 2009 at 11:38 PM, Mark Miller wrote: Shalin Shekhar Mangar wrote: It selects all matching nodes. But if the field is not multi-valued, it will store only the last value. I guess this is what is happening here. So do you think it

Re: DIH: Limited xpath syntax unable to parse all xml elements

2009-07-02 Thread Fergus McMenemie
>Shalin Shekhar Mangar wrote: >> On Thu, Jul 2, 2009 at 11:08 PM, Mark Miller wrote: >> >> >>> It looks like DIH implements its own subset of the Xpath spec. >>> >> >> >> Right, DIH has a streaming implementation supporting a subset of XPath only. >> The supported things are in the wiki ex

Re: DIH: Limited xpath syntax unable to parse all xml elements

2009-07-02 Thread Shalin Shekhar Mangar
On Thu, Jul 2, 2009 at 11:38 PM, Mark Miller wrote: > Shalin Shekhar Mangar wrote: > >> >> It selects all matching nodes. But if the field is not multi-valued, it >> will >> store only the last value. I guess this is what is happening here. >> >> >> > So do you think it should match them all and

Re: DIH: Limited xpath syntax unable to parse all xml elements

2009-07-02 Thread Mark Miller
Shalin Shekhar Mangar wrote: On Thu, Jul 2, 2009 at 11:08 PM, Mark Miller wrote: It looks like DIH implements its own subset of the Xpath spec. Right, DIH has a streaming implementation supporting a subset of XPath only. The supported things are in the wiki examples. I don't s

Re: DIH: Limited xpath syntax unable to parse all xml elements

2009-07-02 Thread Shalin Shekhar Mangar
On Thu, Jul 2, 2009 at 11:08 PM, Mark Miller wrote: > It looks like DIH implements its own subset of the Xpath spec. Right, DIH has a streaming implementation supporting a subset of XPath only. The supported things are in the wiki examples. > I don't see any tests with multiple matching sub n

Re: DIH: Limited xpath syntax unable to parse all xml elements

2009-07-02 Thread Mark Miller
It looks like DIH implements its own subset of the Xpath spec. I don't see any tests with multiple matching sub nodes, so perhaps DIH Xpath does not properly support that and just selects the last matching node? Also, I don't think the double / matters. That would just allow more nodes in betw

Re: DIH: Limited xpath syntax unable to parse all xml elements

2009-07-02 Thread Jay Hill
It is not multivalued. The intention is to get all text under they element into one "body" field in the index that is not multivalued. Essentially everything within the element minus the markup. Thanks, -Jay On Thu, Jul 2, 2009 at 8:55 AM, Fergus McMenemie wrote: > >Thanks Noble, I gave thos

Re: DIH: Limited xpath syntax unable to parse all xml elements

2009-07-02 Thread Fergus McMenemie
>Thanks Noble, I gave those examples a try. > >If I use I only get >the text from the last element, not from all elements. Hm, I am sure I have done this. In your schema.xml is the field "body" multiValued or not? > >If I use >or I don't >get back anything for the body column. > >So the

Re: DIH: Limited xpath syntax unable to parse all xml elements

2009-07-02 Thread Jay Hill
Thanks Noble, I gave those examples a try. If I use I only get the text from the last element, not from all elements. If I use or I don't get back anything for the body column. So the first example is close, but it only gets the text for the last element. If I could get all elements at th

Re: DIH: Limited xpath syntax unable to parse all xml elements

2009-07-01 Thread Noble Paul നോബിള്‍ नोब्ळ्
complete xpath is not supported /book/body/chapter/p should work. if you wish all the text under irrespective of nesting , tag names use this On Thu, Jul 2, 2009 at 5:31 AM, Jay Hill wrote: > I'm using the XPathEntityProcessor to parse an xml structure that looks like > this: > > >    J

Re: DIH: Limited xpath syntax unable to parse all xml elements

2009-07-01 Thread Mark Miller
Hmmm - my very limited understanding of xpath says that /book/body/chapter/p should work. Some quick testing with XPath Expression Testbed shows both /book/body/chapter/p and /book/body/chapter//p selecting the right nodes. I'm not sure what's up. Are you actually looking for /book/body/chapter/