XPath query support in Solr Cell

Eric Pugh Wed, 20 May 2009 13:46:05 -0700

So I am trying to filter down what I am indexing, and the basic XPathqueries don't work. For example, working with tutorial.pdf thisindexes all the <div/>:

curl http://localhost:8983/solr/update/extract?ext.idx.attr=true\&ext.def.fl=text\&ext.map.div=foo_t\&ext.capture=div\&ext.literal.id=126\&ext.xpath=\/xhtml:html\/xhtml:body\/descendant:node -F "tutori...@tutorial.pdf"


However, if I want to only index the first div, I expect to do this:

budapest:site epugh$ curl http://localhost:8983/solr/update/extract?ext.idx.attr=true\&ext.def.fl=text\&ext.map.div=foo_t\&ext.capture=div\&ext.literal.id=126\&ext.xpath=\/xhtml:html\/xhtml:body\/xhtml:div[1] -F "tutori...@tutorial.pdf"

But I keep getting back an issue from curl. My attempts to escape the[1] have failed. Any suggestions?


curl: (3) [globbing] error: bad range specification after pos 174

Eric

PS,

Also, this site seems to be okay as a place to upload your html andpractice xpath:


http://www.whitebeam.org/library/guide/TechNotes/xpathtestbed.rhtm

I did have to trip out the namespace stuff though.




-----------------------------------------------------
Eric Pugh | Principal | OpenSource Connections, LLC | 434.466.1467 | 
http://www.opensourceconnections.com
Free/Busy: http://tinyurl.com/eric-cal

XPath query support in Solr Cell

Reply via email to