Re: indexing xml document with literals

Chris Hostetter Wed, 07 Jul 2010 15:09:18 -0700

: Does anyone know how to read in data from one or more of the example xml docs
: and ALSO store the filename and path from which it came?


Solr has no knowledge that your "xml docs" are actually files ... the XML 
syntax ("<add><doc>...") is just a serialization mechanism for streaming 
data to solr about documents containing fields (with some optional 
boost data).  Most people 
should never have any need to actaully create files on disk following that 
XML format -- the example XML files exist purely as a simplified way of 
demonstrating the syntax with example data that's easy to view.

In 99% of all "real" applications you should just generate teh XML on the 
fly from your "real" data.

: Two questions:
: once the data gets indexed by solr, is there anything we can use to know that
: this data came from that file? ie, what was the name and location of the file
: that holds the data. I need access to the path and filename of the xml file
: containing the entries when searching.

No, that data needs to be in the XML data when it's sent over the wire -- 
nothing about the raw HTTP POST has any notion of where it came from on 
disk (because it may not have come from disk at all)

: and is there anyway to append information to xml data being indexed through
: the query parameters like there is with the ExtractingRequestHandler.
: like literal.id=x;literal.filename=vidcard..xml  or does all this information
: have to be in the particular <doc> in question.

it all needs to be in the <doc> ... ExtractingRequestHandler supports the 
"literal" params precisely because there is no "serialization" mechanism 
for the data -- it's sent over the wire un-encapsulated.

-Hoss

Re: indexing xml document with literals

Reply via email to