bq: How original document X will be returned? Should I store location of X in Tx? I s there a generic way of doing it?
A couple of choices here: 1> create a stored-only field (i.e. stored="true" indexed="false" docValues="false") and stuff the original in that. It'll chew up some disk space, but not affect searching much. 2> store a pointer to the original XML and return _that_, probably doing that in the app layer. Best, Erick On Wed, Mar 15, 2017 at 11:06 AM, Walter Underwood <wun...@wunderwood.org> wrote: > Solr does not index XML. Period. > > Solr uses an XML protocol for indexing. It can also use JSON or binary > protocols for indexing. > > You need to convert your XML document into fields, then send those fields to > Solr using one of the indexing protocols. > > If you need an XML database and search engine, I recommend MarkLogic. > > wunder > Walter Underwood > wun...@wunderwood.org > http://observer.wunderwood.org/ (my blog) > >> On Mar 15, 2017, at 11:02 AM, rangeli nepal <rangeli.ne...@gmail.com> wrote: >> >> Thank you Erick for such a prompt reply. I am bit confused. >> Suppose I have a document X, I transformed it document Tx. Tx matches the >> format that you have described. I post Tx and I asume it get indexed. >> >> Now I query. How original document X will be returned? Should I store >> location of X in Tx? I s there a generic way of doing it? >> >> Thank you >> Regards, >> rn >> >> >> On Wed, Mar 15, 2017 at 1:11 PM, Erick Erickson <erickerick...@gmail.com> >> wrote: >> >>> Solr does _not_ index arbitrary XML, it will index XML in a very >>> specific format, i.e. >>> <add> >>> <doc> >>> <field name="whatever">value</field> >>> . >>> . >>> </doc> >>> </add> >>> >>> So if you're sending arbitrary XML to Solr I'm actually surprised it's >>> indexing. >>> >>> You might be able to do something with sending docs through Tika >>> (ExtractingRequestHandler). >>> >>> Best, >>> Erick >>> >>> On Wed, Mar 15, 2017 at 9:50 AM, rangeli nepal <rangeli.ne...@gmail.com> >>> wrote: >>>> Good Afternoon, >>>> >>>> I am trying to index xml documents and query them. Once query >>> successfully matches, I am hoping to download the uploaded and indexed xml >>> document. >>>> >>>> Initially I thought solr supports xml. Thus I did not make any change to >>> my default installation. However I was not able to query with the keyword >>> there in document. >>>> >>>> Since most of the sensible token is stored with attribute “name”. I >>> changed managed-schema and added an attribute “name”. But no avail. >>>> I believe I am missing something. Your feedback will be a great help. >>>> >>>> Thank you. >>>> Regards, >>>> r.n. >>>> >>>> >>>> <nestedClassifier xmi:type='uml:CommunicationPath' >>> xmi:id='_18_2_2ff0127_1452978628060_399984_6195' name='TLS/DNS/etc'> >>>> <memberEnd xmi:idref='_18_2_2ff0127_ >>> 1452978628060_557499_6196'/> >>>> <memberEnd xmi:idref='_18_2_2ff0127_ >>> 1452978628061_485164_6197'/> >>>> </nestedClassifier> >>>> <nestedNode xmi:type='uml:Node' >>> xmi:id='_18_2_2ff0127_1452882456655_449194_4228' name='Stealth Master >>> DNS'> >>> >