You might be interested in trying Lux, which is a Solr extension that
indexes XML documents using the element and attribute names and the
contents of those nodes in your document. It also allows you to define
XPath indexes (like DIH, I think, but with the full XPath 2.0 syntax),
and to query your document collection using XQuery 1.0 (in combination
with standard lucene searches at the document level). See
http://luxdb.org/
-Mike Sokolov
On 8/16/2013 8:55 AM, Abhiroop wrote:
I am very new to Solr. I am looking to index an xml file and search its
contents. Its structure resembles something like this
<entry id="REACT_142474" acc="REACT_142474.5">
<name>((1,6)-alpha-glucosyl)poly((1,4)-alpha-glucosyl)glycogenin =>
poly{(1,4)-alpha- glucosyl} glycogenin + alpha-D-glucose</name>
<description>This event has been computationally inferred from an event
that
has been demonstrated in another species.The inference is based on the
homology mapping in Ensembl Compara. Briefly, reactions for which all
involved PhysicalEntities (in input, output and catalyst) have a mapped
orthologue/paralogue (for complexes at least 75% of components must have
a
mapping) are inferred to the other species. High level events are also
inferred for these events to allow for easier navigation.More details
and
caveats of the event inference in Reactome. For details on the Ensembl
Compara system see also: Gene orthology/paralogy prediction
method.</description>
<dates>
<date type="creation" value="06-JUN-2013"/>
<date type="last_modification" value="06-JUN-2013"/>
</dates>
<cross_references>
<ref dbname="ChEBI" dbkey="17925"/>
<ref dbname="UniProt" dbkey="Q06625"/>
<ref dbname="ChEBI" dbkey="18291"/>
<ref dbname="UniProt" dbkey="P47011"/>
<ref dbname="UniProt" dbkey="P36143"/>
<ref dbname="GO" dbkey="GO:0004135"/>
<ref dbname="taxonomy" dbkey="4932"/>
</cross_references>
<additional_fields>
<field name="organism">Saccharomyces cerevisiae</field>
</additional_fields>
</entry>
Is it essential to use the DIH to import this data into Solr? Isn't
there
any simpler way to accomplish the task? Can it be done through SolrJ as
I am
fine with outputting the result through the console too. It would be
really
helpful if someone could point me to some useful examples or resources
on
this apart from the official documentation.
--
View this message in context:
http://lucene.472066.n3.nabble.com/Indexing-an-XML-file-in-Apache-Solr-tp4085053.html
Sent from the Solr - User mailing list archive at Nabble.com.
------------------------------
If you reply to this email, your message will be added to the discussion
below:
http://lucene.472066.n3.nabble.com/Indexing-an-XML-file-in-Apache-Solr-tp4085053p4085344.html
To unsubscribe from Indexing an XML file in Apache Solr, click
here<http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=4085053&code=YXNpYW1nZW5pdXNAZ21haWwuY29tfDQwODUwNTN8LTMzNDk4OTkzNQ==>
.
NAML<http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>