Here's an idea: if you index the full text of your XML document using XmlCharFilter - available as a patch (or HtmlCharFilter), and then highlight the entire document (you will need to fiddle with highlighter parameters a bit to make sure you get 1 fragment that covers the entire file) with some tag like <match>, then you can take the highlighted result, parse it as an XML document into a tree model like JDOM or DOM, and execute XPath like: name(/descendant::match[1]/..) to find out the context in which your (first) hit appears.

-Mike

On 7/26/2011 10:48 AM, Lucas Miguez wrote:
Hi, finally now I have all the field names of each document using the
Luke Request Handler (http://wiki.apache.org/solr/LukeRequestHandler)
and making HTTP Request to Solr I can get all the fields that contain
the word that I am searching.
I'll keep looking for a better solution.

Thanks!

Regards

2011/7/15 Gora Mohanty
On Thu, Jul 14, 2011 at 8:43 PM, Lucas Miguez<lucas.mig...@gmail.com>  wrote:
Thanks for your help!

DIH XPathEntityProcessor helps me to index the XML Files, but, does it
help to me to know from where the node comes? Following the example in
my previous post:

example: Imagine that the user search the word "zona", then I have to
show the TitleP, the TextP, the TitlePart, the TextPart and all the
TextSubPart that are childs of gSubPart.
Well, I tried to create TextPart, TitlePart, etc with the XPath
expression of the location in the original XML, using dynamic fields,
for example:
<dynamic field="TextPart *" multivalued="true" indexed="true" ... />
There should not be a space between "TextPart" and "*"

to have the XPath associated with the field, but I don't know how to
search in all "TextPart *" fields...
[...]

You can search in individual fields, e.g., with ?q=TitlePart:myterm.
For searching in all "TextPart*" fields, the easiest way probably is
to copy the fields into a full-text search field. With the default Solr
schema, this can be done by adding a directive like
   <copyField source="TextPart*"  dest="text" />
This copies all fields into the field "text", which is searched by
default. Thus, ?q=myterm will find "myterm" in all "TextPart*"
fields.

Regards,
Gora


Reply via email to