I started this thread back in November. Recall that I'm indexing xml and
storing the xpath as a payload in each token. I am not encoding or mapping
the xpath but storing the text directly as String.getBytes(). We're not
using this to query in any way, just to add context to our search results.
Hi All,
The Structured (or Multi-Page, Multi-Part) document problem is a problem
I've been thinking about for a while. A couple of years ago when the
project I was working on was using Lucene only (no Solr), we solved this
problem in several steps. At the point of ingestion we created a custom