A bug was introduced between Solr 3.1 and 3.2.
With Solr 3.2 we are now getting the follwing error when querying
several pdf and word documents:
SEVERE: org.apache.solr.common.SolrException:
org.apache.lucene.search.highlight.InvalidTokenOffsetsException: Token
17 exceeds length of provided
This works with NEITHER HtmlEncoder NOR DefaultEncoder.
1. Special characters like öäüß simply are returned as question marks.
This goes for ALL document types.
2. The index is built in a way that randomly concatenates words and
puts them into the highlighting section in a way that does no
Having updated from 1.4.1 to 3.1.0 some documents are not parsed
correctly anymore:
1. Both the result's id field and the highlighting's header do not
display special-characters e.g. German Umlauts anymore.
2. The highlighting section is messed up as words appear in random order
instead of r
Hi,
when I query solr (trunk) I get "numeric character references" instead
of regular UTF-8 strings in case of special characters in the
highlighting section, in the result section the characters are presented
fine.
e.g instead of the German Umlaut Ä I get ä
Example:
Vielfachmessgerät
eam.url" parameter. (Also stream.file.) Note
that there is no outbound authentication supported; your web server
has to be open (at least to the Solr instance).
On Sun, Oct 31, 2010 at 4:06 PM, getagrip wrote:
Hi,
I've got some basic usage / design questions.
1. The
Hi,
I've got some basic usage / design questions.
1. The SolrJ wiki proposes to use the same CommonsHttpSolrServer
instance for all requests to avoid connection leaks.
So if I create a Singleton instance upon application-startup I can
securely use this instance for ALL queries/updates t