I also have a real world document that doesn't work (from our nutch crawls):
wget http://variogr.am/badfile.txt
./post.sh badfile.txt


A solr rock star advised me to try SOLR-214, which fixes the problem. Perhaps he'll illuminate us as to the reasons! But for now be careful with Resin.



I don't know about this "rock star" business!

Brian's setup worked running trunk from ~1 month ago... the major character encoding change since then is to use the servlet container's getReader() rather then construct it from the stream.

The javadocs are clear that the servlet container needs to handle the conversion... but if that is causing problems in the newest resin and tomcat, maybe solr should take care of it.


Reply via email to