Yeah, actually changing the field to "text_en" or "text_en_splitting"
actually made it so my indexer indexed all my files. The only problem is, I
don't think it's doing it well. 

I have two Cores that I'm working with. Both of them have indexed the same
set of files. The first core, which I will refer to as Testcore, I used a
DIH configuration that indexed the files with their metadata. (It indexed
everything fine but it almost killed Solr with 280 files I would hate to see
what would happen with say, 10,000 files.). When I query Testcore on some
random common word like "a" it returns like 279 files. A good margin I can
accept that. 

The second core, which I will refer to as Testcore2, I used my own indexer
that I created and use SolrJ as the client. It indexes everything. However,
when I query on the same word "a" it only returns 208 of the 281 files.
Which is weird cause I'm using the exact same Querying handler for both. So
I don't think a comprehensive indexed text is being sent to Solr. 





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Error-when-submitting-PDF-to-Solr-w-text-fields-using-SolrJ-tp4212704p4212933.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to