Re: Error when submitting PDF to Solr w/text fields using SolrJ

2015-06-19 Thread Erick Erickson
You really, really, really want to get friendly with the admin/analysis page for questions like: bq: You're probably right though. I probably have to create a better analyzer really ;). It shows you exactly what each link in your analysis chain does to the input. Perhaps 75% or the questions abo

Re: Error when submitting PDF to Solr w/text fields using SolrJ

2015-06-19 Thread Paden
Yes the number of indexed documents is correct. But the queries I perform fall short of what they should be. You're probably right though. I probably have to create a better analyzer. And I'm not really worried about the other fields. I've already check to see if it's storing them correctly and i

Re: Error when submitting PDF to Solr w/text fields using SolrJ

2015-06-19 Thread Erick Erickson
This may be another forehead-slapper (man, you don't know how often I've injured myself that way). Did you commit at the end of the SolrJ indexing to Testcore2? DIH automatically commits at the end of the run, and depending on how your SolrJ program is written it may not have. Or just set autoComm

Re: Error when submitting PDF to Solr w/text fields using SolrJ

2015-06-19 Thread Alessandro Benedetti
So, the first I can say is if that is true : "it almost killed Solr with 280 files" you are doing something wrong for sure. At least if you are not trying to index 4k full movies xD Joking apart : 1) You should carefully design your analyser. 2) You should store your fields initially to verify you

Re: Error when submitting PDF to Solr w/text fields using SolrJ

2015-06-19 Thread Paden
Yeah, actually changing the field to "text_en" or "text_en_splitting" actually made it so my indexer indexed all my files. The only problem is, I don't think it's doing it well. I have two Cores that I'm working with. Both of them have indexed the same set of files. The first core, which I will r

Re: Error when submitting PDF to Solr w/text fields using SolrJ

2015-06-19 Thread Alessandro Benedetti
Silly thing … Maybe the immense token was generating because trying to set "string" as field type for your text ? Can be ? Can you wipe out the index, set a proper type for your text, and index again ? No worries about the not full stack trace, We learn and do wrong things everyday :) Errare humanu

Re: Error when submitting PDF to Solr w/text fields using SolrJ

2015-06-19 Thread Paden
Yeah I'm just gonna say hands down this was a totally bad question. My fault, mea culpa. I'm pretty new to working in an IDE environment and using a stack trace (I just finished my first year of CS at University and now I'm interning). I'm actually kind of embarrassed by how long it took me to real

Re: Error when submitting PDF to Solr w/text fields using SolrJ

2015-06-19 Thread Alessandro Benedetti
I definitely agree with Erick, the stack trace you posted is not complete again. This is an example of the same problem you got with a complete, meaningful stack trace : " Stacktrace you provided : org.apache.solr.common.SolrException: Exception writing document id 12345 > to the index; possible a

Re: Error when submitting PDF to Solr w/text fields using SolrJ

2015-06-18 Thread Erick Erickson
The stack trace is what gets returned to the client, right? It's often much more informative to see the Solr log output, the error message is often much more helpful there. By the time the exception bubbles up through the various layers vital information is sometimes not returned to the client in t

Re: Error when submitting PDF to Solr w/text fields using SolrJ

2015-06-18 Thread Paden
Just rolling out a little bit more information as it is coming. I changed the field type in the schema to text_general and that didn't change a thing. Another thing is that it's consistently submitting/not submitting the same documents. I will run over it one time and it won't index a set of docu

Re: Error when submitting PDF to Solr w/text fields using SolrJ

2015-06-18 Thread Paden
USING Solr 5.1.0 This is the schema file filepath

Re: Error when submitting PDF to Solr w/text fields using SolrJ

2015-06-18 Thread Alessandro Benedetti
We would like more information, but the first thing I notice is that hardly would make any sense to use a "string" type for a file content. Can you give more details about the exception ? Have you debugged a little bit ? How does the solr input document look before it is sent to Solr ? Furthermor