Re: Question about indexing PDFs

2016-08-26 Thread Betsey Benagh
Erick, I’m not sure of anything. I’m new to Solr and find the documentation extremely confusing. I’ve searched the web and found tutorials/advice, but they generally refer to older versions of Solr, and refer to methods/settings/whatever that no longer exist. That’s why I’m asking for help here.

RE: Question about indexing PDFs

2016-08-26 Thread Srinivasa Meenavalli
on&indent=true Regards Srinivas Meenavalli -Original Message- From: Erick Erickson [mailto:erickerick...@gmail.com] Sent: Friday, August 26, 2016 3:09 AM To: solr-user Subject: Re: Question about indexing PDFs That is always a dangero

Re: Question about indexing PDFs

2016-08-25 Thread Erick Erickson
That is always a dangerous assumption. Are you sure you're searching on the proper field? Are you sure it's indexed? Are you sure it's The schema browser I indicated above will give you some idea what's actually in the field. You can not only see the fields Solr (actually Lucene) see in your i

Re: Question about indexing PDFs

2016-08-25 Thread Betsey Benagh
Right, that¹s where I looked. No Œcontent¹. Which is what confused me. On 8/25/16, 1:56 PM, "Erick Erickson" wrote: >when you say "I don't see it in the schema for that collection" are you >talking schema.xml? managed_schema? Or actual documents in the index? >Often >these are defined by dyna

Re: Question about indexing PDFs

2016-08-25 Thread Betsey Benagh
It looks like the metadata of the PDFs was indexed, but not the content (which is what I was interested in). Searches on terms I know exist in the content come up empty. On 8/25/16, 2:16 PM, "Betsey Benagh" wrote: >Right, that¹s where I looked. No Œcontent¹. Which is what confused me. > > >On

Re: Question about indexing PDFs

2016-08-25 Thread Erick Erickson
when you say "I don't see it in the schema for that collection" are you talking schema.xml? managed_schema? Or actual documents in the index? Often these are defined by dynamic fields and the like in the schema files. Take a look at the admin UI>>schema browser>>drop down and you'll see all the ac

Question about indexing PDFs

2016-08-25 Thread Betsey Benagh
Following the instructions in the quick start guide, I imported a bunch of PDF documents into my Solr 6.0 instance. As far as I can tell from the documentation, there should be a 'content' field indexing, well, the content, but I don't see it in the schema for that collection. Is there somethi