Re: Question about indexing PDFs

2016-08-26 Thread Betsey Benagh
Erick, I’m not sure of anything. I’m new to Solr and find the documentation extremely confusing. I’ve searched the web and found tutorials/advice, but they generally refer to older versions of Solr, and refer to methods/settings/whatever that no longer exist. That’s why I’m asking for help here.

RE: Question about indexing PDFs

2016-08-26 Thread Srinivasa Meenavalli
on&indent=true Regards Srinivas Meenavalli -Original Message- From: Erick Erickson [mailto:erickerick...@gmail.com] Sent: Friday, August 26, 2016 3:09 AM To: solr-user Subject: Re: Question about indexing PDFs That is always a dangero

Re: Question about indexing PDFs

2016-08-25 Thread Erick Erickson
That is always a dangerous assumption. Are you sure you're searching on the proper field? Are you sure it's indexed? Are you sure it's The schema browser I indicated above will give you some idea what's actually in the field. You can not only see the fields Solr (actually Lucene) see in your i

Re: Question about indexing PDFs

2016-08-25 Thread Betsey Benagh
Right, that¹s where I looked. No Œcontent¹. Which is what confused me. On 8/25/16, 1:56 PM, "Erick Erickson" wrote: >when you say "I don't see it in the schema for that collection" are you >talking schema.xml? managed_schema? Or actual documents in the index? >Often >these are defined by dyna

Re: Question about indexing PDFs

2016-08-25 Thread Betsey Benagh
It looks like the metadata of the PDFs was indexed, but not the content (which is what I was interested in). Searches on terms I know exist in the content come up empty. On 8/25/16, 2:16 PM, "Betsey Benagh" wrote: >Right, that¹s where I looked. No Œcontent¹. Which is what confused me. > > >On

Re: Question about indexing PDFs

2016-08-25 Thread Erick Erickson
when you say "I don't see it in the schema for that collection" are you talking schema.xml? managed_schema? Or actual documents in the index? Often these are defined by dynamic fields and the like in the schema files. Take a look at the admin UI>>schema browser>>drop down and you'll see all the ac

Re: Question about Indexing Updated Documents

2016-07-01 Thread Chris Hostetter
If you are already using DIH, then you can use a deltaQuery to find "updated" documents and index only them. https://cwiki.apache.org/confluence/display/solr/Uploading+Structured+Data+Store+Data+with+the+Data+Import+Handler Some people just parameterize their main DIH query and use request par

Re: question about indexing...

2010-05-26 Thread Erik Hatcher
On May 26, 2010, at 3:49 AM, Jörg Agatz wrote: is the Textfield Single instance? how can i make it? I'm not sure what you're asking. You can have as many "text" fields as you like, or as many of any other type as well. In textfield indext the Word : "Hallo" if i search "Hallo" i found "

Re: question about indexing...

2010-05-26 Thread Jörg Agatz
OK, Done.. i reboot the Server. Now it works.. is the Textfield Single instance? how can i make it? In textfield indext the Word : "Hallo" if i search "Hallo" i found "hallo" i found "Hall*" i dont "hall*" i found But some user will search "Hall*" One more little Question i have... The Diff

Re: question about indexing...

2010-05-26 Thread Jörg Agatz
Sorry, i mean: The XML like This:

Re: question about indexing...

2010-05-26 Thread Jörg Agatz
Ok, Done... But no changes! I have the following in the Schema.xml Made: all The XML like This: i search "hallo" "Hallo" "leute" or "name" but i caqnt Find anythink. "*:*" brings me the one indexed file. What happens?

Re: question about indexing...

2010-05-25 Thread Erick Erickson
Don't forget to re-index after you make the change Lance suggested... Erick On Tue, May 25, 2010 at 4:51 PM, Lance Norskog wrote: > Change type="string" to type="text". This causes the field to be > analyzed and then searching on words finds the document. > > > > On Tue, May 25, 2010 at 8:34 AM

Re: question about indexing...

2010-05-25 Thread Lance Norskog
Change type="string" to type="text". This causes the field to be analyzed and then searching on words finds the document. On Tue, May 25, 2010 at 8:34 AM, Jörg Agatz wrote: > i create a new Index, but nothing Change. > >   multiValued="true"/> > > > > > > > > > I search for : > > " *:* " > I fo

Re: question about indexing...

2010-05-25 Thread Jörg Agatz
i create a new Index, but nothing Change. I search for : " *:* " I fond it i search vor "hallo" "Hallo" "hallo*" "Hallo*"or some other content from the CDATA field i dosent.

Re: question about indexing...

2010-05-25 Thread Erik Hatcher
You have to provide more details than that. We need to know the field definition for that named field, the corresponding field type definition, and the exact request you're making to Solr that you think should find this document. And most importantly, did you :) Erik On May 25,

Re: question about indexing...

2010-05-25 Thread Jörg Agatz
ok, done.. But now i dosent find any word in the CDATA field. i make : it is a string field Multivalued.. King

Re: question about indexing...

2010-05-25 Thread Erik Hatcher
Well, you'll just have to create valid XML, either encoding some characters or using CDATA sections. Erik On May 25, 2010, at 10:06 AM, Jörg Agatz wrote: I have a work!, i musst indexing a lot of E-Mails, so i will create a Script to generate me a xml of the Mails. Now is the que