RE: PDF search functionality using Solr Schema.xml and SolrConfig.xml question

2015-01-06 Thread Ganesh.Yadav
1:56 AM To: solr-user@lucene.apache.org Subject: Re: PDF search functionality using Solr Hello, no matter which search platform you will use, this will pose two challenges: - The size of the documents will render search less and less useful as the likelihood of matches increases with documen

Re: PDF search functionality using Solr

2015-01-06 Thread Erick Erickson
Seconding Jürgen's comment. 4G docs are almost, but not quite totally useless to search How many JIRA's each? That's _one_ document unless you do some fancy dancing. Pulling the data directly using the JIRA API sounds far superior. If you _must_ use the JIRA->PDF->Solr option, consider the followi

RE: PDF search functionality using Solr Schema.xml and SolrConfig.xml question

2015-01-06 Thread Ganesh.Yadav
.com] Sent: Tuesday, January 06, 2015 11:56 AM To: solr-user@lucene.apache.org Subject: Re: PDF search functionality using Solr Hello, no matter which search platform you will use, this will pose two challenges: - The size of the documents will render search less and less useful as the likelihood

Re: PDF search functionality using Solr

2015-01-06 Thread Jürgen Wagner (DVT)
Hello, no matter which search platform you will use, this will pose two challenges: - The size of the documents will render search less and less useful as the likelihood of matches increases with document size. So, without a proper semantic extraction (e.g., using decent NER or relationship extr

PDF search functionality using Solr

2015-01-06 Thread Ganesh.Yadav
Hello Solr-users and developers, Can you please suggest, 1. What I should do to index PDF content information column wise? 2. Do I need to extract the contents using one of the Analyzer, Tokenize and Filter combination and then add it to Index? How can test the results on command pr