Re: Solr 4.3.0 Cloud Issue indexing pdf documents

2013-06-10 Thread Michael Della Bitta
Glad that helped. I'm going to go buy a lottery ticket now! :) Michael Della Bitta Applications Developer o: +1 646 532 3062 | c: +1 917 477 7906 appinions inc. “The Science of Influence Marketing” 18 East 41st Street New York, NY 10017 t: @appinions | g+: p

Re: Solr 4.3.0 Cloud Issue indexing pdf documents

2013-06-10 Thread Mark Wilson
Hi Michael Thanks very much for that, it did indeed solve the problem. I had it setup on my internal servers, as I have a separate script for tomcat startup, but forgot all about it on the Amazon Cloud servers. For info I added CATALINA_OPTS="-Djava.awt.headless=true" export CATALINA_OPTS to

Re: Solr 4.3.0 Cloud Issue indexing pdf documents

2013-06-07 Thread Michael Della Bitta
Hi Mark, This is a total shot in the dark, but does passing -Djava.awt.headless=true when you run the server help at all? More on awt headless mode: http://www.oracle.com/technetwork/articles/javase/headless-136834.html Michael Della Bitta Applications Developer o: +1 646 532 3062 | c: +1 91

Solr 4.3.0 Cloud Issue indexing pdf documents

2013-06-07 Thread Mark Wilson
Hi I am having an issue with adding pdf documents to a SolrCloud index I have setup. I can index pdf documents fine using 4.3.0 on my local box, but I have a SolrCloud instance setup on the Amazon Cloud (Using 2 servers) and I get Error. It seems that it is not loading org.apache.pdfbox.pdmodel.

Re: SolrJ indexing pdf documents

2012-06-16 Thread Sami Siren
On Sat, Jun 16, 2012 at 5:59 PM, 12rad wrote: > Hi, > > I'm new to SolrJ. Hi and welcome! > Here I are the steps I followed to write an application to index pdf > documents to fresh solr3.6 > > 1 -In Schema.xml: > I added the fields I wanted indexed and changed stored = true. > > 2 - Started Sol

SolrJ indexing pdf documents

2012-06-16 Thread 12rad
t see anything. The numDocs that have been indexed is still 0. What I doing incorrectly? Any help would be greatly appreciated. Thanks! -- View this message in context: http://lucene.472066.n3.nabble.com/SolrJ-indexing-pdf-documents-tp3989965.html Sent from the Solr - User mailing list archive at Nabble.com.

Indexing PDF documents with no UniqueKey

2011-07-15 Thread sabman
ant to use it through the DataImportHandler. -- View this message in context: http://lucene.472066.n3.nabble.com/Indexing-PDF-documents-with-no-UniqueKey-tp3173272p3173272.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: indexing pdf documents

2008-05-14 Thread Brian Carmalt
Hello Cam, The wiki for RichDocuments explains how you can add meta data to the RDUpdater. http://wiki.apache.org/solr/UpdateRichDocuments I have used the patch to index docs and thier meta data, but it was not exactly what we needed. Brian. Am Mittwoch, den 14.05.2008, 12:38 +0300 schrieb

Re: indexing pdf documents

2008-05-14 Thread Cam Bazz
Hello Elizabeth; Yes, I have PDF files, and metadata about them already extracted. so I need something like: someone content of my pdf file it seems that the updaterichdocument patch can only accept pdfs in raw form - so it is not possible to feed metadata. Have you found a solution other th

Re: indexing pdf documents

2008-05-13 Thread Bess Sadler
C.B., are you saying you have metadata about your PDF files (i.e., title, author, etc) separate from the PDF file itself, or are you saying you want to extract that information from the PDF file? The first of these is pretty easy, the second of these can be difficult or impossible, dependin

Re: indexing pdf documents

2008-05-13 Thread Cam Bazz
yes, I have seen the documentation on RichDocumentRequestHandler at the http://wiki.apache.org/solr/UpdateRichDocuments page. However, from what I understand this just feeds documents to solr. How can I construct something like: document_id, document_name, document_text and feed it in. (i.e. my doc

Re: indexing pdf documents

2008-05-12 Thread Chris Harris
Solr does not have this support built in, but there's a patch for it: https://issues.apache.org/jira/browse/SOLR-284 On Mon, May 12, 2008 at 2:02 PM, Cam Bazz <[EMAIL PROTECTED]> wrote: > Hello, > > Before making a little program to extract the txt from my pdfs and feed it > into solr with xml,

indexing pdf documents

2008-05-12 Thread Cam Bazz
Hello, Before making a little program to extract the txt from my pdfs and feed it into solr with xml, I just wanted to check if solr has capability to digest pdf files apart from xml? Best Regards, -C.B.