user:~/solr/example/exampledocs$ java -jar post.jar test.pdf doesnt work

Index binary documents such as Word and PDF with Solr Cell 
(ExtractingRequestHandler).

how do i do his?

http://lucene.apache.org/solr/api-4_0_0-BETA/doc-files/tutorial.html


http://wiki.apache.org/solr/ExtractingRequestHandler

it says solr 1.4?

curl is not installed normally so how do we do this like with post.jar?
also the docs dir is not existing, seems very outdated?

"using "curl" or other command line tools to post documents to Solr is nice for 
testing, but not the recommended update method for best performance."

what then?


far below there:

java -Durl=http://localhost:8983/solr/update/extract -Dparams=literal.id=doc5 
-Dtype=text/html -jar post.jar tutorial.html


is this the right?

java -Dauto -jar post.jar tutorial.html
java -Dauto -Drecursive -jar post.jar .

"NOTE: The post.jar utility is not meant for production use"
so how do we normally do this or  should do this?
-------- Original-Nachricht --------
> Datum: Wed, 19 Sep 2012 10:51:29 -0400
> Von: Erik Hatcher <erik.hatc...@gmail.com>
> An: solr-user@lucene.apache.org
> Betreff: Re: missing a directory, can not process pdf files

> There's nothing in that tutorial that mentions an update "directory". 
> /update is a URL endpoint that requires Solr be up and running.
> 
> Please post the entire set of steps that you're trying and the exact
> (copy/pasted) error messages you're receiving.
> 
> And once you index a PDF file, you don't retrieve the file back from Solr,
> you retrieve search results.  The original file is where it was indexed
> from, not inside Solr.  What you'll get back is the file name (if you stored
> it, that is).
> 
>       Erik
> 
> On Sep 19, 2012, at 10:40 , xxxx xxxx wrote:
> 
> > I want to process a pdf file see "Indexing Data" from
> http://lucene.apache.org/solr/api-4_0_0-BETA/doc-files/tutorial.html
> > 
> > the directory "update" doesnt even exist:
> > SimplePostTool: POSTing files to http://localhost:8983/solr/update..
> > 
> > fails because the /update directory is not there and also has no
> contents (and is missing in the repos on github and so on)
> > 
> > how can we retrieve the files when we do a query which contain the
> searched query?
> > -------- Original-Nachricht --------
> >> Datum: Wed, 19 Sep 2012 08:33:57 -0400
> >> Von: Erick Erickson <erickerick...@gmail.com>
> >> An: solr-user@lucene.apache.org
> >> Betreff: Re: missing a directory, can not process pdf files
> > 
> >> Please review:
> >> 
> >> http://wiki.apache.org/solr/UsingMailingLists
> >> 
> >> There's nothing in your problem statement that's diagnosable. What did
> >> you try? What
> >> were the results? Details matter.
> >> 
> >> 4.0 is in process of being prepped for release. 30 days was a
> >> straw-man proposal.
> >> 
> >> Best
> >> Erick
> >> 
> >> On Wed, Sep 19, 2012 at 3:46 AM, xxxx xxxx <team.gam...@gmx.de> wrote:
> >>> seems the /update directory is missing? I use solr 4.0.0 beta
> >>> can not process pdf files because of it
> >>> 
> >>> also when will the final version be released? thought it it 30 days
> >> after beta?
> >>> 
> >>> how can we get the files which contain the searched queries / content?
> >>> 
> >>> 
> 

Reply via email to