RE: Problem with pdf, upgrading Cell

2010-05-11 Thread Marc Ghorayeb
Great news, thanks :) Marc _ Vous voulez regarder la TV directement depuis votre PC ? C'est très simple avec Windows 7 http://clk.atdmt.com/FRM/go/229960614/direct/01/

Re: Problem with pdf, upgrading Cell

2010-05-10 Thread Grant Ingersoll
pository.html > > Thanks, > Sandhya > > -Original Message- > From: Praveen Agrawal [mailto:pkal...@gmail.com] > Sent: Wednesday, May 05, 2010 10:49 PM > To: solr-user@lucene.apache.org > Subject: Re: Problem with pdf, upgrading Cell > > It reports that Ju

RE: Problem with pdf, upgrading Cell

2010-05-06 Thread Sandhya Agarwal
: Re: Problem with pdf, upgrading Cell It reports that Jukka has resolved the issue (Tika-419), and now waiting for Grant to verify (Solr-1902). But it seems the resolution will be available in 0.8 version of Tika. If it solves the problem, Is there a way to get it now? Any SVN trunk access etc

Re: Problem with pdf, upgrading Cell

2010-05-05 Thread Praveen Agrawal
trib/clustering/lib/commons-lang-2.4.jar' to > classloader > > > > May 4, 2010 12:51:52 PM org.apache.solr.core.SolrResourceLoader > replaceClassLoader > > > > INFO: Adding > 'file:/C:/apache-solr-1.4.0/contrib/clustering/lib/ehcache-1.6.2.jar' to > classl

RE: Problem with pdf, upgrading Cell

2010-05-05 Thread Sandhya Agarwal
ded to it the extraction library (apache solr > cell jar), though you might not need it specifically inside the war file. > Marc > > From: sagar...@opentext.com > > To: solr-user@lucene.apache.org > > Date: Wed, 5 May 2010 10:21:36 +0530 > > Subject: RE: Problem with

RE: Problem with pdf, upgrading Cell

2010-05-05 Thread Marc Ghorayeb
Praveen, I am indeed using a trunk version from last week's svn i think. You could always try a version from the hudson builds. I did not try this procedure with Solr's 1.4 release though. Marc ___

Re: Problem with pdf, upgrading Cell

2010-05-05 Thread Praveen Agrawal
he extraction library (apache solr > cell jar), though you might not need it specifically inside the war file. > Marc > > From: sagar...@opentext.com > > To: solr-user@lucene.apache.org > > Date: Wed, 5 May 2010 10:21:36 +0530 > > Subject: RE: Problem with pdf, upgrading Cell

RE: Problem with pdf, upgrading Cell

2010-05-05 Thread Marc Ghorayeb
core-0.7.jar > tika-parsers-0.7.jar > xml-apis-1.0.b2.jar > xmlbeans-2.3.0.jar > > Thanks, > Sandhya > > > > -Original Message- > From: Sandhya Agarwal [mailto:sagar...@opentext.com] > Sent: Wednesday, May 05, 2010 10:06 AM > To: solr-user@lucene.apache

RE: Problem with pdf, upgrading Cell

2010-05-04 Thread Sandhya Agarwal
[mailto:sagar...@opentext.com] Sent: Wednesday, May 05, 2010 10:06 AM To: solr-user@lucene.apache.org Subject: RE: Problem with pdf, upgrading Cell Praveen, I only have the highlighted jars copied. Not sure, if we need the other jars. Also, I copied the jars directly into solr\WEB-INF\lib

RE: Problem with pdf, upgrading Cell

2010-05-04 Thread Sandhya Agarwal
: solr-user@lucene.apache.org Subject: Re: Problem with pdf, upgrading Cell Hi Sandhya.. I must be missing something. I copied all dependencies jars to both contrib/extraction/lib and web-in/lib folders. Here is the list of jars copied: asm-3.1.jar bcmail-jdk15-1.45.jar bcprov-jdk15-1.45

Re: Problem with pdf, upgrading Cell

2010-05-04 Thread Praveen Agrawal
. > > Thanks, > Sandhya > > From: Praveen Agrawal [mailto:pkal...@gmail.com] > Sent: Tuesday, May 04, 2010 5:22 PM > To: solr-user@lucene.apache.org > Subject: Re: Problem with pdf, upgrading Cell > > another one here.. > On Tue, May 4, 2010 at 5:20 PM, Praveen Agr

RE: Problem with pdf, upgrading Cell

2010-05-04 Thread Marc Ghorayeb
ls, they just get appended. Someone correct me if i am wrong here :) Marc > Date: Tue, 4 May 2010 11:58:56 + > From: pkal...@gmail.com > To: solr-user@lucene.apache.org > Subject: Re: Problem with pdf, upgrading Cell > > This email contained a .zip file attachment. Raytheon do

Re: Problem with pdf, upgrading Cell

2010-05-04 Thread Praveen Agrawal
gt;> what you were asking. >>> Thanks. >>> >>> >>> >>> On Tue, May 4, 2010 at 5:01 PM, Sandhya Agarwal >>> wrote: >>> >>>> Praveen, >>>> >>>> Along with the tika core and parser jars, did you run "mvn >

RE: Problem with pdf, upgrading Cell

2010-05-04 Thread Sandhya Agarwal
Both the files work for me, Praveen. Thanks, Sandhya From: Praveen Agrawal [mailto:pkal...@gmail.com] Sent: Tuesday, May 04, 2010 5:22 PM To: solr-user@lucene.apache.org Subject: Re: Problem with pdf, upgrading Cell another one here.. On Tue, May 4, 2010 at 5:20 PM, Praveen Agrawal mailto:pkal

Re: Problem with pdf, upgrading Cell

2010-05-04 Thread Praveen Agrawal
uot;mvn >>> dependency:copy-dependencies", to generate all the dependencies too. >>> >>> Thanks, >>> Sandhya >>> >>> -Original Message- >>> From: Praveen Agrawal [mailto:pkal...@gmail.com] >>> Sent: Tuesday, May 04, 201

RE: Problem with pdf, upgrading Cell

2010-05-04 Thread Sandhya Agarwal
-user@lucene.apache.org Subject: Re: Problem with pdf, upgrading Cell Yes Sandhya, i copied new poi/jempbox/pdfbox/fontbox etc jars too. I believe this is what you were asking. Thanks. On Tue, May 4, 2010 at 5:01 PM, Sandhya Agarwal wrote: > Praveen, > > Along with the tika core and parser

Re: Problem with pdf, upgrading Cell

2010-05-04 Thread Praveen Agrawal
quot;, to generate all the dependencies too. > > Thanks, > Sandhya > > -Original Message- > From: Praveen Agrawal [mailto:pkal...@gmail.com] > Sent: Tuesday, May 04, 2010 4:52 PM > To: solr-user@lucene.apache.org > Subject: Re: Problem with pdf, upgrading Cell >

RE: Problem with pdf, upgrading Cell

2010-05-04 Thread Sandhya Agarwal
solr-user@lucene.apache.org Subject: Re: Problem with pdf, upgrading Cell I seems to have mixed results: Here is what i did: copied new Tika/poi/jempbox/pdfbox/fontbox/log4j jars etc in contrib/extraction/lib (of-course removed old ones),. as well as in web-inf/lib of solr web app in tomcat. Now it extract

Re: Problem with pdf, upgrading Cell

2010-05-04 Thread Praveen Agrawal
I seems to have mixed results: Here is what i did: copied new Tika/poi/jempbox/pdfbox/fontbox/log4j jars etc in contrib/extraction/lib (of-course removed old ones),. as well as in web-inf/lib of solr web app in tomcat. Now it extracts contents from some pdf, but either no content from others, or

RE: Problem with pdf, upgrading Cell

2010-05-04 Thread Marc Ghorayeb
Hey, I got it to work. I just redid my steps, i had forgotten several libraries that were imported through the xml. PDF extraction seems to work once again, i have yet to find one that raises an exception! Thanks for the investigation, at least we now have a fix :) Marc

RE: Problem with pdf, upgrading Cell

2010-05-04 Thread Sandhya Agarwal
On Behalf Of Grant Ingersoll Sent: Tuesday, May 04, 2010 4:00 PM To: solr-user@lucene.apache.org Subject: Re: Problem with pdf, upgrading Cell Yes, it is loading the libraries, but they are in a different classloader that apparently the new way Tika loads doesn't have access to. -Gra

Re: Problem with pdf, upgrading Cell

2010-05-04 Thread Grant Ingersoll
he-1.6.2.jar' to > classloader > > May 4, 2010 12:51:52 PM org.apache.solr.core.SolrResourceLoader > replaceClassLoader > > INFO: Adding > 'file:/C:/apache-solr-1.4.0/contrib/clustering/lib/google-collections-1.0-rc2.jar' > to classloader > > May 4,

Re: Problem with pdf, upgrading Cell

2010-05-04 Thread Praveen Agrawal
s.java:152) > Thanks Grant for investigating the problem! > Marc > > > From: sagar...@opentext.com > > To: solr-user@lucene.apache.org > > Date: Tue, 4 May 2010 13:10:25 +0530 > > Subject: RE: Problem with pdf, upgrading Cell > > > > Yes, Grant. You are right.

RE: Problem with pdf, upgrading Cell

2010-05-04 Thread Sandhya Agarwal
-Original Message- From: Sandhya Agarwal [mailto:sagar...@opentext.com] Sent: Tuesday, May 04, 2010 1:10 PM To: solr-user@lucene.apache.org Subject: RE: Problem with pdf, upgrading Cell Yes, Grant. You are right. Copying the tika libraries to solr webapp, solved the issue and the content

RE: Problem with pdf, upgrading Cell

2010-05-04 Thread Marc Ghorayeb
pache.org > Date: Tue, 4 May 2010 13:10:25 +0530 > Subject: RE: Problem with pdf, upgrading Cell > > Yes, Grant. You are right. Copying the tika libraries to solr webapp, solved > the issue and the content extraction works fine now. > > Thanks, > Sandhya > >

RE: Problem with pdf, upgrading Cell

2010-05-04 Thread Sandhya Agarwal
Subject: RE: Problem with pdf, upgrading Cell Hello, But I see that the libraries are being loaded : INFO: Adding specified lib dirs to ClassLoader May 4, 2010 12:49:59 PM org.apache.solr.core.SolrResourceLoader replaceClassLoader INFO: Adding 'file:/C:/apache-solr-1.4.0/co

RE: Problem with pdf, upgrading Cell

2010-05-04 Thread Sandhya Agarwal
:/apache-solr-1.4.0/contrib/clustering/lib/jackson-mapper-asl-0.9.9-6.jar' to classloader May 4, 2010 12:51:52 PM org.apache.solr.core.SolrResourceLoader replaceClassLoader INFO: Adding 'file:/C:/apache-solr-1.4.0/contrib/clustering/lib/log4j-1.2.14.jar' to classloader Thanks, Sandhya

Re: Problem with pdf, upgrading Cell

2010-05-03 Thread Grant Ingersoll
rote: >> >>> >>> Hi, >>> Grant, i confirm what Praveen has said, any PDF i try does not work with >>> the new Tika and SVN versions. :( >>> Marc >>> >>>> From: sagar...@opentext.com >>>> To: solr-user@lucene.apach

Re: Problem with pdf, upgrading Cell

2010-05-03 Thread Grant Ingersoll
d SVN versions. :( >> Marc >> >>> From: sagar...@opentext.com >>> To: solr-user@lucene.apache.org >>> Date: Mon, 3 May 2010 13:05:24 +0530 >>> Subject: RE: Problem with pdf, upgrading Cell >>> >>> Hello, >>> >>&g

Re: Problem with pdf, upgrading Cell

2010-05-03 Thread Grant Ingersoll
.org >> Date: Mon, 3 May 2010 13:05:24 +0530 >> Subject: RE: Problem with pdf, upgrading Cell >> >> Hello, >> >> Please let me know if anybody figured out a way out of this issue. >> >> Thanks, >> Sandhya >> >> -Original Messa

RE: Problem with pdf, upgrading Cell

2010-05-03 Thread Marc Ghorayeb
Hi, Grant, i confirm what Praveen has said, any PDF i try does not work with the new Tika and SVN versions. :( Marc > From: sagar...@opentext.com > To: solr-user@lucene.apache.org > Date: Mon, 3 May 2010 13:05:24 +0530 > Subject: RE: Problem with pdf, upgrading Cell > > Hell

RE: Problem with pdf, upgrading Cell

2010-05-03 Thread Sandhya Agarwal
Hello, Please let me know if anybody figured out a way out of this issue. Thanks, Sandhya -Original Message- From: Praveen Agrawal [mailto:pkal...@gmail.com] Sent: Friday, April 30, 2010 11:14 PM To: solr-user@lucene.apache.org Subject: Re: Problem with pdf, upgrading Cell Grant, You

Re: Problem with pdf, upgrading Cell

2010-04-30 Thread Praveen Agrawal
Grant, You can try any of the sample pdfs that come in /docs folder of Solr 1.4 dist'n. I had tried 'Installing Solr in Tomcat.pdf', 'index.pdf' etc. Only metadata i.e. stream_size, content_type apart from my own literals are indexed, and content is missing.. On Fri, Apr 30, 2010 at 8:52 PM, Gran

Re: Problem with pdf, upgrading Cell

2010-04-30 Thread Grant Ingersoll
Praveen and Marc, Can you share the PDF (feel free to email my private email) that fails in Solr? Thanks, Grant On Apr 30, 2010, at 7:55 AM, Marc Ghorayeb wrote: > > Hi > Nope i didn't get it to work... Just like you, command line version of tika > extracts correctly the content, but once in

Re: Problem with pdf, upgrading Cell

2010-04-30 Thread Marc Ghorayeb
Hi Nope i didn't get it to work... Just like you, command line version of tika extracts correctly the content, but once included in Solr, no content is extracted. What i tried until now is:- Updating the tika libraries inside Solr 1.4 public version, no luck there.- Downloading the latest SVN v

Re: Problem with pdf, upgrading Cell

2010-04-30 Thread Praveen Agrawal
nside the ExtractingDocumentLoader class > does not receive the ContentStream (it is set to null...).Maybe i should > send this to the developper mailing list? > > Marc > > > >> From: dekay...@hotmail.com > >> To: solr-user@lucene.apache.org > >> Subject: RE:

Re: Problem with pdf, upgrading Cell

2010-04-30 Thread Grant Ingersoll
t; the developper mailing list? > Marc > >> From: dekay...@hotmail.com >> To: solr-user@lucene.apache.org >> Subject: RE: Problem with pdf, upgrading Cell >> Date: Fri, 23 Apr 2010 16:03:28 +0200 >> >> >> Seems like i'm not the only o

RE: Problem with pdf, upgrading Cell

2010-04-30 Thread Sandhya Agarwal
: RE: Problem with pdf, upgrading Cell Mark, did you managed to get it work? I did try latest Tika (0.7) command line and successfully parsed earlier problematic pdf. Then i replaced Tika related jars in Solr-1.4 contrib/extraction/lib folder with new ones. Now it doesn;t throw any exception, but

RE: Problem with pdf, upgrading Cell

2010-04-30 Thread pk
Mark, did you managed to get it work? I did try latest Tika (0.7) command line and successfully parsed earlier problematic pdf. Then i replaced Tika related jars in Solr-1.4 contrib/extraction/lib folder with new ones. Now it doesn;t throw any exception, but no content extraction, only metadata!

RE: Problem with pdf, upgrading Cell

2010-04-26 Thread Marc Ghorayeb
mail.com > To: solr-user@lucene.apache.org > Subject: RE: Problem with pdf, upgrading Cell > Date: Fri, 23 Apr 2010 16:03:28 +0200 > > > Seems like i'm not the only one with this "no extraction" > problem:http://www.mail-archive.com/solr-user@lucene.apache.org/

Re: Problem with pdf, upgrading Cell

2010-04-23 Thread Paul Borgermans
On Fri, Apr 23, 2010 at 5:48 PM, Marc Ghorayeb wrote: > > Yes, the only log i can actually get is the one in the command console from > windows and there are no errors there ... > Here are the last lines when i upload a pdf to the update/extract url: I am pretty sure it is the tika itself that

RE: Problem with pdf, upgrading Cell

2010-04-23 Thread Marc Ghorayeb
ive_evictions=0}Apr 23, 2010 5:47:14 PM org.apache.solr.update.processor.LogUpdateProcessor finishINFO: {optimize=} 0 46Apr 23, 2010 5:47:14 PM org.apache.solr.core.SolrCore executeINFO: [] webapp=/solr path=/update params={optimize=true&waitSearcher=true&maxSegments=1&waitFlush=true&a

Re: Problem with pdf, upgrading Cell

2010-04-23 Thread Otis Gospodnetic
Sent: Fri, April 23, 2010 9:12:39 AM > Subject: RE: Problem with pdf, upgrading Cell > > I'm launching it with the start.jar utility, and there doesn't seem to be > anything weird inside the console when i upload a pdf. Is there a way to > output > the console to a l

RE: Problem with pdf, upgrading Cell

2010-04-23 Thread Marc Ghorayeb
Seems like i'm not the only one with this "no extraction" problem:http://www.mail-archive.com/solr-user@lucene.apache.org/msg33609.htmlApparently he tried the same thing, building from the trunk, and indexing a pdf, and no extraction occured... Strange. Marc G.

RE: Problem with pdf, upgrading Cell

2010-04-23 Thread Marc Ghorayeb
kay...@hotmail.com > To: solr-user@lucene.apache.org > Subject: RE: Problem with pdf, upgrading Cell > Date: Fri, 23 Apr 2010 15:12:39 +0200 > > > I'm launching it with the start.jar utility, and there doesn't seem to be > anything weird inside the console when i uplo

RE: Problem with pdf, upgrading Cell

2010-04-23 Thread Marc Ghorayeb
.1" 200 41 127.0.0.1 - - [23/Apr/2010:13:07:05 +] "GET /solr/core0/admin/schema.jsp HTTP/1.1" 200 26395 127.0.0.1 - - [23/Apr/2010:13:07:05 +] "GET /solr/core0/admin/jquery-1.2.3.min.js HTTP/1.1" 304 0 I don't think that's going to help much :) >

Re: Problem with pdf, upgrading Cell

2010-04-23 Thread Otis Gospodnetic
Marc, got anything in your logs? Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/ - Original Message > From: Marc Ghorayeb > To: solr-user@lucene.apache.org > Sent: Fri, April 23, 2010 8:42:53 AM > Subject: Probl