Fwd: request about snippets

2012-04-07 Thread alessio crisantemi
Dear all, I configured my Nutch (1.4) for works with Solr (1.4.1) and I crawl and index with success my website. I have only a problem with the results of my researches. Into all results, the snippets have a raw with a string where I can read all the categories of my website. I attached a screen s

Re: nutch log

2012-03-04 Thread alessio crisantemi
thanks koji, but i don't comprend ho can i do.. Il giorno 04 marzo 2012 06:31, Koji Sekiguchi ha scritto: > It is not solr error. Consult nutch/hadoop mailing list. > > > koji > -- > Query Log Visualizer for Apache Solr > http://soleami.com/ > > (12/03/04

Re: nutch log

2012-03-03 Thread alessio crisantemi
lRunner.java:65) at org.apache.nutch.crawl.Crawl.main(Crawl.java:55) why, in your opinion? thanks again alessio Il giorno 03 marzo 2012 16:43, Koji Sekiguchi ha scritto: > (12/03/04 0:09), alessio crisantemi wrote: > >> is true. >> this is the slr problem: >> mar 03, 2012 12:08:04 PM org.apa

Re: nutch log

2012-03-03 Thread alessio crisantemi
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:722) whats means? thanks a. Il giorno 03 marzo 2012 14:40, Koji Sekiguchi ha scritto: > (12/03/03 20:32), alessio crisantemi wr

nutch log

2012-03-03 Thread alessio crisantemi
this is my nutch log after configured it for solr index: 2012-03-03 12:20:25,520 INFO solr.SolrMappingReader - source: content dest: content 2012-03-03 12:20:25,520 INFO solr.SolrMappingReader - source: site dest: site 2012-03-03 12:20:25,520 INFO solr.SolrMappingReader - source: title dest: ti

Re: nutch and solr

2012-02-27 Thread alessio crisantemi
? thanks, alessio Il giorno 25 febbraio 2012 10:52, alessio crisantemi < alessio.crisant...@gmail.com> ha scritto: > thi is the problem! > Becaus in my root there is a url! > > I write you my step-by-step configuration of nutch: > (I use cygwin because I work on windows) >

Re: nutch and solr

2012-02-25 Thread alessio crisantemi
thi is the problem! Becaus in my root there is a url! I write you my step-by-step configuration of nutch: (I use cygwin because I work on windows) *1. Extract the Nutch package* *2. Configure Solr* (*Copy the provided Nutch schema from directory apache-nutch-1.0/conf to directory apache-solr-1.3

Re: nutch and solr

2012-02-22 Thread alessio crisantemi
thanks for your reply, but don't work. the same message: can't convert empty path and more: impossible find class org.apache.nutch.crawl.injector .. Il giorno 22 febbraio 2012 06:14, tamanjit.bin...@yahoo.co.in < tamanjit.bin...@yahoo.co.in> ha scritto: > Try this command. > > bin/nutch crawl

nutch and solr

2012-02-21 Thread alessio crisantemi
I try to configured nutch (1.4) on my solr 3.2 But when I try with a crawl command "bin/nutch inject crawl/crawldb urls" don't works, and it reply with "can't convert a empty path" why, in your opinion? tx a.

solr and tika

2012-02-20 Thread alessio crisantemi
Hi all, In a new installation of sOlr (1.4) I configured Tika for indexing rich documents. So, I commit my files and I can find it after indexing with an http query "* http://localhost:8983/solr/select?q=attr_content:parola*"; (for search the word 'parola') and I find the committed text. but if I

Re: problem to indexing pdf directory

2012-02-17 Thread alessio crisantemi
k, in your opinion, or you see an error in this code? thanks, alessio Il giorno 17 febbraio 2012 21:29, Erick Erickson ha scritto: > Sorry, my error! In that case you *do* have to do some fiddling to get > it all to work. > > Good Luck! > Erick > > On Fri, Feb 17, 2012 at 3:27

Re: problem to indexing pdf directory

2012-02-17 Thread alessio crisantemi
FileListEntityProcessor" recursive="true" > rootEntity="false"> > processor="TikaEntityProcessor" url="${sd.fileAbsolutePath}"> > > > > > > > > > > >

Re: problem to indexing pdf directory

2012-02-17 Thread alessio crisantemi
ry 2012 21:37, alessio crisantemi > wrote: > > here the log: > > > > > > org.apache.solr.handler.dataimport.DataImporter doFullImport > > Grave: Full Import failed > > org.apache.solr.handler.dataimport.DataImportHandlerException: 'baseDir' > is

Re: problem to indexing pdf directory

2012-02-16 Thread alessio crisantemi
Yes, I read it. But I don't know the cause. and more: I work on windows and so, I configured manually tika and solr because I don't have maven... 2012/2/16 Gora Mohanty > On 16 February 2012 21:37, alessio crisantemi > wrote:

Re: problem to indexing pdf directory

2012-02-16 Thread alessio crisantemi
e.AbstractProtocol destroy Informazioni: Destroying ProtocolHandler ["ajp-bio-8009"] 2012/2/16 alessio crisantemi > yes, but if I use TikaEntityProcessor the result of my full-import is > > 0 > 1 > > 0 > > Indexing failed. Rolled back all changes. > > &g

Re: problem to indexing pdf directory

2012-02-16 Thread alessio crisantemi
yes, but if I use TikaEntityProcessor the result of my full-import is 0 1 0 Indexing failed. Rolled back all changes. 2012/2/16 alessio crisantemi > Hi all, > I have a problem to configure a pdf indexing from a directory in my solr > wit DIH: > > with t

problem to indexing pdf directory

2012-02-16 Thread alessio crisantemi
Hi all, I have a problem to configure a pdf indexing from a directory in my solr wit DIH: with this data-config I obtain this result: full-import idle - 0:0:2.44 0 43 0 2012-02-12 19:06:00 Indexing failed. Rolled

Re: indexing with DIH (and with problems)

2012-02-12 Thread alessio crisantemi
*2012-02-12 19:06:00* *Indexing failed. Rolled back all changes.* *2012-02-12 19:06:00* suggestions? thank you a. 2012/2/12 alessio crisantemi > sorry for the confusion: > > I forgotted a part of code: > url="${f.fileAbsolutePath}" format="text"> >

Re: indexing with DIH (and with problems)

2012-02-12 Thread alessio crisantemi
* *1* *0* *0* *2012-02-12 18:20:49* *Indexing failed. Rolled back all changes.* *2012-02-12 18:20:49* help! ty alessio 2012/2/12 alessio crisantemi > Hi, > Now, my DIH run but maybe only partly > > I indexing a directory containing 43 pdf files. > follow, the reply of m

Re: indexing with DIH (and with problems)

2012-02-12 Thread alessio crisantemi
lease, Help me! thank you alessio PS: follow my data-config.xl file: may be is here the problem.. 2012/2/12 alessio crisantemi > Dear Shawn, > thanks for your reply. > but my contrib directory of Solr 3.5 do

Re: indexing with DIH (and with problems)

2012-02-12 Thread alessio crisantemi
ndler-extras-3.5.jar, so, WITHOUTH 'snapshot'. Why? Where I can download this jar files? a. 2012/2/12 Shawn Heisey > On 2/11/2012 4:33 AM, alessio crisantemi wrote: > >> dear all, >> I update my solr at 3.5 version but now I have this problem: >> >> Grave:

Re: indexing with DIH (and with problems)

2012-02-11 Thread alessio crisantemi
Informazioni: end_rollback I don't know.. suggestions? best a. 2012/2/10 Gora Mohanty > On 10 February 2012 04:15, alessio crisantemi > wrote: > > hi all, > > I would index on solr my pdf files wich includeds on my directory > c:\myfile\ > > > > so, I add

Re: indexing with DIH (and with problems)

2012-02-10 Thread alessio crisantemi
un(Unknown Source) . why? Tu 2012/2/10 Gora Mohanty > On 10 February 2012 04:15, alessio crisantemi > wrote: > > hi all, > > I would index on solr my pdf files wich includeds on my directory > c:\myfile\ > > > > so, I add on my solr/conf directory the

Re: indexing with DIH (and with problems)

2012-02-10 Thread alessio crisantemi
with rootEntity="false" it's the same.. help! 2012/2/10 Chantal Ackermann > > > On Thu, 2012-02-09 at 23:45 +0100, alessio crisantemi wrote: > > hi all, > > I would index on solr my pdf files wich includeds on my directory > c:\myfile\ > > > >

Re: indexing with DIH (and with problems)

2012-02-10 Thread alessio crisantemi
I have problems with full import query. no results. I search in log files and after I write again.. tx a. 2012/2/9 alessio crisantemi > hi all, > I would index on solr my pdf files wich includeds on my directory > c:\myfile\ > > so, I add on my solr/conf directory the file data-

indexing with DIH (and with problems)

2012-02-09 Thread alessio crisantemi
hi all, I would index on solr my pdf files wich includeds on my directory c:\myfile\ so, I add on my solr/conf directory the file data-config.xml like the following: before, I add this part into solr-config.xml: c:\solr\conf\data-config.xml but this is the r

Re: indexing data on solr

2012-02-07 Thread alessio crisantemi
ok, I try. but I think: If I Index a zip archive containing any pdf files and after, i search on solr a query, I see only the list of the pdf title into my archive, but it can't search into the single document.. I read on Tika documentation that "Package formats can contain multiple separate docu

Re: nutch in solr

2012-02-05 Thread alessio crisantemi
geek4377/nutch/commit/c66bf35ff4f86393413621b3b889b1c78281df4d > > to see how to upgrade the solr version in nutch, teh above example > replaces solr 1.4.0 with 3.1.0. > > > > > On Sun, Feb 5, 2012 at 11:02 PM, alessio crisantemi > wrote: > > if I look the solr and nuth libs I found: > >

Re: nutch in solr

2012-02-05 Thread alessio crisantemi
gt; version on server, > can you post the versions for both nutch and solr? > > > On Sun, Feb 5, 2012 at 10:24 PM, alessio crisantemi > wrote: > > no, all run on port 8983. > > .. > > > > 2012/2/5 Matthew Parker > > > >> Doesn't tomcat run on

Re: nutch in solr

2012-02-05 Thread alessio crisantemi
no, all run on port 8983. .. 2012/2/5 Matthew Parker > Doesn't tomcat run on port 8080, and not port 8983? Or did you change the > tomcat's default port to 8983? > On Feb 5, 2012 5:17 AM, "alessio crisantemi" > > wrote: > > > Hi All, > > I ha

nutch in solr

2012-02-05 Thread alessio crisantemi
Hi All, I have some problems with integration of Nutch in Solr and Tomcat. I follo Nutch tutorial for integration and now, I can crawl a website: all works right. But It I try the solr integration, I can't indexing on Solr. follow the nutch output after the command: bin/nutch crawl urls -solr htt

problem to add Solr data with Nutch

2012-01-29 Thread alessio crisantemi
Hi all, I built Nutch on Solr (versions 1.4 and 1.4.1) on Windows. I can parse and crawl a website, but when I try to indexing this data with Solr, I received an error.. this is my command: bin/nutch crawl urls -solr http://localhost:8983/solr/ -depth 3 -topN 5 and this is (the final part of) t

Re: Tika0.10 language identifier in Solr3.5.0

2012-01-19 Thread Alessio Crisantemi
Dear all, I can I do for indexing a complete directory with many pdf files on Solr? Alessio Crisantemi Direttore Responsabile Gioconews.it www.gioconews.it t: (+39)0744461296 f: (+39)0744461362 bb: (+39)3477939054 e: alessio.crisant...@gioconews.it

Re: Solr - Tomcat new versions

2012-01-17 Thread Alessio Crisantemi
can you help me? best, a. --- -Messaggio originale- From: Luca Cavanna Sent: Tuesday, January 17, 2012 10:16 AM To: solr-user@lucene.apache.org ; Alessio Crisantemi Subject: Re: Solr - Tomc

Re: Solr - Tomcat new versions

2012-01-17 Thread Alessio Crisantemi
Hi, I installed Apache tomct on Windows (Vista) and Solr. But I have any problem between Tomcat 7.0.23 and Solr 3.5 No problem if I install Solr 1.4.1 with the same version of Tomcat. (I check it with binary and source code installation for omcat but the result is the same). It's a bug, I think