Dear all,
I configured my Nutch (1.4) for works with Solr (1.4.1) and I crawl and
index with success my website.
I have only a problem with the results of my researches.
Into all results, the snippets have a raw with a string where I can read
all the categories of my website. I attached a screen s
thanks koji, but i don't comprend ho can i do..
Il giorno 04 marzo 2012 06:31, Koji Sekiguchi ha
scritto:
> It is not solr error. Consult nutch/hadoop mailing list.
>
>
> koji
> --
> Query Log Visualizer for Apache Solr
> http://soleami.com/
>
> (12/03/04
lRunner.java:65)
at org.apache.nutch.crawl.Crawl.main(Crawl.java:55)
why, in your opinion?
thanks again
alessio
Il giorno 03 marzo 2012 16:43, Koji Sekiguchi ha
scritto:
> (12/03/04 0:09), alessio crisantemi wrote:
>
>> is true.
>> this is the slr problem:
>> mar 03, 2012 12:08:04 PM org.apa
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:722)
whats means?
thanks
a.
Il giorno 03 marzo 2012 14:40, Koji Sekiguchi ha
scritto:
> (12/03/03 20:32), alessio crisantemi wr
this is my nutch log after configured it for solr index:
2012-03-03 12:20:25,520 INFO solr.SolrMappingReader - source: content
dest: content
2012-03-03 12:20:25,520 INFO solr.SolrMappingReader - source: site dest:
site
2012-03-03 12:20:25,520 INFO solr.SolrMappingReader - source: title dest:
ti
?
thanks,
alessio
Il giorno 25 febbraio 2012 10:52, alessio crisantemi <
alessio.crisant...@gmail.com> ha scritto:
> thi is the problem!
> Becaus in my root there is a url!
>
> I write you my step-by-step configuration of nutch:
> (I use cygwin because I work on windows)
>
thi is the problem!
Becaus in my root there is a url!
I write you my step-by-step configuration of nutch:
(I use cygwin because I work on windows)
*1. Extract the Nutch package*
*2. Configure Solr*
(*Copy the provided Nutch schema from directory apache-nutch-1.0/conf to
directory apache-solr-1.3
thanks for your reply, but don't work.
the same message: can't convert empty path
and more: impossible find class org.apache.nutch.crawl.injector
..
Il giorno 22 febbraio 2012 06:14, tamanjit.bin...@yahoo.co.in <
tamanjit.bin...@yahoo.co.in> ha scritto:
> Try this command.
>
> bin/nutch crawl
I try to configured nutch (1.4) on my solr 3.2
But when I try with a crawl command
"bin/nutch inject crawl/crawldb urls"
don't works, and it reply with "can't convert a empty path"
why, in your opinion?
tx
a.
Hi all,
In a new installation of sOlr (1.4) I configured Tika for indexing rich
documents.
So, I commit my files and I can find it after indexing with an http query "*
http://localhost:8983/solr/select?q=attr_content:parola*"; (for search the
word 'parola') and I find the committed text.
but if I
k, in your opinion, or you see an error in this code?
thanks,
alessio
Il giorno 17 febbraio 2012 21:29, Erick Erickson
ha scritto:
> Sorry, my error! In that case you *do* have to do some fiddling to get
> it all to work.
>
> Good Luck!
> Erick
>
> On Fri, Feb 17, 2012 at 3:27
FileListEntityProcessor" recursive="true"
> rootEntity="false">
> processor="TikaEntityProcessor" url="${sd.fileAbsolutePath}">
>
>
>
>
>
>
>
>
>
>
>
ry 2012 21:37, alessio crisantemi
> wrote:
> > here the log:
> >
> >
> > org.apache.solr.handler.dataimport.DataImporter doFullImport
> > Grave: Full Import failed
> > org.apache.solr.handler.dataimport.DataImportHandlerException: 'baseDir'
> is
Yes, I read it. But I don't know the cause.
and more: I work on windows and so, I configured manually tika and solr
because I don't have maven...
2012/2/16 Gora Mohanty
> On 16 February 2012 21:37, alessio crisantemi
> wrote:
e.AbstractProtocol destroy
Informazioni: Destroying ProtocolHandler ["ajp-bio-8009"]
2012/2/16 alessio crisantemi
> yes, but if I use TikaEntityProcessor the result of my full-import is
>
> 0
> 1
>
> 0
>
> Indexing failed. Rolled back all changes.
>
>
&g
yes, but if I use TikaEntityProcessor the result of my full-import is
0
1
0
Indexing failed. Rolled back all changes.
2012/2/16 alessio crisantemi
> Hi all,
> I have a problem to configure a pdf indexing from a directory in my solr
> wit DIH:
>
> with t
Hi all,
I have a problem to configure a pdf indexing from a directory in my solr
wit DIH:
with this data-config
I obtain this result:
full-import
idle
-
0:0:2.44
0
43
0
2012-02-12 19:06:00
Indexing failed. Rolled
*2012-02-12 19:06:00*
*Indexing failed. Rolled back all changes.*
*2012-02-12 19:06:00*
suggestions?
thank you
a.
2012/2/12 alessio crisantemi
> sorry for the confusion:
>
> I forgotted a part of code:
> url="${f.fileAbsolutePath}" format="text">
>
*
*1*
*0*
*0*
*2012-02-12 18:20:49*
*Indexing failed. Rolled back all changes.*
*2012-02-12 18:20:49*
help!
ty
alessio
2012/2/12 alessio crisantemi
> Hi,
> Now, my DIH run but maybe only partly
>
> I indexing a directory containing 43 pdf files.
> follow, the reply of m
lease, Help me!
thank you
alessio
PS: follow my data-config.xl file: may be is here the problem..
2012/2/12 alessio crisantemi
> Dear Shawn,
> thanks for your reply.
> but my contrib directory of Solr 3.5 do
ndler-extras-3.5.jar, so, WITHOUTH 'snapshot'.
Why? Where I can download this jar files?
a.
2012/2/12 Shawn Heisey
> On 2/11/2012 4:33 AM, alessio crisantemi wrote:
>
>> dear all,
>> I update my solr at 3.5 version but now I have this problem:
>>
>> Grave:
Informazioni: end_rollback
I don't know..
suggestions?
best
a.
2012/2/10 Gora Mohanty
> On 10 February 2012 04:15, alessio crisantemi
> wrote:
> > hi all,
> > I would index on solr my pdf files wich includeds on my directory
> c:\myfile\
> >
> > so, I add
un(Unknown Source)
.
why?
Tu
2012/2/10 Gora Mohanty
> On 10 February 2012 04:15, alessio crisantemi
> wrote:
> > hi all,
> > I would index on solr my pdf files wich includeds on my directory
> c:\myfile\
> >
> > so, I add on my solr/conf directory the
with rootEntity="false" it's the same..
help!
2012/2/10 Chantal Ackermann
>
>
> On Thu, 2012-02-09 at 23:45 +0100, alessio crisantemi wrote:
> > hi all,
> > I would index on solr my pdf files wich includeds on my directory
> c:\myfile\
> >
> >
I have problems with full import query.
no results.
I search in log files and after I write again..
tx
a.
2012/2/9 alessio crisantemi
> hi all,
> I would index on solr my pdf files wich includeds on my directory
> c:\myfile\
>
> so, I add on my solr/conf directory the file data-
hi all,
I would index on solr my pdf files wich includeds on my directory c:\myfile\
so, I add on my solr/conf directory the file data-config.xml like the
following:
before, I add this part into solr-config.xml:
c:\solr\conf\data-config.xml
but this is the r
ok, I try.
but I think:
If I Index a zip archive containing any pdf files and after, i search on
solr a query, I see only the list of the pdf title into my archive, but it
can't search into the single document..
I read on Tika documentation that "Package formats can contain multiple
separate docu
geek4377/nutch/commit/c66bf35ff4f86393413621b3b889b1c78281df4d
>
> to see how to upgrade the solr version in nutch, teh above example
> replaces solr 1.4.0 with 3.1.0.
>
>
>
>
> On Sun, Feb 5, 2012 at 11:02 PM, alessio crisantemi
> wrote:
> > if I look the solr and nuth libs I found:
> >
gt; version on server,
> can you post the versions for both nutch and solr?
>
>
> On Sun, Feb 5, 2012 at 10:24 PM, alessio crisantemi
> wrote:
> > no, all run on port 8983.
> > ..
> >
> > 2012/2/5 Matthew Parker
> >
> >> Doesn't tomcat run on
no, all run on port 8983.
..
2012/2/5 Matthew Parker
> Doesn't tomcat run on port 8080, and not port 8983? Or did you change the
> tomcat's default port to 8983?
> On Feb 5, 2012 5:17 AM, "alessio crisantemi" >
> wrote:
>
> > Hi All,
> > I ha
Hi All,
I have some problems with integration of Nutch in Solr and Tomcat.
I follo Nutch tutorial for integration and now, I can crawl a website: all
works right.
But It I try the solr integration, I can't indexing on Solr.
follow the nutch output after the command:
bin/nutch crawl urls -solr htt
Hi all,
I built Nutch on Solr (versions 1.4 and 1.4.1) on Windows.
I can parse and crawl a website, but when I try to indexing this data with
Solr, I received an error..
this is my command:
bin/nutch crawl urls -solr http://localhost:8983/solr/ -depth 3 -topN 5
and
this is (the final part of) t
Dear all,
I can I do for indexing a complete directory with many pdf files on Solr?
Alessio Crisantemi
Direttore Responsabile Gioconews.it
www.gioconews.it
t: (+39)0744461296
f: (+39)0744461362
bb: (+39)3477939054
e: alessio.crisant...@gioconews.it
can you help me?
best,
a.
---
-Messaggio originale-
From: Luca Cavanna
Sent: Tuesday, January 17, 2012 10:16 AM
To: solr-user@lucene.apache.org ; Alessio Crisantemi
Subject: Re: Solr - Tomc
Hi,
I installed Apache tomct on Windows (Vista) and Solr.
But I have any problem between Tomcat 7.0.23 and Solr 3.5
No problem if I install Solr 1.4.1 with the same version of Tomcat.
(I check it with binary and source code installation for omcat but the
result is the same).
It's a bug, I think
35 matches
Mail list logo