: What's a GA release?
http://en.wikipedia.org/wiki/Software_release_life_cycle#General_availability
-Hoss
--
http://lucenerevolution.org/ ... October 7-8, Boston
http://bit.ly/stump-hoss ... Stump The Chump!
What's a GA release?
Dennis Gearon
Signature Warning
EARTH has a Right To Life,
otherwise we all die.
Read 'Hot, Flat, and Crowded'
Laugh at http://www.yert.com/film.php
--- On Fri, 9/24/10, Lance Norskog wrote:
> From: Lance Norskog
> Subject: Re
The TikaEntityProcessor is the class in the DIH that calls the Tika
libraries.
TikaEntityProcessor is not in Solr 1.4 or 1.4.1. It is in the trunk and
the 3.x branch.
I have set it up from the 3.x branch. I discovered that the
"DefaultParser" does not work, and you have to explicitly name the
On 9/23/2010 6:52 AM, mehdi.es...@gmail.com wrote:
Hi,
I have exactly the same problem than the one you submitted in this link
http://lucene.472066.n3.nabble.com/Data-Import-Handler-Rich-Format-Documents-td905478.html
and I would like to ask you if you got a solution for that.
I started to have
On 6/28/2010 8:28 AM, Alexey Serba wrote:
Ok, I'm trying to integrate the TikaEntityProcessor as suggested. �I'm using
Solr Version: 1.4.0 and getting the following error:
java.lang.ClassNotFoundException: Unable to load BinURLDataSource or
org.apache.solr.handler.dataimport.BinURLDataSource
It
> Ok, I'm trying to integrate the TikaEntityProcessor as suggested. I'm using
> Solr Version: 1.4.0 and getting the following error:
>
> java.lang.ClassNotFoundException: Unable to load BinURLDataSource or
> org.apache.solr.handler.dataimport.BinURLDataSource
It seems that DIH-Tika integration is
On 6/18/2010 2:42 PM, Chris Hostetter wrote:
: > I don't think DIH can do that, but who knows, let's see what others say.
: Looks like the ExtractingRequestHandler uses Tika as well. I might just use
: this but I'm wondering if there will be a large performance difference between
: using it to
You are right. It seems TikaEntityProcessor is exactly the tool you
need in this case.
Alex
On Sat, Jun 19, 2010 at 2:59 AM, Chris Hostetter
wrote:
> : I think you can use existing ExtractingRequestHandler to do the job,
> : i.e. add child entity to your DIH metadata
>
> why would you do this in
: I think you can use existing ExtractingRequestHandler to do the job,
: i.e. add child entity to your DIH metadata
why would you do this instead of using the TikaEntityProcessor as i
already suggested in my earlier mail?
-Hoss
I think you can use existing ExtractingRequestHandler to do the job,
i.e. add child entity to your DIH metadata
http://localhost:8983/solr/update/extract?extractOnly=true&wt=xml&indent=on&stream.url=${metadata.url}";
dataSource="solr">
That's not working example, just basic
On 6/18/2010 2:42 PM, Chris Hostetter wrote:
: > I don't think DIH can do that, but who knows, let's see what others say.
: Looks like the ExtractingRequestHandler uses Tika as well. I might just use
: this but I'm wondering if there will be a large performance difference between
: using it to
On Fri, Jun 18, 2010 at 2:42 PM, Chris Hostetter
wrote:
> I'm confused ... You're using DIH, and some of your fields are URLs to
> documents that you want to parse with Tika?
>
> Why would you need a custom Transformer?
Yeah, I can definitely vouch that DIH can handle this without
additional codi
: > I don't think DIH can do that, but who knows, let's see what others say.
: Looks like the ExtractingRequestHandler uses Tika as well. I might just use
: this but I'm wondering if there will be a large performance difference between
: using it to batch content in over rolling my own Transform
On 6/18/2010 11:24 AM, Otis Gospodnetic wrote:
Tod,
I don't think DIH can do that, but who knows, let's see what others say.
Yes, Nutch uses TIKA, too.
Otis
Looks like the ExtractingRequestHandler uses Tika as well. I might just
use this but I'm wondering if there will be a large performan
t; To: solr-user@lucene.apache.org
> Sent: Fri, June 18, 2010 10:20:34 AM
> Subject: Re: Data Import Handler Rich Format Documents
>
> On 6/18/2010 9:12 AM, Otis Gospodnetic wrote:
> Tod,
>
> You
> didn't mention Tika, which makes me think you are not aware of it...
> You
On 6/18/2010 9:12 AM, Otis Gospodnetic wrote:
Tod,
You didn't mention Tika, which makes me think you are not aware of it...
You could implement a custom Transformer that uses Tika to perform rich doc
text extraction, just like ExtractingRequestHandler does it (see
http://wiki.apache.org/solr/E
Tod,
You didn't mention Tika, which makes me think you are not aware of it...
You could implement a custom Transformer that uses Tika to perform rich doc
text extraction, just like ExtractingRequestHandler does it (see
http://wiki.apache.org/solr/ExtractingRequestHandler ). Maybe you could even
17 matches
Mail list logo