Afternoon,
I've got a large collections of documents which I'm attempting to add to
a Solr index using Tika via the ExtractingRequestHandler, but there are
a large number that it has problems with (PDFs, PPTX and XLS documents
mainly).
I've tried them with the most recent stand alone version of
Joe, can you open a Lucene JIRA issue for this?
I just glanced at the code and it looks like a bug to me.
On Sun, Feb 14, 2010 at 12:07 AM, Joe Calderon wrote:
> i ran into a problem while using the edgengramtokenfilter, it seems to
> report incorrect offsets when generating tokens, more specifi
i ran into a problem while using the edgengramtokenfilter, it seems to
report incorrect offsets when generating tokens, more specifically all
the tokens have offset 0 and term length as start and end, this leads
to goofy highlighting behavior when creating edge grams for tokens
beyond the first one
Hi All,
My concept still not clear about facet search.
I am trying to search using facet query. I am indexing data from three
table, following is the detail of table:
table name: news
news_id
news_details
table name : article
article_id
article_details
table name: blog
blog_id
blog_details
I
take a look at PositionFilter
On Feb 13, 2010 1:14 AM, "Kevin Osborn" wrote:
Right now if I have the query model:(Nokia BH-212V), the parser turns this
into +(model:nokia model:"bh 212 v"). The problem is that I might have a
model called Nokia BH-212, so this is completely missed. In my case, I
I don't see a good way to fix this without some heuristic you'd have to
implement to munge your query. There's no good for SOLR to intuit that
what you want is a partial match in this case. If you can create some
rules like "remove any single letters after numbers in the query"
that would be "good
It's really hard to help unless you include the analysis and query
schema for the field in question since so much of how things work
is dependent upon those choices. Also include the query you fire
at SOLR
I suspect that omitTermFreqAndPositions is irrelevant
Erick
On Fri, Feb 12, 2010 a
>
> I mean solr's Http caching. When testing from browser I usually disable it
> with
> in solrconfig.xml.
I have enabled , changed the title field to be the
same "text" type, cleared the index, restarted and re-indexed and - it works!
I then changed the title field back to the "text" type
> The title and long_description are almost the same. The
> only difference is that I've taken out the
> "solr.SnowballPorterFilterFactory" filter, so that the
> titles are not stemmed. I actually as part of the test
> before writing to this list I tried creating the type that
> has only the standa
>
> Interesting there is a parameter (hl.requireFieldMatch) about this but
> default
> value is false.
Interesting indeed! I have tried setting hl.requireFieldMatch manually to false
before - but no luck.
>
> Are you using some default highlighting parameters defined in solrconfig.xml?
> Y
Hi Jan,
I believe Yonik already answered on such question, it is simply schema
configuration / specific query type; but I can't find this post...
Something like [product:word1 -product:"word1"], but I am under liters of
beer :) and can't think...
-F
> -Original Message-
> From: Jan Høyda
> > What is numFound when you execute this query?
> >
> > http://localhost:8983/solr/select?q=long_description:terminator&fl=sku,title,long_description&hl=true&hl.fl=long_description&fq=sku:10699058
>
> The response is: start="0">
>
> The odd thing is that if I do a search in the
> long_descri
> Then we should confirm that long_description really contains term terminator.
>
> What is numFound when you execute this query?
>
> http://localhost:8983/solr/select?q=long_description:terminator&fl=sku,title,long_description&hl=true&hl.fl=long_description&fq=sku:10699058
The response is:
Hi,
This is probably due to length normalization. Normally this is wanted, as you
want to penalize partial match vs a more exact match.
Try specifying omitNorms=true on your field.
You should ask yourself what kind of relevancy or sorting you really need in
your project. If you search short fie
How do I SWAP the old_core with the new_core. Is it to be done manually or
does solr provide with a command for doing so. What if I don't make a new
core, make changes to the existing core, reindex it and then utilize the
RELOAD command. Would this be a bad approach ?
On Sat, Feb 13, 2010 at 10:1
> The contents of the "long_description" field are actually
> pretty short - max. 2000 characters. But I've tried setting
> it to -1 as well, and still the same results.
Then we should confirm that long_description really contains term terminator.
What is numFound when you execute this query?
Hi,
I execute query "word1", and it returns 100k results where top-10k are just
"word1".
How to filter it, and to show "word1 word2" in top-10?
Thanks
> > I am
> > having some issues with making the highlighting work
> > properly. If I search for a word in a "title"
> > field and request a highlighted summary from another
> > "long_description" field, this works on some
> > documents, but on some doesn't. Have you seen anything
> > like this bef
Ryan Kennedy wrote:
> Have you tried using the RELOAD command in the core admin?
>
> http://wiki.apache.org/solr/CoreAdmin#RELOAD
>
> I'm not sure if the sharedlib classloader is global or if it's local
> to each core, but if it's local to each core then a RELOAD may do the
> trick.
>
> Ryan
>
Have you tried using the RELOAD command in the core admin?
http://wiki.apache.org/solr/CoreAdmin#RELOAD
I'm not sure if the sharedlib classloader is global or if it's local
to each core, but if it's local to each core then a RELOAD may do the
trick.
Ryan
> I am
> having some issues with making the highlighting work
> properly. If I search for a word in a "title"
> field and request a highlighted summary from another
> "long_description" field, this works on some
> documents, but on some doesn't. Have you seen anything
> like this before?
Default v
can we confirm that the user does not have multiple DIH configured?
any request for an import, while an import is going on, is rejected
On Sat, Feb 13, 2010 at 11:40 AM, Chris Hostetter
wrote:
>
> : concurrent imports are not allowed in DIH, unless u setup multiple DIH
> instances
>
> Right, bu
On Fri, Feb 12, 2010 at 19:46, Sachin Sebastian
wrote:
> Hi there,
>
> I'm trying to migrate from solr 1.3 to solr 1.4 and I've few issues.
> Initially my localsolr was throwing NullPointer exception and I fixed it by
> changing type of lat and lng to 'tdouble'. But now I'm not able to updat
Hi gurus,
I am having some issues with making the highlighting work properly. If I search
for a word in a "title" field and request a highlighted summary from another
"long_description" field, this works on some documents, but on some doesn't.
Have you seen anything like this before?
Exampl
Most of the other files have rows about 1000-3000 thousand.
What does happen if im writing that file, and then solr tries to read it, is
there somekind of timeout?
Again, if I recall correctly, it loads up the file at startup or after commit
(the later depends on the location of the fi
On Feb 11, 2010, at 5:34 AM, Julian Hille wrote:
> Hi,
>
> were trying to implement another sortby Algorythm which is calculate outside
> of our solr Server.
> Is there a limit for the lines in that outside file? Cause we sometimes have
> 1.5 million lines in some situations.
> Also is this a
On Feb 10, 2010, at 8:46 PM, Jason Chaffee wrote:
> Is it possible to do query elevation based on field?
>
>
>
> Basically, I would like to search the same term on three different
> fields:
>
>
>
> q=field1:term OR field2:term OR field3:term
>
>
>
> and I would like to sort the results
Thanks Antonio for sharing this.
I believe this could be one of the interesting case studies for Solr In
Action, if you are interested in sharing a bit more - I am sure the
authors would be more interested for upcoming revisions.
--
K K.
On 02/12/2010 06:02 PM, Antonio Lobato wrote:
Hey ev
28 matches
Mail list logo