Hello,
I'm using Lucene Solr 4.10.4 for Keyword match functionality. I found some
issues with distance rule.
I have added search keyword with distance 2 "Bridgewater~2".
When I make search it did not return "bridwater" in results which should be.
If I change placing of 'ge' at any other place it
Hi Upaya,
thanks for the explanation, I actually already did some investigations
about it ( my first foundation was :
http://cephas.net/blog/2008/03/30/how-morelikethis-works-in-lucene/ ) and
then I took a look to the code.
Was just wondering what the community was thinking about
including/providi
happy to read that, regarding the spellcheck, is a different thing, so let
us know for further details !
Cheers
2015-09-27 18:59 GMT+01:00 Mark Fenbers :
> I am delighted to announce that I have it all working again! Well, not
> all, just the searching!
>
> I deleted my core and created a new o
Huh, strange - I didn't even notice that you could create cores through the UI.
I suppose it depends what order you read and infer from the documentation.
See "Create a Core":
https://cwiki.apache.org/confluence/display/solr/Running+Solr
I followed the "solr create -help" option to work out how
Hi,
I am using apache Nutch 1.7 to crawl and apache Solr 4.7.2 for indexing. In
my tests there is a gap between number of fetched results of Nutch and
number of indexed documents in Solr. For example one of the crawls is
fetched 23343 pages and 1146 images successfully while in the Solr 19250
docs
On Sun, 2015-09-27 at 14:47 +0200, Uwe Reh wrote:
> Like Walter Underwood wrote, in technical sense faceting on authors
> isn't a good idea.
In a technical sense, there is no good or bad about faceting on
high-cardinality fields in Solr. The faceting code is fairly efficient
(modulo the newly dis
Maybe it's a silly observation...
But are you lowercasing at indexing/querying time ?
Can you show us the schema analysis config for the field type you use ?
Because strictly talking about Levenshtein distance bridwater is 3 edits
from Bridgewater.
Cheers
2015-09-28 8:26 GMT+01:00 anil.vadhavane
So, based on my knowledge, it is not possible ( except if you customise the
component) .
Read here :
http://lucene.472066.n3.nabble.com/How-do-I-recover-the-position-and-offset-a-highlight-for-solr-4-1-4-2-td4051763.html
Another data structure that you can think as useful is to store the Term
Vect
Erick, Walter and all,
as I wrote, I am aware of the firstSearcher event, we tried it manually before
we choosed to enhance
the QuerySenderListener.
I think our usage scenario (I didn't wrote about it for simplicity) is a bit
different from yours,
what makes this necessary. We are implementing
How does facet_count work with a facet field that is defined as solr.
PathHierarchyTokenizerFactory?
I have multiple records that contains field Parameter which is of type
PathHierarchyTokenizerFactory.
E.g
"Parameter": [
"EARTH SCIENCE>OCEANS>OCEAN TEMPERATURE>WATER TEMPERATUR
I suspect you may be better off asking this on the Nutch user list. The
decisions you are describing will be within the Nutch codebase, not
Solr. Someone here may know (hopefully) but you may get more support
over on the Nutch list.
One suggestion -start with a clean, empty index. Run a crawl. Loo
This is a major release supporting lucene / solr 5.3.0. Download the zip
here:
https://github.com/DmitryKey/luke/releases/tag/luke-5.3.0
This release runs on Java8 and does not run on Java7.
The release includes a number of pull requests and github issues. Worth
mentioning:
https://github.com/Dm
There is also facet.limit which says how many facet entries to return.
Is that catching you?
The document either matches your query, or doesn't. If it does, then all
values of the Parameter field should be included in your faceting. But,
perhaps not all facet buckets are being returned to you - he
Hi,
I want to register multiple but identical search handler to have multiple
buckets to measure performance for our different apis and consumers (and to
find out who is actually using Solr).
What are there some costs associated with having multiple search handlers? Are
they neglible?
Cheers,
This looks similar to SOLR-4489, which is marked fixed for version 4.5. If
you're using an older version, the fix is to upgrade.
Also see SOLR-3608, which is similar but here it seems as if the user's query
is more than spellcheck was designed to handle. This should still be looked at
and p
Yes, that solved my problem. There must be an implisite facet.limit set because
I tried the same url query with face.limit=1. And got back records with
"EARTH SCIENCE>GEOGRAPHIC REGION>ARCTIC"
Cheers!
Endre
-Original Message-
From: Upayavira [mailto:u...@odoko.co.uk]
Sent: 28. sept
You could use the MLT query parser, and combine that with other queries,
whether as filters or boosts.
You can't yet use stream.body yet, so would need to use the handler if
you need that.
Upayavira
On Mon, Sep 28, 2015, at 09:53 AM, Alessandro Benedetti wrote:
> Hi Upaya,
> thanks for the expla
I would expect this to be negligible.
Upayavira
On Mon, Sep 28, 2015, at 01:30 PM, Oliver Schrenk wrote:
> Hi,
>
> I want to register multiple but identical search handler to have multiple
> buckets to measure performance for our different apis and consumers (and
> to find out who is actually us
On 9/28/2015 6:30 AM, Oliver Schrenk wrote:
> I want to register multiple but identical search handler to have multiple
> buckets to measure performance for our different apis and consumers (and to
> find out who is actually using Solr).
>
> What are there some costs associated with having multi
Were all of shard replica in active state (green color in admin ui) before
starting?
Sounds like it otherwise you won't hit the replica that is out of sync.
Replicas can get out of sync, and report being in sync after a sequence of
stop start w/o a chance to complete sync.
See if it might have hap
Hello,
I am importing in solr 2 entities coming from 2 different tables, and
I have defined an update request processor chain with two custom processor
factories:
- the first processor factory needs to be executed first for one type
of entities and then for the other (I differentiate the
>From the Solr wiki, the default facet.limit should be 100 !
Anyway I find the way field facet is shown for field path hierarchy token
filtered fields, to be not so user friendly.
Ideally for those fields we should show a facet representation similar to
facet pivot.
Should be nice to think an idea
A different solution to the same need: I'm measuring response times of
different collections measuring online/batch queries apart using New
Relic. I've added a servlet filter that analyses the request and makes this
info available to new relic over a request argument.
The built in new relic solr
We did the same thing, but reporting performance metrics to Graphite.
But we won’t be able to add servlet filters in 6.x, because it won’t be a
webapp.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Sep 28, 2015, at 11:32 AM, Gili Nachum wrote:
>
Hi,
I am trying to retrieve all the documents from a solr index in a batched
manner.
I have 100M documents. I am retrieving them using the method proposed here
https://nowontap.wordpress.com/2014/04/04/solr-exporting-an-index-to-an-external-file/
I am dumping 10M document splits in each file. I ge
Gili I was constantly checking the cloud admin UI and it always stayed
Green, that is why I initially overlooked sync issues...finally when all
options dried out I went individually to each node and quieried and that is
when i found the out of sync issue. The way I resolved my issue was shut
down t
Hi - you need to use the CursorMark feature for larger sets:
https://cwiki.apache.org/confluence/display/solr/Pagination+of+Results
M.
-Original message-
> From:Ajinkya Kale
> Sent: Monday 28th September 2015 20:46
> To: solr-user@lucene.apache.org; java-u...@lucene.apache.org
> Subj
If I am not wrong this works only with Solr version > 4.7.0 ?
On Mon, Sep 28, 2015 at 12:23 PM Markus Jelsma
wrote:
> Hi - you need to use the CursorMark feature for larger sets:
> https://cwiki.apache.org/confluence/display/solr/Pagination+of+Results
> M.
>
>
>
> -Original message-
> > F
Greetings!
I have highlighting turned on in my Solr searches, but what I get back
is tags surrounding the found term. Since I use a SWT StyledText
widget to display my search results, what I really want is the offset
and length of each found term, so that I can highlight it in my own way
wi
If you can't use CursorMark, then I suggest not using the start parameter,
instead sort asc by a unique field and and range the query to records with
a field value larger then the last doc you read. Then set rows to be
whatever you found can fit in memory.
On Mon, Sep 28, 2015 at 10:59 PM, Ajinkya
Hi,
I'm using HttpSolrClient to connect to Solr. Everything works until when I
enabled basic authentication in Jetty. My question is, how do I pass to
SolrJ the basic auth info. so that I don't get a 401 error?
Thanks in advance
Steve
Hi,
if I need fine grained error reporting I use Http Solr server and send
1 doc per request using the add method.
I report errors on exceptions of the add method,
I'm using autocommit so I'm not seing errors related to commit.
Am I loosing some errors? Is there a better way?
Thanks
CloudSolrClient has zkClientTimeout/zkConnectTimeout for access to
zookeeper.
It would be handy to also have the possibility to set something like
soTimeout/connectTimeout for accessing the solr nodes similarly to the old
non-cloud client.
Currently, in order to set a timeout for the client to
One would hope that https://issues.apache.org/jira/browse/SOLR-4735 will
be done by then.
On 9/28/15, 11:39 AM, "Walter Underwood" wrote:
>We did the same thing, but reporting performance metrics to Graphite.
>
>But we won’t be able to add servlet filters in 6.x, because it won’t be a
>webapp
http://opensourceconnections.com/blog/2014/07/13/reindexing-collections-with-solrs-cursor-support/
-Original Message-
From: Ajinkya Kale [mailto:kaleajin...@gmail.com]
Sent: Monday, September 28, 2015 2:46 PM
To: solr-user@lucene.apache.org; java-u...@lucene.apache.org
Subject: Solr jav
You shouldn't be losing errors with HttpSolrServer. Are you
seeing evidence that you are or is this mostly a curiosity question?
Do not it's better to batch up docs, your throughput will increase
a LOT. That said, when you do batch (e.g. send 500 docs per update
or whatever) and you get an error b
On 9/28/2015 4:04 PM, Arcadius Ahouansou wrote:
> CloudSolrClient has zkClientTimeout/zkConnectTimeout for access to
> zookeeper.
>
> It would be handy to also have the possibility to set something like
> soTimeout/connectTimeout for accessing the solr nodes similarly to the old
> non-cloud clien
We built our own because there was no movement on that. Don’t hold your breath.
Glad to contribute it. We’ve been running it in production for a year, but the
config is pretty manual.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Sep 28, 2015, at
38 matches
Mail list logo