Hi Bill.
So it seems you want an exact match to be first even if it is outside the
spatial region, right? Your suggested implementation suggests this. And
apparently you want to sort by distance, notwithstanding the exact match
being first. Although you don't have to do this as two queries, I t
I'd like to run a repeatable test of having Solr ingest a corpus of
docs on disk, to measure the speed of some alternative things plugged
in.
Anyone have some advice to share? One approach would be a quick SolrJ
program that pushed the entire stack as one giant collection with a
commit at the end.
I'm still not quite getting the issue. Separate requests (i.e. any
addition of a SolrInputDocument) are treated as a separate document.
There's no notion of "append the contents of one doc to another based
on ID", unless you're doing atomic updates.
And Tika takes some care to index separate files
Unfortunately I don't quite know the internals of this code well. I
vaguely remember
a problem with insuring that deletes were handled correctly, so this may be a
manifestation of that fix. As I remember optimistic locking is mixed
up in this too.
But all that means is that I really can't answer y
SolrMeter?
Upayavira
On Sun, May 26, 2013, at 03:38 PM, Benson Margulies wrote:
> I'd like to run a repeatable test of having Solr ingest a corpus of
> docs on disk, to measure the speed of some alternative things plugged
> in.
>
> Anyone have some advice to share? One approach would be a quick
bq: Whats is the difference between q and fq other than cache
Very little from a functional standpoint. I.e.
q=abc AND def
and
q=abc&fq=def
return the same set of results. The differences are
1> the fq clause can be cached efficiently
2> the terms in the fq clause don't contribute to the score of
I'm beginning to hate solr.xml
That stuff should definitely be persisted, please raise a JIRA and
assign it to me.
Thanks,
Erick
On Thu, May 23, 2013 at 5:10 PM, André Widhani wrote:
> When I create a core with Core admin handler using these request parameters:
>
> action=CREATE
> &name=cor
Jack:
Kudos for carrying on! Having a contract canceled after putting a lot
of work into it must be a bummer...
Personally I'm not buying many paper books any more, so the e-book
version is preferable for me, so take this with a grain of salt.. but
make the paper version spiral bound, _please_. I
Valery:
I share your puzzlement. _If_ you are letting Solr do the document
routing, and not doing any of the custom routing, then the same unique
key should be going to the same shard and replacing the previous doc
with that key.
But, if you're using custom routing, if you've been experimenting w
Thanks, Erick. I could do the experiment of publishing both spiral and
perfect found and see which "wins". Spiral does have the one downside of not
standing out on a shelf. But, for now, I'll focus on getting the (rough
draft) e-book available ASAP.
-- Jack Krupansky
-Original Message
In addition to Alexandre's comment:
bq: ...I’d like to import in my index all metadata
Be a little careful here, this isn't actually very useful in my experience.
Sure
it's nice to have all that data in the index, but... how do you search it
meaningfully?
Consider that some doc may have an "aut
SOLR-3221 added the ability to configure the shard handler in Solr. In
particular, increasing maxConnectionsPerHost is important for
scalability, and many people might want to enable fairnessPolicy.
http://wiki.apache.org/solr/SolrConfigXml#Configuration_of_Shard_Handlers_for_Distributed_searche
On Fri, May 24, 2013 at 4:04 PM, Jack Krupansky wrote:
> The primary purpose of this filter is in conjunction with the
> KeywordRepeatFilterFactory and a stemmer, to remove the tokens that did not
> produce a stem from the original token, so the keyword duplicate is no
> longer needed. The goal is
The only comment I was trying to make here is the relationship between the
RemoveDuplicatesTokenFilterFactory and the KeywordRepeatFilterFactory.
No, stemmed terms are not considered the same text as the original word. By
definition, they are a new value for the term text.
-- Jack Krupansky
Hi
I use solr 4.3.0
created 3 collections with collections API.
Reloaded one of them a few times.
The cluster is running for 2 weeks now.
Today I tried creating a new collection using the collections API and I get
an error
"reloadcollection the collection time out: 60s
I then tried reloading a co
I have document divider by paragraphs. How better to add it to Solr?
As single str field:
paragraph1
paragraph2
paragraph3
Or multivalued fields:
paragraph1
paragraph2
paragraph3
That depends on what you are trying to search. Start your schema
design from your _search_ requirements, not your document
requirements.
See the presentation by Gilt on how they went through different
iterations on their document schema design:
http://www.slideshare.net/trenaman/lucene-revolution-
On Sun, May 26, 2013, at 10:41 PM, Oleksiy Druzhynin wrote:
> I have document divider by paragraphs. How better to add it to Solr?
> As single str field:
>
>
> paragraph1
> paragraph2
> paragraph3
>
>
> Or multivalued fields:
> paragraph1
> paragraph2
> paragraph3
Depends what
Thanks David !
On Sun, May 26, 2013 at 8:02 AM, David Smiley (@MITRE.org) <
dsmi...@mitre.org> wrote:
> Hi Bill.
>
> So it seems you want an exact match to be first even if it is outside the
> spatial region, right? Your suggested implementation suggests this. And
> apparently you want to
Thank you jack for the response.
>> Fuzzy search is the syntax for a term, not a handler. For example: alpha~1
>> will match terms that have an editing distance of 0 or 1 from "alpha".
So the search query string will be like - /term?q= alpha~1
>> But, are you sure you really mean "fuzzy search"
Fuzzy query is invoked just like any other query:
.../select?q=alpha~1
-- Jack Krupansky
-Original Message-
From: Sagar Chaturvedi
Sent: Sunday, May 26, 2013 11:27 PM
To: solr-user@lucene.apache.org
Subject: RE: Fuzzy search in solr
Thank you jack for the response.
Fuzzy search is t
Hi, Erick!
That's it! I'm using a custom implementation of a SolrServer with
distributed behavior that routes queries and updates using an in-house
Round Robin method. But the thing is that I'm doing this myself because
I've noticed that duplicated documents appears using LBHttpSolrServer
implemen
22 matches
Mail list logo