Is there a recommended ZooKeeper topology for production Solr environments?
I was planning: 3 ZK nodes, each on its own dedicated machine.
Thinking that dedicated machines, separate from Solr servers, would keep ZK
isolated from resource contention spikes that may occur on Solr. Also, if a
Solr m
As Otis mentioned, its obviously good to run Optimization once in a while
or when you are done with most of your heavy indexing operation. Its not
concern with the Disk Capacity rather with the IO and seeking in segements,
When comparably it has less segments to query the IO operation will be less
Then are there some other alternative so that we can achieve the goal. As
querying with this way of set of foreign id is really going to make the
query very large and the response is also awaited for
long(previously tested with the standalone Solr core with Master Slave
Architecture).
Thanks!
Hi Aman,
Yeah, We are also thinking the same. Using UIMA is better. And thanks to
everyone. You guys really showed us the way(UIMA).
We'll work on it.
Thanks,
Vivek
On Fri, Jun 6, 2014 at 5:54 PM, Aman Tandon wrote:
> Hi Vikek,
>
> As everybody in the mail list mentioned to use UIMA you shou
hi all,
I need help simplifying my query. The doc structure is as follows.
docStructure
id A
cat : p, q, r
id B
cat : m, n ,o
id C
cat: l,b, o
Now given this structure my job is to find documents which have cat ids
belonging to a list. Right now this is achieved in this fashion using OR of
mu
Thanks for the response Jack
--
View this message in context:
http://lucene.472066.n3.nabble.com/accessing-individual-elements-of-a-multivalued-field-tp4140862p4140911.html
Sent from the Solr - User mailing list archive at Nabble.com.
Not currently.
You could have separate explicit fields for the categories such as "cat_1",
"cat_2", etc. The data would need to be replicated (possibly using a
), but redundancy to facilitate access is a reasonable approach.
-- Jack Krupansky
-Original Message-
From: kritarth.anand
Check out the patch on the issue below. We hit the same issue and posted a
patch, none of the committers have picked it up yet, but would be good to
get some feedback on it and get this into the next dot release. If it works
for you, please vote it up.
https://issues.apache.org/jira/browse/SOLR-59
> the incoming document rate could be as high as 20k/second...
That sounds like a lot of CPU eager indexing work, given the 128 CPU cores
available, from indexing speed perspective: would you recommend having a
similar number of solr cores created, or Solr does just a when several with
a small numb
Hi,
I don't remember last time I ran optimize. Sure, yes, things will work
faster if you optimize an index and reduce the number of segments, but if
you are regularly writing to that index and performance is OK, leave it to
Lucene segment merges to purge deletes.
Otis
--
Performance Monitoring *
Hey guys,
I'm trying to simply create collection foo in SolrCloud (to a collection
that failed to create once due to a badly formatted schema).
I try the following:
createCollection foo -> could not create a new core
solr/foo_shard1_replica1 as another core is already defined there
deleteColle
Hi,
We have SolrCloud cluster (5 shards and 2 replicas) on 10 boxes. On some of the
boxes we have about 5 million deleted docs and we have never run optimization
since beginning. Does number of deleted docs have anything to do with
performance of query? Should we consider optimization at all i
hi,
prod: p
cat : catA,catB,catC
prod :q
cat : catB, catC,catD
My schema consists of documents with uid : 'prod's and then they belong can
to multiple categories called 'cat' and which are represented as a
multivalued field. For a particular kind of query I need to access
individual elements se
Hi Chris,
Created ticket https://issues.apache.org/jira/browse/SOLR-6154
Included to the ticket the data.xml and a PDF with instructions on how to
replicate.
Sending different updates to different ports was just how the confluence
tutorial made the steps; it does not affect the result of the te
On 6/8/2014 12:09 PM, rashi gandhi wrote:
> I am using SolrMeter for performance benchmarking. I am able to
> successfully test my solr setup up to 1000 queries per min while
> searching.
> But when I am exceeding this limit say 1500 search queries per min,
> facing "Server Refused Connection" in S
On 6/8/2014 4:17 PM, shushuai zhu wrote:
> I would like to get some advice to setup a Solr Cloud on a set of powerful
> machines. The average size of the documents handled by the Solr Cloud is
> about 0.5 KB, and the number of documents stored in Solr Cloud could reach
> billions. When indexing,
Dear Solr expert.
I have 2 problems need your help.
1) I have to group list with group.limit=1&group.main=true&group.sort=Date
desc (many group and each group has 1 element is newest). Then from list
group (each group has 1 element), I want to filter in order to remove items
(in groups) not matches
My first answer is "don't do it that way" :).
Solr works best with flattened (de-normlized) data. If at all
possible, you _really_ would be better off combining the two
collections and flattening the data even though there would be more
data.
Whenever I see a question like this, I wonder if you'r
thanks
--
View this message in context:
http://lucene.472066.n3.nabble.com/Deepy-nested-structure-tp4140397p4140803.html
Sent from the Solr - User mailing list archive at Nabble.com.
Thanks
--
View this message in context:
http://lucene.472066.n3.nabble.com/Deepy-nested-structure-tp4140397p4140802.html
Sent from the Solr - User mailing list archive at Nabble.com.
Yeah just got it thanks Fracois :)
With Regards
Aman Tandon
On Mon, Jun 9, 2014 at 8:20 PM, François Schiettecatte <
fschietteca...@gmail.com> wrote:
> Just click the 'Releases' link:
>
> https://github.com/DmitryKey/luke/releases
>
> François
>
> On Jun 9, 2014, at 10:43 AM, Aman Tando
Well, you've omitted information about the most precious resource for
Solr, memory.
That said, this question is impossible to answer in the abstract, see:
http://searchhub.org/2012/07/23/sizing-hardware-in-the-abstract-why-we-dont-have-a-definitive-answer/
Best,
Erick
On Sun, Jun 8, 2014 at 3:1
Just click the 'Releases' link:
https://github.com/DmitryKey/luke/releases
François
On Jun 9, 2014, at 10:43 AM, Aman Tandon wrote:
> No, Anyways thanks Alex, but where is the luke jar?
>
> With Regards
> Aman Tandon
>
>
> On Mon, Jun 9, 2014 at 6:54 AM, Alexandre Rafalovitch
> wro
On Tue, Jan 7, 2014 at 1:53 PM, Yonik Seeley wrote:
[...]
> Next major feature: Native Code Optimizations.
> In addition to moving more large data structures off-heap(like
> UnInvertedField?), I am planning to implement native code
> optimizations for certain hotspots. Native code faceting would
No, Anyways thanks Alex, but where is the luke jar?
With Regards
Aman Tandon
On Mon, Jun 9, 2014 at 6:54 AM, Alexandre Rafalovitch
wrote:
> Have you looked at:
> https://github.com/DmitryKey/luke
>
> Regards,
>Alex.
> Personal website: http://www.outerthoughts.com/
> Current project: http:
Hi All,
I was curious to know how multiple Collection communication be achieved? If
yes then by what means.
The use case says, having multiple collection I need to query the first
collection and get the unique ids from first collection to query the second
one(Foreign Key Relation). Now if the no.
I believe it will return the terms that are most similar to the queried terms
but have a greater term frequency than the queried terms. It doesn't actually
care what the term frequencies are, only that they are greater than the
frequencies of the terms you queried on.
I do not know your use ca
Thanks, Tim. Worked like a charm. Appreciate your timely assistance.
On Sat, Jun 7, 2014 at 9:13 PM, Timothy Potter wrote:
> Hi Mark,
>
> Sorry for the trouble! I've now made the ami-1e6b9d76 AMI public;
> total oversight on my part :-(. Please try again. Thanks Hoss for
> trying to help out o
I’ve certainly go for the 2nd option. Depending of what you need you won’t need
to modify Solr itself but extend it using different plugins for what you need.
You’ll need to write different components depending on your specific
requirements. I definitely recommend the talks from Trey Grainger, f
Can you make a custom Component? They are pluggable.
Regards,
Alex
On 09/06/2014 6:24 pm, "Vishnu Mishra" wrote:
> I am using solr 4.6 and I am using solr Sharding (Distributed Search). I
> have
> situation where I like to modify the solr search result (DocList and
> DocSet)
> inside solr Q
Are they expecting relevancy ranking or merely seeking to a bulk read of
those documents? Please detail what the user is trying to accomplish with
such a monster list of IDs.
Generally, queries of more than a few dozen terms are a bad idea. If for no
other reason than that if you need to debug
I'm wondering what the best practice for large disjunct queries in Solr is.
A user wants to submit a query for several hundred thousand terms, like:
(term1 OR term2 OR ... term500,000)
I know it might be better to break this up into multiple queries that can
be merged on the user's end, but I'm w
To be of any help we'd need to know what your documents look like, what
your queries look like, what is the specifications of your server? How much
heap is dedicated to Solr, how much free memory is available for the OS
file cache. You have to figure out the bottleneck. Is it CPU or RAM or
Disk? Ma
I am using solr 4.6 and I am using solr Sharding (Distributed Search). I have
situation where I like to modify the solr search result (DocList and DocSet)
inside solr QueryComponent right after the following method is called from
process() method.
searcher.search(result, cmd);
Hi
I am using SimplepostTool to post the xml files to SOLR llke :
java -Durl=http://localhost:8080/solr/collection1/update -jar
/var/lib/tomcat6/solr/collection1/dump/xmlinput/post.jar
/var/lib/tomcat6/solr/collection1/dump/xmlinput/solr.xml
When there are certain errors ,the response fro
I'm really at dead point.
Mine indeks is 5,6GM and about 8mln documments.
Field i'm using for filter is simple as hell.
Can it be that other fields affect my search if i only do filter query?
solr/puls-objects-prod/select?q=*%3A*&fq=class_name:License
mine results:
831
*:*
class_name:
Hello all,
I was wondering what does the "onlyMorePopular" option for spellchecking use
as its threshold? Will it always pick the suggestion that returns the most
queries or does it base its result based off of some threshold that can be
configured?
Thanks!
Ali.
--
View this message in conte
Thanks, it is working fine but I had to change the following line
to
On Mon, Jun 9, 2014 at 9:29 AM, Shalin Shekhar Mangar [via Lucene] <
ml-node+s472066n4140715...@n3.nabble.com> wrote:
> You can specify the file name as the id by adding a TemplateTransformer on
> the entity "x" and specif
I think this may be the same bug as LUCENE-5289 which was fixed in 4.5.1.
Can you upgrade to 4.5.1 and see if that solves the problem?
On Fri, Jun 6, 2014 at 7:17 PM, Justin Sweeney
wrote:
> Hi,
>
> An application I am working on indexes documents to a Solr index. This Solr
> index is setup a
39 matches
Mail list logo