Hi Erick,
The issue was with zookeeper when we tried to force full replication by
cleaning the datadir in zookeeper, caused the index removal.
Our index always replicated full even on short outage or restart. I think
"too far out of date" could be the reason. We felt zookeeper was to blame
here.
Hi,
I think things will work for Hassan as he described them. The key is not
to shard in his case, that's all.
Hassan, yes, 1-2M docs is small. But beware of creating a crazy
number (e.g. thousands) of collections per server, as each collection has
some cost.
Otis
--
Solr & ElasticSearch Suppor
Hi Varun,
I don't think this exists in Solr...
But have a look at http://sematext.com/products/dym-researcher/index.html .
Look at the screenshot and you will spot something labeled as "Relaxer" in
the blue area. This (Query) Relaxer is DYM ReSearcher's cousin and can be
seen in action on http:
Not at this time. That is something you would do at your app level -
re-query with a looser query if zero results for the original query.
-- Jack Krupansky
-Original Message-
From: Varun Thacker
Sent: Friday, January 04, 2013 7:50 AM
To: solr-user@lucene.apache.org
Subject: Removing t
If you index from the outside (i.e. not using DIH) you have more control:
* how many threads you use
* how you batch documents
* how much you wait between indexing batches
...
Otis
--
Solr & ElasticSearch Support
http://sematext.com/
On Fri, Jan 4, 2013 at 6:25 PM, Marcin Rzewucki wrote:
>
That's probably as official as anything ever gets around here.
-- Jack Krupansky
-Original Message-
From: Mark Miller
Sent: Friday, January 04, 2013 11:47 AM
To: solr-user@lucene.apache.org
Subject: Re: Solr 4 (CloudSolrServer and LBHttpSolrServer question)
I'm going to push *hard* fo
Tried Wireshark yet to see what host/port it is trying to connect and why
it fails? It is a complex tool, but well worth learning.
Regards,
Alex.
Personal blog: http://blog.outerthoughts.com/
LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
- Time is the quality of nature that keeps ev
DIH won't make any real difference, I'd say. The work to write terms to
your index still happens in either case.
Upayavira
On Fri, Jan 4, 2013, at 11:25 PM, Marcin Rzewucki wrote:
> Thanks. I guess you're right - it's normal behaviour. Are there some
> guidelines how to use ramBufferSizeMB or onl
Thanks! I had a different version of httpclient in the classpath. So the 2nd
exception is gone but now I am back to the first one
" org.apache.solr.client.solrj.SolrServerException: No live SolrServers
available to handle this request"
-Original Message-
From: Alexandre Rafalovitch [m
Thanks. I guess you're right - it's normal behaviour. Are there some
guidelines how to use ramBufferSizeMB or only by testing ? Do you know if
DIH is "gentler" than indexing via REST or solrj API ?
Kind regards.
On 4 January 2013 23:14, Otis Gospodnetic wrote:
> Hi,
>
> I think what you are seein
For the second one:
Wrong version of library on a classpath or multiple versions of library on
the classpath which causes wrong classes with missing fields/variables? Or
library interface baked in and the implementation is newer. Some sort of
mismatch basically. Most probably in Apache http librar
Hi,
I think what you are seeing is a general thing. Regular search is slower
while there is indexing, too, of course.
So maybe it's best to mentally decouple indexing part here and simply make
your calls as fast as possible without indexing. Then you can add indexing
and play with things like ra
Hi All,
I am getting exceptions on trying to create a collection. Any help is
appreciated.
While trying to create a collection, I got this error
Caused by: org.apache.solr.client.solrj.SolrServerException: No live
SolrServers available to handle this request
at
org.apache.solr.client.sol
Hello Solr-Users,
I thought you, or someone you know, might be interested in a very important
role here at Simply Hired. The Staff Search Engineer will own the
responsibility of writing the search engine of SimplyHired. You will work
on cutting edge machine learning, search and big data tools
On Jan 4, 2013, at 3:41 PM, "Dyer, James" wrote:
> 4. Dynamic Business Rules.
There is an open JIRA issue around biz rules and drools integration. Not sure
if there is any work done there, but at least some notes about it last I looked.
- Mark
Sachin,
You might more response on this list is you can describe a little in detail
what your application needs to do. A lot of us haven't used Endeca and won't
understand exactly what you mean here.
With that said, I migrated a few apps from Endeca to Solr a few years back and
will try to he
I think the problem is that you have to interpret the user query (Solr has
one syntax, other sources have a different one) and then combine results
(how?). All of those are non-trivial.
Have you looked at something like
http://www.comcepta.com/en/enterprise-metasearch.html which builds on top
of C
Sounds like you may have a corrupt index. Try running the CheckIndex tool.
Otis
Solr & ElasticSearch Support
http://sematext.com/
On Jan 3, 2013 8:59 AM, "Karan jindal" wrote:
> Hi everyone,
>
> I have a solr index which is built using solr 3.2.
>
> I am facing two problem with that solr index.
I agree with the 'more mature' analysis, but surely you can use 4.0 in a
3.x style without greater difficulty, no?
Upayavira
On Fri, Jan 4, 2013, at 07:35 PM, Otis Gospodnetic wrote:
> Hi,
>
> If you don't need to shard your index and don't need NRT search Solr 3.x
> is
> much simpler to operate
Yes , it would be great to start discussion of this topic.
I am looking a sort of kick start information to get start more detailed
investigation. And of course may be someone already faced with this problem
so please share your ideas and experience.
Thanks
Oleg.
On Fri, Jan 4, 2013 at 2:15 PM,
Hi,
If you don't need to shard your index and don't need NRT search Solr 3.x is
much simpler to operate and is more mature.
Otis
Solr & ElasticSearch Support
http://sematext.com/
On Jan 4, 2013 7:08 AM, "Dikchant Sahi" wrote:
> As someone in the forum correctly said, if all Solr releases were
>
On Jan 4, 2013, at 2:14 PM, Per Steffensen wrote:
>> I'm not sure what the node tells Zookeeper and who does shard assignment. I
>> mean, does a node explicitly say what shard it wants to be, or is that
>> assigned by Zookeeper, or is that a node's choice/option?
It's basically both. If you
Would this be a reasonable (if very rough) attempt at cake diagram?
https://docs.google.com/drawings/d/1XxLjds0OOm44zOVCMR-cwCJXnTs3C2x257KpCTxI1Ec/edit
Not sure if I managed to get logical/physical separation clearly enough,
but it could be a start.
Regards,
Alex.
Personal blog: http://blog
We're not gonna have documentation to explain it. I guess it is more a
question of starting a discussion here about how to do it.
My thought would be to write an adapter in front of your APIs to make it
look like a Solr instance, and fake distributed search. But, to get that
to work, you'd need to
It was a very good explanation, Jack!
I believe I have heard most of it before, so it is really not new for
me. I DO understand that the name "replica" and "replication-factor" CAN
be justified, but it requires a long and thorough explanation. And thats
the point. A good name for a concept mea
Yes. In that case, core should best be described as a logical solr
entity with various "managed" attributes
and qualities above the physical layer (sorry, not trying to perpetuate
this thread so much).
On 01/04/2013 01:55 PM, Mark Miller wrote:
Currently a SolrCore is 1:1 with a low level Luce
Currently a SolrCore is 1:1 with a low level Lucene index. There is no reason
that needs to alway be that way. It's possible that we may at some point add
built in micro sharding support that means a SolrCore could have multiple
underlying Lucene indexes. Or we may not.
- Mark
On Jan 4, 2013,
Good point. Agree.
Sent from my Verizon Wireless 4G LTE Smartphone
Original message
From: Upayavira
Date:
To: solr-user@lucene.apache.org
Subject: Re: Terminology question: Core vs. Collection vs...
Using your terminology, I'd say core is a physical solr term, and index
Using your terminology, I'd say core is a physical solr term, and index
is a pysical lucene term. A collection or a shard is a logical solr
term.
Upayavira
On Fri, Jan 4, 2013, at 06:28 PM, darren wrote:
> My understanding is core is a logical solr term. Index is a physical
> lucene term. A solr
I agree. In my opinion index is a low level lucene thing. I never say a
collection has an index directly. That confuses levels and creates confusion.
To me at least. I think the terminology discussed is good. Just some lingering
usage inconsistencies.
Sent from my Verizon Wireless 4G LTE Smart
On Fri, Jan 4, 2013 at 1:35 PM, Alexandre Rafalovitch
wrote:
> Hmm. Doesn't that make (logical) index=collection? And (physical)
> index=core? Which creates duplication of terminology and at the same time
> can cause confusion between highest logical and lowest physical level.
That's why I've avo
Hmm. Doesn't that make (logical) index=collection? And (physical)
index=core? Which creates duplication of terminology and at the same time
can cause confusion between highest logical and lowest physical level.
Regards,
Alex.
P.s. Hoping not to start a new terminology war.
Personal blog: http:
My understanding is core is a logical solr term. Index is a physical lucene
term. A solr core is backed by a physical lucene index. One index per core.
Solr team can correct me if its not accurate. :)
Sent from my Verizon Wireless 4G LTE Smartphone
Original message
From: Alex
The entire collection does have an index - a distributed index - which
consists of a Lucene index on each core/replica for the subset of the data
in that shard.
-- Jack Krupansky
-Original Message-
From: Alexandre Rafalovitch
Sent: Friday, January 04, 2013 1:12 PM
To: solr-user@lucen
Can I just start by saying that this was AMAZING. :-) When I asked the
question, I certainly did not expect this level of details.
And I vote on the cake diagram for WIKI as well. Perhaps, two with the
first one showing the trivial collapsed state of single
collection/shard/replica/core. The trivi
Well, i hope this won't spoil everything then:
https://issues.apache.org/jira/browse/SOLR-4260
I'll continue tests monday
-Original message-
> From:Mark Miller
> Sent: Fri 04-Jan-2013 17:54
> To: solr-user@lucene.apache.org
> Subject: Re: Solr 4 (CloudSolrServer and LBHttpSolrServer que
Hi Mark,
SOLR-3929 rocks!
A nigthly build of 4.1 with maxIndexingThreads configured to 24, takes
80% to 100% of the cpu resources :-)
Thank you, Otis and Gora
"mpstat 10"
CPU minf mjf xcal intr ithr csw icsw migr smtx srw syscl usr sys wt idl
00 0 13 607 241 234 78 100
I'm going to push *hard* for a Jan release. Woe to those that get in my way :)
- Mark
On Jan 4, 2013, at 11:37 AM, Shawn Heisey wrote:
> On 1/4/2013 8:54 AM, Luis Cappa Banda wrote:
>> Any release stimation date, Mark? I heard something about January. I was
>> considering using 4.0 for producti
Thanks Mark.
-Original Message-
From: Mark Miller [mailto:markrmil...@gmail.com]
Sent: Friday, January 04, 2013 9:51 AM
To: solr-user@lucene.apache.org
Subject: Re: Solr 4 (CloudSolrServer and LBHttpSolrServer question)
CloudSolrServer can be used for indexing and is smart about indexing
thanks for pointing me to Solr's Zookeeper servlet. I will look at the
source to see how I can use to fulfill my needs.
Bill
On Thu, Jan 3, 2013 at 6:43 PM, Mark Miller wrote:
> Technically, you want to make sure zookeeper reports the node as live and
> active.
>
> You could use the same api
On 1/4/2013 8:54 AM, Luis Cappa Banda wrote:
Any release stimation date, Mark? I heard something about January. I was
considering using 4.0 for production but if 4.1 release is incomming I
could wait a little more.
I'm not a committer, but I contribute the occasional patch and keep an
eye on t
Ok , thank you for the answer.
May be you can pointing me on documentation or any other source where can I
get the Idea how to develop such extension.
Thanks
Oleg.
On Fri, Jan 4, 2013 at 2:47 PM, Upayavira wrote:
> Solr does not support federated search in the form you describe - that
> is, to
Any release stimation date, Mark? I heard something about January. I was
considering using 4.0 for production but if 4.1 release is incomming I
could wait a little more.
2013/1/4 Mark Miller
> CloudSolrServer can be used for indexing and is smart about indexing since
> it knows the current clus
CloudSolrServer can be used for indexing and is smart about indexing since it
knows the current cluster state.
For 4.0 I'd use one per collection because there is a bug around this fixed in
the upcoming 4.1 (using one for more than one collection).
In fact, if you are moving to 4, it's a good i
This is the containment hierarchy i understand but includes both physical and
logical.
Sent from my Verizon Wireless 4G LTE Smartphone
Original message
From: darren
Date:
To: dar...@ontrenet.com,yo...@lucidworks.com,solr-user@lucene.apache.org
Subject: Re: Terminology qu
Hi,
I am trying to migrate to Solr 4 (from 3.6) for a
multithreaded/multicollection environment using the Solrj java client. I need
some clarification of when to use the
Cloud Solr Server vs LBHttpSolrServer. Any help is appreciated.
Which one do I use? The CloudSolrServer uses the LB server
Actually. Node/collection/shard/replica/core/index
Sent from my Verizon Wireless 4G LTE Smartphone
Original message
From: darren
Date:
To: yo...@lucidworks.com,solr-user@lucene.apache.org
Subject: Re: Terminology question: Core vs. Collection vs...
Agreed. But for compl
Agreed. But for completeness can it be node/collection/shard/replica/core?
Sent from my Verizon Wireless 4G LTE Smartphone
Original message
From: Yonik Seeley
Date:
To: solr-user@lucene.apache.org
Subject: Re: Terminology question: Core vs. Collection vs...
On Fri, Jan
On Fri, Jan 4, 2013 at 2:26 AM, Per Steffensen wrote:
> Our biggest problem is that we really havent decided once and for all and
> made sure to reflect the decision consistently across code and
> documentation. As long as we havnt I believe it is still ok to change our
> minds.
IMO, I *think* it
Yes. Thats it. Its clear if we separate logical terms from physical terms. A
simple cake diagram on the wiki along with perhaps a uml will solidify these
concepts.
Sent from my Verizon Wireless 4G LTE Smartphone
Original message
From: Jack Krupansky
Date:
To: solr-user@lu
I thought about adding Solr core, but it only muddies the water. Yes, it
needs to be added, but carefully.
In the context of SolrCloud, a Solr core is the underlying representation of
a replica. Alternatively, a replica of a shard of a collection is
implemented as a Solr core. [Need to factor
This is a good explanation and makes sense. The one inconsistency is referring
to a replica of a shard that has no replication. But its not that big of a
problem. If you wove the term 'core' into your writeup below it would be
complete and should be posted on the wiki.
Sent from my Verizon Wi
Replication makes perfect sense even if our explanations so far do not.
A shard is an abstraction of a subset of the data for a collection.
A replica is an instance of the data of the shard and instances of Solr
servers that have indicated a readiness to service queries and updates for
the dat
That is very odd. Have there been any hard commits performed at all? Even
if not, there should still be an index directory.
Solr will do a full replication if the replica is too far out of date, but
that shouldn't
create (I don't think) a new index directory unless it's a misleading
message.
Is th
Solr does not support federated search in the form you describe - that
is, to make a query to Solr which solr defers to another search system.
There may be ways you could achieve it (Solr is pretty extensible) and
such a feature would be a very useful one, but it would take some,
likely significan
3.6.2 is a maintenance release with bug fixes for existing 3.x users for
whom an upgrade to 4.0 is too big a leap at present. 4.0 is the release
that will see active development from here on in. If you ware starting
with a new project, 4.0 seems a reasonable place to start. I'd expect
4.1 to be out
First, I'm assuming SolrCloud with Zookeeper etc.
1> Don't do anything. If Node A is the leader, the replica for that shard
will become the leader.
2> This is a little unclear. There are two cases, a> the leader crashed or
b> the replica crashed.
a> no problem, distributed in
Hi Friends,
I need a help , i want to implement clustering in my solr , i have studied
both carrot2 and apache uima framework , can anyone suggest me which is
better to use , but with reasons.
Thanks in advance
Puneet Chaturvedi
--
View this message in context:
http://lucene.472066.n3.nabb
As someone in the forum correctly said, if all Solr releases were
evolutionary Solr 4.0 is revolutionary. It has lots of improvement over the
previous releases like NoSql features, atomic updates, cloud features and
lot more.
Solr 4.0 would be the right migration I believe.
Can someone in the for
We are starting a new e-com application from this month onwards, for which I
am trying to identify the right SOLR release. We were using 3.4 in our
previous project, bu I have read in multiple blogs and forums about the
improvements that SOLR 4 has in terms of efficient memory management, less
OOMs
On 1/4/13 9:21 AM, Hassan wrote:
Hi,
I am considering SolrCloud for our applications but I have run into
the limitation of not being able to use Join Queries in distributed
searches.
Our requirements are the following:
- SolrCloud will serve many applications where each application
"index" i
Aha! &mlt=true, that was the key I hadn't worked out before (thought it was
&qt=mlt that achieved that), things are looking rosy now, and these results
are a perfect fit for my needs. Thanks very much for your time to help
explain this!!
David
-Original Message-
From: Jack Krupansky [mai
Hi,
I am considering SolrCloud for our applications but I have run into the
limitation of not being able to use Join Queries in distributed searches.
Our requirements are the following:
- SolrCloud will serve many applications where each application "index"
is separate from other application.
63 matches
Mail list logo