Re: SolrCloud and exernal file fields

2012-11-28 Thread Mikhail Khludnev
Mark, Your comment is quite valuable. Let me mention the keyword to be able to find later NoOpDistributingUpdateProcessorFactory.* *Thanks*! * On Wed, Nov 28, 2012 at 5:56 PM, Mark Miller wrote: > Keep in mind that the distrib update proc will be auto inserted into > chains! You have to includ

Re: SolrCloud and exernal file fields

2012-11-28 Thread Mark Miller
Keep in mind that the distrib update proc will be auto inserted into chains! You have to include a proc that disables it - see the FAQ: http://wiki.apache.org/solr/SolrCloud#FAQ - Mark On Nov 28, 2012, at 7:25 AM, Mikhail Khludnev wrote: > Martin, > Right as far node in Zookeeper Distributed

Re: SolrCloud and exernal file fields

2012-11-28 Thread Mikhail Khludnev
Martin, Right as far node in Zookeeper DistributedUpdateProcessor will broadcast commits to all peers. To hack this you can introduce dedicated UpdateProcessorChain without DistributedUpdateProcessor and send commit to that chain. 28.11.2012 13:16 пользователь "Martin Koch" написал: > Mikhail >

Re: SolrCloud and exernal file fields

2012-11-28 Thread Martin Koch
Mikhail I haven't experimented further yet. I think that the previous experiment of issuing a commit to a specific core proved that all cores get the commit, so I don't think that this approach will work. Thanks, /Martin On Tue, Nov 27, 2012 at 6:24 PM, Mikhail Khludnev < mkhlud...@griddynamics

Re: SolrCloud and exernal file fields

2012-11-27 Thread Mikhail Khludnev
Martin, It's still not clear to me whether you solve the problem completely or partially: Does reducing number of cores free some resources for searching during commit? Does the commiting one-by-one core prevents the "freeze"? Thanks On Thu, Nov 22, 2012 at 4:31 PM, Martin Koch wrote: > Mikha

Re: SolrCloud and exernal file fields

2012-11-25 Thread Simone Gianni
Hi Gopal, the post you linked is interesting, it takes a different approach than mine : it implements a codec for Lucene, so at a lower level than my solution that works at Solr UpdateHandler level, so before the document reaches Lucene. The lucene-codec approach should offer a few advantages : th

Re: SolrCloud and exernal file fields

2012-11-23 Thread Gopal Patwa
Hi, I am also very much interested in this, since we use Solr 4 with NRT where we update index every second but most of time it update only stored filed. if Solr/Lucene could provide external datastore without re-indexing even for stored field only, it would be very beneficial for frequent update

Re: SolrCloud and exernal file fields

2012-11-23 Thread Simone Gianni
Posted, see it here http://lucene.472066.n3.nabble.com/Possible-sharded-and-replicated-replacement-for-ExternalFileFields-in-SolrCloud-td4022108.html Simone 2012/11/23 Simone Gianni > 2012/11/22 Martin Koch > >> IMO it would be ideal if the lucene/solr community could come up with a >> good w

Re: SolrCloud and exernal file fields

2012-11-23 Thread Simone Gianni
2012/11/22 Martin Koch > IMO it would be ideal if the lucene/solr community could come up with a > good way of updating fields in a document without reindexing. This could be > by linking to some external data store, or in the lucene/solr internals. If > it would make things easier, a good first

Re: SolrCloud and exernal file fields

2012-11-23 Thread Martin Koch
The short answer is no; the number was chosen in an attempt to get as many cores working in parallel to complete the search faster, but I realize that there is an overhead incurred by distribution and merging the results. We've now gone to 8 shards and will be monitoring performance. /Martin On

Re: SolrCloud and exernal file fields

2012-11-22 Thread Yonik Seeley
On Tue, Nov 20, 2012 at 4:16 AM, Martin Koch wrote: > around 7M documents in the index; each document has a 45 character ID. 7M documents isn't that large. Is there a reason why you need so many shards (16 in your case) on a single box? -Yonik http://lucidworks.com

Re: SolrCloud and exernal file fields

2012-11-22 Thread Martin Koch
Mikhail To avoid freezes we deployed the patches that are now on the 4.1 trunk (bug 3985). But this wasn't good enough, because SOLR would still take very long to restart when that was necessary. I don't see how we could throw more hardware at the problem without making it worse, really - the onl

Re: SolrCloud and exernal file fields

2012-11-21 Thread Mikhail Khludnev
Martin, I don't think solrconfig.xml shed any light on. I've just found what I didn't get in your setup - the way of how to explicitly assigning core to collection. Now, I realized most of details after all! Ball is on your side, let us know whether you have managed your cores to commit one by one

Re: SolrCloud and exernal file fields

2012-11-21 Thread Simone Gianni
Hi Martin, thanks for sharing your experience with EFF and saving me a lot of time figuring it out myself, I was afraid of exactly this kind of problems. Mikhail, thanks for expanding the thread with even more useful informations! Simone 2012/11/20 Martin Koch > Solr 4.0 does support using EF

Re: SolrCloud and exernal file fields

2012-11-21 Thread Martin Koch
Mikhail, PSB On Wed, Nov 21, 2012 at 10:08 AM, Mikhail Khludnev < mkhlud...@griddynamics.com> wrote: > On Wed, Nov 21, 2012 at 11:53 AM, Martin Koch wrote: > > > > > I wasn't aware until now that it is possible to send a commit to one core > > only. What we observed was the effect of curl > > l

Re: SolrCloud and exernal file fields

2012-11-21 Thread Mikhail Khludnev
On Wed, Nov 21, 2012 at 11:53 AM, Martin Koch wrote: > > I wasn't aware until now that it is possible to send a commit to one core > only. What we observed was the effect of curl > localhost:8080/solr/update?commit=true but perhaps we should experiment > with solr/coreN/update?commit=true. A quic

Re: SolrCloud and exernal file fields

2012-11-21 Thread Martin Koch
On Wed, Nov 21, 2012 at 7:08 AM, Mikhail Khludnev < mkhlud...@griddynamics.com> wrote: > On Wed, Nov 21, 2012 at 2:07 AM, Martin Koch wrote: > > > I'm not sure about the mmap directory or where that > > would be configured in solr - can you explain that? > > > > You can check it at Solr Admin/St

Re: SolrCloud and exernal file fields

2012-11-20 Thread Martin Koch
Mikhail I appreciate your input, it's very useful :) On Wed, Nov 21, 2012 at 6:30 AM, Mikhail Khludnev < mkhlud...@griddynamics.com> wrote: > Martin, > This deployment seems a little bit confusing to me. You have 16-way fairy > virtual "box", and send 16 request for really heavy operation at the

Re: SolrCloud and exernal file fields

2012-11-20 Thread Mikhail Khludnev
On Wed, Nov 21, 2012 at 2:07 AM, Martin Koch wrote: > I'm not sure about the mmap directory or where that > would be configured in solr - can you explain that? > You can check it at Solr Admin/Statistics/core/searcher/stats/readerDir should be org.apache.lucene.store.MMapDirectory -- Sincerel

Re: SolrCloud and exernal file fields

2012-11-20 Thread Mikhail Khludnev
Martin, This deployment seems a little bit confusing to me. You have 16-way fairy virtual "box", and send 16 request for really heavy operation at the same moment, it does not surprise me that you loosing it for some period of time. At that time you should have more than 16 in load average metrics.

Re: SolrCloud and exernal file fields

2012-11-20 Thread Martin Koch
Mikhail PSB On Tue, Nov 20, 2012 at 7:22 PM, Mikhail Khludnev < mkhlud...@griddynamics.com> wrote: > Martin, > > Please find additional question from me below. > > Simone, > > I'm sorry for hijacking your thread. The only what I've heard about it at > recent ApacheCon sessions is that Zookeeper

Re: SolrCloud and exernal file fields

2012-11-20 Thread Mikhail Khludnev
Martin, Please find additional question from me below. Simone, I'm sorry for hijacking your thread. The only what I've heard about it at recent ApacheCon sessions is that Zookeeper is supposed to replicate those files as configs under solr home. And I'm really looking forward to know how it work

Re: SolrCloud and exernal file fields

2012-11-20 Thread Martin Koch
Hi Mikhail Please see answers below. On Tue, Nov 20, 2012 at 12:28 PM, Mikhail Khludnev < mkhlud...@griddynamics.com> wrote: > Martin, > > Thank you for telling your own "war-story". It's really useful for > community. > The first question might seems not really conscious, but would you tell me

Re: SolrCloud and exernal file fields

2012-11-20 Thread Mikhail Khludnev
Martin, Thank you for telling your own "war-story". It's really useful for community. The first question might seems not really conscious, but would you tell me what blocks searching during EFF reload, when it's triggered by handler or by listener? I don't really get the sentence about sequential

Re: SolrCloud and exernal file fields

2012-11-20 Thread Martin Koch
Solr 4.0 does support using EFFs, but it might not give you what you're hoping fore. We tried using Solr Cloud, and have given up again. The EFF is placed in the parent of the index directory in each core; each core reads the entire EFF and picks out the IDs that it is responsible for. In the cu

SolrCloud and exernal file fields

2012-11-19 Thread Simone Gianni
Hi all, I'm planning to move a quite big Solr index to SolrCloud. However, in this index, an external file field is used for popularity ranking. Does SolrCloud supports external file fields? How does it cope with sharding and replication? Where should the external file be placed now that the index