Re: Solr 5.2.1 versus Solr 4.7.0 performance

2015-08-27 Thread Esther Goldbraich
We are using GC tuning options: Xgcpolicy:gencon , verbose:gc.
RAM: 64GB
Solr heap: -Xms512m -Xmx32768m
Index per server: 500G

Surprisingly, running different setup on same machines, 64 collections / 1 
shard per collection gives significantly better results.
Any ideas? 

Thank you,
Esther



From:   Shawn Heisey 
To: solr-user@lucene.apache.org
Date:   26/08/2015 06:25 PM
Subject:Re: Solr 5.2.1 versus Solr 4.7.0 performance



On 8/26/2015 1:11 AM, Esther Goldbraich wrote:
> We have benchmarked a set of queries on Solr 4.7.0 and 5.2.1 (with same 
> data, same solrconfig.xml) and saw better query performance on Solr 
4.7.0 
> (5-15% better than 5.2.1, with an exception of 100% improvement for one 
of 
> the queries ).
> Using same JVM (IBM 1.7) and JVM params.
> Index's size is ~500G, spread over 64 shards, with replication factor 2.
> Do you know about any config / setup change for Solr 5.2.1 that can 
> improve the performance? Any idea what causes this behavior?

I have little experience comparing the performance of different
versions, but I have a general sense that OS disk caching becomes
increasingly important to Solr's performance as time goes on.  What this
means in real terms is that if you have enough memory for adequate OS
disk caching, using a later version of Solr will probably yield better
performance, but if you don't have enough memory, you might actually see
*worse* performance.

A question that might become important later, but doesn't really affect
the immediate things I'm thinking about: What GC tuning options you are
using?

How much RAM do you have in each machine, and how big is Solr's heap? 
How much index data actually lives on each server?  Be sure to count all
replicas on each machine.

https://wiki.apache.org/solr/SolrPerformanceProblems#RAM

Thanks,
Shawn





SolrCloud: collection creation: There are duplicate coreNodeName in core.properties in a same collection.

2015-08-27 Thread forest_soup
https://issues.apache.org/jira/browse/SOLR-7982

We have a 3 Zookeeper 5 solr server Solrcloud. 
We created collection1 and collection2 with 80 shards respectively in the
cloud, replicateFactor is 2. 
But after created, we found in a same collection, the coreNodeName has some
duplicate in core.properties in the core folder. For example:
[tanglin@solr64 home]$ ll collection1_shard13_replica2/core.properties
rw-rr- 1 solr solr 173 Jul 29 11:52
collection1_shard13_replica2/core.properties
[tanglin@solr64 home]$ ll collection1_shard66_replica1/core.properties
rw-rr- 1 solr solr 173 Jul 29 11:52
collection1_shard66_replica1/core.properties
[tanglin@solr64 home]$ cat collection1_shard66_replica1/core.properties
#Written by CorePropertiesLocator
#Wed Jul 29 11:52:54 UTC 2015
numShards=80
name=collection1_shard66_replica1
shard=shard66
collection=collection1
coreNodeName=core_node19
[tanglin@solr64 home]$ cat collection1_shard13_replica2/core.properties
#Written by CorePropertiesLocator
#Wed Jul 29 11:52:53 UTC 2015
numShards=80
name=collection1_shard13_replica2
shard=shard13
collection=collection1
coreNodeName=core_node19
[tanglin@solr64 home]$
The consequence of the issue is that the clusterstate.json in zookeeper is
also with wrong core_node#, and updating state of a core sometimes changed
the state of other core in other shard..
Snippet from clusterstate:
"shard13":{
"range":"a666-a998",
"state":"active",
"replicas":{
"core_node33":
{ "state":"active", "base_url":"https://solr65.somesite.com:8443/solr";,
"core":"collection1_shard13_replica1",
"node_name":"solr65.somesite.com:8443_solr"}
,
"core_node19":
{ "state":"active", "base_url":"https://solr64.somesite.com:8443/solr";,
"core":"collection1_shard13_replica2",
"node_name":"solr64.somesite.com:8443_solr", "leader":"true"}
}},
...
"shard66":{
"range":"5000-5332",
"state":"active",
"replicas":{
"core_node105":
{ "state":"active", "base_url":"https://solr63.somesite.com:8443/solr";,
"core":"collection1_shard66_replica2",
"node_name":"solr63.somesite.com:8443_solr", "leader":"true"}
,
"core_node19":
{ "state":"active", "base_url":"https://solr64.somesite.com:8443/solr";,
"core":"collection1_shard66_replica1",
"node_name":"solr64.somesite.com:8443_solr"}
}},



--
View this message in context: 
http://lucene.472066.n3.nabble.com/SolrCloud-collection-creation-There-are-duplicate-coreNodeName-in-core-properties-in-a-same-collecti-tp4225532.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr 5.2.1 versus Solr 4.7.0 performance

2015-08-27 Thread Toke Eskildsen
On Thu, 2015-08-27 at 11:23 +0300, Esther Goldbraich wrote:
> We are using GC tuning options: Xgcpolicy:gencon , verbose:gc.
> RAM: 64GB
> Solr heap: -Xms512m -Xmx32768m
> Index per server: 500G

Expecting "Your RAM size should be equal to index size"-posts to arrive
in 3, 2, 1...

> Surprisingly, running different setup on same machines, 64 collections / 1 
> shard per collection gives significantly better results.
> Any ideas? 

Compared to what? If your other scenario is to have 1 collection split
over 64 shards, then the difference boils down to distributed search vs.
single-shard search. There is a non-trivial overhead with doing
distributed search, so if a collection fits well into a single shard
(replicas does not count as distribution here), that is preferable.

Come to think about it, distribution in itself might account for the
difference between 4 & 5 that you are observing. Are you doing faceting
as part of your test?

- Toke Eskildsen, State and University Library, Denmark




Re: Exact substring search with ngrams

2015-08-27 Thread Christian Ramseyer
On 26/08/15 18:05, Erick Erickson wrote:
> bq: my dog
> has fleas
> I wouldn't  want some variant of "og ha" to match,
> 
> Here's where the mysterious "positionIncrementGap" comes in. If you
> make this field "multiValued",  and index this like this:
> 
> my dog
> has fleas
> 
> 
> then the position of "dog" will be 2 and the position of "has" will be
> 102 assuming
> the positionIncrementGap is the default 100. N.B. I'm not sure you'll
> see this in the
> admin/analysis page or not.
> 
> Anyway, now your example won't match across the two parts unless
> you specify a "slop" up in the 101 range.

Oh that's nifty, thanks!



Re: Solr 5.2.1 versus Solr 4.7.0 performance

2015-08-27 Thread Esther Goldbraich
:)

Yes, 64 collections / 1 shard is compared to 1 collection / 64 shards 
(with router.name=compositeId) on Solr 5.
Quering with "_route_" should not eliminate distributed search overhead?

What is the difference in distribution mechanism between Solr 4 & Solr 5? 
Especially is there any change in filter cache management? 
We see very low hit ratio, with many evictions. Seems that specific shard 
holds entries from other shards (based on _route_ field). Why this can 
happen?

Yes, we have facets, not in all queries. Using regular facets, haven't 
upgraded to Json facets yet.

Appreciate your help,
Esther



From:   Toke Eskildsen 
To: 
Date:   27/08/2015 12:27 PM
Subject:Re: Solr 5.2.1 versus Solr 4.7.0 performance



On Thu, 2015-08-27 at 11:23 +0300, Esther Goldbraich wrote:
> We are using GC tuning options: Xgcpolicy:gencon , verbose:gc.
> RAM: 64GB
> Solr heap: -Xms512m -Xmx32768m
> Index per server: 500G

Expecting "Your RAM size should be equal to index size"-posts to arrive
in 3, 2, 1...

> Surprisingly, running different setup on same machines, 64 collections / 
1 
> shard per collection gives significantly better results.
> Any ideas? 

Compared to what? If your other scenario is to have 1 collection split
over 64 shards, then the difference boils down to distributed search vs.
single-shard search. There is a non-trivial overhead with doing
distributed search, so if a collection fits well into a single shard
(replicas does not count as distribution here), that is preferable.

Come to think about it, distribution in itself might account for the
difference between 4 & 5 that you are observing. Are you doing faceting
as part of your test?

- Toke Eskildsen, State and University Library, Denmark






Re: Solr 5.2.1 versus Solr 4.7.0 performance

2015-08-27 Thread Toke Eskildsen
On Thu, 2015-08-27 at 13:16 +0300, Esther Goldbraich wrote:
> Yes, 64 collections / 1 shard is compared to 1 collection / 64 shards 
> (with router.name=compositeId) on Solr 5.
> Quering with "_route_" should not eliminate distributed search overhead?

Caveat: I am guessing a bit here.

When you specify a _route_, the request goes to one or more shards (as
you can specify more than one _route_):
https://cwiki.apache.org/confluence/display/solr/Advanced+Distributed
+Request+Options#AdvancedDistributedRequestOptions-RoutingParameters

When a distributed facet request is issued, over-provisioning of the
result set is used (FacetComponene#modifyRequestForFieldFacets). I do
not know if Solr is smart enough to recognize a shards-list of length 1
as a non-distributed call.

You should be able to verify this in your Solr-log: Look for facet.limit
and see if it matches the limit you requested or if it is higher
(default formula is limit * 1.5 + 10).


The rest of your questions are too far outside of my knowledge for me to
try and answer.

- Toke Eskildsen, State and University Library, Denmark




Re: Solr 5.2.1 versus Solr 4.7.0 performance

2015-08-27 Thread Esther Goldbraich
Found the reason for many evictions (bug in our code), please ignore the 
specific question on filter cache.
All other questions (in bold) are still very relevant.



From:   Esther Goldbraich/Haifa/IBM
To: solr-user@lucene.apache.org
Date:   27/08/2015 01:13 PM
Subject:Re: Solr 5.2.1 versus Solr 4.7.0 performance


:)

Yes, 64 collections / 1 shard is compared to 1 collection / 64 shards 
(with router.name=compositeId) on Solr 5.
Quering with "_route_" should not eliminate distributed search overhead?

What is the difference in distribution mechanism between Solr 4 & Solr 5? 
Especially is there any change in filter cache management? 
We see very low hit ratio, with many evictions. Seems that specific shard 
holds entries from other shards (based on _route_ field). Why this can 
happen?

Yes, we have facets, not in all queries. Using regular facets, haven't 
upgraded to Json facets yet.

Appreciate your help,
Esther




From:   Toke Eskildsen 
To: 
Date:   27/08/2015 12:27 PM
Subject:Re: Solr 5.2.1 versus Solr 4.7.0 performance



On Thu, 2015-08-27 at 11:23 +0300, Esther Goldbraich wrote:
> We are using GC tuning options: Xgcpolicy:gencon , verbose:gc.
> RAM: 64GB
> Solr heap: -Xms512m -Xmx32768m
> Index per server: 500G

Expecting "Your RAM size should be equal to index size"-posts to arrive
in 3, 2, 1...

> Surprisingly, running different setup on same machines, 64 collections / 
1 
> shard per collection gives significantly better results.
> Any ideas? 

Compared to what? If your other scenario is to have 1 collection split
over 64 shards, then the difference boils down to distributed search vs.
single-shard search. There is a non-trivial overhead with doing
distributed search, so if a collection fits well into a single shard
(replicas does not count as distribution here), that is preferable.

Come to think about it, distribution in itself might account for the
difference between 4 & 5 that you are observing. Are you doing faceting
as part of your test?

- Toke Eskildsen, State and University Library, Denmark







Problem with JSON sub-faceting

2015-08-27 Thread Pritam Kute
Do result grouping and tagging and excluding filters feature works with
JSON sub-faceting? If yes, it will be a great help if someone point me to
some documentation for the same.

Thanks in advance.

Thanks & Regards,
--
*Pritam Kute*


Re: Solr relevancy score order

2015-08-27 Thread Steven White
Thanks Erick.

Your summary about doc IDs is much helpful.

I tested the second level sort with a small set of data (10K records) and
didn't see much of a significant impact.  I will test with a 10m records at
some time later.

Steve

On Mon, Aug 24, 2015 at 11:03 PM, Erick Erickson 
wrote:

> Getting the most recent doc first in the case of a tie
> will _not_ "just happen". I don't think you really get the
> nuance here...
>
> You index doc1, and doc2 later. Let's
> claim that doc1 gets internal Lucene doc ID of 1 and
> doc2 gets an internal doc ID of 2. So far you're golden.
> Let's further claim that doc1 is in a different segment than
> doc2. Sometime later, as you add/update/delete docs,
> segments are merged and doc1 and doc2 may or may
> not be in the merged segment. At that point, doc1 can get an
> internal Lucene doc ID of, say, 823 and doc2 can get an internal
> doc ID of, say 64. So their relative order is changed.
>
> You have to have a secondary sort criteria then. And it has to be
> something monotonically increasing by time that won't ever change
> like internal doc IDs can. Adding a timestamp
> to every doc is certainly an option. Adding your own counter
> is also reasonable.
>
> But this is a _secondary_ sort, so it's not even consulted if the
> first sort (score) is not a tie. You can get a sense of how this would
> affect your query time/CPU usage/RAM by must specifying
> sort=score desc,id asc
> where id is your  field. This won't do what you want,
> but it will simulate it without having to re-index.
>
> Best,
> Erick
>
> On Mon, Aug 24, 2015 at 11:54 AM, Steven White 
> wrote:
> > Thanks Hoss.
> >
> > I understand the dynamic nature of doc-IDs.  All that I care about is the
> > most recent docs be at the top of the hit list when there is a tie.  From
> > your reply, it is not clear if that's what happens.  If not, then I have
> to
> > sort, but this is something I want to avoid so it won't add cost to my
> > queries (CPU and RAM).
> >
> > Can you help me answer those two questions?
> >
> > Steve
> >
> > On Mon, Aug 24, 2015 at 2:16 PM, Chris Hostetter <
> hossman_luc...@fucit.org>
> > wrote:
> >
> >>
> >> : A follow up question.  Is the sub-sorting on the lucene internal doc
> IDs
> >> : ascending or descending order?  That is, do the most recently index
> doc
> >>
> >> you can not make any generic assumptions baout hte order of the internal
> >> lucene doc IDS -- the secondary sort on the internal IDs is stable (and
> >> FWIW: ascending) for static indexes, but as mentioned before: the
> *actual*
> >> order hte the IDS changes as the index changes -- if there is an index
> >> merge, the ids can be totally different and docs can be re-arranged
> into a
> >> diff order...
> >>
> >> : > However, internal Lucene Ids can change when index changes. (merges,
> >> : > updates etc).
> >>
> >> ...
> >>
> >> : show up first in this set of docs that have tied score?  If not, who
> can
> >> I
> >> : have the most recent be first?  Do I have to sort on lucene's internal
> >> doc
> >>
> >> add a "timestamp" or "counter" field when you index your documents that
> >> means whatevery you want it to mean (order added, order updated, order
> >> according to some external sort criteria from some external system) and
> >> then do an explicit sort on that.
> >>
> >>
> >> -Hoss
> >> http://www.lucidworks.com/
> >>
>


Re: Solr 5.2.1 versus Solr 4.7.0 performance

2015-08-27 Thread Shawn Heisey
On 8/27/2015 3:26 AM, Toke Eskildsen wrote:
> On Thu, 2015-08-27 at 11:23 +0300, Esther Goldbraich wrote:
>> We are using GC tuning options: Xgcpolicy:gencon , verbose:gc.
>> RAM: 64GB
>> Solr heap: -Xms512m -Xmx32768m
>> Index per server: 500G

Is this in use for Solr 5.2.1 as well?  The start script for Solr 5.x
includes garbage collection tuning that has worked well for most people.

I have no idea what the "gencon" strategy actually does, and I cannot
find much information with Google.  Most of the info I can find on this
setting indicates this setting on Oracle Java is for JRockit, which is
gone in Java 7.  I suspect that it doesn't do much beyond changing which
collectors are in use.  This is *not* enough GC tuning for Solr, and if
you are running Oracle Java 7 or OpenDK 7, may not actually do anything
useful at all.  It looks like it is a valid setting on IBM Java, but
there are plenty of reasons (bugs) to never use Java from IBM.

The GC settings in the bin/solr start script included with 5.x are good,
or you can look at my wiki page for info on all the GC testing that I
have done personally:

https://wiki.apache.org/solr/ShawnHeisey

> Expecting "Your RAM size should be equal to index size"-posts to arrive
> in 3, 2, 1...

I see that I am predictable. :)

There is about 32GB left over after Java is done allocating memory for
Solr, assuming there isn't anything else on the system that requires a
significant chunk of memory, like a webserver or a database server.

Like Toke mentions, I will tell you that for *ideal* performance, your
available RAM should be equal to the index size.  In a perfect world,
you would have at least 532GB of RAM -- 32GB for Java and 500GB for
caching the index.  Since that's not a typical RAM size for a server,
call it 576GB or 640GB.

Perfection is not required for Solr, but I can tell you that 32GB of RAM
for 500GB of index will almost certainly lead to performance issues.
This most likely wouldn't be enough even if your index is on SSD.

If I couldn't achieve the dream of 512GB to run 500GB of index, I would
hope for at least 128GB of RAM for SSD storage or at least 256GB of RAM
for spinning disks, and I would not be surprised to learn that's not
enough either.

I suspect this may be the majority of why you don't see a performance
improvement with the newer Java version.  Newer versions generally have
better performance, but only if your hardware has enough resources for
the job.

Thanks,
Shawn



Re: Lucene/Solr 5.0 and custom FieldCahe implementation

2015-08-27 Thread Tomás Fernández Löbbe
I don't think there is a way to do this now. Maybe we should separate the
logic of creating the SolrIndexSearcher to a factory. Moving this logic
away from SolrCore is already a win, plus it will make it easier to unit
test and extend for advanced use cases.

Tomás

On Wed, Aug 26, 2015 at 8:10 PM, Jamie Johnson  wrote:

> Sorry to poke this again but I'm not following the last comment of how I
> could go about extending the solr index searcher and have the extension
> used.  Is there an example of this?  Again thanks
>
> Jamie
> On Aug 25, 2015 7:18 AM, "Jamie Johnson"  wrote:
>
> > I had seen this as well, if I over wrote this by extending
> > SolrIndexSearcher how do I have my extension used?  I didn't see a way
> that
> > could be plugged in.
> > On Aug 25, 2015 7:15 AM, "Mikhail Khludnev" 
> > wrote:
> >
> >> On Tue, Aug 25, 2015 at 2:03 PM, Jamie Johnson 
> wrote:
> >>
> >> > Thanks Mikhail.  If I'm reading the SimpleFacets class correctly, out
> >> > delegates to DocValuesFacets when facet method is FC, what used to be
> >> > FieldCache I believe.  DocValuesFacets either uses DocValues or builds
> >> then
> >> > using the UninvertingReader.
> >> >
> >>
> >> Ah.. got it. Thanks for reminding this details.It seems like even
> >> docValues=true doesn't help with your custom implementation.
> >>
> >>
> >> >
> >> > I am not seeing a clean extension point to add a custom
> >> UninvertingReader
> >> > to Solr, would the only way be to copy the FacetComponent and
> >> SimpleFacets
> >> > and modify as needed?
> >> >
> >> Sadly, yes. There is no proper extension point. Also, consider
> overriding
> >> SolrIndexSearcher.wrapReader(SolrCore, DirectoryReader) where the
> >> particular UninvertingReader is created, there you can pass the own one,
> >> which refers to custom FieldCache.
> >>
> >>
> >> > On Aug 25, 2015 12:42 AM, "Mikhail Khludnev" <
> >> mkhlud...@griddynamics.com>
> >> > wrote:
> >> >
> >> > > Hello Jamie,
> >> > > I don't understand how it could choose DocValuesFacets (it occurs on
> >> > > docValues=true) field, but then switches to
> >> UninvertingReader/FieldCache
> >> > > which means docValues=false. If you can provide more details it
> would
> >> be
> >> > > great.
> >> > > Beside of that, I suppose you can only implement and inject your own
> >> > > UninvertingReader, I don't think there is an extension point for
> this.
> >> > It's
> >> > > too specific requirement.
> >> > >
> >> > > On Tue, Aug 25, 2015 at 3:50 AM, Jamie Johnson 
> >> > wrote:
> >> > >
> >> > > > as mentioned in a previous email I have a need to provide security
> >> > > controls
> >> > > > at the term level.  I know that Lucene/Solr doesn't support this
> so
> >> I
> >> > had
> >> > > > baked something onto a 4.x baseline that was sufficient for my use
> >> > cases.
> >> > > > I am now looking to move that implementation to 5.x and am running
> >> into
> >> > > an
> >> > > > issue around faceting.  Previously we were able to provide a
> custom
> >> > cache
> >> > > > implementation that would create separate cache entries given a
> >> > > particular
> >> > > > set of security controls, but in Solr 5 some faceting is delegated
> >> to
> >> > > > DocValuesFacets which delegates to UninvertingReader in my case
> (we
> >> are
> >> > > not
> >> > > > storing DocValues).  The issue I am running into is that before
> 5.x
> >> I
> >> > had
> >> > > > the ability to influence the FieldCache that was used at the Solr
> >> level
> >> > > to
> >> > > > also include a security token into the key so each cache entry was
> >> > scoped
> >> > > > to a particular level.  With the current implementation the
> >> FieldCache
> >> > > > seems to be an internal detail that I can't influence in anyway.
> Is
> >> > this
> >> > > > correct?  I had noticed this Jira ticket
> >> > > > https://issues.apache.org/jira/browse/LUCENE-5427, is there any
> >> > movement
> >> > > > on
> >> > > > this?  Is there another way to influence the information that is
> put
> >> > into
> >> > > > these caches?  As always thanks in advance for any suggestions.
> >> > > >
> >> > > > -Jamie
> >> > > >
> >> > >
> >> > >
> >> > >
> >> > > --
> >> > > Sincerely yours
> >> > > Mikhail Khludnev
> >> > > Principal Engineer,
> >> > > Grid Dynamics
> >> > >
> >> > > 
> >> > > 
> >> > >
> >> >
> >>
> >>
> >>
> >> --
> >> Sincerely yours
> >> Mikhail Khludnev
> >> Principal Engineer,
> >> Grid Dynamics
> >>
> >> 
> >> 
> >>
> >
>


Re: Lucene/Solr 5.0 and custom FieldCahe implementation

2015-08-27 Thread Jamie Johnson
That makes sense, then I could extend the SolrIndexSearcher by creating a
different factory class that did whatever magic I needed.  If you create a
Jira ticket for this please link it here so I can track it!  Again thanks

On Thu, Aug 27, 2015 at 11:59 AM, Tomás Fernández Löbbe <
tomasflo...@gmail.com> wrote:

> I don't think there is a way to do this now. Maybe we should separate the
> logic of creating the SolrIndexSearcher to a factory. Moving this logic
> away from SolrCore is already a win, plus it will make it easier to unit
> test and extend for advanced use cases.
>
> Tomás
>
> On Wed, Aug 26, 2015 at 8:10 PM, Jamie Johnson  wrote:
>
> > Sorry to poke this again but I'm not following the last comment of how I
> > could go about extending the solr index searcher and have the extension
> > used.  Is there an example of this?  Again thanks
> >
> > Jamie
> > On Aug 25, 2015 7:18 AM, "Jamie Johnson"  wrote:
> >
> > > I had seen this as well, if I over wrote this by extending
> > > SolrIndexSearcher how do I have my extension used?  I didn't see a way
> > that
> > > could be plugged in.
> > > On Aug 25, 2015 7:15 AM, "Mikhail Khludnev" <
> mkhlud...@griddynamics.com>
> > > wrote:
> > >
> > >> On Tue, Aug 25, 2015 at 2:03 PM, Jamie Johnson 
> > wrote:
> > >>
> > >> > Thanks Mikhail.  If I'm reading the SimpleFacets class correctly,
> out
> > >> > delegates to DocValuesFacets when facet method is FC, what used to
> be
> > >> > FieldCache I believe.  DocValuesFacets either uses DocValues or
> builds
> > >> then
> > >> > using the UninvertingReader.
> > >> >
> > >>
> > >> Ah.. got it. Thanks for reminding this details.It seems like even
> > >> docValues=true doesn't help with your custom implementation.
> > >>
> > >>
> > >> >
> > >> > I am not seeing a clean extension point to add a custom
> > >> UninvertingReader
> > >> > to Solr, would the only way be to copy the FacetComponent and
> > >> SimpleFacets
> > >> > and modify as needed?
> > >> >
> > >> Sadly, yes. There is no proper extension point. Also, consider
> > overriding
> > >> SolrIndexSearcher.wrapReader(SolrCore, DirectoryReader) where the
> > >> particular UninvertingReader is created, there you can pass the own
> one,
> > >> which refers to custom FieldCache.
> > >>
> > >>
> > >> > On Aug 25, 2015 12:42 AM, "Mikhail Khludnev" <
> > >> mkhlud...@griddynamics.com>
> > >> > wrote:
> > >> >
> > >> > > Hello Jamie,
> > >> > > I don't understand how it could choose DocValuesFacets (it occurs
> on
> > >> > > docValues=true) field, but then switches to
> > >> UninvertingReader/FieldCache
> > >> > > which means docValues=false. If you can provide more details it
> > would
> > >> be
> > >> > > great.
> > >> > > Beside of that, I suppose you can only implement and inject your
> own
> > >> > > UninvertingReader, I don't think there is an extension point for
> > this.
> > >> > It's
> > >> > > too specific requirement.
> > >> > >
> > >> > > On Tue, Aug 25, 2015 at 3:50 AM, Jamie Johnson  >
> > >> > wrote:
> > >> > >
> > >> > > > as mentioned in a previous email I have a need to provide
> security
> > >> > > controls
> > >> > > > at the term level.  I know that Lucene/Solr doesn't support this
> > so
> > >> I
> > >> > had
> > >> > > > baked something onto a 4.x baseline that was sufficient for my
> use
> > >> > cases.
> > >> > > > I am now looking to move that implementation to 5.x and am
> running
> > >> into
> > >> > > an
> > >> > > > issue around faceting.  Previously we were able to provide a
> > custom
> > >> > cache
> > >> > > > implementation that would create separate cache entries given a
> > >> > > particular
> > >> > > > set of security controls, but in Solr 5 some faceting is
> delegated
> > >> to
> > >> > > > DocValuesFacets which delegates to UninvertingReader in my case
> > (we
> > >> are
> > >> > > not
> > >> > > > storing DocValues).  The issue I am running into is that before
> > 5.x
> > >> I
> > >> > had
> > >> > > > the ability to influence the FieldCache that was used at the
> Solr
> > >> level
> > >> > > to
> > >> > > > also include a security token into the key so each cache entry
> was
> > >> > scoped
> > >> > > > to a particular level.  With the current implementation the
> > >> FieldCache
> > >> > > > seems to be an internal detail that I can't influence in anyway.
> > Is
> > >> > this
> > >> > > > correct?  I had noticed this Jira ticket
> > >> > > > https://issues.apache.org/jira/browse/LUCENE-5427, is there any
> > >> > movement
> > >> > > > on
> > >> > > > this?  Is there another way to influence the information that is
> > put
> > >> > into
> > >> > > > these caches?  As always thanks in advance for any suggestions.
> > >> > > >
> > >> > > > -Jamie
> > >> > > >
> > >> > >
> > >> > >
> > >> > >
> > >> > > --
> > >> > > Sincerely yours
> > >> > > Mikhail Khludnev
> > >> > > Principal Engineer,
> > >> > > Grid Dynamics
> > >> > >
> > >> > > 
> > >> > > 
> > >> > >
> > >> >
> > >>
> > >>
> > >>
> > >> --
> 

Re: Lucene/Solr 5.0 and custom FieldCahe implementation

2015-08-27 Thread Jamie Johnson
Also in this vein I think that Lucene should support factories for the
cache creation as described @
https://issues.apache.org/jira/browse/LUCENE-2394.  I'm not endorsing the
patch that is provided (I haven't even looked at it) just the concept in
general.

On Thu, Aug 27, 2015 at 12:01 PM, Jamie Johnson  wrote:

> That makes sense, then I could extend the SolrIndexSearcher by creating a
> different factory class that did whatever magic I needed.  If you create a
> Jira ticket for this please link it here so I can track it!  Again thanks
>
> On Thu, Aug 27, 2015 at 11:59 AM, Tomás Fernández Löbbe <
> tomasflo...@gmail.com> wrote:
>
>> I don't think there is a way to do this now. Maybe we should separate the
>> logic of creating the SolrIndexSearcher to a factory. Moving this logic
>> away from SolrCore is already a win, plus it will make it easier to unit
>> test and extend for advanced use cases.
>>
>> Tomás
>>
>> On Wed, Aug 26, 2015 at 8:10 PM, Jamie Johnson  wrote:
>>
>> > Sorry to poke this again but I'm not following the last comment of how I
>> > could go about extending the solr index searcher and have the extension
>> > used.  Is there an example of this?  Again thanks
>> >
>> > Jamie
>> > On Aug 25, 2015 7:18 AM, "Jamie Johnson"  wrote:
>> >
>> > > I had seen this as well, if I over wrote this by extending
>> > > SolrIndexSearcher how do I have my extension used?  I didn't see a way
>> > that
>> > > could be plugged in.
>> > > On Aug 25, 2015 7:15 AM, "Mikhail Khludnev" <
>> mkhlud...@griddynamics.com>
>> > > wrote:
>> > >
>> > >> On Tue, Aug 25, 2015 at 2:03 PM, Jamie Johnson 
>> > wrote:
>> > >>
>> > >> > Thanks Mikhail.  If I'm reading the SimpleFacets class correctly,
>> out
>> > >> > delegates to DocValuesFacets when facet method is FC, what used to
>> be
>> > >> > FieldCache I believe.  DocValuesFacets either uses DocValues or
>> builds
>> > >> then
>> > >> > using the UninvertingReader.
>> > >> >
>> > >>
>> > >> Ah.. got it. Thanks for reminding this details.It seems like even
>> > >> docValues=true doesn't help with your custom implementation.
>> > >>
>> > >>
>> > >> >
>> > >> > I am not seeing a clean extension point to add a custom
>> > >> UninvertingReader
>> > >> > to Solr, would the only way be to copy the FacetComponent and
>> > >> SimpleFacets
>> > >> > and modify as needed?
>> > >> >
>> > >> Sadly, yes. There is no proper extension point. Also, consider
>> > overriding
>> > >> SolrIndexSearcher.wrapReader(SolrCore, DirectoryReader) where the
>> > >> particular UninvertingReader is created, there you can pass the own
>> one,
>> > >> which refers to custom FieldCache.
>> > >>
>> > >>
>> > >> > On Aug 25, 2015 12:42 AM, "Mikhail Khludnev" <
>> > >> mkhlud...@griddynamics.com>
>> > >> > wrote:
>> > >> >
>> > >> > > Hello Jamie,
>> > >> > > I don't understand how it could choose DocValuesFacets (it
>> occurs on
>> > >> > > docValues=true) field, but then switches to
>> > >> UninvertingReader/FieldCache
>> > >> > > which means docValues=false. If you can provide more details it
>> > would
>> > >> be
>> > >> > > great.
>> > >> > > Beside of that, I suppose you can only implement and inject your
>> own
>> > >> > > UninvertingReader, I don't think there is an extension point for
>> > this.
>> > >> > It's
>> > >> > > too specific requirement.
>> > >> > >
>> > >> > > On Tue, Aug 25, 2015 at 3:50 AM, Jamie Johnson <
>> jej2...@gmail.com>
>> > >> > wrote:
>> > >> > >
>> > >> > > > as mentioned in a previous email I have a need to provide
>> security
>> > >> > > controls
>> > >> > > > at the term level.  I know that Lucene/Solr doesn't support
>> this
>> > so
>> > >> I
>> > >> > had
>> > >> > > > baked something onto a 4.x baseline that was sufficient for my
>> use
>> > >> > cases.
>> > >> > > > I am now looking to move that implementation to 5.x and am
>> running
>> > >> into
>> > >> > > an
>> > >> > > > issue around faceting.  Previously we were able to provide a
>> > custom
>> > >> > cache
>> > >> > > > implementation that would create separate cache entries given a
>> > >> > > particular
>> > >> > > > set of security controls, but in Solr 5 some faceting is
>> delegated
>> > >> to
>> > >> > > > DocValuesFacets which delegates to UninvertingReader in my case
>> > (we
>> > >> are
>> > >> > > not
>> > >> > > > storing DocValues).  The issue I am running into is that before
>> > 5.x
>> > >> I
>> > >> > had
>> > >> > > > the ability to influence the FieldCache that was used at the
>> Solr
>> > >> level
>> > >> > > to
>> > >> > > > also include a security token into the key so each cache entry
>> was
>> > >> > scoped
>> > >> > > > to a particular level.  With the current implementation the
>> > >> FieldCache
>> > >> > > > seems to be an internal detail that I can't influence in
>> anyway.
>> > Is
>> > >> > this
>> > >> > > > correct?  I had noticed this Jira ticket
>> > >> > > > https://issues.apache.org/jira/browse/LUCENE-5427, is there
>> any
>> > >> > movement
>> > >> > > > on
>> > >> > > > th

Re: Lucene/Solr 5.0 and custom FieldCahe implementation

2015-08-27 Thread Yonik Seeley
The FieldCache has become implementation rather than interface, so I
don't think you're going to see plugins at that level (it's all
package protected now).

One could either subclass or re-implement UnInvertingReader though.

-Yonik


On Thu, Aug 27, 2015 at 12:09 PM, Jamie Johnson  wrote:
> Also in this vein I think that Lucene should support factories for the
> cache creation as described @
> https://issues.apache.org/jira/browse/LUCENE-2394.  I'm not endorsing the
> patch that is provided (I haven't even looked at it) just the concept in
> general.
>
> On Thu, Aug 27, 2015 at 12:01 PM, Jamie Johnson  wrote:
>
>> That makes sense, then I could extend the SolrIndexSearcher by creating a
>> different factory class that did whatever magic I needed.  If you create a
>> Jira ticket for this please link it here so I can track it!  Again thanks
>>
>> On Thu, Aug 27, 2015 at 11:59 AM, Tomás Fernández Löbbe <
>> tomasflo...@gmail.com> wrote:
>>
>>> I don't think there is a way to do this now. Maybe we should separate the
>>> logic of creating the SolrIndexSearcher to a factory. Moving this logic
>>> away from SolrCore is already a win, plus it will make it easier to unit
>>> test and extend for advanced use cases.
>>>
>>> Tomás
>>>
>>> On Wed, Aug 26, 2015 at 8:10 PM, Jamie Johnson  wrote:
>>>
>>> > Sorry to poke this again but I'm not following the last comment of how I
>>> > could go about extending the solr index searcher and have the extension
>>> > used.  Is there an example of this?  Again thanks
>>> >
>>> > Jamie
>>> > On Aug 25, 2015 7:18 AM, "Jamie Johnson"  wrote:
>>> >
>>> > > I had seen this as well, if I over wrote this by extending
>>> > > SolrIndexSearcher how do I have my extension used?  I didn't see a way
>>> > that
>>> > > could be plugged in.
>>> > > On Aug 25, 2015 7:15 AM, "Mikhail Khludnev" <
>>> mkhlud...@griddynamics.com>
>>> > > wrote:
>>> > >
>>> > >> On Tue, Aug 25, 2015 at 2:03 PM, Jamie Johnson 
>>> > wrote:
>>> > >>
>>> > >> > Thanks Mikhail.  If I'm reading the SimpleFacets class correctly,
>>> out
>>> > >> > delegates to DocValuesFacets when facet method is FC, what used to
>>> be
>>> > >> > FieldCache I believe.  DocValuesFacets either uses DocValues or
>>> builds
>>> > >> then
>>> > >> > using the UninvertingReader.
>>> > >> >
>>> > >>
>>> > >> Ah.. got it. Thanks for reminding this details.It seems like even
>>> > >> docValues=true doesn't help with your custom implementation.
>>> > >>
>>> > >>
>>> > >> >
>>> > >> > I am not seeing a clean extension point to add a custom
>>> > >> UninvertingReader
>>> > >> > to Solr, would the only way be to copy the FacetComponent and
>>> > >> SimpleFacets
>>> > >> > and modify as needed?
>>> > >> >
>>> > >> Sadly, yes. There is no proper extension point. Also, consider
>>> > overriding
>>> > >> SolrIndexSearcher.wrapReader(SolrCore, DirectoryReader) where the
>>> > >> particular UninvertingReader is created, there you can pass the own
>>> one,
>>> > >> which refers to custom FieldCache.
>>> > >>
>>> > >>
>>> > >> > On Aug 25, 2015 12:42 AM, "Mikhail Khludnev" <
>>> > >> mkhlud...@griddynamics.com>
>>> > >> > wrote:
>>> > >> >
>>> > >> > > Hello Jamie,
>>> > >> > > I don't understand how it could choose DocValuesFacets (it
>>> occurs on
>>> > >> > > docValues=true) field, but then switches to
>>> > >> UninvertingReader/FieldCache
>>> > >> > > which means docValues=false. If you can provide more details it
>>> > would
>>> > >> be
>>> > >> > > great.
>>> > >> > > Beside of that, I suppose you can only implement and inject your
>>> own
>>> > >> > > UninvertingReader, I don't think there is an extension point for
>>> > this.
>>> > >> > It's
>>> > >> > > too specific requirement.
>>> > >> > >
>>> > >> > > On Tue, Aug 25, 2015 at 3:50 AM, Jamie Johnson <
>>> jej2...@gmail.com>
>>> > >> > wrote:
>>> > >> > >
>>> > >> > > > as mentioned in a previous email I have a need to provide
>>> security
>>> > >> > > controls
>>> > >> > > > at the term level.  I know that Lucene/Solr doesn't support
>>> this
>>> > so
>>> > >> I
>>> > >> > had
>>> > >> > > > baked something onto a 4.x baseline that was sufficient for my
>>> use
>>> > >> > cases.
>>> > >> > > > I am now looking to move that implementation to 5.x and am
>>> running
>>> > >> into
>>> > >> > > an
>>> > >> > > > issue around faceting.  Previously we were able to provide a
>>> > custom
>>> > >> > cache
>>> > >> > > > implementation that would create separate cache entries given a
>>> > >> > > particular
>>> > >> > > > set of security controls, but in Solr 5 some faceting is
>>> delegated
>>> > >> to
>>> > >> > > > DocValuesFacets which delegates to UninvertingReader in my case
>>> > (we
>>> > >> are
>>> > >> > > not
>>> > >> > > > storing DocValues).  The issue I am running into is that before
>>> > 5.x
>>> > >> I
>>> > >> > had
>>> > >> > > > the ability to influence the FieldCache that was used at the
>>> Solr
>>> > >> level
>>> > >> > > to
>>> > >> > > > also include a security token into the key s

Re: Lucene/Solr 5.0 and custom FieldCahe implementation

2015-08-27 Thread Yonik Seeley
On Thu, Aug 27, 2015 at 11:59 AM, Tomás Fernández Löbbe
 wrote:
> I don't think there is a way to do this now. Maybe we should separate the
> logic of creating the SolrIndexSearcher to a factory.

That should probably be extended down to where lucene creates
searchers as well (delete-by-query).
Right now there's this hacky DeleteByQueryWrapper to handle wrapping
with UnInvertingReader.

-Yonik


Not receicing mails from the mailing list

2015-08-27 Thread Vijaya Narayana Reddy Bhoomi Reddy
Hi,

Sorry to spam everyone with this email.

I am not able to get emails into my inbox from Solr mailing list. However,
I am able to send mails to the mailing list. I am able to see them on the
mail archive on the website. In case anyone replies to my mail, it appears
in the archive, but I don't receive any email for the same.

Can anyone please let me know what could have gone wrong?

Thanks & Regards
Vijay

-- 
The contents of this e-mail are confidential and for the exclusive use of 
the intended recipient. If you receive this e-mail in error please delete 
it from your system immediately and notify us either by e-mail or 
telephone. You should not copy, forward or otherwise disclose the content 
of the e-mail. The views expressed in this communication may not 
necessarily be the view held by WHISHWORKS.


Re: Not receicing mails from the mailing list

2015-08-27 Thread Upayavira
You were subscribed to the "allow" list, meaning you could post, but
would not receive messages.

This is almost certainly because you started sending mails to the list
without subscribing first. Moderators then moderated through your
request in such a way as to allow you to post in future - as in, adding
you to the "allow" list.

I have switched your subscription to full list membership - you should
get this mail to your inbox.

Upayavira

On Thu, Aug 27, 2015, at 05:39 PM, Vijaya Narayana Reddy Bhoomi Reddy
wrote:
> Hi,
> 
> Sorry to spam everyone with this email.
> 
> I am not able to get emails into my inbox from Solr mailing list.
> However,
> I am able to send mails to the mailing list. I am able to see them on the
> mail archive on the website. In case anyone replies to my mail, it
> appears
> in the archive, but I don't receive any email for the same.
> 
> Can anyone please let me know what could have gone wrong?
> 
> Thanks & Regards
> Vijay
> 
> -- 
> The contents of this e-mail are confidential and for the exclusive use of 
> the intended recipient. If you receive this e-mail in error please delete 
> it from your system immediately and notify us either by e-mail or 
> telephone. You should not copy, forward or otherwise disclose the content 
> of the e-mail. The views expressed in this communication may not 
> necessarily be the view held by WHISHWORKS.


Looking for Traditional Chinese support

2015-08-27 Thread Steven White
Hi Everyone

Per
https://cwiki.apache.org/confluence/display/solr/Language+Analysis#LanguageAnalysis-Language-SpecificFactories
I see the languages Solr supports.  Where is Traditional Chinese?  Is CJK
the one?

Thanks

Steve


Re: Lucene/Solr 5.0 and custom FieldCahe implementation

2015-08-27 Thread Jamie Johnson
I think a custom UnInvertingReader would work as I could skip the process
of putting things in the cache.  Right now in Solr 4.x though I am caching
based but including the users authorities in the key of the cache so we're
not rebuilding the UnivertedField on every request.  Where in 5.x is the
object actually cached?  Will this be possible in 5.x?

On Thu, Aug 27, 2015 at 12:32 PM, Yonik Seeley  wrote:

> The FieldCache has become implementation rather than interface, so I
> don't think you're going to see plugins at that level (it's all
> package protected now).
>
> One could either subclass or re-implement UnInvertingReader though.
>
> -Yonik
>
>
> On Thu, Aug 27, 2015 at 12:09 PM, Jamie Johnson  wrote:
> > Also in this vein I think that Lucene should support factories for the
> > cache creation as described @
> > https://issues.apache.org/jira/browse/LUCENE-2394.  I'm not endorsing
> the
> > patch that is provided (I haven't even looked at it) just the concept in
> > general.
> >
> > On Thu, Aug 27, 2015 at 12:01 PM, Jamie Johnson 
> wrote:
> >
> >> That makes sense, then I could extend the SolrIndexSearcher by creating
> a
> >> different factory class that did whatever magic I needed.  If you
> create a
> >> Jira ticket for this please link it here so I can track it!  Again
> thanks
> >>
> >> On Thu, Aug 27, 2015 at 11:59 AM, Tomás Fernández Löbbe <
> >> tomasflo...@gmail.com> wrote:
> >>
> >>> I don't think there is a way to do this now. Maybe we should separate
> the
> >>> logic of creating the SolrIndexSearcher to a factory. Moving this logic
> >>> away from SolrCore is already a win, plus it will make it easier to
> unit
> >>> test and extend for advanced use cases.
> >>>
> >>> Tomás
> >>>
> >>> On Wed, Aug 26, 2015 at 8:10 PM, Jamie Johnson 
> wrote:
> >>>
> >>> > Sorry to poke this again but I'm not following the last comment of
> how I
> >>> > could go about extending the solr index searcher and have the
> extension
> >>> > used.  Is there an example of this?  Again thanks
> >>> >
> >>> > Jamie
> >>> > On Aug 25, 2015 7:18 AM, "Jamie Johnson"  wrote:
> >>> >
> >>> > > I had seen this as well, if I over wrote this by extending
> >>> > > SolrIndexSearcher how do I have my extension used?  I didn't see a
> way
> >>> > that
> >>> > > could be plugged in.
> >>> > > On Aug 25, 2015 7:15 AM, "Mikhail Khludnev" <
> >>> mkhlud...@griddynamics.com>
> >>> > > wrote:
> >>> > >
> >>> > >> On Tue, Aug 25, 2015 at 2:03 PM, Jamie Johnson  >
> >>> > wrote:
> >>> > >>
> >>> > >> > Thanks Mikhail.  If I'm reading the SimpleFacets class
> correctly,
> >>> out
> >>> > >> > delegates to DocValuesFacets when facet method is FC, what used
> to
> >>> be
> >>> > >> > FieldCache I believe.  DocValuesFacets either uses DocValues or
> >>> builds
> >>> > >> then
> >>> > >> > using the UninvertingReader.
> >>> > >> >
> >>> > >>
> >>> > >> Ah.. got it. Thanks for reminding this details.It seems like even
> >>> > >> docValues=true doesn't help with your custom implementation.
> >>> > >>
> >>> > >>
> >>> > >> >
> >>> > >> > I am not seeing a clean extension point to add a custom
> >>> > >> UninvertingReader
> >>> > >> > to Solr, would the only way be to copy the FacetComponent and
> >>> > >> SimpleFacets
> >>> > >> > and modify as needed?
> >>> > >> >
> >>> > >> Sadly, yes. There is no proper extension point. Also, consider
> >>> > overriding
> >>> > >> SolrIndexSearcher.wrapReader(SolrCore, DirectoryReader) where the
> >>> > >> particular UninvertingReader is created, there you can pass the
> own
> >>> one,
> >>> > >> which refers to custom FieldCache.
> >>> > >>
> >>> > >>
> >>> > >> > On Aug 25, 2015 12:42 AM, "Mikhail Khludnev" <
> >>> > >> mkhlud...@griddynamics.com>
> >>> > >> > wrote:
> >>> > >> >
> >>> > >> > > Hello Jamie,
> >>> > >> > > I don't understand how it could choose DocValuesFacets (it
> >>> occurs on
> >>> > >> > > docValues=true) field, but then switches to
> >>> > >> UninvertingReader/FieldCache
> >>> > >> > > which means docValues=false. If you can provide more details
> it
> >>> > would
> >>> > >> be
> >>> > >> > > great.
> >>> > >> > > Beside of that, I suppose you can only implement and inject
> your
> >>> own
> >>> > >> > > UninvertingReader, I don't think there is an extension point
> for
> >>> > this.
> >>> > >> > It's
> >>> > >> > > too specific requirement.
> >>> > >> > >
> >>> > >> > > On Tue, Aug 25, 2015 at 3:50 AM, Jamie Johnson <
> >>> jej2...@gmail.com>
> >>> > >> > wrote:
> >>> > >> > >
> >>> > >> > > > as mentioned in a previous email I have a need to provide
> >>> security
> >>> > >> > > controls
> >>> > >> > > > at the term level.  I know that Lucene/Solr doesn't support
> >>> this
> >>> > so
> >>> > >> I
> >>> > >> > had
> >>> > >> > > > baked something onto a 4.x baseline that was sufficient for
> my
> >>> use
> >>> > >> > cases.
> >>> > >> > > > I am now looking to move that implementation to 5.x and am
> >>> running
> >>> > >> into
> >>> > >> > > an
> >>> > >> > > > issue around fac

FW: Issue while setting Solr on Slider / YARN

2015-08-27 Thread Vijay Bhoomireddy
 

Hi Tim,

 

For some reason, I was not receiving messages from Solr mailing list, though
I could post it to the list. Now I got that sorted. For my below query, I
saw your response on the mailing list. Below is the snippet of your
response:

 

Hi Vijay,

 

Verify the ResourceManager URL and try passing the --manager param to

explicitly set the ResourceManager URL during the create step.

 

Cheers,

Tim

 

 

I verified the Resource Manager URL and its pointing correctly. In my case,
it's myhdpcluster.com:8032 I even tried passing --manager param to the
slider create solr command, but without any luck. So not sure where it's
getting wrong. Can you please help me in understanding if I need to modify
something else to get this working? I am just wondering whether
"${AGENT_WORK_ROOT} in step below has any impact? I haven't changed this
line in the json file. Should this be modified?

 

I am able to login to Ambari console and see all the services running fine ,
including YARN related and ZooKeeper. Also, I could login to Resource
Manager's web UI as well and its working fine. Can you please let me know
what / where it could have gone wrong?

 

 

Thanks & Regards

Vijay

 

From: Vijay Bhoomireddy [mailto:vijaya.bhoomire...@whishworks.com] 
Sent: 17 August 2015 11:37
To: solr-user@lucene.apache.org  
Subject: Issue while setting Solr on Slider / YARN

 

Hi,

 

Any help on this please?

 

Thanks & Regards

Vijay

 

From: Vijay Bhoomireddy [mailto:vijaya.bhoomire...@whishworks.com] 
Sent: 14 August 2015 18:03
To: solr-user@lucene.apache.org  
Subject: Issue while setting Solr on Slider / YARN

 

Hi,

 

We have a requirement of setting up of Solr Cloud to work along with Hadoop.
Earlier, I could setup a SolrCloud cluster separately alongside the Hadoop
cluster i.e. it looks like two logical  clusters sitting next to each other,
both relying on HDFS.

 

However, the experiment now I am trying to do is to install SolrCloud on
YARN using Apache Slider. I am following LucidWorks blog at
https://github.com/LucidWorks/solr-slider for the same. I already have a
Hortonworks HDP cluster. When I try to setup Solr on my HDP cluster using
Slider, I am facing some issues.

 

As per the blog, I have performed the below steps:

 

1.   I have setup a single node HDP cluster for which the hostname is
myhdpcluster.com with all the essential services including ZooKeeper and
Slider running on it.

2.   Updated the resource manager address and port in slider-client.xml
present under /usr/hdp/current/slider-client/conf



yarn.resourcemanager.address

 myhdpcluster.com:8032



3.   Cloned the LucidWorks git and moved it under /home/hdfs/solr-slider

4.   Downloaded solr latest stable distribution and renamed it as
solr.tgz and placed it under /home/hdfs/solr-slider/package/files/solr.tgz

5.   Next ran the following command from within the
/home/hdfs/solr-slider folder

zip -r solr-on-yarn.zip metainfo.xml package/

6.   Next ran the following command as hdfs user

slider install-package --replacepkg --name solr --package
/home/hdfs/solr-slider/solr-on-yarn.zip

7.   Modified the following settings in the
/home/hdfs/solr-slider/appConfig-default.json file

"java_home": MY_JAVA_HOME_LOCATION

"site.global.app_root": "${AGENT_WORK_ROOT}/app/install/solr-5.2.1",
(Should this be changed to any other value?)

"site.global.zk_host": " myhdpcluster.com:2181",

8.   Set yarn.component.instances to 1 in resources-default.json file

9.   Next ran the following command

slider create solr --template /home/hdfs/solr-slider/appConfig-default.json
--resources /home/hdfs/solr-slider/resources-default.json

 

During this step, I am seeing an message INFO client.RMProxy - Connecting to
ResourceManager at myhdpcluster.com/10.0.2.15:8032 

 
INFO ipc.Client - Retrying connect to server:
myhdpcluster.com/10.0.2.15:8032. Already tried 0 time(s); 

 

This message keeps repeating for 50 times and then pauses for a couple of
seconds and then prints the same message in a loop eternally. Not sure on
where the problem is.

 

Can anyone please help me out to get away from this issue and help me setup
Solr on Slider/YARN?

 

Thanks & Regards

Vijay


-- 
The contents of this e-mail are confidential and for the exclusive use of 
the intended recipient. If you receive this e-mail in error please delete 
it from your system immediately and notify us either by e-mail or 
telephone. You should not copy, forward or otherwise disclose the content 
of the e-mail. The views expressed in this communication may not 
necessarily be the view held by WHISHWORKS.


Re: Lucene/Solr 5.0 and custom FieldCahe implementation

2015-08-27 Thread Yonik Seeley
UnInvertingReader makes indexed fields look like docvalues fields.
The caching itself is still done in FieldCache/FieldCacheImpl
but you could perhaps wrap what is cached there to either screen out
stuff or construct a new entry based on the user.

-Yonik


On Thu, Aug 27, 2015 at 12:55 PM, Jamie Johnson  wrote:
> I think a custom UnInvertingReader would work as I could skip the process
> of putting things in the cache.  Right now in Solr 4.x though I am caching
> based but including the users authorities in the key of the cache so we're
> not rebuilding the UnivertedField on every request.  Where in 5.x is the
> object actually cached?  Will this be possible in 5.x?
>
> On Thu, Aug 27, 2015 at 12:32 PM, Yonik Seeley  wrote:
>
>> The FieldCache has become implementation rather than interface, so I
>> don't think you're going to see plugins at that level (it's all
>> package protected now).
>>
>> One could either subclass or re-implement UnInvertingReader though.
>>
>> -Yonik
>>
>>
>> On Thu, Aug 27, 2015 at 12:09 PM, Jamie Johnson  wrote:
>> > Also in this vein I think that Lucene should support factories for the
>> > cache creation as described @
>> > https://issues.apache.org/jira/browse/LUCENE-2394.  I'm not endorsing
>> the
>> > patch that is provided (I haven't even looked at it) just the concept in
>> > general.
>> >
>> > On Thu, Aug 27, 2015 at 12:01 PM, Jamie Johnson 
>> wrote:
>> >
>> >> That makes sense, then I could extend the SolrIndexSearcher by creating
>> a
>> >> different factory class that did whatever magic I needed.  If you
>> create a
>> >> Jira ticket for this please link it here so I can track it!  Again
>> thanks
>> >>
>> >> On Thu, Aug 27, 2015 at 11:59 AM, Tomás Fernández Löbbe <
>> >> tomasflo...@gmail.com> wrote:
>> >>
>> >>> I don't think there is a way to do this now. Maybe we should separate
>> the
>> >>> logic of creating the SolrIndexSearcher to a factory. Moving this logic
>> >>> away from SolrCore is already a win, plus it will make it easier to
>> unit
>> >>> test and extend for advanced use cases.
>> >>>
>> >>> Tomás
>> >>>
>> >>> On Wed, Aug 26, 2015 at 8:10 PM, Jamie Johnson 
>> wrote:
>> >>>
>> >>> > Sorry to poke this again but I'm not following the last comment of
>> how I
>> >>> > could go about extending the solr index searcher and have the
>> extension
>> >>> > used.  Is there an example of this?  Again thanks
>> >>> >
>> >>> > Jamie
>> >>> > On Aug 25, 2015 7:18 AM, "Jamie Johnson"  wrote:
>> >>> >
>> >>> > > I had seen this as well, if I over wrote this by extending
>> >>> > > SolrIndexSearcher how do I have my extension used?  I didn't see a
>> way
>> >>> > that
>> >>> > > could be plugged in.
>> >>> > > On Aug 25, 2015 7:15 AM, "Mikhail Khludnev" <
>> >>> mkhlud...@griddynamics.com>
>> >>> > > wrote:
>> >>> > >
>> >>> > >> On Tue, Aug 25, 2015 at 2:03 PM, Jamie Johnson > >
>> >>> > wrote:
>> >>> > >>
>> >>> > >> > Thanks Mikhail.  If I'm reading the SimpleFacets class
>> correctly,
>> >>> out
>> >>> > >> > delegates to DocValuesFacets when facet method is FC, what used
>> to
>> >>> be
>> >>> > >> > FieldCache I believe.  DocValuesFacets either uses DocValues or
>> >>> builds
>> >>> > >> then
>> >>> > >> > using the UninvertingReader.
>> >>> > >> >
>> >>> > >>
>> >>> > >> Ah.. got it. Thanks for reminding this details.It seems like even
>> >>> > >> docValues=true doesn't help with your custom implementation.
>> >>> > >>
>> >>> > >>
>> >>> > >> >
>> >>> > >> > I am not seeing a clean extension point to add a custom
>> >>> > >> UninvertingReader
>> >>> > >> > to Solr, would the only way be to copy the FacetComponent and
>> >>> > >> SimpleFacets
>> >>> > >> > and modify as needed?
>> >>> > >> >
>> >>> > >> Sadly, yes. There is no proper extension point. Also, consider
>> >>> > overriding
>> >>> > >> SolrIndexSearcher.wrapReader(SolrCore, DirectoryReader) where the
>> >>> > >> particular UninvertingReader is created, there you can pass the
>> own
>> >>> one,
>> >>> > >> which refers to custom FieldCache.
>> >>> > >>
>> >>> > >>
>> >>> > >> > On Aug 25, 2015 12:42 AM, "Mikhail Khludnev" <
>> >>> > >> mkhlud...@griddynamics.com>
>> >>> > >> > wrote:
>> >>> > >> >
>> >>> > >> > > Hello Jamie,
>> >>> > >> > > I don't understand how it could choose DocValuesFacets (it
>> >>> occurs on
>> >>> > >> > > docValues=true) field, but then switches to
>> >>> > >> UninvertingReader/FieldCache
>> >>> > >> > > which means docValues=false. If you can provide more details
>> it
>> >>> > would
>> >>> > >> be
>> >>> > >> > > great.
>> >>> > >> > > Beside of that, I suppose you can only implement and inject
>> your
>> >>> own
>> >>> > >> > > UninvertingReader, I don't think there is an extension point
>> for
>> >>> > this.
>> >>> > >> > It's
>> >>> > >> > > too specific requirement.
>> >>> > >> > >
>> >>> > >> > > On Tue, Aug 25, 2015 at 3:50 AM, Jamie Johnson <
>> >>> jej2...@gmail.com>
>> >>> > >> > wrote:
>> >>> > >> > >
>> >>> > >> > > > as mentioned in a previous email I have a n

Re: StrDocValues

2015-08-27 Thread Jamie Johnson
Thanks Yonik.  I currently am using this to negate the score of a document
given the value of a particular field within the document, then using a
custom AnalyticQuery to only collect documents with a score > 0.  Will this
also impact the faceting counts?

On Wed, Aug 26, 2015 at 8:32 PM, Yonik Seeley  wrote:

> On Wed, Aug 26, 2015 at 6:20 PM, Jamie Johnson  wrote:
> > I don't see it explicitly mentioned, but does the boost only get applied
> to
> > the final documents/score that matched the provided query or is it called
> > for each field that matched?  I'm assuming only once per document that
> > matched the main query, is that right?
>
> Correct.
>
> -Yonik
>


Re: Search to Ignore ","

2015-08-27 Thread EXTERNAL Taminidi Ravi (ETI, AA-AS/PAS-PTS)
Hi Team, Can someone help on implementing filter to ignore "," in string?

e.g. I have "Technical, Specification"

If I search "Technical'  or  "Technical, Specification" I have to get this 
results, but current I am not .


Simply Ignore the "," I can't index the existing Data as it is very huge , is 
there any way to do it at query time.


Thanks

Ravi


Re: StrDocValues

2015-08-27 Thread Jamie Johnson
Actually I should have just tried this before asking but I'll say what I'm
seeing and maybe someone can confirm.

Faceting looks like it took this into account, i.e. the counts were 0 for
values that were in documents that I removed using my AnalyticQuery.  I had
expected that the AnalyticsQuery might be done after everything was
completed, but it looks like it was executed before faceting which is great.

On Thu, Aug 27, 2015 at 1:17 PM, Jamie Johnson  wrote:

> Thanks Yonik.  I currently am using this to negate the score of a document
> given the value of a particular field within the document, then using a
> custom AnalyticQuery to only collect documents with a score > 0.  Will this
> also impact the faceting counts?
>
> On Wed, Aug 26, 2015 at 8:32 PM, Yonik Seeley  wrote:
>
>> On Wed, Aug 26, 2015 at 6:20 PM, Jamie Johnson  wrote:
>> > I don't see it explicitly mentioned, but does the boost only get
>> applied to
>> > the final documents/score that matched the provided query or is it
>> called
>> > for each field that matched?  I'm assuming only once per document that
>> > matched the main query, is that right?
>>
>> Correct.
>>
>> -Yonik
>>
>
>


Re: Search to Ignore ","

2015-08-27 Thread Alexandre Rafalovitch
This is both very specific and very general question at the same time.
The way indexing and search are both done is via analyzer chains, as
defined in your schema. So, you need to check what the definition is
for the field you search and then play with that.

There is "Analysis" screen in the Web Admin UI, which shows you what
happens with your text as it gets indexed and searched. So, try
different field type definitions and see what happens. You don't even
need to index the text, just have field type definitions in the loaded
core.

Regards,
   Alex.

Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter:
http://www.solr-start.com/


On 27 August 2015 at 13:17, EXTERNAL Taminidi Ravi (ETI,
AA-AS/PAS-PTS)  wrote:
> Hi Team, Can someone help on implementing filter to ignore "," in string?
>
> e.g. I have "Technical, Specification"
>
> If I search "Technical'  or  "Technical, Specification" I have to get this 
> results, but current I am not .
>
>
> Simply Ignore the "," I can't index the existing Data as it is very huge , is 
> there any way to do it at query time.
>
>
> Thanks
>
> Ravi


Re: Looking for Traditional Chinese support

2015-08-27 Thread Jeanne Wang
Chinese instead of Simplified Chinese should be Traditional Chinese.

Jeanne

On Thu, Aug 27, 2015 at 12:51 PM, Steven White  wrote:

> Hi Everyone
>
> Per
>
> https://cwiki.apache.org/confluence/display/solr/Language+Analysis#LanguageAnalysis-Language-SpecificFactories
> I see the languages Solr supports.  Where is Traditional Chinese?  Is CJK
> the one?
>
> Thanks
>
> Steve
>


Re: StrDocValues

2015-08-27 Thread Erick Erickson
Right, when scoring any document that scores 0 is removed from the
results and facets etc. are calculated afterwards.

Best,
Erick

On Thu, Aug 27, 2015 at 10:24 AM, Jamie Johnson  wrote:
> Actually I should have just tried this before asking but I'll say what I'm
> seeing and maybe someone can confirm.
>
> Faceting looks like it took this into account, i.e. the counts were 0 for
> values that were in documents that I removed using my AnalyticQuery.  I had
> expected that the AnalyticsQuery might be done after everything was
> completed, but it looks like it was executed before faceting which is great.
>
> On Thu, Aug 27, 2015 at 1:17 PM, Jamie Johnson  wrote:
>
>> Thanks Yonik.  I currently am using this to negate the score of a document
>> given the value of a particular field within the document, then using a
>> custom AnalyticQuery to only collect documents with a score > 0.  Will this
>> also impact the faceting counts?
>>
>> On Wed, Aug 26, 2015 at 8:32 PM, Yonik Seeley  wrote:
>>
>>> On Wed, Aug 26, 2015 at 6:20 PM, Jamie Johnson  wrote:
>>> > I don't see it explicitly mentioned, but does the boost only get
>>> applied to
>>> > the final documents/score that matched the provided query or is it
>>> called
>>> > for each field that matched?  I'm assuming only once per document that
>>> > matched the main query, is that right?
>>>
>>> Correct.
>>>
>>> -Yonik
>>>
>>
>>


Re: Looking for Traditional Chinese support

2015-08-27 Thread Steven White
Hi Jeanne,

I don't understand.  Are you saying "Chinese Tokenizer" per
https://cwiki.apache.org/confluence/display/solr/Language+Analysis#LanguageAnalysis-Chinese
is "Traditional Chinese"?  If so, then it "is deprecated as of Solr 3.4"
and I just tried it with Solr 5.2 and could not get Solr started because
solr.ChineseFilterFactory cannot be loaded.

This is what I tried:


  
  


Thanks

Steve

On Thu, Aug 27, 2015 at 2:20 PM, Jeanne Wang  wrote:

> Chinese instead of Simplified Chinese should be Traditional Chinese.
>
> Jeanne
>
> On Thu, Aug 27, 2015 at 12:51 PM, Steven White 
> wrote:
>
> > Hi Everyone
> >
> > Per
> >
> >
> https://cwiki.apache.org/confluence/display/solr/Language+Analysis#LanguageAnalysis-Language-SpecificFactories
> > I see the languages Solr supports.  Where is Traditional Chinese?  Is CJK
> > the one?
> >
> > Thanks
> >
> > Steve
> >
>


"no default request handler is registered"

2015-08-27 Thread Scott Hollenbeck
I'm doing some experimenting with Solr 5.3 and the 7.x-1.x-dev version of
the Apache Solr Search module for Drupal. Things seem to be working fine,
except that this warning message appears in the Solr admin logging window
and in the server log:

"no default request handler is registered (either '/select' or 'standard')"

Looking at the solrconfig.xml file that comes with the Drupal module I see a
requestHandler named "standard":

  
 
   content
   explicit
   true
 
  

I also see a handler named pinkPony with a "default" attribute set to
"true":

  
  

  edismax
  content
  explicit
  true
  0.01
  
  ${solr.pinkPony.timeAllowed:-1}
  *:*

  
  false
  
  true
  false
  
  1


  spellcheck
  elevator

  

So it seems like there are both standard and default requestHandlers
specified. Why is the warning produced? What am I missing?

Thank you,
Scott



Re: StrDocValues

2015-08-27 Thread Yonik Seeley
On Thu, Aug 27, 2015 at 2:43 PM, Erick Erickson  wrote:
> Right, when scoring any document that scores 0 is removed from the
> results

Just to clarify, I think Jamie removed 0 scoring documents himself.

Solr has never done this itself.  Lucene used to a long time ago and
then stopped IIRC.

-Yonik


Re: "no default request handler is registered"

2015-08-27 Thread Chris Hostetter

Thats... strange.

Looking at hte code it appears to be a totally bogus and missleading 
warning -- but it also shouldn't affect anything.

You can feel free to ignore it for now...

https://issues.apache.org/jira/browse/SOLR-7984



: Date: Thu, 27 Aug 2015 15:10:18 -0400
: From: Scott Hollenbeck 
: Reply-To: solr-user@lucene.apache.org
: To: solr-user@lucene.apache.org
: Subject: "no default request handler is registered"
: 
: I'm doing some experimenting with Solr 5.3 and the 7.x-1.x-dev version of
: the Apache Solr Search module for Drupal. Things seem to be working fine,
: except that this warning message appears in the Solr admin logging window
: and in the server log:
: 
: "no default request handler is registered (either '/select' or 'standard')"
: 
: Looking at the solrconfig.xml file that comes with the Drupal module I see a
: requestHandler named "standard":
: 
:   
:  
:content
:explicit
:true
:  
:   
: 
: I also see a handler named pinkPony with a "default" attribute set to
: "true":
: 
:   
:   
: 
:   edismax
:   content
:   explicit
:   true
:   0.01
:   
:   ${solr.pinkPony.timeAllowed:-1}
:   *:*
: 
:   
:   false
:   
:   true
:   false
:   
:   1
: 
: 
:   spellcheck
:   elevator
: 
:   
: 
: So it seems like there are both standard and default requestHandlers
: specified. Why is the warning produced? What am I missing?
: 
: Thank you,
: Scott
: 
: 

-Hoss
http://www.lucidworks.com/


Re: "no default request handler is registered"

2015-08-27 Thread Shawn Heisey
On 8/27/2015 1:10 PM, Scott Hollenbeck wrote:
> I'm doing some experimenting with Solr 5.3 and the 7.x-1.x-dev version of
> the Apache Solr Search module for Drupal. Things seem to be working fine,
> except that this warning message appears in the Solr admin logging window
> and in the server log:
> 
> "no default request handler is registered (either '/select' or 'standard')"
> 
> Looking at the solrconfig.xml file that comes with the Drupal module I see a
> requestHandler named "standard":
> 
>   
>  
>content
>explicit
>true
>  
>   
> 
> I also see a handler named pinkPony with a "default" attribute set to
> "true":



> So it seems like there are both standard and default requestHandlers
> specified. Why is the warning produced? What am I missing?

I think the warning message may be misworded, or logged in incorrect
circumstances, and might need some attention.

The solrconfig.xml that you are using (which I assume came from the
Drupal project) is geared towards a 3.x version of Solr prior to 3.6.x
(the last minor version in the 3.x line).

Starting in the 3.6 version, all request handlers in examples have names
that start with a forward slash, like "/select", none of them have the
"default" attribute, and the handleSelect parameter found elsewhere in
the solrconfig.xml is false.

You should bring this up with the Drupal folks and ask them to upgrade
their config/schema and their code for modern versions of Solr.  Solr
3.6.0 (which deprecated their handler naming convention and the
"default" attribute) was released over three years ago.

More info than you probably wanted to know:  The reason this change was
made is security-related.  With the old way of naming request handlers
and handling /select indirectly, you could send a query to /select,
include a qt=/update parameter, and change the index via a handler
intended only for queries.

Thanks,
Shawn



Re: "no default request handler is registered"

2015-08-27 Thread Chris Hostetter

I just want to clarify: all of Shawn's points below are valid and good -- 
but they stll don't explain the warning messgae you are getting.  it makes 
no sense as the code is currently written, and doesn't do anything to help 
encourage people to transition to path based handler names.



: Date: Thu, 27 Aug 2015 13:50:51 -0600
: From: Shawn Heisey 
: Reply-To: solr-user@lucene.apache.org
: To: solr-user@lucene.apache.org
: Subject: Re: "no default request handler is registered"
: 
: On 8/27/2015 1:10 PM, Scott Hollenbeck wrote:
: > I'm doing some experimenting with Solr 5.3 and the 7.x-1.x-dev version of
: > the Apache Solr Search module for Drupal. Things seem to be working fine,
: > except that this warning message appears in the Solr admin logging window
: > and in the server log:
: > 
: > "no default request handler is registered (either '/select' or 'standard')"
: > 
: > Looking at the solrconfig.xml file that comes with the Drupal module I see a
: > requestHandler named "standard":
: > 
: >   
: >  
: >content
: >explicit
: >true
: >  
: >   
: > 
: > I also see a handler named pinkPony with a "default" attribute set to
: > "true":
: 
: 
: 
: > So it seems like there are both standard and default requestHandlers
: > specified. Why is the warning produced? What am I missing?
: 
: I think the warning message may be misworded, or logged in incorrect
: circumstances, and might need some attention.
: 
: The solrconfig.xml that you are using (which I assume came from the
: Drupal project) is geared towards a 3.x version of Solr prior to 3.6.x
: (the last minor version in the 3.x line).
: 
: Starting in the 3.6 version, all request handlers in examples have names
: that start with a forward slash, like "/select", none of them have the
: "default" attribute, and the handleSelect parameter found elsewhere in
: the solrconfig.xml is false.
: 
: You should bring this up with the Drupal folks and ask them to upgrade
: their config/schema and their code for modern versions of Solr.  Solr
: 3.6.0 (which deprecated their handler naming convention and the
: "default" attribute) was released over three years ago.
: 
: More info than you probably wanted to know:  The reason this change was
: made is security-related.  With the old way of naming request handlers
: and handling /select indirectly, you could send a query to /select,
: include a qt=/update parameter, and change the index via a handler
: intended only for queries.
: 
: Thanks,
: Shawn
: 
: 

-Hoss
http://www.lucidworks.com/


Re: StrDocValues

2015-08-27 Thread Jamie Johnson
Right, I am removing them myself.  Another feature which would be great
would be the ability to specify a custom collector like the positive score
only collector in this case to avoid having to do an extra pass over all of
the scores, but I don't believe there is a way to do that now right?

On Thu, Aug 27, 2015 at 3:16 PM, Yonik Seeley  wrote:

> On Thu, Aug 27, 2015 at 2:43 PM, Erick Erickson 
> wrote:
> > Right, when scoring any document that scores 0 is removed from the
> > results
>
> Just to clarify, I think Jamie removed 0 scoring documents himself.
>
> Solr has never done this itself.  Lucene used to a long time ago and
> then stopped IIRC.
>
> -Yonik
>


Re: FW: Issue while setting Solr on Slider / YARN

2015-08-27 Thread Timothy Potter
Hi Vijay,

I'm not sure what's wrong here ... have you posted to the Slider
mailing list? Also, which version of Java are you using when
interacting with Slider? I know it had some issues with Java 8 at one
point. Which version of Slider so I can try to reproduce ...

Cheers,
Tim

On Thu, Aug 27, 2015 at 11:05 AM, Vijay Bhoomireddy
 wrote:
>
>
> Hi Tim,
>
>
>
> For some reason, I was not receiving messages from Solr mailing list, though
> I could post it to the list. Now I got that sorted. For my below query, I
> saw your response on the mailing list. Below is the snippet of your
> response:
>
>
>
> Hi Vijay,
>
>
>
> Verify the ResourceManager URL and try passing the --manager param to
>
> explicitly set the ResourceManager URL during the create step.
>
>
>
> Cheers,
>
> Tim
>
>
>
>
>
> I verified the Resource Manager URL and its pointing correctly. In my case,
> it's myhdpcluster.com:8032 I even tried passing --manager param to the
> slider create solr command, but without any luck. So not sure where it's
> getting wrong. Can you please help me in understanding if I need to modify
> something else to get this working? I am just wondering whether
> "${AGENT_WORK_ROOT} in step below has any impact? I haven't changed this
> line in the json file. Should this be modified?
>
>
>
> I am able to login to Ambari console and see all the services running fine ,
> including YARN related and ZooKeeper. Also, I could login to Resource
> Manager's web UI as well and its working fine. Can you please let me know
> what / where it could have gone wrong?
>
>
>
>
>
> Thanks & Regards
>
> Vijay
>
>
>
> From: Vijay Bhoomireddy [mailto:vijaya.bhoomire...@whishworks.com]
> Sent: 17 August 2015 11:37
> To: solr-user@lucene.apache.org 
> Subject: Issue while setting Solr on Slider / YARN
>
>
>
> Hi,
>
>
>
> Any help on this please?
>
>
>
> Thanks & Regards
>
> Vijay
>
>
>
> From: Vijay Bhoomireddy [mailto:vijaya.bhoomire...@whishworks.com]
> Sent: 14 August 2015 18:03
> To: solr-user@lucene.apache.org 
> Subject: Issue while setting Solr on Slider / YARN
>
>
>
> Hi,
>
>
>
> We have a requirement of setting up of Solr Cloud to work along with Hadoop.
> Earlier, I could setup a SolrCloud cluster separately alongside the Hadoop
> cluster i.e. it looks like two logical  clusters sitting next to each other,
> both relying on HDFS.
>
>
>
> However, the experiment now I am trying to do is to install SolrCloud on
> YARN using Apache Slider. I am following LucidWorks blog at
> https://github.com/LucidWorks/solr-slider for the same. I already have a
> Hortonworks HDP cluster. When I try to setup Solr on my HDP cluster using
> Slider, I am facing some issues.
>
>
>
> As per the blog, I have performed the below steps:
>
>
>
> 1.   I have setup a single node HDP cluster for which the hostname is
> myhdpcluster.com with all the essential services including ZooKeeper and
> Slider running on it.
>
> 2.   Updated the resource manager address and port in slider-client.xml
> present under /usr/hdp/current/slider-client/conf
>
> 
>
> yarn.resourcemanager.address
>
>  myhdpcluster.com:8032
>
> 
>
> 3.   Cloned the LucidWorks git and moved it under /home/hdfs/solr-slider
>
> 4.   Downloaded solr latest stable distribution and renamed it as
> solr.tgz and placed it under /home/hdfs/solr-slider/package/files/solr.tgz
>
> 5.   Next ran the following command from within the
> /home/hdfs/solr-slider folder
>
> zip -r solr-on-yarn.zip metainfo.xml package/
>
> 6.   Next ran the following command as hdfs user
>
> slider install-package --replacepkg --name solr --package
> /home/hdfs/solr-slider/solr-on-yarn.zip
>
> 7.   Modified the following settings in the
> /home/hdfs/solr-slider/appConfig-default.json file
>
> "java_home": MY_JAVA_HOME_LOCATION
>
> "site.global.app_root": "${AGENT_WORK_ROOT}/app/install/solr-5.2.1",
> (Should this be changed to any other value?)
>
> "site.global.zk_host": " myhdpcluster.com:2181",
>
> 8.   Set yarn.component.instances to 1 in resources-default.json file
>
> 9.   Next ran the following command
>
> slider create solr --template /home/hdfs/solr-slider/appConfig-default.json
> --resources /home/hdfs/solr-slider/resources-default.json
>
>
>
> During this step, I am seeing an message INFO client.RMProxy - Connecting to
> ResourceManager at myhdpcluster.com/10.0.2.15:8032
>
>
> INFO ipc.Client - Retrying connect to server:
> myhdpcluster.com/10.0.2.15:8032. Already tried 0 time(s);
>
>
>
> This message keeps repeating for 50 times and then pauses for a couple of
> seconds and then prints the same message in a loop eternally. Not sure on
> where the problem is.
>
>
>
> Can anyone please help me out to get away from this issue and help me setup
> Solr on Slider/YARN?
>
>
>
> Thanks & Regards
>
> Vijay
>
>
> --
> The contents of this e-mail are confidential and for the exclusive use of
> the intended recipien

RE: FW: Issue while setting Solr on Slider / YARN

2015-08-27 Thread Vijay Bhoomireddy
Tim,

Here is the complete content of the appConfig-default.json file. I haven’t 
worked with Slider so far, so not very sure if some mistake has crept into this 
file while modifying it as per the changes mentioned by Lucidworks on the 
Github page. However, I tried to simulate the steps mentioned on the Github 
Solr-Slider page. As you can see from below file, I am using Java 7. I have 
installed HDP 2.3 GA version which has Slider version 0.80 included in it.

{
  "schema": "http://example.org/specification/v2.0.0";,
  "metadata": {
  },
  "global": {
"application.def": "/home/hdfs/solr-slider/solr-on-yarn.zip",
"java_home": "/usr/lib/jvm/jre-1.7.0-openjdk.x86_64",
"site.global.app_root": 
"${AGENT_WORK_ROOT}/app/install/solr-5.2.1-SNAPSHOT",
"site.global.zk_host": "myhdpcluster.com:2181",
"site.global.solr_host": "${SOLR_HOST}",
"site.global.listen_port": "${SOLR.ALLOCATED_PORT}",
"site.global.xmx_val": "1g",
"site.global.xms_val": "1g",
"site.global.gc_tune": "-XX:NewRatio=3 -XX:SurvivorRatio=4 
-XX:TargetSurvivorRatio=90 -XX:MaxTenuringThreshold=8 -XX:+UseConcMarkSweepGC 
-XX:+UseParNewGC -XX:ConcGCThreads=4 -XX:ParallelGCThreads=4 
-XX:+CMSScavengeBeforeRemark -XX:PretenureSizeThreshold=64m 
-XX:+UseCMSInitiatingOccupancyOnly -XX:CMSInitiatingOccupancyFraction=50 
-XX:CMSMaxAbortablePrecleanTime=6000 -XX:+CMSParallelRemarkEnabled 
-XX:+ParallelRefProcEnabled -verbose:gc -XX:+PrintHeapAtGC -XX:+PrintGCDetails 
-XX:+PrintGCDateStamps -XX:+PrintTenuringDistribution 
-XX:+PrintGCApplicationStoppedTime",
"site.global.zk_timeout": "15000",
"site.global.server_module": "--module=http",
"site.global.stop_key": "solrrocks",
"site.global.solr_opts": ""
  },
  "components": {
"slider-appmaster": {
  "jvm.heapsize": "512M"
},
"SOLR": {
}
  }
}

Thanks & Regards
Vijay

-Original Message-
From: Timothy Potter [mailto:thelabd...@gmail.com] 
Sent: 27 August 2015 21:44
To: solr-user@lucene.apache.org
Subject: Re: FW: Issue while setting Solr on Slider / YARN

Hi Vijay,

I'm not sure what's wrong here ... have you posted to the Slider mailing list? 
Also, which version of Java are you using when interacting with Slider? I know 
it had some issues with Java 8 at one point. Which version of Slider so I can 
try to reproduce ...

Cheers,
Tim

On Thu, Aug 27, 2015 at 11:05 AM, Vijay Bhoomireddy 
 wrote:
>
>
> Hi Tim,
>
>
>
> For some reason, I was not receiving messages from Solr mailing list, 
> though I could post it to the list. Now I got that sorted. For my 
> below query, I saw your response on the mailing list. Below is the 
> snippet of your
> response:
>
>
>
> Hi Vijay,
>
>
>
> Verify the ResourceManager URL and try passing the --manager param to
>
> explicitly set the ResourceManager URL during the create step.
>
>
>
> Cheers,
>
> Tim
>
>
>
>
>
> I verified the Resource Manager URL and its pointing correctly. In my 
> case, it's myhdpcluster.com:8032 I even tried passing --manager param 
> to the slider create solr command, but without any luck. So not sure 
> where it's getting wrong. Can you please help me in understanding if I 
> need to modify something else to get this working? I am just wondering 
> whether "${AGENT_WORK_ROOT} in step below has any impact? I haven't 
> changed this line in the json file. Should this be modified?
>
>
>
> I am able to login to Ambari console and see all the services running 
> fine , including YARN related and ZooKeeper. Also, I could login to 
> Resource Manager's web UI as well and its working fine. Can you please 
> let me know what / where it could have gone wrong?
>
>
>
>
>
> Thanks & Regards
>
> Vijay
>
>
>
> From: Vijay Bhoomireddy [mailto:vijaya.bhoomire...@whishworks.com]
> Sent: 17 August 2015 11:37
> To: solr-user@lucene.apache.org 
> Subject: Issue while setting Solr on Slider / YARN
>
>
>
> Hi,
>
>
>
> Any help on this please?
>
>
>
> Thanks & Regards
>
> Vijay
>
>
>
> From: Vijay Bhoomireddy [mailto:vijaya.bhoomire...@whishworks.com]
> Sent: 14 August 2015 18:03
> To: solr-user@lucene.apache.org 
> Subject: Issue while setting Solr on Slider / YARN
>
>
>
> Hi,
>
>
>
> We have a requirement of setting up of Solr Cloud to work along with Hadoop.
> Earlier, I could setup a SolrCloud cluster separately alongside the 
> Hadoop cluster i.e. it looks like two logical  clusters sitting next 
> to each other, both relying on HDFS.
>
>
>
> However, the experiment now I am trying to do is to install SolrCloud 
> on YARN using Apache Slider. I am following LucidWorks blog at 
> https://github.com/LucidWorks/solr-slider for the same. I already have 
> a Hortonworks HDP cluster. When I try to setup Solr on my HDP cluster 
> using Slider, I am facing some issues.
>
>
>
> As per the blog, I have performed the below steps:
>
>
>
> 1.   I have setup a single node HDP cluster for which the hostname is
> myhdpcluster.com with a