Not Able to Build Spellcheck index - SpellCheckComponent.prepare 500 Error

2014-04-05 Thread sameer
Hi
I am trying to use spellcheck in solr with below config but it throwing with
error while using spellcheck build or reload

it works fine otherwise for indexed search, can someone please help
implementing spellcheck corectly

schema.xml:

// fieldType declaration

  





  


//field name


//copyFields











solrconfig.xml:

//searchComponent


textSpell

  solr.IndexBasedSpellChecker
  default
  ./spellchecker
  categoryName,dealName,seoTags,description,dealTitle,merchantName,dealUri,highlights
  true
  0.9




//default requestHandler

  

 
explicit
true
direct
on
true
5
true
true
   
 
 
spellcheck
 
  


// URL params
select?q=*%3A*&wt=php&indent=true&spellcheck=true&spellcheck.build=true


//output


array(
  'responseHeader'=>array(
'status'=>500,
'QTime'=>4,
'params'=>array(
  'spellcheck'=>'true',
  'indent'=>'true',
  'q'=>'*:*',
  '_'=>'1396684768649',
  'wt'=>'php',
  'spellcheck.build'=>'true')),
  'error'=>array(
'trace'=>'java.lang.NullPointerException
at
org.apache.solr.handler.component.SpellCheckComponent.prepare(SpellCheckComponent.java:125)
at
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:187)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1859)
at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:710)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:413)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:197)
at
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
at 
org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
at
org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
at 
org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
at 
org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:767)
at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
at
org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
at
org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
at 
org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
at org.mortbay.jetty.handler.DebugHandler.handle(DebugHandler.java:77)
at 
org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
at org.mortbay.jetty.Server.handle(Server.java:326)
at 
org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
at
org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)
at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
at
org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410)
at
org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
',
'code'=>500))



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Not-Able-to-Build-Spellcheck-index-SpellCheckComponent-prepare-500-Error-tp4129368.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Not Able to Build Spellcheck index - SpellCheckComponent.prepare 500 Error

2014-04-05 Thread sameer
its solr-4.6.0



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Not-Able-to-Build-Spellcheck-index-SpellCheckComponent-prepare-500-Error-tp4129368p4129392.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Handling fields used for both display & index

2016-01-31 Thread Sameer Maggon
Hi Jay,

You could use one field for both unless there is a specific requirement you
are looking for that is not being met by that one field (e.g. faceting,
etc). Typically, if you have a field that is marked as both "indexed" and
"stored", the value that is passed while indexing to that field is stored
as is. However, it's indexed based on the field type that you've specified
for that field.

e.g. a description field with the field type of "text_en" would be indexed
per the pipeline in the text_en fieldtype and the text as is will be stored
(which is what is returned in your response in the results).

Thanks,
-- 
*Sameer Maggon*
Measured Search | Solr-as-a-Service | Solr Monitoring | Search Analytics
www.measuredsearch.com
<https://mailtrack.io/trace/link/dca98638f8114f38d1ff30ed04feb547877c848e?url=http%3A%2F%2Fmeasuredsearch.com%2F&signature=797ba5008ecc48b8>

On Sun, Jan 31, 2016 at 5:56 PM, Jay Potharaju 
wrote:

> Hi,
> I am trying to decide if I should use text_en or string as my field type.
> The fields have to be both indexed and stored for display. One solution is
> to duplicate fields, one for indexing other for display.One of the field
> happens to be a description field which I would like to avoid duplicating.
> Solr should return results when someone searches for John or john.Is
> storing a copy of the field the best way to go about this problem?
>
>
> Thanks
>


Re: Amazon CloudSearch

2016-04-26 Thread Sameer Maggon
Hi Sergio,

CloudSearch is a Search-as-a-Service that uses SOLR underneath, though they
have a proprietary API to interact with it. Both on the document side and
query side. It won't give us ability to 'manage' Solr instances or cluster.
If you have a use cases where you want to keep on pumping data and forget
about it, good for that as it offers auto-scaling. Does not offer ability
or visibility into what's going on underneath it plus no control over
solrconfig, custom plugins, etc. No spellcheck, etc.

If you are looking for a service that provides you direct access to Solr's
APIs without having to rewrite your application, then CloudSearch is
probably not what you are looking for.

Take a look at Measured Search (www.measuredsearch.com) - It offers
Solr-as-a-Service on top of AWS, Azure and Google Cloud that allows you
direct access to Solr and ability to manage your instances. The platform is
comprised of currently three products:

1. SearchStax Cloud Manager - Allows you to deploy, manage and scale Solr.
- Provides High Availability as instances are front-ended with ELB (load
balancers).
- One time and scheduled backups.
- Cloning of deployments;
- Ability to add / remove nodes, real time log access and log archival.
- All deployments run on https, supports auth
- Enterprise version allows you to deploy & manage Solr within your AWS
account as well.
- Zookeeper deployment & setup.
- access to deploy custom JARs, etc.
- Supports Solr 4.8 and above (self serve version supports Solr 5.2.1 and
5.3.1)

2. SearchStax Pulse - Monitoring and Alerting for your Solr Clusters.
- System Level monitoring
- GC monitoring
- Search & Indexing monitoring
- Cache statistics
- Alerting on any of the above metrics at host and collection level.
- PagerDuty integration

3. SearchStax Analytics - User behavior Analytics that allows you to track
application level interactions and metrics to help you optimize your
search.
- Total searches,
- No result searches
- Click through rates
- conversion metrics for e-commerce scenarios
- query level details
- advanced version includes MRR reports, average click positions, etc.

Lastly, provides 24x7x365 Support and auto-scaling for customers that elect
for it.

Thanks,
Sameer.

On Tuesday, April 26, 2016, marotosg  wrote:

> Hi,
>
> I am evaluating the possibility of using Amazon CloudSearch to manage Solr
> insances. Reason is the price and time to manage and deploy. I am not fully
> sure yet how flexible is that service. in case you need to install a
> specific solr version or plug in.
> Do you have any experience with it?
>
> Would you please share any thoughts?
>
> Thanks
> Sergio
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Amazon-CloudSearch-tp4272875.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


-- 
*Sameer Maggon*
Measured Search
c: 310.344.7266
www.measuredsearch.com <http://measuredsearch.com>


Re: SolrCloud clarification/Question

2015-09-16 Thread Sameer Maggon
Absolutely. You can have a collection with just replicas and no shards for
redundancy and have a load balancer in front of it that removes the
dependency on a single node. One of them will assume the role of a leader,
and in case that leader goes down, one of the replicas will be elected as a
leader and your application will be fine.

Thanks,

On Wed, Sep 16, 2015 at 9:44 AM, Ravi Solr  wrote:

> Hello,
>  We are trying to move away from Master-Slave configuration to a
> SolrCloud environment. I have a couple of questions. Currently in the
> Master-Slave setup we have 4 Machines 2 of which are indexers and 2 of them
> are query servers. The query servers are fronted via Load Balancer.
>
> There are 3 solr cores for 3 different/separate applications (mutually
> exclusive). Each core is a complete index of all docs (i.e. the data is not
> sharded).
>
>   We intend to keep it in a non-sharded mode even after the SolrCloud
> mode.The prime motivation to move to cloud is to effectively use all
> servers for indexing and querying (read fault tolerant/redundant).
>
> So, the real question is, can SolrCloud be used without shards ? i.e. a
> "collection" resides entirely on one machine rather than partitioning data
> onto different machines ?
>
> Thanks
>
> Ravi Kiran Bhaskar
>



-- 
*Sameer Maggon*
Measured Search
c: 310.344.7266
www.measuredsearch.com <http://measuredsearch.com>


Re: SolrCloud clarification/Question

2015-09-16 Thread Sameer Maggon
You'll have to say numShards=1 and replicationFactor=2.

http://
[hostname]:8983/solr/admin/collections?action=CREATE&name=test&configName=test&numShards=1&replicationFactor=2

On Wed, Sep 16, 2015 at 11:23 AM, Ravi Solr  wrote:

> Thank you very much for responding Sameer so numShards=0 and
> replicationFactr=4 if I have 4 machines ??
>
> Thanks
>
> Ravi Kiran Bhaskar
>
> On Wed, Sep 16, 2015 at 12:56 PM, Sameer Maggon  >
> wrote:
>
> > Absolutely. You can have a collection with just replicas and no shards
> for
> > redundancy and have a load balancer in front of it that removes the
> > dependency on a single node. One of them will assume the role of a
> leader,
> > and in case that leader goes down, one of the replicas will be elected
> as a
> > leader and your application will be fine.
> >
> > Thanks,
> >
> > On Wed, Sep 16, 2015 at 9:44 AM, Ravi Solr  wrote:
> >
> > > Hello,
> > >  We are trying to move away from Master-Slave configuration to
> a
> > > SolrCloud environment. I have a couple of questions. Currently in the
> > > Master-Slave setup we have 4 Machines 2 of which are indexers and 2 of
> > them
> > > are query servers. The query servers are fronted via Load Balancer.
> > >
> > > There are 3 solr cores for 3 different/separate applications (mutually
> > > exclusive). Each core is a complete index of all docs (i.e. the data is
> > not
> > > sharded).
> > >
> > >   We intend to keep it in a non-sharded mode even after the
> SolrCloud
> > > mode.The prime motivation to move to cloud is to effectively use all
> > > servers for indexing and querying (read fault tolerant/redundant).
> > >
> > > So, the real question is, can SolrCloud be used without shards ? i.e. a
> > > "collection" resides entirely on one machine rather than partitioning
> > data
> > > onto different machines ?
> > >
> > > Thanks
> > >
> > > Ravi Kiran Bhaskar
> > >
> >
> >
> >
> > --
> > *Sameer Maggon*
> > Measured Search
> > c: 310.344.7266
> > www.measuredsearch.com <http://measuredsearch.com>
> >
>



-- 
*Sameer Maggon*
Measured Search
c: 310.344.7266
www.measuredsearch.com <http://measuredsearch.com>


Re: SolrCloud clarification/Question

2015-09-16 Thread Sameer Maggon
I just gave an example API call, but for your scenario, the
replicationFactor will be 4 (replicationFactor=4). In this way, all 4
machines will have the same copy of the data and you can put an LB in front
of those 4 machines.

On Wed, Sep 16, 2015 at 12:00 PM, Ravi Solr  wrote:

> OK...I understood numShards=1, when you say replicationFactor=2 what does
> it mean ? I have 4 machines, then, only 3 copies of data (1 at leader and 2
> replicas) ?? so am i not under utilizing one machine ?
>
> I was more thinking in the lines of a Mesh connectivity format i.e.
> everybody has others copy so that I can put all 4 machines behind a Load
> Balancer...Is that a wrong way to look at it ?
>
> Thanks
>
> Ravi Kiran
>
> On Wed, Sep 16, 2015 at 2:51 PM, Sameer Maggon 
> wrote:
>
> > You'll have to say numShards=1 and replicationFactor=2.
> >
> > http://
> >
> >
> [hostname]:8983/solr/admin/collections?action=CREATE&name=test&configName=test&numShards=1&replicationFactor=2
> >
> > On Wed, Sep 16, 2015 at 11:23 AM, Ravi Solr  wrote:
> >
> > > Thank you very much for responding Sameer so numShards=0 and
> > > replicationFactr=4 if I have 4 machines ??
> > >
> > > Thanks
> > >
> > > Ravi Kiran Bhaskar
> > >
> > > On Wed, Sep 16, 2015 at 12:56 PM, Sameer Maggon <
> > sam...@measuredsearch.com
> > > >
> > > wrote:
> > >
> > > > Absolutely. You can have a collection with just replicas and no
> shards
> > > for
> > > > redundancy and have a load balancer in front of it that removes the
> > > > dependency on a single node. One of them will assume the role of a
> > > leader,
> > > > and in case that leader goes down, one of the replicas will be
> elected
> > > as a
> > > > leader and your application will be fine.
> > > >
> > > > Thanks,
> > > >
> > > > On Wed, Sep 16, 2015 at 9:44 AM, Ravi Solr 
> wrote:
> > > >
> > > > > Hello,
> > > > >  We are trying to move away from Master-Slave configuration
> > to
> > > a
> > > > > SolrCloud environment. I have a couple of questions. Currently in
> the
> > > > > Master-Slave setup we have 4 Machines 2 of which are indexers and 2
> > of
> > > > them
> > > > > are query servers. The query servers are fronted via Load Balancer.
> > > > >
> > > > > There are 3 solr cores for 3 different/separate applications
> > (mutually
> > > > > exclusive). Each core is a complete index of all docs (i.e. the
> data
> > is
> > > > not
> > > > > sharded).
> > > > >
> > > > >   We intend to keep it in a non-sharded mode even after the
> > > SolrCloud
> > > > > mode.The prime motivation to move to cloud is to effectively use
> all
> > > > > servers for indexing and querying (read fault tolerant/redundant).
> > > > >
> > > > > So, the real question is, can SolrCloud be used without shards ?
> > i.e. a
> > > > > "collection" resides entirely on one machine rather than
> partitioning
> > > > data
> > > > > onto different machines ?
> > > > >
> > > > > Thanks
> > > > >
> > > > > Ravi Kiran Bhaskar
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > *Sameer Maggon*
> > > > Measured Search
> > > > c: 310.344.7266
> > > > www.measuredsearch.com <http://measuredsearch.com>
> > > >
> > >
> >
> >
> >
> > --
> > *Sameer Maggon*
> > Measured Search
> > c: 310.344.7266
> > www.measuredsearch.com <http://measuredsearch.com>
> >
>



-- 
*Sameer Maggon*
Measured Search
c: 310.344.7266
www.measuredsearch.com <http://measuredsearch.com>


Re: How to check Zookeeper ensemble status?

2015-09-18 Thread Sameer Maggon
Have you tried zkServer.sh status?

This will tell you whether zookeeper is running or not and whether it's
acting as a leader or follower.

Sameer.

On Friday, September 18, 2015, Merlin Morgenstern <
merlin.morgenst...@gmail.com> wrote:

> I am running a 3 node zookeeper ensemble on 3 machines dedicated to
> SolrCloud 5.2.x
>
> Inside the Solr Admin-UI I can check "live nodes", but how can I check if
> all three zookeeper nodes are up?
>
> I am asking since node2 has 25% CPU usage by zookeeper while beeing idle
> and I wonder what the cause is. Maybe zookeeper can not connect to the
> other nodes or whatever it is, which braught me to the question how to
> check if all 3 nodes are operational.
>
> Thank you for any help on this!
>


-- 
*Sameer Maggon*
Measured Search
c: 310.344.7266
www.measuredsearch.com <http://measuredsearch.com>


Re: Using books.json in solr

2015-10-27 Thread Sameer Maggon
Hi Salonee, can you post the query and your schema file too?

Thanks,
-- 
*Sameer Maggon*
www.measuredsearch.com <http://measuredsearch.com/>
Solr Cloud Hosting | Managed Services | Solr Consulting


On Tue, Oct 27, 2015 at 10:44 AM, Salonee Rege  wrote:

> Hello,
>   We are trying to query the books.json that we have posted to solr. But
> when we try to specfically query it on genre it does not return a complete
> json with valid key-value pairs. Kindly help.
>
> *Salonee Rege*
> USC Viterbi School of Engineering
> University of Southern California
> Master of Computer Science - Student
> Computer Science - B.E
> salon...@usc.edu  *||* *619-709-6756 <619-709-6756>*
>
>
>


Re: Using books.json in solr

2015-10-27 Thread Sameer Maggon
Hi Salonee,

I believe you missed adding the query screenshot?

Sameer.

On Tue, Oct 27, 2015 at 10:57 AM, Salonee Rege  wrote:

> Please find attached the following books.json which is in the example-docs
> file for your reference. And a screenshot of querying it on the field
> fantasy for genre key.
> Thanks for the help.
>
>
> *Salonee Rege*
> USC Viterbi School of Engineering
> University of Southern California
> Master of Computer Science - Student
> Computer Science - B.E
> salon...@usc.edu  *||* *619-709-6756 <619-709-6756>*
>
>
>
> On Tue, Oct 27, 2015 at 10:47 AM, Rallavagu  wrote:
>
>> Could you please share your query? You could use "wt=json" query
>> parameter to receive JSON formatted results if that is what you are looking
>> for.
>>
>> On 10/27/15 10:44 AM, Salonee Rege wrote:
>>
>>> Hello,
>>>We are trying to query the books.json that we have posted to solr.
>>> But when we try to specfically query it on genre it does not return a
>>> complete json with valid key-value pairs. Kindly help.
>>>
>>> /Salonee Rege/
>>> USC Viterbi School of Engineering
>>> University of Southern California
>>> Master of Computer Science - Student
>>> Computer Science - B.E
>>> salon...@usc.edu <mailto:salon...@usc.edu> _||_ _619-709-6756_
>>> _
>>> _
>>> _
>>> _
>>>
>>
>


-- 
*Sameer Maggon*
Measured Search
c: 310.344.7266
www.measuredsearch.com <http://measuredsearch.com>


Re: Solr Suggester with Geo?

2015-11-09 Thread Sameer Maggon
Have you looked at the Spatial extensions for Solr? If you are indexing
Lat/Lon along with your documents, you can compute the distance from the
origin & use that distance as one of the boost factors to affect the score.
Typically, use cases around that combine the geo score with other factors
as a pure sort by geo score might not give you the relevant results.

e.g. typing to search for "sushi restaurants" near Santa Monica, CA - you
might not want "thai restaurants" that are closest to you. (Local Search
use case)

https://cwiki.apache.org/confluence/display/solr/Spatial+Search

Thanks,
-- 
*Sameer Maggon*
www.measuredsearch.com <http://measuredsearch.com/>
Fully Managed Solr-as-a-Service | Solr Consulting | Solr Support



On Mon, Nov 9, 2015 at 11:18 AM, William Bell  wrote:

> http://lucidworks.com/blog/solr-suggester/
>
>
> Wondering if anyone has uses these new techniques with a boost on
> geodist() inverted? So the rows that get returned that are closest
> need to come back first.
>
>
> We are still using Edge Grams since we have not figured out how to
> boost the results on geo spatial.
>
>
> Anyone have thoughts?
>
>
>
>
> --
> Bill Bell
> billnb...@gmail.com
> cell 720-256-8076
>


Re: Solr Suggester with Geo?

2015-11-09 Thread Sameer Maggon
Looking through the code and some example Suggesters, it seems that
theoretically, one can write a GeoSuggester and provide that as the Lookup
implementation (lookupimpl) that would factor in the geo score or extend
the SolrSuggestor to support spatial extensions in the same spirit as
"Filters" are supported today.

Sameer.

On Mon, Nov 9, 2015 at 11:47 AM, William Bell  wrote:

> Yeah we have that working today. But the issue is we want to use
> http://lucidworks.com/blog/solr-suggester/
>
> And you cannot do a boost with that right?
>
>
>
> On Mon, Nov 9, 2015 at 12:41 PM, Sameer Maggon 
> wrote:
>
> > Have you looked at the Spatial extensions for Solr? If you are indexing
> > Lat/Lon along with your documents, you can compute the distance from the
> > origin & use that distance as one of the boost factors to affect the
> score.
> > Typically, use cases around that combine the geo score with other factors
> > as a pure sort by geo score might not give you the relevant results.
> >
> > e.g. typing to search for "sushi restaurants" near Santa Monica, CA - you
> > might not want "thai restaurants" that are closest to you. (Local Search
> > use case)
> >
> > https://cwiki.apache.org/confluence/display/solr/Spatial+Search
> >
> > Thanks,
> > --
> > *Sameer Maggon*
> > www.measuredsearch.com <http://measuredsearch.com/>
> > Fully Managed Solr-as-a-Service | Solr Consulting | Solr Support
> >
> >
> >
> > On Mon, Nov 9, 2015 at 11:18 AM, William Bell 
> wrote:
> >
> > > http://lucidworks.com/blog/solr-suggester/
> > >
> > >
> > > Wondering if anyone has uses these new techniques with a boost on
> > > geodist() inverted? So the rows that get returned that are closest
> > > need to come back first.
> > >
> > >
> > > We are still using Edge Grams since we have not figured out how to
> > > boost the results on geo spatial.
> > >
> > >
> > > Anyone have thoughts?
> > >
> > >
> > >
> > >
> > > --
> > > Bill Bell
> > > billnb...@gmail.com
> > > cell 720-256-8076
> > >
> >
>


Re: Generating Index offline and loading into solrcloud

2015-11-19 Thread Sameer Maggon
If you are trying to create a large index and want speedups there, you
could use the MapReduceTool -
https://github.com/cloudera/search/tree/cdh5-1.0.0_5.2.1/search-mr. At a
high level, it takes your files (csv, json, etc) as input can create either
a single or a sharded index that you can either copy it to your Solr
Servers. I've used this to create indexes that include hundreds of millions
of documents in fairly decent amount of time.

Thanks,
-- 
*Sameer Maggon*
Measured Search
www.measuredsearch.com <http://measuredsearch.com/>

On Thu, Nov 19, 2015 at 11:17 AM, KNitin  wrote:

> Hi,
>
>  I was wondering if there are existing tools that will generate solr index
> offline (in solrcloud mode)  that can be later on loaded into solrcloud,
> before I decide to implement my own. I found some tools that do only solr
> based index loading (non-zk mode). Is there one with zk mode enabled?
>
>
> Thanks in advance!
> Nitin
>


Re: Fully automated replica creation in AWS

2015-12-09 Thread Sameer Maggon
Erick,

Typically, while creating collections, a replicationFactor is specified.
Thus, the meta data about the collection does have information about what
the "desired" replicationFactor is for the collection. If that's the case,
when a Solr node joins the cluster, there could be a pro-active add-replica
operation that can be initiated if the Solr detects that the current
replicas are less than the desired replicationFactor and pull the
collection data from the leader.

Isn't that what the attribute "autoAddReplicas" does for HDFS - can this be
done for non-shared filesystem? As a side note, we do this for our
customers as that's baked into our cloud provisioning software, but it
would be nice if Solr supports that OOTB. Are there any underlying flaws of
doing that?

Thanks,
-- 

*Sameer Maggon*
www.measuredsearch.com
<https://mailtrack.io/trace/link/66fad5b85359bf1b21be04166edea6c7d13e?url=http%3A%2F%2Fmeasuredsearch.com%2F&signature=6dbc74f0abef4882>
|
Deploy, Scale & Manage Solr in the cloud of your choice.


On Wed, Dec 9, 2015 at 11:19 AM, Erick Erickson 
wrote:

> Not that I know of. The two systems are somewhat disconnected.
> AWS doesn't know that Solr lives on those nodes, it's just spinning
> one up, right? Albeit with Solr running.
>
> There's nothing in Solr that auto-detects the  existence of a new
> Solr node and automagically assigns collections and/or replicas.
>
> How would either system intuit that this new node is replacing
> something else and "do the right thing"?
>
> I'll tell you how, by interrogating Zookeeper and seeing that for some
> specific collection, shardX had fewer replicas than other shards and
> issuing the Collections API ADDREPLICA command.
>
> But now there are _three_ systems that need to be coordinated and
> doing the right thing in your situation would be the wrong thing in
> another. The last thing many sys ops want is having replicas started
> without their knowledge.
>
> And on top of that, I have doubts about the model. Having AWS
> elastically spin up a new replica is a heavyweight operation from
> Solr's perspective. I mean this potentially copies a many G set of
> index files from one place to another which could take a long time,
> is that really what's desired here?
>
> I have seen some folks spin up/down Solr instances based on a
> schedule if they know roughly when the peak load will be, but again
> there's nothing built in to handle this.
>
> Best,
> Erick
>
> On Wed, Dec 9, 2015 at 10:15 AM, Ugo Matrangolo
>  wrote:
> > Hi,
> >
> > I was trying to setup a SolrCloud cluster in AWS backed by an ASG (auto
> > scaling group) serving a replicated collection. I have just came across a
> > case when one of the Solr node became unresponsive with AWS killing it
> and
> > spinning a new one.
> >
> > Unfortunately, this new Solr node did not join as a replica of the
> existing
> > collection requiring human intervention to configure it as a new replica.
> >
> > I was wondering if there is around something that will make this process
> > fully automated by detecting that a new node just joined the cluster and
> > instructing it (e.g. via Collections API) to join as a replica of a given
> > collection.
> >
> > Best
> > Ugo
>


Re: [ANN] Relevant Search by Manning out! (Thanks Solr community!)

2016-06-23 Thread Sameer Maggon
Congrats Doug & John, will order a copy!

Thanks,

On Tuesday, June 21, 2016, Doug Turnbull <
dturnb...@opensourceconnections.com> wrote:

> Not much more to add than my post here! This book is targeted towards
> Lucene-based search (Elasticsearch and Solr) relevance.
>
> Announcement with discount code:
> http://opensourceconnections.com/blog/2016/06/21/relevant-search-published/
>
> Related hacker news thread:
> https://news.ycombinator.com/item?id=11946636
>
> Thanks to everyone in the Solr community that was helpful to my efforts.
> Specifically Trey Grainger, Eric Pugh (for keeping me employed), Charlie
> Hull and the Flax team, Alex Rafalovitch, Timothy Potter, Yonik Seeley,
> Grant Ingersoll (for basically teaching me Solr back in the day), Drew
> Farris (for encouraging my early blogging), everyone at OSC, and many
> others I'm probably forgetting!
>
> Best
> -Doug
>


-- 
*Sameer Maggon*
www.measuredsearch.com
<https://mailtrack.io/trace/link/3404ae650cc88b51d518880f313638b7ca7d7f2c?url=http%3A%2F%2Fwww.measuredsearch.com&signature=6436799da5f290d7>
1.844.9.SEARCH
Measured Search is the only *Fully Managed Solr as a Service* multi-cloud
capable offering.
Plus utilize our *On Demand Expertise* to build your applications faster
and with more confidence.


Re: Solr and Drupal

2016-08-09 Thread Sameer Maggon
Hi John,

As John B. mentioned, you can utilize the plugin here -
https://www.drupal.org/project/apachesolr.
<https://mailtrack.io/trace/link/5b49557fccf2653a8333a25cc6f15c245ccf7ec9?url=https%3A%2F%2Fwww.drupal.org%2Fproject%2Fapachesolr.&signature=e242eddec6d9f0d9>
If you are looking to not have to worry about hosting, deployment, scaling
and management, you can take a look at SearchStax by Measured Search to get
a Solr deployment up and running in a couple of minutes and not have to get
into installing Solr and going through a learning curve around setup and
scale.


Thanks,
Sameer.


On Tue, Aug 9, 2016 at 12:11 PM, Rose, John B  wrote:

> We are looking at Solr for a Drupal web site. We have never installed Solr.
>
>
> From my readings it is not clear exactly what we need to implement a
> search in Drupal with Solr. Some sites have implied Lucene and/or Tomcat
> are needed.
>
>
> Can someone point me to the site that explains minimally what is needed to
> implement Solr within Drupal?
>
>
> Thanks for your time
>



-- 
*Sameer Maggon*
www.measuredsearch.com
<https://mailtrack.io/trace/link/3404ae650cc88b51d518880f313638b7ca7d7f2c?url=http%3A%2F%2Fwww.measuredsearch.com&signature=6436799da5f290d7>
1.844.9.SEARCH
Measured Search is the only *Fully Managed Solr as a Service* multi-cloud
capable offering.
Plus utilize our *On Demand Expertise* to build your applications faster
and with more confidence.


Re: Monitoring Apache Solr

2016-09-11 Thread Sameer Maggon
Hardika,

You can sign up at www.measuredsearch.com and take a look at SearchStax
Pulse, that provides detailed Monitoring for Solr Deployments, both single
node and cloud setups.

Feel free to reach out to me if you have any questions around it.

Thanks,
Sameer.

On Tuesday, August 30, 2016, Hardika Catur S
 wrote:

> Hi,
>
> I try to monitor apache solr, because solr often over heap and status
> collection solr be "down". How to monitor apache solr ??
> is there any tools for monitoring solr or how ??
>
> Please help me to find a solution.
>
> Thanks,
> Hardika CS.
>


-- 
*Sameer Maggon*
www.measuredsearch.com
<https://mailtrack.io/trace/link/3404ae650cc88b51d518880f313638b7ca7d7f2c?url=http%3A%2F%2Fwww.measuredsearch.com&signature=6436799da5f290d7>
1.844.9.SEARCH
Measured Search is the only *Fully Managed Solr as a Service* multi-cloud
capable offering.
Plus utilize our *On Demand Expertise* to build your applications faster
and with more confidence.


Re: SolrMeter is dead?

2014-05-15 Thread Sameer Maggon
Have you looked at JMeter - http://jmeter.apache.org/

Thanks,
Sameer.
--
http://measuredsearch.com


On Wed, May 7, 2014 at 7:51 AM, Al Krinker  wrote:

> I am trying to test performance of my cluster (solr 4.8).
>
> SolrMeter looked promising... small and standalone. Plus, open source so
> that I could make tweaks if needed.
>
> However, I see that the last update date was in Oct 2012. Is it dead? Any
> better non commercial and preferably open sourced projects out there?
>
> Thanks,
> Al
>


Re: writing logs of a speicific solr posting to a file

2014-06-09 Thread Sameer Maggon
Check out the patch on the issue below. We hit the same issue and posted a
patch, none of the committers have picked it up yet, but would be good to
get some feedback on it and get this into the next dot release. If it works
for you, please vote it up.

https://issues.apache.org/jira/browse/SOLR-5940

Thanks,
-- 
*Sameer Maggon*
Founder | Measured Search
http://measuredsearch.com



On Mon, Jun 9, 2014 at 3:48 AM, pshahukhal  wrote:

> Hi
>I am using SimplepostTool to post the xml files to SOLR llke :
>
> java  -Durl=http://localhost:8080/solr/collection1/update -jar
> /var/lib/tomcat6/solr/collection1/dump/xmlinput/post.jar
> /var/lib/tomcat6/solr/collection1/dump/xmlinput/solr.xml
>
>When there are certain errors ,the response from above command just
> shows
> the 404 error or 500 server error but doesnt provide the complete log
> details like in
>   http://localhost:8080/solr/#/~logging  or in catalina.out
>I want to catch the exact log details that are thrown in  the logs when
> the above command is executed and write to a file .I am wondering if there
> are additional params that need to be passed in the command line or I have
> to work in the configurations .
>
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/writing-logs-of-a-speicific-solr-posting-to-a-file-tp4140730.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


Re: running Post jar from different server

2014-06-19 Thread Sameer Maggon
Ravi,

post.jar is a standalone utility that does not have to be on the same
server. If you can share the command you are executing, there might be some
pointers in there.

Thanks,
-- 
*Sameer Maggon*
http://measuredsearch.com


On Thu, Jun 19, 2014 at 8:54 PM, EXTERNAL Taminidi Ravi (ETI,
Automotive-Service-Solutions)  wrote:

> Hi,  I have situation where my SQL Job initiate a console application ,
> where I am calling the post.jar to upload data to SOLR. Both SQL DB and
> SOLR are 2 different servers.
>
> I am calling post.jar from my SQLDB where the path is mapped to a network
> drive. I am getting an error file not found.
>
> Is the above scenario is possible, if anyone has some experience on this
> can you share or any direction will be really appreciated.
>
> Thanks
>
> Ravi
>


Re: POST Vs GET

2014-06-25 Thread Sameer Maggon
Ravi,

The POST should work. Here's an example that works within tomcat.

curl -X POST --data "q=*:*&rows=1"
http://localhost:8080/solr/collection1/select

Sameer.


On Mon, Jun 23, 2014 at 10:37 AM, EXTERNAL Taminidi Ravi (ETI,
Automotive-Service-Solutions)  wrote:

> Hi, I am executing a solr query runs 10 to 12 lines with all the boosting
> and condition. I change the Http Contentype to POST from GET as post
> doesn't have any restriction for size. But I am getting an error. I am
> using Tomcat 7, Is there any place we need to specify in Tomcat to accept
> POST..
>
> FYI, From my Jetty solr version everthing works good.
>
> Thanks
>
> Ravi
>



-- 
*Sameer Maggon*
http://measuredsearch.com


Re: New to Solr - Need advice on clustering

2013-11-25 Thread Sameer Maggon
Anders,

Take a look at Solr Replication. Essentially, you'll treat one as a master
& one as a slave. Both master & slave can be used to serve traffic. If one
of them goes down, the other can be used as a master for the interim.

http://wiki.apache.org/solr/SolrReplication

Sameer.
--
http://measuredsearch.com


On Mon, Nov 25, 2013 at 9:50 PM, Anders Kåre Olsen  wrote:

>
> Hi Gora
>
> Thank you for your reply.
>
> We are planning on having a loadbalancer in front of our frontend servers.
>
> If I have two distinct solr indexes, how will I keep them synchronized? I
> expect that one of the frontend servers will have the task of updating the
> product repository on the e-commerce site. This server will then update the
> local solr index after product update has finished.
>
> Is there an easy  way that I can keep the two indexes synchronized without
> solrcloud?
>
> Regards
> Anders
>
> -Oprindelig meddelelse- From: Gora Mohanty
> Sent: Tuesday, November 26, 2013 2:37 AM
> To: solr-user@lucene.apache.org
> Subject: Re: New to Solr - Need advice on clustering
>
>
> On 26 November 2013 01:44, Anders Kåre Olsen  wrote:
>
>> Hi Solr-users
>>
>> I’m trying to setup Solr for search and indexing on the project I’m
>> working on.
>>
>> My project is a e-commerce B2B solution. We are planning on setting up 2
>> frontend servers for the website, and I was planning on installing Solr on
>> these servers. We are using Windows Server 2012 for the frontend servers.
>>
>> We are not expecting a huge load on the servers, so we expect these 2
>> servers to be adequate to handle both the website and search index.
>>
>> I have been looking at SolrCloud and ZooKeeper. Howver I have read that
>> you need at least 3 ZooKeepers in an ensamble, and I only have 2 servers.
>>
>> I need to handle the situation where one of the servers crashes, so I
>> need both servers to have a Solr index.
>>
> [...]
>
> If you do not want to get into SolrCloud, a simpler
> solution might be a HTTP load balancer in front of
> the two Solr instances. Hardware load balancers are
> better, but more expensive. A software load balancer
> like haproxy should meet your needs.
>
> Regards,
> Gora
>


Re: Off-line search on mobile devices

2013-12-16 Thread Sameer Maggon
1. Which platform are you looking at? Android, iOS, other?

If you are on Android, you can directly use lucene to build an embedded
solution for search. Depending upon your need, that can offer a small
enough footprint. We've done some work around embedding lucene for a
specific application on Android, happy to brainstorm offline.

Thanks,
Sameer.
--
http://measuredsearch.com



On Mon, Dec 16, 2013 at 3:07 PM, Arcadius Ahouansou wrote:

> Hello.
>
> We are planning to offer search as an embedded functionality into
> mobile/low-power devices.
>
> The main requirement are:
>
> - ability to index and search documents available on the mobile device,
> - no need of internet access,
> - lightweight, low footprint and fast
>
> We are looking into various options.
>
> As I understand it, Solr would be way too heavy for mobile devices.
>
> Has anyone used Lucene/Solr for off-line search on mobile devices?
>
> Are there better alternatives for off-line full-text search?
>
> Many thanks.
>
> Arcadius.
>



-- 
Sameer Maggon
Founder, Measured Search
m: 310.344.7266
tw: @measuredsearch
w: http://www.measuredsearch.com


Re: Off-line search on mobile devices

2013-12-17 Thread Sameer Maggon
Might want to look into http://clucene.sourceforge.net/ &
http://lucene.apache.org/pylucene/jcc/install.html. Unfortunately, I don't
have direct experience with either. -S.


On Tue, Dec 17, 2013 at 8:46 AM, Arcadius Ahouansou wrote:

> Hi Sameer.
> It's a generic Linux device, not iOS/Android.
>
> Thanks.
>
> Arcadius.
>
>
>
> On 16 December 2013 23:11, Sameer Maggon 
> wrote:
>
> > 1. Which platform are you looking at? Android, iOS, other?
> >
> > If you are on Android, you can directly use lucene to build an embedded
> > solution for search. Depending upon your need, that can offer a small
> > enough footprint. We've done some work around embedding lucene for a
> > specific application on Android, happy to brainstorm offline.
> >
> > Thanks,
> > Sameer.
> > --
> > http://measuredsearch.com
> >
> >
> >
> > On Mon, Dec 16, 2013 at 3:07 PM, Arcadius Ahouansou <
> arcad...@menelic.com
> > >wrote:
> >
> > > Hello.
> > >
> > > We are planning to offer search as an embedded functionality into
> > > mobile/low-power devices.
> > >
> > > The main requirement are:
> > >
> > > - ability to index and search documents available on the mobile device,
> > > - no need of internet access,
> > > - lightweight, low footprint and fast
> > >
> > > We are looking into various options.
> > >
> > > As I understand it, Solr would be way too heavy for mobile devices.
> > >
> > > Has anyone used Lucene/Solr for off-line search on mobile devices?
> > >
> > > Are there better alternatives for off-line full-text search?
> > >
> > > Many thanks.
> > >
> > > Arcadius.
> > >
> >
> >
> >
> > --
> > Sameer Maggon
> > Founder, Measured Search
> > m: 310.344.7266
> > tw: @measuredsearch
> > w: http://www.measuredsearch.com
> >
>



-- 
Sameer Maggon
Founder, Measured Search
m: 310.344.7266
tw: @measuredsearch
w: http://www.measuredsearch.com


Re: Limit amount of search result

2014-02-12 Thread Sameer Maggon
Chun,

Have you looked at Grouping / Field Collapsing feature in solr?

https://wiki.apache.org/solr/FieldCollapsing

If shop is one of your field, you can use field collapsing on that field
with a maximum of 'n' to return per field value (or group).

Sameer.
--
www.measuredsearch.com
tw: measuredsearch

On Wednesday, February 12, 2014, rachun  wrote:

> Dear all gurus,
>
>
> I would like to limit amount of search result, let's say I have many shop
> which is selling shirt. So when I search "white shirt" I want to give a
> maximum number per shop (ex. 5).
>
> The result should be like this...
> -> Shop A
> -> Shop A
> -> Shop B
> -> Shop B
> -> Shop B
> -> Shop B
> -> Shop B
> -> Shop C
> -> Shop C
> -> Shop C
>
> any suggestion would be very appreciate.
>
> Thank you very much,
> Chun.
>
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Limit-amount-of-search-result-tp4117062.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


-- 
Sameer Maggon
Founder, Measured Search
m: 310.344.7266
tw: @measuredsearch
w: http://www.measuredsearch.com


Re: Limit amount of search result

2014-02-18 Thread Sameer Maggon
You are welcome!

On Mon, Feb 17, 2014 at 11:07 PM, rachun  wrote:

> hi Samee,
>
> Thank you very much for your suggestion.
> Now I got it worked now;)
>
> Chun.
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Limit-amount-of-search-result-tp4117062p4117952.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


Need to write a start.jar file

2008-11-04 Thread Muhammed Sameer
Salaam,

I read somewhere that it is better to write a new start.jar file than use the 
one that is provided within the example directory, can someone please guide me 
to some documentation that can help me achieve this and write out my own 
start.jar file.

Regards,
Muhammed Sameer


  


Re: Need to write a start.jar file

2008-11-05 Thread Muhammed Sameer
Salaam,

Thanks for the response, I'll only change this if I need any customization done

Regards,
Muhammed Sameer
--- On Wed, 11/5/08, Erik Hatcher <[EMAIL PROTECTED]> wrote:

> From: Erik Hatcher <[EMAIL PROTECTED]>
> Subject: Re: Need to write a start.jar file
> To: solr-user@lucene.apache.org
> Date: Wednesday, November 5, 2008, 5:27 AM
> I've never heard of this need to provide a customized
> start.jar.  Could you send us a pointer to where you read
> that if you still have that available?
> 
> But, no, there is no need to provide a different start.jar.
>  However, Jetty is really just one example of how you deploy
> Solr - any modern servlet container should be fine.  I'd
> just stick with Jetty and the built-in start.jar unless you
> have a compelling reason to switch.
> 
>   Erik
> 
> 
> On Nov 4, 2008, at 11:16 PM, Muhammed Sameer wrote:
> 
> > Salaam,
> > 
> > I read somewhere that it is better to write a new
> start.jar file than use the one that is provided within the
> example directory, can someone please guide me to some
> documentation that can help me achieve this and write out my
> own start.jar file.
> > 
> > Regards,
> > Muhammed Sameer
> > 
> > 
> >


  


Redirecting output of post.jar and start.jar

2008-11-05 Thread Muhammed Sameer
Salaam,

When I run post.jar or start.jar its throws a lot of information on the screen, 
I even tried redirecting the info but that does not seem to help, I have 
configured a cron to run post.jar to run every 2mins to keep the index updated, 
and each time this runs it throws a lot of stuff on the console.

Q1) What can I do so that the start.jar and post.jar do not send output to 
stdout

Q2) Is running post.jar every 2 mins a correct way of keeping the indexes 
updated, or is there a more sane way.

Regards,
Muhammed Sameer


  


Re: Joining Solr Indexes

2009-01-28 Thread Sameer Maggon
IndexMergeTool - http://wiki.apache.org/solr/MergingSolrIndexes

Sameer.
-- 
http://www.productification.com

On Wed, Jan 28, 2009 at 7:30 AM, Jae Joo  wrote:

> Hi,
>
> Is there any way to join multiple indexes in Solr?
>
> Thanks,
>
> Jae
>


Multiple Masters - Solr Replication (1.4)

2009-03-10 Thread Sameer Maggon
I have been playing around with replication in Solr 1.4 and I must say that
it's a big "ease of use" improvement over scripts. Though, I have a few
questions about it.

*1. Is there a way to specify multiple master URLs in the slaves? *
I want to make sure I have redundancy, and if one master goes down the
slaves automatically start taking data from the other master. If not, has
anyone tried a load balancer approach where you put the multiple masters
behind the LB and have slaves talk to the LB?
*
2. Is there a plan to add multicast support to Solr Replication?
*If I have ~100 slaves talking to the master over rsync - I see two problems

   a. Network
   b. Master being choked as it's getting requests from 100 machines

Thoughts?

Thanks,
Sameer.
-- 
http://www.productification.com


NullPointerException while performing Merge

2009-03-29 Thread Sameer Maggon
In our application, we are getting NullPointerExceptions very frequently. It
seems like it's happening during the merge operation (commit). There are no
exceptions while adding documents to Solr. We are using Solr 1.3.0. I looked
around the mailing list, and found that there is a JIRA issue opened for a
similar bug (Lucene-1374), but it's not exactly the same. Also, my fields
are not compressed.

Has anyone seem this before?

Below is the stacktrace.

Exception in thread "Lucene Merge Thread #142"
org.apache.lucene.index.MergePolicy$MergeException:
java.lang.NullPointerException
at
org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:325)
at
org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:302)
Caused by: java.lang.NullPointerException
at
org.apache.lucene.index.FieldsWriter.writeField(FieldsWriter.java:179)
at
org.apache.lucene.index.FieldsWriter.addDocument(FieldsWriter.java:268)
at
org.apache.lucene.index.SegmentMerger.mergeFields(SegmentMerger.java:361)
at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:140)
at
org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4485)
at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:4143)
at
org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:218)
at
org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:274)

Thanks,
Sameer.


Talking to solr

2009-04-08 Thread Muhammed Sameer

Salaam,

I am running solr on port 8080 of my box, I need to write a check that actually 
telnets tothe solr box and exchanges some commands, so I can be sure that solr 
is up and running

Is there something that I can do to achieve this, like when we telnet to the 
mail server we can exchange the helo commands , is there something similar with 
solr also ?

Regards,
Muhammed Sameer


  


UTF8 compatibility

2009-04-29 Thread Muhammed Sameer

Salaam,

I have a question, its in two parts actually and are related

We run post.jar periodically ie after every 15mins to commit the changes, Is 
this approach correct ?

When I run this I get the following message
{code}
SimplePostTool: version 1.2
SimplePostTool: WARNING: Make sure your XML documents are encoded in UTF-8, 
other encodings are not currently supported
SimplePostTool: COMMITting Solr index changes..
{code}

So I tried to run the test_utf8.sh script and got the following output
{code}
Solr server is up.
HTTP GET is accepting UTF-8
HTTP POST is accepting UTF-8
HTTP POST defaults to UTF-8
ERROR: HTTP GET is not accepting UTF-8 beyond the basic multilingual plane
ERROR: HTTP POST is not accepting UTF-8 beyond the basic multilingual plane
ERROR: HTTP POST + URL params is not accepting UTF-8 beyond the basic 
multilingual plane
{code}

Are these errors normal or do I need to change something ?

Thanks for your time.

Regards,
Muhammed Sameer


  


Index size concerns

2009-05-25 Thread Muhammed Sameer

Salaam,

We are using apache-solr to index our files for faster searches, all things 
happen without a problem, my only concern is the size of the cache.

It seems that the trend is that the if I cache 1 GB of files the index goes to 
800MB ie we are seeing a 80% cache size.

Is this normal or am I missing something in the configuration of solr

Thanks and regards,
Muhammed Sameer


  


Re: Index size concerns

2009-05-25 Thread Muhammed Sameer

Salaam,

Sorry for this here is the big picture

Actually we use solr to index all the mails that come to us so that we can 
allow for faster look ups.

We have seen that after our mail server accepts say a GB of mails the index 
size goes upto 800MB 

I hope that this time I am clear in conveying the problem

What I wanted to know is that is this index size normal ?

Regards,
Muhammed Sameer

--- On Mon, 5/25/09, Shalin Shekhar Mangar  wrote:

> From: Shalin Shekhar Mangar 
> Subject: Re: Index size concerns
> To: solr-user@lucene.apache.org
> Date: Monday, May 25, 2009, 11:19 AM
> On Mon, May 25, 2009 at 3:53 PM,
> Muhammed Sameer wrote:
> 
> >
> > We are using apache-solr to index our files for faster
> searches, all things
> > happen without a problem, my only concern is the size
> of the cache.
> >
> > It seems that the trend is that the if I cache 1 GB of
> files the index goes
> > to 800MB ie we are seeing a 80% cache size.
> >
> > Is this normal or am I missing something in the
> configuration of solr
> >
> 
> I'm sorry I do not understand your question. Which files
> are you talking
> about? The Solr cache has got nothing to do with files. It
> caches the
> query/filter results and solr documents.
> 
> -- 
> Regards,
> Shalin Shekhar Mangar.
> 


  


Re: Index size concerns

2009-05-26 Thread Muhammed Sameer

Thank you Otis, I will for sure check on this

wa salaam,
Muhammed Sameer

--- On Tue, 5/26/09, Otis Gospodnetic  wrote:

> From: Otis Gospodnetic 
> Subject: Re: Index size concerns
> To: solr-user@lucene.apache.org
> Date: Tuesday, May 26, 2009, 1:01 PM
> 
> Muhammed,
> 
> It sounds like you are talking about the ratio of original
> data size vs. index size.  The exact ratio depends on
> things such as:
> - whether you store fields or just index them
> - whether you compress fields if you store them
> - whether you have term vectors enabled or not
> - analyzers and what they do - they could stem tokens,
> remove them, etc., but they could also insert synonyms, and
> so on
> - nature of the input text - term distribution/variance
> 
> Otis
> --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
> 
> 
> 
> - Original Message 
> > From: Muhammed Sameer 
> > To: solr-user@lucene.apache.org
> > Sent: Monday, May 25, 2009 1:22:15 PM
> > Subject: Re: Index size concerns
> > 
> > 
> > Salaam,
> > 
> > Sorry for this here is the big picture
> > 
> > Actually we use solr to index all the mails that come
> to us so that we can allow 
> > for faster look ups.
> > 
> > We have seen that after our mail server accepts say a
> GB of mails the index size 
> > goes upto 800MB 
> > 
> > I hope that this time I am clear in conveying the
> problem
> > 
> > What I wanted to know is that is this index size
> normal ?
> > 
> > Regards,
> > Muhammed Sameer
> > 
> > --- On Mon, 5/25/09, Shalin Shekhar Mangar wrote:
> > 
> > > From: Shalin Shekhar Mangar 
> > > Subject: Re: Index size concerns
> > > To: solr-user@lucene.apache.org
> > > Date: Monday, May 25, 2009, 11:19 AM
> > > On Mon, May 25, 2009 at 3:53 PM,
> > > Muhammed Sameer wrote:
> > > 
> > > >
> > > > We are using apache-solr to index our files
> for faster
> > > searches, all things
> > > > happen without a problem, my only concern is
> the size
> > > of the cache.
> > > >
> > > > It seems that the trend is that the if I
> cache 1 GB of
> > > files the index goes
> > > > to 800MB ie we are seeing a 80% cache size.
> > > >
> > > > Is this normal or am I missing something in
> the
> > > configuration of solr
> > > >
> > > 
> > > I'm sorry I do not understand your question.
> Which files
> > > are you talking
> > > about? The Solr cache has got nothing to do with
> files. It
> > > caches the
> > > query/filter results and solr documents.
> > > 
> > > -- 
> > > Regards,
> > > Shalin Shekhar Mangar.
> > > 
> 
> 





Re: Local development and SolrCloud

2018-08-22 Thread Sameer Maggon
Why not just revert to everything SolrCloud? The advantages you will have
is that you or your other team members are using the same APIs, parameters,
experience, etc. that they will be using when they go from one environment
to another. It would be less confusion to explain to someone why you are
doing one thing in one environment and another in another. It does not seem
like there is an overhead if you used SolrCloud in your lower environments
or locally too.

You don't have to run a cluster within your local environment, you can
still have a single node "acting" as SolrCloud.

Maybe I am missing something, but what advantages or benefit you get for
*not* using SolrCloud locally?

-- 
Sameer Maggon
https://www.searchstax.com


On Wed, Aug 22, 2018 at 5:23 PM, John Blythe  wrote:

> For those of you who are developing applications with solr and are using
> solrcloud in production: what are you doing locally? Cloud seems
> unnecessary locally besides testing strictly for cloud specific use cases
> or configurations. Am I totally off basis there? We are considering keeping
> a “standard” (read: non-cloud) local solr environment locally for our
> development workflow and using cloud only for our remote environments.
> Curious to know how wise or stupid that play would be.
>
> Thanks for any info!
> --
> John Blythe
>


Re: Frequently Used Search Terms.

2018-01-18 Thread Sameer Maggon
I don't think you can get this information from Solr as it does not store
these. The stats component provides information around statistics, but it's
mostly numeric in nature. You could parse server logs for come up with a
way to build a Frequently Searched Terms (e.g. pump those logs in SiLK or
Kibana for visualization).

If you have the ability to change the front end and add some javascript
code to the UI or can intercept the search request and make an async or
batch calls to APIs for tracking, you can use SearchStax Analytics [1] that
provides Search Analytics that tracks searches, clicks, cart actions,
revenue, etc. There is also Sematext's product that offer Search Analytics
[2], however, I am not able to find that anymore on their website. @Otis?

Sameer
--
https://www.searchstax.com
<https://mailtrack.io/trace/link/f6c73a9f81226e9e1f1c3c70931ff111324a8cb3?url=https%3A%2F%2Fwww.searchstax.com&userId=554211&signature=4e72815250309cad>

[1] SearchStax Analytics - Documentation
https://www.searchstax.com/docs/search-analytics-start/
<https://mailtrack.io/trace/link/45836beb60d3406916b553ffda343d745c85550f?url=https%3A%2F%2Fwww.searchstax.com%2Fdocs%2Fsearch-analytics-start%2F&userId=554211&signature=75734248d9b12460>
[2] Semetext Search Analytics -
https://sematext.com/blog/whats-new-in-sematext-search-analytics/
<https://mailtrack.io/trace/link/037c080578ec7a9e5c7c9956126e0d56ef00e5e9?url=https%3A%2F%2Fsematext.com%2Fblog%2Fwhats-new-in-sematext-search-analytics%2F&userId=554211&signature=a7316835bb1564b3>


On Thu, Jan 18, 2018 at 10:16 AM, Fiz Newyorker 
wrote:

> Hi Team,
>
> I am using Solr 6.5, I want to retrieve the Information on the Frequently
> Searched Terms and User Clicks , Is there way to Store these information
> and Stats ? Where does the Lucene/Solr stores this Information.
>
> Is there way to retrieve this information .
>
> I want to use this information as an input to Search Relevancy.
>
> Please share your thoughts .
>
>
>
> Thanks
> Fiz..
>


Re: Bitnami, or other Solr on AWS recommendations?

2018-01-26 Thread Sameer Maggon
Although this is shameless promotion, but have you taken a look at
SearchStax (https://www.searchstax.com)? Why not use a Solr-as-a-Service?

On Fri, Jan 26, 2018 at 11:24 AM, TK Solr  wrote:

> If I want to deploy Solr on AWS, do people recommend using the prepackaged
> Bitnami Solr image? Or is it better to install Solr manually on a computer
> instance? Or are there a better way?
>
> TK
>
>
>


-- 
Sameer Maggon
Founder, SearchStax, Inc.
https://www.searchstax.com


Re: Using replicas in SOLR-6.5.1

2018-01-27 Thread Sameer Maggon
1. You could just have 2 VMs, one has all 20 shards of your collection, the
other one has the replicas for those shards. In this scenario, if one VM is
not available, you still have application availability as at least one
replica is available for each shard. This assumes that your VM can fit all
the data in one VM (all 20 shards) without compromising on performance or
getting into memory or garbage collection issues (I am not sure what the
size of your collection or shards is). For additional redundancy, you can
add another VM and add another replica for for all your shards.

2. Can you provide more specifics around what sort of issues are you
thinking of? Replication in general is pretty solid in the version you are
talking about. You could comb through JIRA (
https://issues.apache.org/jira/browse/SOLR-5821?jql=project%20%3D%20SOLR%20AND%20status%20in%20(Open%2C%20%22In%20Progress%22%2C%20Reopened)%20AND%20text%20~%20%22replica%22
)

3. I would recommend you take a look at the Solr Collection API (
https://lucene.apache.org/solr/guide/6_6/collections-api.html). Parameters
that you want to pay more attention to are "replicationFactor", "numShards"
and "maxShardsPerNode" that relate to the shards and replicas.

If you have a use case that warrants you to go beyond the above scenario of
having all shards on the same VM, then you should read more into
"maxShardsPerNode", etc. - but perhaps you can share a bit more around that
use that.

Thanks,
-- 
Sameer Maggon
https://www.searchstax.com | Solr-as-as-Service platform on AWS, Azure and
GCP

On Sat, Jan 27, 2018 at 2:08 AM, SOLR4189  wrote:

> I use SOLR-6.5.1. I would like to use SolrCloud replicas. And I have some
> questions:
>
> 1) What is the best architecture for this if my collection contains 20
> shards, and each shard is in different vm? 40 vms where 20 for leaders and
> 20 for replicas? Or maybe stay with 20 vms where leader and replica (of
> another leader) in the same vm but to add RAM?
>
> 2) What are opened issues about replicas in SOLR-6.5.1 that I need to
> check?
>
> 3) If I use SolrCloud replica, which configuration parameters should I
> change? Which can I change?
>
>
>
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Search Analytics Help

2018-05-23 Thread Sameer Maggon
Ennio,

Have you taken a look at SearchStax Analytics?

https://www.searchstax.com/docs/search-analytics-start/

Thanks,




On Wed, May 23, 2018 at 11:34 AM, ennio  wrote:

> Thanks all for the comments. I'm looking at the ELK option here.
>
>
>
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
>



-- 
Sameer Maggon
https://www.searchstax.com


Re: saas based search With Solr

2018-06-11 Thread Sameer Maggon
If you are looking for a Solr-as-a-Service options, there are a few of them
including:

SearchStax
OpenSolr
WebSolr

Sameer.

On Mon, Jun 11, 2018 at 10:19 PM Sreenivas.T  wrote:

> All,
>
> Does any one aware of commercially available SAAS based Solr search tool?
>
> Regards,
> Sreenivas
>
-- 
Sameer Maggon
https://www.searchstax.com


Re: Hackdays in October, London & Montreal

2018-07-12 Thread Sameer Maggon
Charlie,

I might be able to get sponsorship from SearchStax for eve/drinks in
Montreal. Do you want to start a thread offline?

Sameer.

On Thu, Jul 12, 2018 at 4:28 AM Charlie Hull  wrote:

> Hi all,
>
> A couple of years ago I ran two free Lucene Hackdays in London and
> Boston (the latter just before Lucene Revolution). Here's what we got up
> to with the kind support of Alfresco, Bloomberg, BA Insight and
> Lucidworks
> http://www.flax.co.uk/blog/2016/10/21/tale-two-cities-two-lucene-hackdays/
>
> I'd like to do this again during the weeks of 8th and 15th October in
> London and Montreal (so just before the Activate event). It's a great
> chance to get together IRL with other Lucene/Solr/Elasticsearch hackers!
> I have a venue for London but a sponsor for evening curry/drinks would
> be wonderful, and for Montreal I still need a venue and evening sponsor
> - do let me know if you or your employer can help.
>
> I'll post again once there are more details and with a call for ideas as
> to what we should work on.
>
> Best
>
> Charlie
> --
> Charlie Hull
> Flax - Open Source Enterprise Search
>
> tel/fax: +44 (0)8700 118334
> mobile:  +44 (0)7767 825828
> web: www.flax.co.uk
>
-- 
Sameer Maggon
https://www.searchstax.com


Re: Getting rid of Master/Slave nomenclature in Solr

2020-06-17 Thread Sameer Maggon
+1 for simplifying and using the Leader/Follower Terminology. Our company
operates both SolrCloud, Standalone Solr, and Master/Slave Configurations,
outside of the Solr Developer community, it's painful and confusing to talk
about Master/Slave and Leader/Replica. It would be easier if we had the
following:

The internal differences between manual configuration or SolrCloud being
smart about managing and assigning roles are just the evolution of the
design and details of a particular mode/implementation and shouldn't matter
to the end-user.

Today, when someone not involved in the Solr development looks at the
terminology, it looks new terminology is introduced without thinking about
existing customers or thinking through the system as a whole and how to
best evolve it (not saying that's what happened, but just a perception).
Adding new terminology should be introduced carefully and +1 on reducing
the cognitive load on an average guy like me.

- There are leaders and there are followers
- Solr Clusters can be configured in two modes/implementation (SolrCloud or
Master/Slave). This one is hard because you don't want to introduce yet
another name here as people are now already familiar with it.
- These modes happen to have different designs and depending upon the mode,
you can go into the design differences of these two modes.

Cheers!
-- 

*Sameer Maggon*
*SearchStax* | www.searchstax.com


On Wed, Jun 17, 2020 at 2:22 PM gnandre  wrote:

> +1 for Leader-Follower. How about Publisher-Subscriber?
>
> On Wed, Jun 17, 2020 at 5:19 PM Rahul Goswami 
> wrote:
>
> > +1 on avoiding SolrCloud terminology. In the interest of keeping it
> obvious
> > and simple, may I I please suggest primary/secondary?
> >
> > On Wed, Jun 17, 2020 at 5:14 PM Atita Arora 
> wrote:
> >
> > > I agree avoiding using of solr cloud terminology too.
> > >
> > > I may suggest going for "prime" and "clone"
> > > (Short and precise as Master and Slave).
> > >
> > > Best,
> > > Atita
> > >
> > >
> > >
> > >
> > >
> > > On Wed, 17 Jun 2020, 22:50 Walter Underwood, 
> > > wrote:
> > >
> > > > I strongly disagree with using the Solr Cloud leader/follower
> > terminology
> > > > for non-Cloud clusters. People in my company are confused enough
> > without
> > > > using polysemous terminology.
> > > >
> > > > “This node is the leader, but it means something different than the
> > > leader
> > > > in this other cluster.” I’m dreading that conversation.
> > > >
> > > > I like “principal”. How about “clone” for the slave role? That
> suggests
> > > > that
> > > > it does not accept updates and that it is loosely-coupled, only
> > depending
> > > > on the state of the no-longer-called-master.
> > > >
> > > > Chegg has five production Solr Cloud clusters and one production
> > > > master/slave
> > > > cluster, so this is not a hypothetical for us. We have 100+ Solr
> hosts
> > in
> > > > production.
> > > >
> > > > wunder
> > > > Walter Underwood
> > > > wun...@wunderwood.org
> > > > http://observer.wunderwood.org/  (my blog)
> > > >
> > > > > On Jun 17, 2020, at 1:36 PM, Trey Grainger 
> > wrote:
> > > > >
> > > > > Proposal:
> > > > > "A Solr COLLECTION is composed of one or more SHARDS, which each
> have
> > > one
> > > > > or more REPLICAS. Each replica can have a ROLE of either:
> > > > > 1) A LEADER, which can process external updates for the shard
> > > > > 2) A FOLLOWER, which receives updates from another replica"
> > > > >
> > > > > (Note: I prefer "role" but if others think it's too overloaded due
> to
> > > the
> > > > > overseer role, we could replace it with "mode" or something
> similar)
> > > > > ---
> > > > >
> > > > > To be explicit with the above definitions:
> > > > > 1) In SolrCloud, the roles of leaders and followers can dynamically
> > > > change
> > > > > based upon the status of the cluster. In standalone mode, they can
> be
> > > > > changed by manual intervention.
> > > > > 2) A leader does not have to have any followers (i.e. only one
> active
> > > > > replica)
> > > > > 3) Each shard always has one le