Not Able to Build Spellcheck index - SpellCheckComponent.prepare 500 Error
Hi I am trying to use spellcheck in solr with below config but it throwing with error while using spellcheck build or reload it works fine otherwise for indexed search, can someone please help implementing spellcheck corectly schema.xml: // fieldType declaration //field name //copyFields solrconfig.xml: //searchComponent textSpell solr.IndexBasedSpellChecker default ./spellchecker categoryName,dealName,seoTags,description,dealTitle,merchantName,dealUri,highlights true 0.9 //default requestHandler explicit true direct on true 5 true true spellcheck // URL params select?q=*%3A*&wt=php&indent=true&spellcheck=true&spellcheck.build=true //output array( 'responseHeader'=>array( 'status'=>500, 'QTime'=>4, 'params'=>array( 'spellcheck'=>'true', 'indent'=>'true', 'q'=>'*:*', '_'=>'1396684768649', 'wt'=>'php', 'spellcheck.build'=>'true')), 'error'=>array( 'trace'=>'java.lang.NullPointerException at org.apache.solr.handler.component.SpellCheckComponent.prepare(SpellCheckComponent.java:125) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:187) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1859) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:710) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:413) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:197) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399) at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:767) at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450) at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230) at org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152) at org.mortbay.jetty.handler.DebugHandler.handle(DebugHandler.java:77) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152) at org.mortbay.jetty.Server.handle(Server.java:326) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542) at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404) at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410) at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582) ', 'code'=>500)) -- View this message in context: http://lucene.472066.n3.nabble.com/Not-Able-to-Build-Spellcheck-index-SpellCheckComponent-prepare-500-Error-tp4129368.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Not Able to Build Spellcheck index - SpellCheckComponent.prepare 500 Error
its solr-4.6.0 -- View this message in context: http://lucene.472066.n3.nabble.com/Not-Able-to-Build-Spellcheck-index-SpellCheckComponent-prepare-500-Error-tp4129368p4129392.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Handling fields used for both display & index
Hi Jay, You could use one field for both unless there is a specific requirement you are looking for that is not being met by that one field (e.g. faceting, etc). Typically, if you have a field that is marked as both "indexed" and "stored", the value that is passed while indexing to that field is stored as is. However, it's indexed based on the field type that you've specified for that field. e.g. a description field with the field type of "text_en" would be indexed per the pipeline in the text_en fieldtype and the text as is will be stored (which is what is returned in your response in the results). Thanks, -- *Sameer Maggon* Measured Search | Solr-as-a-Service | Solr Monitoring | Search Analytics www.measuredsearch.com <https://mailtrack.io/trace/link/dca98638f8114f38d1ff30ed04feb547877c848e?url=http%3A%2F%2Fmeasuredsearch.com%2F&signature=797ba5008ecc48b8> On Sun, Jan 31, 2016 at 5:56 PM, Jay Potharaju wrote: > Hi, > I am trying to decide if I should use text_en or string as my field type. > The fields have to be both indexed and stored for display. One solution is > to duplicate fields, one for indexing other for display.One of the field > happens to be a description field which I would like to avoid duplicating. > Solr should return results when someone searches for John or john.Is > storing a copy of the field the best way to go about this problem? > > > Thanks >
Re: Amazon CloudSearch
Hi Sergio, CloudSearch is a Search-as-a-Service that uses SOLR underneath, though they have a proprietary API to interact with it. Both on the document side and query side. It won't give us ability to 'manage' Solr instances or cluster. If you have a use cases where you want to keep on pumping data and forget about it, good for that as it offers auto-scaling. Does not offer ability or visibility into what's going on underneath it plus no control over solrconfig, custom plugins, etc. No spellcheck, etc. If you are looking for a service that provides you direct access to Solr's APIs without having to rewrite your application, then CloudSearch is probably not what you are looking for. Take a look at Measured Search (www.measuredsearch.com) - It offers Solr-as-a-Service on top of AWS, Azure and Google Cloud that allows you direct access to Solr and ability to manage your instances. The platform is comprised of currently three products: 1. SearchStax Cloud Manager - Allows you to deploy, manage and scale Solr. - Provides High Availability as instances are front-ended with ELB (load balancers). - One time and scheduled backups. - Cloning of deployments; - Ability to add / remove nodes, real time log access and log archival. - All deployments run on https, supports auth - Enterprise version allows you to deploy & manage Solr within your AWS account as well. - Zookeeper deployment & setup. - access to deploy custom JARs, etc. - Supports Solr 4.8 and above (self serve version supports Solr 5.2.1 and 5.3.1) 2. SearchStax Pulse - Monitoring and Alerting for your Solr Clusters. - System Level monitoring - GC monitoring - Search & Indexing monitoring - Cache statistics - Alerting on any of the above metrics at host and collection level. - PagerDuty integration 3. SearchStax Analytics - User behavior Analytics that allows you to track application level interactions and metrics to help you optimize your search. - Total searches, - No result searches - Click through rates - conversion metrics for e-commerce scenarios - query level details - advanced version includes MRR reports, average click positions, etc. Lastly, provides 24x7x365 Support and auto-scaling for customers that elect for it. Thanks, Sameer. On Tuesday, April 26, 2016, marotosg wrote: > Hi, > > I am evaluating the possibility of using Amazon CloudSearch to manage Solr > insances. Reason is the price and time to manage and deploy. I am not fully > sure yet how flexible is that service. in case you need to install a > specific solr version or plug in. > Do you have any experience with it? > > Would you please share any thoughts? > > Thanks > Sergio > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Amazon-CloudSearch-tp4272875.html > Sent from the Solr - User mailing list archive at Nabble.com. > -- *Sameer Maggon* Measured Search c: 310.344.7266 www.measuredsearch.com <http://measuredsearch.com>
Re: SolrCloud clarification/Question
Absolutely. You can have a collection with just replicas and no shards for redundancy and have a load balancer in front of it that removes the dependency on a single node. One of them will assume the role of a leader, and in case that leader goes down, one of the replicas will be elected as a leader and your application will be fine. Thanks, On Wed, Sep 16, 2015 at 9:44 AM, Ravi Solr wrote: > Hello, > We are trying to move away from Master-Slave configuration to a > SolrCloud environment. I have a couple of questions. Currently in the > Master-Slave setup we have 4 Machines 2 of which are indexers and 2 of them > are query servers. The query servers are fronted via Load Balancer. > > There are 3 solr cores for 3 different/separate applications (mutually > exclusive). Each core is a complete index of all docs (i.e. the data is not > sharded). > > We intend to keep it in a non-sharded mode even after the SolrCloud > mode.The prime motivation to move to cloud is to effectively use all > servers for indexing and querying (read fault tolerant/redundant). > > So, the real question is, can SolrCloud be used without shards ? i.e. a > "collection" resides entirely on one machine rather than partitioning data > onto different machines ? > > Thanks > > Ravi Kiran Bhaskar > -- *Sameer Maggon* Measured Search c: 310.344.7266 www.measuredsearch.com <http://measuredsearch.com>
Re: SolrCloud clarification/Question
You'll have to say numShards=1 and replicationFactor=2. http:// [hostname]:8983/solr/admin/collections?action=CREATE&name=test&configName=test&numShards=1&replicationFactor=2 On Wed, Sep 16, 2015 at 11:23 AM, Ravi Solr wrote: > Thank you very much for responding Sameer so numShards=0 and > replicationFactr=4 if I have 4 machines ?? > > Thanks > > Ravi Kiran Bhaskar > > On Wed, Sep 16, 2015 at 12:56 PM, Sameer Maggon > > wrote: > > > Absolutely. You can have a collection with just replicas and no shards > for > > redundancy and have a load balancer in front of it that removes the > > dependency on a single node. One of them will assume the role of a > leader, > > and in case that leader goes down, one of the replicas will be elected > as a > > leader and your application will be fine. > > > > Thanks, > > > > On Wed, Sep 16, 2015 at 9:44 AM, Ravi Solr wrote: > > > > > Hello, > > > We are trying to move away from Master-Slave configuration to > a > > > SolrCloud environment. I have a couple of questions. Currently in the > > > Master-Slave setup we have 4 Machines 2 of which are indexers and 2 of > > them > > > are query servers. The query servers are fronted via Load Balancer. > > > > > > There are 3 solr cores for 3 different/separate applications (mutually > > > exclusive). Each core is a complete index of all docs (i.e. the data is > > not > > > sharded). > > > > > > We intend to keep it in a non-sharded mode even after the > SolrCloud > > > mode.The prime motivation to move to cloud is to effectively use all > > > servers for indexing and querying (read fault tolerant/redundant). > > > > > > So, the real question is, can SolrCloud be used without shards ? i.e. a > > > "collection" resides entirely on one machine rather than partitioning > > data > > > onto different machines ? > > > > > > Thanks > > > > > > Ravi Kiran Bhaskar > > > > > > > > > > > -- > > *Sameer Maggon* > > Measured Search > > c: 310.344.7266 > > www.measuredsearch.com <http://measuredsearch.com> > > > -- *Sameer Maggon* Measured Search c: 310.344.7266 www.measuredsearch.com <http://measuredsearch.com>
Re: SolrCloud clarification/Question
I just gave an example API call, but for your scenario, the replicationFactor will be 4 (replicationFactor=4). In this way, all 4 machines will have the same copy of the data and you can put an LB in front of those 4 machines. On Wed, Sep 16, 2015 at 12:00 PM, Ravi Solr wrote: > OK...I understood numShards=1, when you say replicationFactor=2 what does > it mean ? I have 4 machines, then, only 3 copies of data (1 at leader and 2 > replicas) ?? so am i not under utilizing one machine ? > > I was more thinking in the lines of a Mesh connectivity format i.e. > everybody has others copy so that I can put all 4 machines behind a Load > Balancer...Is that a wrong way to look at it ? > > Thanks > > Ravi Kiran > > On Wed, Sep 16, 2015 at 2:51 PM, Sameer Maggon > wrote: > > > You'll have to say numShards=1 and replicationFactor=2. > > > > http:// > > > > > [hostname]:8983/solr/admin/collections?action=CREATE&name=test&configName=test&numShards=1&replicationFactor=2 > > > > On Wed, Sep 16, 2015 at 11:23 AM, Ravi Solr wrote: > > > > > Thank you very much for responding Sameer so numShards=0 and > > > replicationFactr=4 if I have 4 machines ?? > > > > > > Thanks > > > > > > Ravi Kiran Bhaskar > > > > > > On Wed, Sep 16, 2015 at 12:56 PM, Sameer Maggon < > > sam...@measuredsearch.com > > > > > > > wrote: > > > > > > > Absolutely. You can have a collection with just replicas and no > shards > > > for > > > > redundancy and have a load balancer in front of it that removes the > > > > dependency on a single node. One of them will assume the role of a > > > leader, > > > > and in case that leader goes down, one of the replicas will be > elected > > > as a > > > > leader and your application will be fine. > > > > > > > > Thanks, > > > > > > > > On Wed, Sep 16, 2015 at 9:44 AM, Ravi Solr > wrote: > > > > > > > > > Hello, > > > > > We are trying to move away from Master-Slave configuration > > to > > > a > > > > > SolrCloud environment. I have a couple of questions. Currently in > the > > > > > Master-Slave setup we have 4 Machines 2 of which are indexers and 2 > > of > > > > them > > > > > are query servers. The query servers are fronted via Load Balancer. > > > > > > > > > > There are 3 solr cores for 3 different/separate applications > > (mutually > > > > > exclusive). Each core is a complete index of all docs (i.e. the > data > > is > > > > not > > > > > sharded). > > > > > > > > > > We intend to keep it in a non-sharded mode even after the > > > SolrCloud > > > > > mode.The prime motivation to move to cloud is to effectively use > all > > > > > servers for indexing and querying (read fault tolerant/redundant). > > > > > > > > > > So, the real question is, can SolrCloud be used without shards ? > > i.e. a > > > > > "collection" resides entirely on one machine rather than > partitioning > > > > data > > > > > onto different machines ? > > > > > > > > > > Thanks > > > > > > > > > > Ravi Kiran Bhaskar > > > > > > > > > > > > > > > > > > > > > -- > > > > *Sameer Maggon* > > > > Measured Search > > > > c: 310.344.7266 > > > > www.measuredsearch.com <http://measuredsearch.com> > > > > > > > > > > > > > > > -- > > *Sameer Maggon* > > Measured Search > > c: 310.344.7266 > > www.measuredsearch.com <http://measuredsearch.com> > > > -- *Sameer Maggon* Measured Search c: 310.344.7266 www.measuredsearch.com <http://measuredsearch.com>
Re: How to check Zookeeper ensemble status?
Have you tried zkServer.sh status? This will tell you whether zookeeper is running or not and whether it's acting as a leader or follower. Sameer. On Friday, September 18, 2015, Merlin Morgenstern < merlin.morgenst...@gmail.com> wrote: > I am running a 3 node zookeeper ensemble on 3 machines dedicated to > SolrCloud 5.2.x > > Inside the Solr Admin-UI I can check "live nodes", but how can I check if > all three zookeeper nodes are up? > > I am asking since node2 has 25% CPU usage by zookeeper while beeing idle > and I wonder what the cause is. Maybe zookeeper can not connect to the > other nodes or whatever it is, which braught me to the question how to > check if all 3 nodes are operational. > > Thank you for any help on this! > -- *Sameer Maggon* Measured Search c: 310.344.7266 www.measuredsearch.com <http://measuredsearch.com>
Re: Using books.json in solr
Hi Salonee, can you post the query and your schema file too? Thanks, -- *Sameer Maggon* www.measuredsearch.com <http://measuredsearch.com/> Solr Cloud Hosting | Managed Services | Solr Consulting On Tue, Oct 27, 2015 at 10:44 AM, Salonee Rege wrote: > Hello, > We are trying to query the books.json that we have posted to solr. But > when we try to specfically query it on genre it does not return a complete > json with valid key-value pairs. Kindly help. > > *Salonee Rege* > USC Viterbi School of Engineering > University of Southern California > Master of Computer Science - Student > Computer Science - B.E > salon...@usc.edu *||* *619-709-6756 <619-709-6756>* > > >
Re: Using books.json in solr
Hi Salonee, I believe you missed adding the query screenshot? Sameer. On Tue, Oct 27, 2015 at 10:57 AM, Salonee Rege wrote: > Please find attached the following books.json which is in the example-docs > file for your reference. And a screenshot of querying it on the field > fantasy for genre key. > Thanks for the help. > > > *Salonee Rege* > USC Viterbi School of Engineering > University of Southern California > Master of Computer Science - Student > Computer Science - B.E > salon...@usc.edu *||* *619-709-6756 <619-709-6756>* > > > > On Tue, Oct 27, 2015 at 10:47 AM, Rallavagu wrote: > >> Could you please share your query? You could use "wt=json" query >> parameter to receive JSON formatted results if that is what you are looking >> for. >> >> On 10/27/15 10:44 AM, Salonee Rege wrote: >> >>> Hello, >>>We are trying to query the books.json that we have posted to solr. >>> But when we try to specfically query it on genre it does not return a >>> complete json with valid key-value pairs. Kindly help. >>> >>> /Salonee Rege/ >>> USC Viterbi School of Engineering >>> University of Southern California >>> Master of Computer Science - Student >>> Computer Science - B.E >>> salon...@usc.edu <mailto:salon...@usc.edu> _||_ _619-709-6756_ >>> _ >>> _ >>> _ >>> _ >>> >> > -- *Sameer Maggon* Measured Search c: 310.344.7266 www.measuredsearch.com <http://measuredsearch.com>
Re: Solr Suggester with Geo?
Have you looked at the Spatial extensions for Solr? If you are indexing Lat/Lon along with your documents, you can compute the distance from the origin & use that distance as one of the boost factors to affect the score. Typically, use cases around that combine the geo score with other factors as a pure sort by geo score might not give you the relevant results. e.g. typing to search for "sushi restaurants" near Santa Monica, CA - you might not want "thai restaurants" that are closest to you. (Local Search use case) https://cwiki.apache.org/confluence/display/solr/Spatial+Search Thanks, -- *Sameer Maggon* www.measuredsearch.com <http://measuredsearch.com/> Fully Managed Solr-as-a-Service | Solr Consulting | Solr Support On Mon, Nov 9, 2015 at 11:18 AM, William Bell wrote: > http://lucidworks.com/blog/solr-suggester/ > > > Wondering if anyone has uses these new techniques with a boost on > geodist() inverted? So the rows that get returned that are closest > need to come back first. > > > We are still using Edge Grams since we have not figured out how to > boost the results on geo spatial. > > > Anyone have thoughts? > > > > > -- > Bill Bell > billnb...@gmail.com > cell 720-256-8076 >
Re: Solr Suggester with Geo?
Looking through the code and some example Suggesters, it seems that theoretically, one can write a GeoSuggester and provide that as the Lookup implementation (lookupimpl) that would factor in the geo score or extend the SolrSuggestor to support spatial extensions in the same spirit as "Filters" are supported today. Sameer. On Mon, Nov 9, 2015 at 11:47 AM, William Bell wrote: > Yeah we have that working today. But the issue is we want to use > http://lucidworks.com/blog/solr-suggester/ > > And you cannot do a boost with that right? > > > > On Mon, Nov 9, 2015 at 12:41 PM, Sameer Maggon > wrote: > > > Have you looked at the Spatial extensions for Solr? If you are indexing > > Lat/Lon along with your documents, you can compute the distance from the > > origin & use that distance as one of the boost factors to affect the > score. > > Typically, use cases around that combine the geo score with other factors > > as a pure sort by geo score might not give you the relevant results. > > > > e.g. typing to search for "sushi restaurants" near Santa Monica, CA - you > > might not want "thai restaurants" that are closest to you. (Local Search > > use case) > > > > https://cwiki.apache.org/confluence/display/solr/Spatial+Search > > > > Thanks, > > -- > > *Sameer Maggon* > > www.measuredsearch.com <http://measuredsearch.com/> > > Fully Managed Solr-as-a-Service | Solr Consulting | Solr Support > > > > > > > > On Mon, Nov 9, 2015 at 11:18 AM, William Bell > wrote: > > > > > http://lucidworks.com/blog/solr-suggester/ > > > > > > > > > Wondering if anyone has uses these new techniques with a boost on > > > geodist() inverted? So the rows that get returned that are closest > > > need to come back first. > > > > > > > > > We are still using Edge Grams since we have not figured out how to > > > boost the results on geo spatial. > > > > > > > > > Anyone have thoughts? > > > > > > > > > > > > > > > -- > > > Bill Bell > > > billnb...@gmail.com > > > cell 720-256-8076 > > > > > >
Re: Generating Index offline and loading into solrcloud
If you are trying to create a large index and want speedups there, you could use the MapReduceTool - https://github.com/cloudera/search/tree/cdh5-1.0.0_5.2.1/search-mr. At a high level, it takes your files (csv, json, etc) as input can create either a single or a sharded index that you can either copy it to your Solr Servers. I've used this to create indexes that include hundreds of millions of documents in fairly decent amount of time. Thanks, -- *Sameer Maggon* Measured Search www.measuredsearch.com <http://measuredsearch.com/> On Thu, Nov 19, 2015 at 11:17 AM, KNitin wrote: > Hi, > > I was wondering if there are existing tools that will generate solr index > offline (in solrcloud mode) that can be later on loaded into solrcloud, > before I decide to implement my own. I found some tools that do only solr > based index loading (non-zk mode). Is there one with zk mode enabled? > > > Thanks in advance! > Nitin >
Re: Fully automated replica creation in AWS
Erick, Typically, while creating collections, a replicationFactor is specified. Thus, the meta data about the collection does have information about what the "desired" replicationFactor is for the collection. If that's the case, when a Solr node joins the cluster, there could be a pro-active add-replica operation that can be initiated if the Solr detects that the current replicas are less than the desired replicationFactor and pull the collection data from the leader. Isn't that what the attribute "autoAddReplicas" does for HDFS - can this be done for non-shared filesystem? As a side note, we do this for our customers as that's baked into our cloud provisioning software, but it would be nice if Solr supports that OOTB. Are there any underlying flaws of doing that? Thanks, -- *Sameer Maggon* www.measuredsearch.com <https://mailtrack.io/trace/link/66fad5b85359bf1b21be04166edea6c7d13e?url=http%3A%2F%2Fmeasuredsearch.com%2F&signature=6dbc74f0abef4882> | Deploy, Scale & Manage Solr in the cloud of your choice. On Wed, Dec 9, 2015 at 11:19 AM, Erick Erickson wrote: > Not that I know of. The two systems are somewhat disconnected. > AWS doesn't know that Solr lives on those nodes, it's just spinning > one up, right? Albeit with Solr running. > > There's nothing in Solr that auto-detects the existence of a new > Solr node and automagically assigns collections and/or replicas. > > How would either system intuit that this new node is replacing > something else and "do the right thing"? > > I'll tell you how, by interrogating Zookeeper and seeing that for some > specific collection, shardX had fewer replicas than other shards and > issuing the Collections API ADDREPLICA command. > > But now there are _three_ systems that need to be coordinated and > doing the right thing in your situation would be the wrong thing in > another. The last thing many sys ops want is having replicas started > without their knowledge. > > And on top of that, I have doubts about the model. Having AWS > elastically spin up a new replica is a heavyweight operation from > Solr's perspective. I mean this potentially copies a many G set of > index files from one place to another which could take a long time, > is that really what's desired here? > > I have seen some folks spin up/down Solr instances based on a > schedule if they know roughly when the peak load will be, but again > there's nothing built in to handle this. > > Best, > Erick > > On Wed, Dec 9, 2015 at 10:15 AM, Ugo Matrangolo > wrote: > > Hi, > > > > I was trying to setup a SolrCloud cluster in AWS backed by an ASG (auto > > scaling group) serving a replicated collection. I have just came across a > > case when one of the Solr node became unresponsive with AWS killing it > and > > spinning a new one. > > > > Unfortunately, this new Solr node did not join as a replica of the > existing > > collection requiring human intervention to configure it as a new replica. > > > > I was wondering if there is around something that will make this process > > fully automated by detecting that a new node just joined the cluster and > > instructing it (e.g. via Collections API) to join as a replica of a given > > collection. > > > > Best > > Ugo >
Re: [ANN] Relevant Search by Manning out! (Thanks Solr community!)
Congrats Doug & John, will order a copy! Thanks, On Tuesday, June 21, 2016, Doug Turnbull < dturnb...@opensourceconnections.com> wrote: > Not much more to add than my post here! This book is targeted towards > Lucene-based search (Elasticsearch and Solr) relevance. > > Announcement with discount code: > http://opensourceconnections.com/blog/2016/06/21/relevant-search-published/ > > Related hacker news thread: > https://news.ycombinator.com/item?id=11946636 > > Thanks to everyone in the Solr community that was helpful to my efforts. > Specifically Trey Grainger, Eric Pugh (for keeping me employed), Charlie > Hull and the Flax team, Alex Rafalovitch, Timothy Potter, Yonik Seeley, > Grant Ingersoll (for basically teaching me Solr back in the day), Drew > Farris (for encouraging my early blogging), everyone at OSC, and many > others I'm probably forgetting! > > Best > -Doug > -- *Sameer Maggon* www.measuredsearch.com <https://mailtrack.io/trace/link/3404ae650cc88b51d518880f313638b7ca7d7f2c?url=http%3A%2F%2Fwww.measuredsearch.com&signature=6436799da5f290d7> 1.844.9.SEARCH Measured Search is the only *Fully Managed Solr as a Service* multi-cloud capable offering. Plus utilize our *On Demand Expertise* to build your applications faster and with more confidence.
Re: Solr and Drupal
Hi John, As John B. mentioned, you can utilize the plugin here - https://www.drupal.org/project/apachesolr. <https://mailtrack.io/trace/link/5b49557fccf2653a8333a25cc6f15c245ccf7ec9?url=https%3A%2F%2Fwww.drupal.org%2Fproject%2Fapachesolr.&signature=e242eddec6d9f0d9> If you are looking to not have to worry about hosting, deployment, scaling and management, you can take a look at SearchStax by Measured Search to get a Solr deployment up and running in a couple of minutes and not have to get into installing Solr and going through a learning curve around setup and scale. Thanks, Sameer. On Tue, Aug 9, 2016 at 12:11 PM, Rose, John B wrote: > We are looking at Solr for a Drupal web site. We have never installed Solr. > > > From my readings it is not clear exactly what we need to implement a > search in Drupal with Solr. Some sites have implied Lucene and/or Tomcat > are needed. > > > Can someone point me to the site that explains minimally what is needed to > implement Solr within Drupal? > > > Thanks for your time > -- *Sameer Maggon* www.measuredsearch.com <https://mailtrack.io/trace/link/3404ae650cc88b51d518880f313638b7ca7d7f2c?url=http%3A%2F%2Fwww.measuredsearch.com&signature=6436799da5f290d7> 1.844.9.SEARCH Measured Search is the only *Fully Managed Solr as a Service* multi-cloud capable offering. Plus utilize our *On Demand Expertise* to build your applications faster and with more confidence.
Re: Monitoring Apache Solr
Hardika, You can sign up at www.measuredsearch.com and take a look at SearchStax Pulse, that provides detailed Monitoring for Solr Deployments, both single node and cloud setups. Feel free to reach out to me if you have any questions around it. Thanks, Sameer. On Tuesday, August 30, 2016, Hardika Catur S wrote: > Hi, > > I try to monitor apache solr, because solr often over heap and status > collection solr be "down". How to monitor apache solr ?? > is there any tools for monitoring solr or how ?? > > Please help me to find a solution. > > Thanks, > Hardika CS. > -- *Sameer Maggon* www.measuredsearch.com <https://mailtrack.io/trace/link/3404ae650cc88b51d518880f313638b7ca7d7f2c?url=http%3A%2F%2Fwww.measuredsearch.com&signature=6436799da5f290d7> 1.844.9.SEARCH Measured Search is the only *Fully Managed Solr as a Service* multi-cloud capable offering. Plus utilize our *On Demand Expertise* to build your applications faster and with more confidence.
Re: SolrMeter is dead?
Have you looked at JMeter - http://jmeter.apache.org/ Thanks, Sameer. -- http://measuredsearch.com On Wed, May 7, 2014 at 7:51 AM, Al Krinker wrote: > I am trying to test performance of my cluster (solr 4.8). > > SolrMeter looked promising... small and standalone. Plus, open source so > that I could make tweaks if needed. > > However, I see that the last update date was in Oct 2012. Is it dead? Any > better non commercial and preferably open sourced projects out there? > > Thanks, > Al >
Re: writing logs of a speicific solr posting to a file
Check out the patch on the issue below. We hit the same issue and posted a patch, none of the committers have picked it up yet, but would be good to get some feedback on it and get this into the next dot release. If it works for you, please vote it up. https://issues.apache.org/jira/browse/SOLR-5940 Thanks, -- *Sameer Maggon* Founder | Measured Search http://measuredsearch.com On Mon, Jun 9, 2014 at 3:48 AM, pshahukhal wrote: > Hi >I am using SimplepostTool to post the xml files to SOLR llke : > > java -Durl=http://localhost:8080/solr/collection1/update -jar > /var/lib/tomcat6/solr/collection1/dump/xmlinput/post.jar > /var/lib/tomcat6/solr/collection1/dump/xmlinput/solr.xml > >When there are certain errors ,the response from above command just > shows > the 404 error or 500 server error but doesnt provide the complete log > details like in > http://localhost:8080/solr/#/~logging or in catalina.out >I want to catch the exact log details that are thrown in the logs when > the above command is executed and write to a file .I am wondering if there > are additional params that need to be passed in the command line or I have > to work in the configurations . > > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/writing-logs-of-a-speicific-solr-posting-to-a-file-tp4140730.html > Sent from the Solr - User mailing list archive at Nabble.com. >
Re: running Post jar from different server
Ravi, post.jar is a standalone utility that does not have to be on the same server. If you can share the command you are executing, there might be some pointers in there. Thanks, -- *Sameer Maggon* http://measuredsearch.com On Thu, Jun 19, 2014 at 8:54 PM, EXTERNAL Taminidi Ravi (ETI, Automotive-Service-Solutions) wrote: > Hi, I have situation where my SQL Job initiate a console application , > where I am calling the post.jar to upload data to SOLR. Both SQL DB and > SOLR are 2 different servers. > > I am calling post.jar from my SQLDB where the path is mapped to a network > drive. I am getting an error file not found. > > Is the above scenario is possible, if anyone has some experience on this > can you share or any direction will be really appreciated. > > Thanks > > Ravi >
Re: POST Vs GET
Ravi, The POST should work. Here's an example that works within tomcat. curl -X POST --data "q=*:*&rows=1" http://localhost:8080/solr/collection1/select Sameer. On Mon, Jun 23, 2014 at 10:37 AM, EXTERNAL Taminidi Ravi (ETI, Automotive-Service-Solutions) wrote: > Hi, I am executing a solr query runs 10 to 12 lines with all the boosting > and condition. I change the Http Contentype to POST from GET as post > doesn't have any restriction for size. But I am getting an error. I am > using Tomcat 7, Is there any place we need to specify in Tomcat to accept > POST.. > > FYI, From my Jetty solr version everthing works good. > > Thanks > > Ravi > -- *Sameer Maggon* http://measuredsearch.com
Re: New to Solr - Need advice on clustering
Anders, Take a look at Solr Replication. Essentially, you'll treat one as a master & one as a slave. Both master & slave can be used to serve traffic. If one of them goes down, the other can be used as a master for the interim. http://wiki.apache.org/solr/SolrReplication Sameer. -- http://measuredsearch.com On Mon, Nov 25, 2013 at 9:50 PM, Anders Kåre Olsen wrote: > > Hi Gora > > Thank you for your reply. > > We are planning on having a loadbalancer in front of our frontend servers. > > If I have two distinct solr indexes, how will I keep them synchronized? I > expect that one of the frontend servers will have the task of updating the > product repository on the e-commerce site. This server will then update the > local solr index after product update has finished. > > Is there an easy way that I can keep the two indexes synchronized without > solrcloud? > > Regards > Anders > > -Oprindelig meddelelse- From: Gora Mohanty > Sent: Tuesday, November 26, 2013 2:37 AM > To: solr-user@lucene.apache.org > Subject: Re: New to Solr - Need advice on clustering > > > On 26 November 2013 01:44, Anders Kåre Olsen wrote: > >> Hi Solr-users >> >> I’m trying to setup Solr for search and indexing on the project I’m >> working on. >> >> My project is a e-commerce B2B solution. We are planning on setting up 2 >> frontend servers for the website, and I was planning on installing Solr on >> these servers. We are using Windows Server 2012 for the frontend servers. >> >> We are not expecting a huge load on the servers, so we expect these 2 >> servers to be adequate to handle both the website and search index. >> >> I have been looking at SolrCloud and ZooKeeper. Howver I have read that >> you need at least 3 ZooKeepers in an ensamble, and I only have 2 servers. >> >> I need to handle the situation where one of the servers crashes, so I >> need both servers to have a Solr index. >> > [...] > > If you do not want to get into SolrCloud, a simpler > solution might be a HTTP load balancer in front of > the two Solr instances. Hardware load balancers are > better, but more expensive. A software load balancer > like haproxy should meet your needs. > > Regards, > Gora >
Re: Off-line search on mobile devices
1. Which platform are you looking at? Android, iOS, other? If you are on Android, you can directly use lucene to build an embedded solution for search. Depending upon your need, that can offer a small enough footprint. We've done some work around embedding lucene for a specific application on Android, happy to brainstorm offline. Thanks, Sameer. -- http://measuredsearch.com On Mon, Dec 16, 2013 at 3:07 PM, Arcadius Ahouansou wrote: > Hello. > > We are planning to offer search as an embedded functionality into > mobile/low-power devices. > > The main requirement are: > > - ability to index and search documents available on the mobile device, > - no need of internet access, > - lightweight, low footprint and fast > > We are looking into various options. > > As I understand it, Solr would be way too heavy for mobile devices. > > Has anyone used Lucene/Solr for off-line search on mobile devices? > > Are there better alternatives for off-line full-text search? > > Many thanks. > > Arcadius. > -- Sameer Maggon Founder, Measured Search m: 310.344.7266 tw: @measuredsearch w: http://www.measuredsearch.com
Re: Off-line search on mobile devices
Might want to look into http://clucene.sourceforge.net/ & http://lucene.apache.org/pylucene/jcc/install.html. Unfortunately, I don't have direct experience with either. -S. On Tue, Dec 17, 2013 at 8:46 AM, Arcadius Ahouansou wrote: > Hi Sameer. > It's a generic Linux device, not iOS/Android. > > Thanks. > > Arcadius. > > > > On 16 December 2013 23:11, Sameer Maggon > wrote: > > > 1. Which platform are you looking at? Android, iOS, other? > > > > If you are on Android, you can directly use lucene to build an embedded > > solution for search. Depending upon your need, that can offer a small > > enough footprint. We've done some work around embedding lucene for a > > specific application on Android, happy to brainstorm offline. > > > > Thanks, > > Sameer. > > -- > > http://measuredsearch.com > > > > > > > > On Mon, Dec 16, 2013 at 3:07 PM, Arcadius Ahouansou < > arcad...@menelic.com > > >wrote: > > > > > Hello. > > > > > > We are planning to offer search as an embedded functionality into > > > mobile/low-power devices. > > > > > > The main requirement are: > > > > > > - ability to index and search documents available on the mobile device, > > > - no need of internet access, > > > - lightweight, low footprint and fast > > > > > > We are looking into various options. > > > > > > As I understand it, Solr would be way too heavy for mobile devices. > > > > > > Has anyone used Lucene/Solr for off-line search on mobile devices? > > > > > > Are there better alternatives for off-line full-text search? > > > > > > Many thanks. > > > > > > Arcadius. > > > > > > > > > > > -- > > Sameer Maggon > > Founder, Measured Search > > m: 310.344.7266 > > tw: @measuredsearch > > w: http://www.measuredsearch.com > > > -- Sameer Maggon Founder, Measured Search m: 310.344.7266 tw: @measuredsearch w: http://www.measuredsearch.com
Re: Limit amount of search result
Chun, Have you looked at Grouping / Field Collapsing feature in solr? https://wiki.apache.org/solr/FieldCollapsing If shop is one of your field, you can use field collapsing on that field with a maximum of 'n' to return per field value (or group). Sameer. -- www.measuredsearch.com tw: measuredsearch On Wednesday, February 12, 2014, rachun wrote: > Dear all gurus, > > > I would like to limit amount of search result, let's say I have many shop > which is selling shirt. So when I search "white shirt" I want to give a > maximum number per shop (ex. 5). > > The result should be like this... > -> Shop A > -> Shop A > -> Shop B > -> Shop B > -> Shop B > -> Shop B > -> Shop B > -> Shop C > -> Shop C > -> Shop C > > any suggestion would be very appreciate. > > Thank you very much, > Chun. > > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Limit-amount-of-search-result-tp4117062.html > Sent from the Solr - User mailing list archive at Nabble.com. > -- Sameer Maggon Founder, Measured Search m: 310.344.7266 tw: @measuredsearch w: http://www.measuredsearch.com
Re: Limit amount of search result
You are welcome! On Mon, Feb 17, 2014 at 11:07 PM, rachun wrote: > hi Samee, > > Thank you very much for your suggestion. > Now I got it worked now;) > > Chun. > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Limit-amount-of-search-result-tp4117062p4117952.html > Sent from the Solr - User mailing list archive at Nabble.com. >
Need to write a start.jar file
Salaam, I read somewhere that it is better to write a new start.jar file than use the one that is provided within the example directory, can someone please guide me to some documentation that can help me achieve this and write out my own start.jar file. Regards, Muhammed Sameer
Re: Need to write a start.jar file
Salaam, Thanks for the response, I'll only change this if I need any customization done Regards, Muhammed Sameer --- On Wed, 11/5/08, Erik Hatcher <[EMAIL PROTECTED]> wrote: > From: Erik Hatcher <[EMAIL PROTECTED]> > Subject: Re: Need to write a start.jar file > To: solr-user@lucene.apache.org > Date: Wednesday, November 5, 2008, 5:27 AM > I've never heard of this need to provide a customized > start.jar. Could you send us a pointer to where you read > that if you still have that available? > > But, no, there is no need to provide a different start.jar. > However, Jetty is really just one example of how you deploy > Solr - any modern servlet container should be fine. I'd > just stick with Jetty and the built-in start.jar unless you > have a compelling reason to switch. > > Erik > > > On Nov 4, 2008, at 11:16 PM, Muhammed Sameer wrote: > > > Salaam, > > > > I read somewhere that it is better to write a new > start.jar file than use the one that is provided within the > example directory, can someone please guide me to some > documentation that can help me achieve this and write out my > own start.jar file. > > > > Regards, > > Muhammed Sameer > > > > > >
Redirecting output of post.jar and start.jar
Salaam, When I run post.jar or start.jar its throws a lot of information on the screen, I even tried redirecting the info but that does not seem to help, I have configured a cron to run post.jar to run every 2mins to keep the index updated, and each time this runs it throws a lot of stuff on the console. Q1) What can I do so that the start.jar and post.jar do not send output to stdout Q2) Is running post.jar every 2 mins a correct way of keeping the indexes updated, or is there a more sane way. Regards, Muhammed Sameer
Re: Joining Solr Indexes
IndexMergeTool - http://wiki.apache.org/solr/MergingSolrIndexes Sameer. -- http://www.productification.com On Wed, Jan 28, 2009 at 7:30 AM, Jae Joo wrote: > Hi, > > Is there any way to join multiple indexes in Solr? > > Thanks, > > Jae >
Multiple Masters - Solr Replication (1.4)
I have been playing around with replication in Solr 1.4 and I must say that it's a big "ease of use" improvement over scripts. Though, I have a few questions about it. *1. Is there a way to specify multiple master URLs in the slaves? * I want to make sure I have redundancy, and if one master goes down the slaves automatically start taking data from the other master. If not, has anyone tried a load balancer approach where you put the multiple masters behind the LB and have slaves talk to the LB? * 2. Is there a plan to add multicast support to Solr Replication? *If I have ~100 slaves talking to the master over rsync - I see two problems a. Network b. Master being choked as it's getting requests from 100 machines Thoughts? Thanks, Sameer. -- http://www.productification.com
NullPointerException while performing Merge
In our application, we are getting NullPointerExceptions very frequently. It seems like it's happening during the merge operation (commit). There are no exceptions while adding documents to Solr. We are using Solr 1.3.0. I looked around the mailing list, and found that there is a JIRA issue opened for a similar bug (Lucene-1374), but it's not exactly the same. Also, my fields are not compressed. Has anyone seem this before? Below is the stacktrace. Exception in thread "Lucene Merge Thread #142" org.apache.lucene.index.MergePolicy$MergeException: java.lang.NullPointerException at org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:325) at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:302) Caused by: java.lang.NullPointerException at org.apache.lucene.index.FieldsWriter.writeField(FieldsWriter.java:179) at org.apache.lucene.index.FieldsWriter.addDocument(FieldsWriter.java:268) at org.apache.lucene.index.SegmentMerger.mergeFields(SegmentMerger.java:361) at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:140) at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4485) at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:4143) at org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:218) at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:274) Thanks, Sameer.
Talking to solr
Salaam, I am running solr on port 8080 of my box, I need to write a check that actually telnets tothe solr box and exchanges some commands, so I can be sure that solr is up and running Is there something that I can do to achieve this, like when we telnet to the mail server we can exchange the helo commands , is there something similar with solr also ? Regards, Muhammed Sameer
UTF8 compatibility
Salaam, I have a question, its in two parts actually and are related We run post.jar periodically ie after every 15mins to commit the changes, Is this approach correct ? When I run this I get the following message {code} SimplePostTool: version 1.2 SimplePostTool: WARNING: Make sure your XML documents are encoded in UTF-8, other encodings are not currently supported SimplePostTool: COMMITting Solr index changes.. {code} So I tried to run the test_utf8.sh script and got the following output {code} Solr server is up. HTTP GET is accepting UTF-8 HTTP POST is accepting UTF-8 HTTP POST defaults to UTF-8 ERROR: HTTP GET is not accepting UTF-8 beyond the basic multilingual plane ERROR: HTTP POST is not accepting UTF-8 beyond the basic multilingual plane ERROR: HTTP POST + URL params is not accepting UTF-8 beyond the basic multilingual plane {code} Are these errors normal or do I need to change something ? Thanks for your time. Regards, Muhammed Sameer
Index size concerns
Salaam, We are using apache-solr to index our files for faster searches, all things happen without a problem, my only concern is the size of the cache. It seems that the trend is that the if I cache 1 GB of files the index goes to 800MB ie we are seeing a 80% cache size. Is this normal or am I missing something in the configuration of solr Thanks and regards, Muhammed Sameer
Re: Index size concerns
Salaam, Sorry for this here is the big picture Actually we use solr to index all the mails that come to us so that we can allow for faster look ups. We have seen that after our mail server accepts say a GB of mails the index size goes upto 800MB I hope that this time I am clear in conveying the problem What I wanted to know is that is this index size normal ? Regards, Muhammed Sameer --- On Mon, 5/25/09, Shalin Shekhar Mangar wrote: > From: Shalin Shekhar Mangar > Subject: Re: Index size concerns > To: solr-user@lucene.apache.org > Date: Monday, May 25, 2009, 11:19 AM > On Mon, May 25, 2009 at 3:53 PM, > Muhammed Sameer wrote: > > > > > We are using apache-solr to index our files for faster > searches, all things > > happen without a problem, my only concern is the size > of the cache. > > > > It seems that the trend is that the if I cache 1 GB of > files the index goes > > to 800MB ie we are seeing a 80% cache size. > > > > Is this normal or am I missing something in the > configuration of solr > > > > I'm sorry I do not understand your question. Which files > are you talking > about? The Solr cache has got nothing to do with files. It > caches the > query/filter results and solr documents. > > -- > Regards, > Shalin Shekhar Mangar. >
Re: Index size concerns
Thank you Otis, I will for sure check on this wa salaam, Muhammed Sameer --- On Tue, 5/26/09, Otis Gospodnetic wrote: > From: Otis Gospodnetic > Subject: Re: Index size concerns > To: solr-user@lucene.apache.org > Date: Tuesday, May 26, 2009, 1:01 PM > > Muhammed, > > It sounds like you are talking about the ratio of original > data size vs. index size. The exact ratio depends on > things such as: > - whether you store fields or just index them > - whether you compress fields if you store them > - whether you have term vectors enabled or not > - analyzers and what they do - they could stem tokens, > remove them, etc., but they could also insert synonyms, and > so on > - nature of the input text - term distribution/variance > > Otis > -- > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch > > > > - Original Message > > From: Muhammed Sameer > > To: solr-user@lucene.apache.org > > Sent: Monday, May 25, 2009 1:22:15 PM > > Subject: Re: Index size concerns > > > > > > Salaam, > > > > Sorry for this here is the big picture > > > > Actually we use solr to index all the mails that come > to us so that we can allow > > for faster look ups. > > > > We have seen that after our mail server accepts say a > GB of mails the index size > > goes upto 800MB > > > > I hope that this time I am clear in conveying the > problem > > > > What I wanted to know is that is this index size > normal ? > > > > Regards, > > Muhammed Sameer > > > > --- On Mon, 5/25/09, Shalin Shekhar Mangar wrote: > > > > > From: Shalin Shekhar Mangar > > > Subject: Re: Index size concerns > > > To: solr-user@lucene.apache.org > > > Date: Monday, May 25, 2009, 11:19 AM > > > On Mon, May 25, 2009 at 3:53 PM, > > > Muhammed Sameer wrote: > > > > > > > > > > > We are using apache-solr to index our files > for faster > > > searches, all things > > > > happen without a problem, my only concern is > the size > > > of the cache. > > > > > > > > It seems that the trend is that the if I > cache 1 GB of > > > files the index goes > > > > to 800MB ie we are seeing a 80% cache size. > > > > > > > > Is this normal or am I missing something in > the > > > configuration of solr > > > > > > > > > > I'm sorry I do not understand your question. > Which files > > > are you talking > > > about? The Solr cache has got nothing to do with > files. It > > > caches the > > > query/filter results and solr documents. > > > > > > -- > > > Regards, > > > Shalin Shekhar Mangar. > > > > >
Re: Local development and SolrCloud
Why not just revert to everything SolrCloud? The advantages you will have is that you or your other team members are using the same APIs, parameters, experience, etc. that they will be using when they go from one environment to another. It would be less confusion to explain to someone why you are doing one thing in one environment and another in another. It does not seem like there is an overhead if you used SolrCloud in your lower environments or locally too. You don't have to run a cluster within your local environment, you can still have a single node "acting" as SolrCloud. Maybe I am missing something, but what advantages or benefit you get for *not* using SolrCloud locally? -- Sameer Maggon https://www.searchstax.com On Wed, Aug 22, 2018 at 5:23 PM, John Blythe wrote: > For those of you who are developing applications with solr and are using > solrcloud in production: what are you doing locally? Cloud seems > unnecessary locally besides testing strictly for cloud specific use cases > or configurations. Am I totally off basis there? We are considering keeping > a “standard” (read: non-cloud) local solr environment locally for our > development workflow and using cloud only for our remote environments. > Curious to know how wise or stupid that play would be. > > Thanks for any info! > -- > John Blythe >
Re: Frequently Used Search Terms.
I don't think you can get this information from Solr as it does not store these. The stats component provides information around statistics, but it's mostly numeric in nature. You could parse server logs for come up with a way to build a Frequently Searched Terms (e.g. pump those logs in SiLK or Kibana for visualization). If you have the ability to change the front end and add some javascript code to the UI or can intercept the search request and make an async or batch calls to APIs for tracking, you can use SearchStax Analytics [1] that provides Search Analytics that tracks searches, clicks, cart actions, revenue, etc. There is also Sematext's product that offer Search Analytics [2], however, I am not able to find that anymore on their website. @Otis? Sameer -- https://www.searchstax.com <https://mailtrack.io/trace/link/f6c73a9f81226e9e1f1c3c70931ff111324a8cb3?url=https%3A%2F%2Fwww.searchstax.com&userId=554211&signature=4e72815250309cad> [1] SearchStax Analytics - Documentation https://www.searchstax.com/docs/search-analytics-start/ <https://mailtrack.io/trace/link/45836beb60d3406916b553ffda343d745c85550f?url=https%3A%2F%2Fwww.searchstax.com%2Fdocs%2Fsearch-analytics-start%2F&userId=554211&signature=75734248d9b12460> [2] Semetext Search Analytics - https://sematext.com/blog/whats-new-in-sematext-search-analytics/ <https://mailtrack.io/trace/link/037c080578ec7a9e5c7c9956126e0d56ef00e5e9?url=https%3A%2F%2Fsematext.com%2Fblog%2Fwhats-new-in-sematext-search-analytics%2F&userId=554211&signature=a7316835bb1564b3> On Thu, Jan 18, 2018 at 10:16 AM, Fiz Newyorker wrote: > Hi Team, > > I am using Solr 6.5, I want to retrieve the Information on the Frequently > Searched Terms and User Clicks , Is there way to Store these information > and Stats ? Where does the Lucene/Solr stores this Information. > > Is there way to retrieve this information . > > I want to use this information as an input to Search Relevancy. > > Please share your thoughts . > > > > Thanks > Fiz.. >
Re: Bitnami, or other Solr on AWS recommendations?
Although this is shameless promotion, but have you taken a look at SearchStax (https://www.searchstax.com)? Why not use a Solr-as-a-Service? On Fri, Jan 26, 2018 at 11:24 AM, TK Solr wrote: > If I want to deploy Solr on AWS, do people recommend using the prepackaged > Bitnami Solr image? Or is it better to install Solr manually on a computer > instance? Or are there a better way? > > TK > > > -- Sameer Maggon Founder, SearchStax, Inc. https://www.searchstax.com
Re: Using replicas in SOLR-6.5.1
1. You could just have 2 VMs, one has all 20 shards of your collection, the other one has the replicas for those shards. In this scenario, if one VM is not available, you still have application availability as at least one replica is available for each shard. This assumes that your VM can fit all the data in one VM (all 20 shards) without compromising on performance or getting into memory or garbage collection issues (I am not sure what the size of your collection or shards is). For additional redundancy, you can add another VM and add another replica for for all your shards. 2. Can you provide more specifics around what sort of issues are you thinking of? Replication in general is pretty solid in the version you are talking about. You could comb through JIRA ( https://issues.apache.org/jira/browse/SOLR-5821?jql=project%20%3D%20SOLR%20AND%20status%20in%20(Open%2C%20%22In%20Progress%22%2C%20Reopened)%20AND%20text%20~%20%22replica%22 ) 3. I would recommend you take a look at the Solr Collection API ( https://lucene.apache.org/solr/guide/6_6/collections-api.html). Parameters that you want to pay more attention to are "replicationFactor", "numShards" and "maxShardsPerNode" that relate to the shards and replicas. If you have a use case that warrants you to go beyond the above scenario of having all shards on the same VM, then you should read more into "maxShardsPerNode", etc. - but perhaps you can share a bit more around that use that. Thanks, -- Sameer Maggon https://www.searchstax.com | Solr-as-as-Service platform on AWS, Azure and GCP On Sat, Jan 27, 2018 at 2:08 AM, SOLR4189 wrote: > I use SOLR-6.5.1. I would like to use SolrCloud replicas. And I have some > questions: > > 1) What is the best architecture for this if my collection contains 20 > shards, and each shard is in different vm? 40 vms where 20 for leaders and > 20 for replicas? Or maybe stay with 20 vms where leader and replica (of > another leader) in the same vm but to add RAM? > > 2) What are opened issues about replicas in SOLR-6.5.1 that I need to > check? > > 3) If I use SolrCloud replica, which configuration parameters should I > change? Which can I change? > > > > -- > Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
Re: Search Analytics Help
Ennio, Have you taken a look at SearchStax Analytics? https://www.searchstax.com/docs/search-analytics-start/ Thanks, On Wed, May 23, 2018 at 11:34 AM, ennio wrote: > Thanks all for the comments. I'm looking at the ELK option here. > > > > -- > Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html > -- Sameer Maggon https://www.searchstax.com
Re: saas based search With Solr
If you are looking for a Solr-as-a-Service options, there are a few of them including: SearchStax OpenSolr WebSolr Sameer. On Mon, Jun 11, 2018 at 10:19 PM Sreenivas.T wrote: > All, > > Does any one aware of commercially available SAAS based Solr search tool? > > Regards, > Sreenivas > -- Sameer Maggon https://www.searchstax.com
Re: Hackdays in October, London & Montreal
Charlie, I might be able to get sponsorship from SearchStax for eve/drinks in Montreal. Do you want to start a thread offline? Sameer. On Thu, Jul 12, 2018 at 4:28 AM Charlie Hull wrote: > Hi all, > > A couple of years ago I ran two free Lucene Hackdays in London and > Boston (the latter just before Lucene Revolution). Here's what we got up > to with the kind support of Alfresco, Bloomberg, BA Insight and > Lucidworks > http://www.flax.co.uk/blog/2016/10/21/tale-two-cities-two-lucene-hackdays/ > > I'd like to do this again during the weeks of 8th and 15th October in > London and Montreal (so just before the Activate event). It's a great > chance to get together IRL with other Lucene/Solr/Elasticsearch hackers! > I have a venue for London but a sponsor for evening curry/drinks would > be wonderful, and for Montreal I still need a venue and evening sponsor > - do let me know if you or your employer can help. > > I'll post again once there are more details and with a call for ideas as > to what we should work on. > > Best > > Charlie > -- > Charlie Hull > Flax - Open Source Enterprise Search > > tel/fax: +44 (0)8700 118334 > mobile: +44 (0)7767 825828 > web: www.flax.co.uk > -- Sameer Maggon https://www.searchstax.com
Re: Getting rid of Master/Slave nomenclature in Solr
+1 for simplifying and using the Leader/Follower Terminology. Our company operates both SolrCloud, Standalone Solr, and Master/Slave Configurations, outside of the Solr Developer community, it's painful and confusing to talk about Master/Slave and Leader/Replica. It would be easier if we had the following: The internal differences between manual configuration or SolrCloud being smart about managing and assigning roles are just the evolution of the design and details of a particular mode/implementation and shouldn't matter to the end-user. Today, when someone not involved in the Solr development looks at the terminology, it looks new terminology is introduced without thinking about existing customers or thinking through the system as a whole and how to best evolve it (not saying that's what happened, but just a perception). Adding new terminology should be introduced carefully and +1 on reducing the cognitive load on an average guy like me. - There are leaders and there are followers - Solr Clusters can be configured in two modes/implementation (SolrCloud or Master/Slave). This one is hard because you don't want to introduce yet another name here as people are now already familiar with it. - These modes happen to have different designs and depending upon the mode, you can go into the design differences of these two modes. Cheers! -- *Sameer Maggon* *SearchStax* | www.searchstax.com On Wed, Jun 17, 2020 at 2:22 PM gnandre wrote: > +1 for Leader-Follower. How about Publisher-Subscriber? > > On Wed, Jun 17, 2020 at 5:19 PM Rahul Goswami > wrote: > > > +1 on avoiding SolrCloud terminology. In the interest of keeping it > obvious > > and simple, may I I please suggest primary/secondary? > > > > On Wed, Jun 17, 2020 at 5:14 PM Atita Arora > wrote: > > > > > I agree avoiding using of solr cloud terminology too. > > > > > > I may suggest going for "prime" and "clone" > > > (Short and precise as Master and Slave). > > > > > > Best, > > > Atita > > > > > > > > > > > > > > > > > > On Wed, 17 Jun 2020, 22:50 Walter Underwood, > > > wrote: > > > > > > > I strongly disagree with using the Solr Cloud leader/follower > > terminology > > > > for non-Cloud clusters. People in my company are confused enough > > without > > > > using polysemous terminology. > > > > > > > > “This node is the leader, but it means something different than the > > > leader > > > > in this other cluster.” I’m dreading that conversation. > > > > > > > > I like “principal”. How about “clone” for the slave role? That > suggests > > > > that > > > > it does not accept updates and that it is loosely-coupled, only > > depending > > > > on the state of the no-longer-called-master. > > > > > > > > Chegg has five production Solr Cloud clusters and one production > > > > master/slave > > > > cluster, so this is not a hypothetical for us. We have 100+ Solr > hosts > > in > > > > production. > > > > > > > > wunder > > > > Walter Underwood > > > > wun...@wunderwood.org > > > > http://observer.wunderwood.org/ (my blog) > > > > > > > > > On Jun 17, 2020, at 1:36 PM, Trey Grainger > > wrote: > > > > > > > > > > Proposal: > > > > > "A Solr COLLECTION is composed of one or more SHARDS, which each > have > > > one > > > > > or more REPLICAS. Each replica can have a ROLE of either: > > > > > 1) A LEADER, which can process external updates for the shard > > > > > 2) A FOLLOWER, which receives updates from another replica" > > > > > > > > > > (Note: I prefer "role" but if others think it's too overloaded due > to > > > the > > > > > overseer role, we could replace it with "mode" or something > similar) > > > > > --- > > > > > > > > > > To be explicit with the above definitions: > > > > > 1) In SolrCloud, the roles of leaders and followers can dynamically > > > > change > > > > > based upon the status of the cluster. In standalone mode, they can > be > > > > > changed by manual intervention. > > > > > 2) A leader does not have to have any followers (i.e. only one > active > > > > > replica) > > > > > 3) Each shard always has one le