disappearing index
I built up two indexes using a multicore configuration one containing 52,000+ documents and the other over 10 million, the entire indexing process showed now errors. The server crashed over night, well after the indexing had completed, and now no documents are reported for either index. This despite the fact that the core's both have huge /data folders. (one is 1.5GB the other is 8.5GB). Any ideas?
small facets not working
I have a solr index which contains research data from the human genome project. Each document contains about 60 facets, including one general composite field that contains all the facet data. the general facet is anywhere from 100KB to 7MB. One facet is called Gene.Symbol and, you guessed it, it contains only the gene symbol. There is only one Symbol per gene (for smarty pantses out there, the aliases are contained in another facet). When I do a search for anything in the big general facet, I find what i'm looking for. But if I do a search in the Gene.Symbol facet, it does not find anything. I realize it's probably finding the string repeated elsewhere in the document, but how do I get it to find it in the Gene.Symbol facet? so a search for http://localhost:8983/solr/core0/select?indent=on&version=2.2&q=Gene.Symbol:abc returns nothing, but a search for http://localhost:8983/solr/core0/select?indent=on&version=2.2&q=abc returns ABCC2 ABCC8 ABCD1 ABCG1 ABCA1 ... CABC1 ... ABCD3 ABCC5 ABCC9 ABCG2 ABCB11 ABCC3 ABCF1 ABCC1 ABCF2 ABCB9 Schema.xml: ... ... ... BFDText
out of memory every time
I'm indexing a large number of documents. As a server I'm using the /solr/example/start.jar No matter how much memory I allocate it fails around 7200 documents. I am committing every 100 docs, and optimizing every 300. all of my xml's contain on doc, and can range in size from 2k to 700k. when I restart the start.jar it again reports out of memory. a sample document looks like this: 1851 TRAJ20 12049 ENSG0211869 28735 HUgn28735 TRA_ TRAJ20 9953837 ENSG0211869 T cell receptor alpha joining 20 14q11.2 14q11 14q11.2 AE000662.1 M94081.1 CH471078.2 NC_14.7 NT_026437.11 NG_001332.2 8188290 The human T-cell receptor TCRAC/TCRDC (C alpha/C delta) region: organization,sequence, and evolution of 97.6 kb of DNA. Koop B.F. Rowen L. Hood L. Wang K. Kuo C.L. Seto D. Lenstra J.A. Howard S. Shan W. Deshpande P. 31311_at the schema is (in summary): PK text and my conf is: false 100 900 2147483647 1
Quoted searches
When I issue a search in quotes, like "tay sachs" lucene is returning results as if it were written: tay OR sachs Any reason why? Any way to stop it?
solrconfig.xml location for cloud setup not working (works on single node)
I have configured a single node and configured a proper datahandler. I need to move it to a clustered setup. Seems like I cannot, for the life of me, get the datahandler to work with the cloud setup. ls /opt/solr/solr-6.0.0/example/cloud/node1/solr/activityDigest_shard1_replica1/conf/ currency.xml db-data-config.xml lang protwords.txt _rest_managed.json schema.xml solrconfig.xml stopwords.txt synonyms.txt inside the solrconfig.xml, I have created a request handler to point to my config file: db-data-config.xml in the db-data-config.xml I have a working config (works in a single node setup that is)
Getting a list of matching terms and offsets
Is anyone aware of a way of getting a list of each matching token and their offsets after executing a search? The reason I want to do this is because I have the physical coordinates of each token in the original document stored out of band, and I want to be able to highlight in the original document. I would really like to have Solr return the list of matching tokens because then things like stemming and phrase matching will work as expected. I'm thinking of something like the highlighter component, except instead of returning html, it would return just the matching tokens and their offsets. I have googled high and low and can't seem to find an exact answer to this question, so I have spent the last few days examining the internals of the various highlighting classes in Solr and Lucene. I think the bulk of the action is in WeightedSpanTermExtractor and its interaction with getBestTextFragments in the Highlighter class. But before I spend anymore time on this I thought I'd ask (1) whether anyone knows of an easier way of doing this, and (2) whether I'm at least barking up the right tree. Thanks much, Justin
Re: Getting a list of matching terms and offsets
Thanks for the responses Alex and Ahmet. The TermVector component was the first thing I looked at, but what it gives you is offset information for every token in the document. I'm trying to get a list of tokens that actually match the search query, and unless I'm missing something, the TermVector component doesn't give you that information. The TermSpans class does contain the right information, but again the hard part is: how do I reliably get a list of TokenSpans for the tokens that actually match the search query? That's why I ended up in the highlighter source code, because the highlighter has to do just this in order to create snippets with accurate highlighting. Justin On Sun, Jun 5, 2016 at 9:09 AM Ahmet Arslan wrote: > Hi, > > May be org.apache.lucene.search.spans.TermSpans ? > > > > On Sunday, June 5, 2016 7:59 AM, Alexandre Rafalovitch > wrote: > It sounds like TermVector component's output: > https://cwiki.apache.org/confluence/display/solr/The+Term+Vector+Component > > Perhaps with additional flags enabled (e.g. tv.offsets and/or > tv.positions). > > Regards, >Alex. > > Newsletter and resources for Solr beginners and intermediates: > http://www.solr-start.com/ > > > > On 5 June 2016 at 07:39, Justin Lee wrote: > > Is anyone aware of a way of getting a list of each matching token and > their > > offsets after executing a search? The reason I want to do this is > because > > I have the physical coordinates of each token in the original document > > stored out of band, and I want to be able to highlight in the original > > document. I would really like to have Solr return the list of matching > > tokens because then things like stemming and phrase matching will work as > > expected. I'm thinking of something like the highlighter component, > except > > instead of returning html, it would return just the matching tokens and > > their offsets. > > > > I have googled high and low and can't seem to find an exact answer to > this > > question, so I have spent the last few days examining the internals of > the > > various highlighting classes in Solr and Lucene. I think the bulk of the > > action is in WeightedSpanTermExtractor and its interaction with > > getBestTextFragments in the Highlighter class. But before I spend > anymore > > time on this I thought I'd ask (1) whether anyone knows of an easier way > of > > doing this, and (2) whether I'm at least barking up the right tree. > > > > Thanks much, > > Justin >
Re: Getting a list of matching terms and offsets
Thanks, yea, I looked at debug query too. Unfortunately the output of debug query doesn't quite do it. For example, if you use a wildcard query, it will simply explain the score associated with that wildcard query, not the actual matching token. In order words, if you search for "hour*" and the actual matching text is "hours", debug query doesn't tell you that. Instead, it just reports the score associated with "hour*". The closest example I've ever found is this: https://lucidworks.com/blog/2013/05/09/update-accessing-words-around-a-positional-match-in-lucene-4/ But this kind of approach won't let me use the full power of the Solr ecosystem. I'd basically be back to dealing with Lucene directly, which I think is a step backwards. I think the right approach is to write my own SearchComponent, using the highlighter as a starting point. But I wanted to make sure there wasn't a simpler way. On Sun, Jun 5, 2016 at 11:30 AM Ahmet Arslan wrote: > Well debug query has the list of token that caused match. > If i am not mistaken i read an example about span query and spans thing. > It was listing the positions of the matches. > Cannot find the example at the moment.. > > Ahmet > > > > On Sunday, June 5, 2016 9:10 PM, Justin Lee > wrote: > Thanks for the responses Alex and Ahmet. > > The TermVector component was the first thing I looked at, but what it gives > you is offset information for every token in the document. I'm trying to > get a list of tokens that actually match the search query, and unless I'm > missing something, the TermVector component doesn't give you that > information. > > The TermSpans class does contain the right information, but again the hard > part is: how do I reliably get a list of TokenSpans for the tokens that > actually match the search query? That's why I ended up in the highlighter > source code, because the highlighter has to do just this in order to create > snippets with accurate highlighting. > > Justin > > > On Sun, Jun 5, 2016 at 9:09 AM Ahmet Arslan > wrote: > > > Hi, > > > > May be org.apache.lucene.search.spans.TermSpans ? > > > > > > > > On Sunday, June 5, 2016 7:59 AM, Alexandre Rafalovitch < > arafa...@gmail.com> > > wrote: > > It sounds like TermVector component's output: > > > https://cwiki.apache.org/confluence/display/solr/The+Term+Vector+Component > > > > Perhaps with additional flags enabled (e.g. tv.offsets and/or > > tv.positions). > > > > Regards, > >Alex. > > > > Newsletter and resources for Solr beginners and intermediates: > > http://www.solr-start.com/ > > > > > > > > On 5 June 2016 at 07:39, Justin Lee wrote: > > > Is anyone aware of a way of getting a list of each matching token and > > their > > > offsets after executing a search? The reason I want to do this is > > because > > > I have the physical coordinates of each token in the original document > > > stored out of band, and I want to be able to highlight in the original > > > document. I would really like to have Solr return the list of matching > > > tokens because then things like stemming and phrase matching will work > as > > > expected. I'm thinking of something like the highlighter component, > > except > > > instead of returning html, it would return just the matching tokens and > > > their offsets. > > > > > > I have googled high and low and can't seem to find an exact answer to > > this > > > question, so I have spent the last few days examining the internals of > > the > > > various highlighting classes in Solr and Lucene. I think the bulk of > the > > > action is in WeightedSpanTermExtractor and its interaction with > > > getBestTextFragments in the Highlighter class. But before I spend > > anymore > > > time on this I thought I'd ask (1) whether anyone knows of an easier > way > > of > > > doing this, and (2) whether I'm at least barking up the right tree. > > > > > > Thanks much, > > > Justin > > >
Re: Getting a list of matching terms and offsets
Thank you very much! That JIRA entry led me to https://issues.apache.org/jira/browse/SOLR-4722, which still works against Solr 6 with a couple of modifications and should serve as the basis for what I want to do. You saved me a bunch of work, so thanks very much. (Also, it is always nice to know that people with more experience than me took the same approach.) On Sun, Jun 5, 2016 at 1:09 PM Ahmet Arslan wrote: > Hi Lee, > > May be you can find useful starting point on > https://issues.apache.org/jira/browse/SOLR-1397 > > Please consider to contribute when you gather something working. > > Ahmet > > > > > On Sunday, June 5, 2016 10:37 PM, Justin Lee > wrote: > Thanks, yea, I looked at debug query too. Unfortunately the output of > debug query doesn't quite do it. For example, if you use a wildcard query, > it will simply explain the score associated with that wildcard query, not > the actual matching token. In order words, if you search for "hour*" and > the actual matching text is "hours", debug query doesn't tell you that. > Instead, it just reports the score associated with "hour*". > > The closest example I've ever found is this: > > > https://lucidworks.com/blog/2013/05/09/update-accessing-words-around-a-positional-match-in-lucene-4/ > > But this kind of approach won't let me use the full power of the Solr > ecosystem. I'd basically be back to dealing with Lucene directly, which I > think is a step backwards. I think the right approach is to write my own > SearchComponent, using the highlighter as a starting point. But I wanted > to make sure there wasn't a simpler way. > > > On Sun, Jun 5, 2016 at 11:30 AM Ahmet Arslan > wrote: > > > Well debug query has the list of token that caused match. > > If i am not mistaken i read an example about span query and spans thing. > > It was listing the positions of the matches. > > Cannot find the example at the moment.. > > > > Ahmet > > > > > > > > On Sunday, June 5, 2016 9:10 PM, Justin Lee > > wrote: > > Thanks for the responses Alex and Ahmet. > > > > The TermVector component was the first thing I looked at, but what it > gives > > you is offset information for every token in the document. I'm trying to > > get a list of tokens that actually match the search query, and unless I'm > > missing something, the TermVector component doesn't give you that > > information. > > > > The TermSpans class does contain the right information, but again the > hard > > part is: how do I reliably get a list of TokenSpans for the tokens that > > actually match the search query? That's why I ended up in the > highlighter > > source code, because the highlighter has to do just this in order to > create > > snippets with accurate highlighting. > > > > Justin > > > > > > On Sun, Jun 5, 2016 at 9:09 AM Ahmet Arslan > > wrote: > > > > > Hi, > > > > > > May be org.apache.lucene.search.spans.TermSpans ? > > > > > > > > > > > > On Sunday, June 5, 2016 7:59 AM, Alexandre Rafalovitch < > > arafa...@gmail.com> > > > wrote: > > > It sounds like TermVector component's output: > > > > > > https://cwiki.apache.org/confluence/display/solr/The+Term+Vector+Component > > > > > > Perhaps with additional flags enabled (e.g. tv.offsets and/or > > > tv.positions). > > > > > > Regards, > > >Alex. > > > > > > Newsletter and resources for Solr beginners and intermediates: > > > http://www.solr-start.com/ > > > > > > > > > > > > On 5 June 2016 at 07:39, Justin Lee wrote: > > > > Is anyone aware of a way of getting a list of each matching token and > > > their > > > > offsets after executing a search? The reason I want to do this is > > > because > > > > I have the physical coordinates of each token in the original > document > > > > stored out of band, and I want to be able to highlight in the > original > > > > document. I would really like to have Solr return the list of > matching > > > > tokens because then things like stemming and phrase matching will > work > > as > > > > expected. I'm thinking of something like the highlighter component, > > > except > > > > instead of returning html, it would return just the matching tokens > and > > > > their offsets. > > > > > > > > I have googled high and low and can't seem to find an exact answer to > > > this > > > > question, so I have spent the last few days examining the internals > of > > > the > > > > various highlighting classes in Solr and Lucene. I think the bulk of > > the > > > > action is in WeightedSpanTermExtractor and its interaction with > > > > getBestTextFragments in the Highlighter class. But before I spend > > > anymore > > > > time on this I thought I'd ask (1) whether anyone knows of an easier > > way > > > of > > > > doing this, and (2) whether I'm at least barking up the right tree. > > > > > > > > Thanks much, > > > > Justin > > > > > >
Bypassing ExtractingRequestHandler
Has anybody had any experience bypassing ExtractingRequestHandler and simply managing Tika manually? I want to make a small modification to Tika to get and save additional data from my PDFs, but I have been procrastinating in no small part due to the unpleasant prospect of setting up a development environment where I could compile and debug modifications that might run through PDFBox, Tika, and ExtractingRequestHandler. It occurs to me that it would be much easier if the two were separate, so I could have direct control over Tika and just submit the text to Solr after extraction. Am I going to regret this approach? I'm not sure what ExtractingRequestHandler really does for me that Tika doesn't already do. Also, I was reading this <http://stackoverflow.com/questions/33292776/solr-tika-processor-not-crawling-my-pdf-files-prefectly> stackoverflow entry and someone offhandedly mentioned that ExtractingRequestHandler might be separated in the future anyway. Is there a public roadmap for the project, or does one have to keep up with the developer's mailing list and hunt through JIRA entries to keep up with the pulse of the project? Thanks, Justin
Re: Bypassing ExtractingRequestHandler
Thanks everyone for the help and advice. The SolrJ exmaple makes sense to me. The import of SOLR-8166 was kind of mind boggling to me, but maybe I'll revisit after some time. Tim: for context, I'm ultimately trying to create an external highlighter. See https://issues.apache.org/jira/browse/SOLR-1397. I want to store the bounding box (in PDF units) for each token in the extracted text stream. Then when I get results from Solr using the above patch, I'll convert the UTF-16 offsets into X/Y coordinates and perform highlighting as appropriate in the UI. I like this approach because I get highlighting that accurately reflects the search, even when the search is complex (e.g. wildcards or proximity searches). I think it would take quite a bit of thinking to get something general enough to add into Tika. For example, what units? Take a look at the discussion of what units to report offsets in here: https://issues.apache.org/jira/browse/SOLR-1954 (see the comments by Robert Muir -- although whatever issues there are here they are the same as the offsets reported in the Term Vector Component, it would seem to me). As another example, I'm just not sure what format is general enough to make sense for everybody. I think I'll just create a mapping from UTF-16 offsets into (x1,y1) (x2,y2) pairs, dump it into a JSON blob, and store that in a NoSQL store. Then, when I get Solr results, I'll look at the matching offsets, the JSON blob, and the original document and be on my merry way. I'm happy to open a JIRA entry in Tika if you think this is a coherent request. The other approach, I suppose, is to try to pass the information along during indexing and store as a token payload. But it seems like the indexing interface is really text oriented. I have also thought about using DelimitedPayloadTokenFilter, which will increase the index size I imagine (how much, though?) and require more customization of Solr internals. I don't know which is the better approach. On Mon, Jun 13, 2016 at 7:22 AM Allison, Timothy B. wrote: > > > > >Two things: Here's a sample bit of SolrJ code, pulling out the DB stuff > should be straightforward: > http://searchhub.org/2012/02/14/indexing-with-solrj/ > > +1 > > > We tend to prefer running Tika externally as it's entirely possible > > that Tika will crash or hang with certain files - and that will bring > > down Solr if you're running Tika within it. > > +1 > > >> I want to make a small modification > >> to Tika to get and save additional data from my PDFs > What info do you need, and if it is common enough, could you ask over on > Tika's JIRA and we'll try to add it directly? > > > >
Re: Send kill -9 to a node and can not delete down replicas with onlyIfDown.
Pardon me for hijacking the thread, but I'm curious about something you said, Erick. I always thought that the point (in part) of going through the pain of using zookeeper and creating replicas was so that the system could seamlessly recover from catastrophic failures. Wouldn't an OOM condition have a similar effect (or maybe java is better at cleanup on that kind of error)? The reason I ask is that I'm trying to set up a solr system that is highly available and I'm a little bit surprised that a kill -9 on one process on one machine could put the entire system in a bad state. Is it common to have to address problems like this with manual intervention in production systems? Ideally, I'd hope to be able to set up a system where a single node dying a horrible death would never require intervention. On Tue, Jul 19, 2016 at 8:54 AM Erick Erickson wrote: > First of all, killing with -9 is A Very Bad Idea. You can > leave write lock files laying around. You can leave > the state in an "interesting" place. You haven't given > Solr a chance to tell Zookeeper that it's going away. > (which would set the state to "down"). In short > when you do this you have to deal with the consequences > yourself, one of which is this mismatch between > cluster state and live_nodes. > > Now, that rant done the bin/solr script tries to stop Solr > gracefully but issues a kill if solr doesn't stop nicely. Personally > I think that timeout should be longer, but that's another story. > > The onlyIfDown='true' option is there specifically as a > safety valve. It was provided for those who want to guard against > typos and the like, so just don't specify it and you should be fine. > > Best, > Erick > > On Mon, Jul 18, 2016 at 11:51 PM, Jerome Yang wrote: > > Hi all, > > > > Here's the situation. > > I'm using solr5.3 in cloud mode. > > > > I have 4 nodes. > > > > After use "kill -9 pid-solr-node" to kill 2 nodes. > > These replicas in the two nodes still are "ACTIVE" in zookeeper's > > state.json. > > > > The problem is, when I try to delete these down replicas with > > parameter onlyIfDown='true'. > > It says, > > "Delete replica failed: Attempted to remove replica : > > demo.public.tbl/shard0/core_node4 with onlyIfDown='true', but state is > > 'active'." > > > > From this link: > > < > http://www.solr-start.com/javadoc/solr-lucene/org/apache/solr/common/cloud/Replica.State.html#ACTIVE > > > > < > http://www.solr-start.com/javadoc/solr-lucene/org/apache/solr/common/cloud/Replica.State.html#ACTIVE > > > > < > http://www.solr-start.com/javadoc/solr-lucene/org/apache/solr/common/cloud/Replica.State.html#ACTIVE > > > > < > http://www.solr-start.com/javadoc/solr-lucene/org/apache/solr/common/cloud/Replica.State.html#ACTIVE > > > > > http://www.solr-start.com/javadoc/solr-lucene/org/apache/solr/common/cloud/Replica.State.html#ACTIVE > > > > It says: > > *NOTE*: when the node the replica is hosted on crashes, the replica's > state > > may remain ACTIVE in ZK. To determine if the replica is truly active, you > > must also verify that its node > > < > http://www.solr-start.com/javadoc/solr-lucene/org/apache/solr/common/cloud/Replica.html#getNodeName-- > > > > is > > under /live_nodes in ZK (or use ClusterState.liveNodesContain(String) > > < > http://www.solr-start.com/javadoc/solr-lucene/org/apache/solr/common/cloud/ClusterState.html#liveNodesContain-java.lang.String- > > > > ). > > > > So, is this a bug? > > > > Regards, > > Jerome >
Re: Send kill -9 to a node and can not delete down replicas with onlyIfDown.
Thanks for taking the time for the detailed response. I completely get what you are saying. Makes sense. On Tue, Jul 19, 2016 at 10:56 AM Erick Erickson wrote: > Justin: > > Well, "kill -9" just makes it harder. The original question > was whether a replica being "active" was a bug, and it's > not when you kill -9; the Solr node has no chance to > tell Zookeeper it's going away. ZK does modify > the live_nodes by itself, thus there are checks as > necessary when a replica's state is referenced > whether the node is also in live_nodes. And an > overwhelming amount of the time this is OK, Solr > recovers just fine. > > As far as the write locks are concerned, those are > a Lucene level issue so if you kill Solr at just the > wrong time it's possible that that'll be left over. The > write locks are held for as short a period as possible > by Lucene, but occasionally they can linger if you kill > -9. > > When a replica comes up, if there is a write lock already, it > doesn't just take over; it fails to load instead. > > A kill -9 won't bring the cluster down by itself except > if there are several coincidences. Just don't make > it a habit. For instance, consider if you kill -9 on > two Solrs that happen to contain all of the replicas > for a shard1 for collection1. And you _happen_ to > kill them both at just the wrong time and they both > leave Lucene write locks for those replicas. Now > no replica will come up for shard1 and the collection > is unusable. > > So the shorter form is that using "kill -9" is a poor practice > that exposes you to some risk. The hard-core Solr > guys work extremely had to compensate for this kind > of thing, but kill -9 is a harsh, last-resort option and > shouldn't be part of your regular process. And you should > expect some "interesting" states when you do. And > you should use the bin/solr script to stop Solr > gracefully. > > Best, > Erick > > > On Tue, Jul 19, 2016 at 9:29 AM, Justin Lee > wrote: > > Pardon me for hijacking the thread, but I'm curious about something you > > said, Erick. I always thought that the point (in part) of going through > > the pain of using zookeeper and creating replicas was so that the system > > could seamlessly recover from catastrophic failures. Wouldn't an OOM > > condition have a similar effect (or maybe java is better at cleanup on > that > > kind of error)? The reason I ask is that I'm trying to set up a solr > > system that is highly available and I'm a little bit surprised that a > kill > > -9 on one process on one machine could put the entire system in a bad > > state. Is it common to have to address problems like this with manual > > intervention in production systems? Ideally, I'd hope to be able to set > up > > a system where a single node dying a horrible death would never require > > intervention. > > > > On Tue, Jul 19, 2016 at 8:54 AM Erick Erickson > > wrote: > > > >> First of all, killing with -9 is A Very Bad Idea. You can > >> leave write lock files laying around. You can leave > >> the state in an "interesting" place. You haven't given > >> Solr a chance to tell Zookeeper that it's going away. > >> (which would set the state to "down"). In short > >> when you do this you have to deal with the consequences > >> yourself, one of which is this mismatch between > >> cluster state and live_nodes. > >> > >> Now, that rant done the bin/solr script tries to stop Solr > >> gracefully but issues a kill if solr doesn't stop nicely. Personally > >> I think that timeout should be longer, but that's another story. > >> > >> The onlyIfDown='true' option is there specifically as a > >> safety valve. It was provided for those who want to guard against > >> typos and the like, so just don't specify it and you should be fine. > >> > >> Best, > >> Erick > >> > >> On Mon, Jul 18, 2016 at 11:51 PM, Jerome Yang > wrote: > >> > Hi all, > >> > > >> > Here's the situation. > >> > I'm using solr5.3 in cloud mode. > >> > > >> > I have 4 nodes. > >> > > >> > After use "kill -9 pid-solr-node" to kill 2 nodes. > >> > These replicas in the two nodes still are "ACTIVE" in zookeeper's > >> > state.json. > >> > > >> > The problem is, when I try to delete these down replicas with
Contributors Group
Hello There. Can you please add the following user to the contributors group: JustinBaynton Thank you! Justin
Documents Added Not Available After Commit (Both Soft and Hard)
gt;> _2gqy(4.5):C697/215 _2gr2(4.5):C878/352 _2gr7(4.5):C28135/11775 >> _2gr9(4.5):C3276/1341 _2grb(4.5):C5/1 _2grc(4.5):C3247/1219 _2grd(4.5):C6/1 >> _2grf(4.5):C5/2 _2grg(4.5):C23659/10967 _2grh(4.5):C1 _2grj(4.5):C1 >> _2grk(4.5):C5160/1482 _2grm(4.5):C1210/351 _2grn(4.5):C3957/1372 >> _2gro(4.5):C7734/2207 _2grp(4.5):C220/36)} > > INFO - 2014-06-05 21:14:26.949; >> org.apache.solr.update.DirectUpdateHandler2; start >> commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false} > > INFO - 2014-06-05 21:14:36.727; org.apache.solr.core.SolrDeletionPolicy; >> SolrDeletionPolicy.onCommit: commits: num=2 > > >> commit{dir=NRTCachingDirectory(org.apache.lucene.store.MMapDirectory@/data/solr-data/index >> lockFactory=org.apache.lucene.store.SingleInstanceLockFactory@26041cb3; >> maxCacheMB=48.0 maxMergeSizeMB=4.0),segFN=segments_acl,generation=13413} > > >> commit{dir=NRTCachingDirectory(org.apache.lucene.store.MMapDirectory@/data/solr-data/index >> lockFactory=org.apache.lucene.store.SingleInstanceLockFactory@26041cb3; >> maxCacheMB=48.0 maxMergeSizeMB=4.0),segFN=segments_acm,generation=13414} > > INFO - 2014-06-05 21:14:36.728; org.apache.solr.core.SolrDeletionPolicy; >> newest commit generation = 13414 > > INFO - 2014-06-05 21:14:36.749; org.apache.solr.search.SolrIndexSearcher; >> Opening Searcher@5bf20a8a main > > INFO - 2014-06-05 21:14:36.750; >> org.apache.solr.update.DirectUpdateHandler2; end_commit_flush > > INFO - 2014-06-05 21:14:36.759; org.apache.solr.core.QuerySenderListener; >> QuerySenderListener sending requests to Searcher@5bf20a8a >> main{StandardDirectoryReader(segments_acm:1367002775958 >> _2f28(4.5):C13583563/4088615 _2gl6(4.5):C2754573/202192 >> _2g21(4.5):C1046256/298243 _2ge2(4.5):C835858/208834 >> _2gqd(4.5):C383500/35732 _2gmu(4.5):C125197/33714 _2grl(4.5):C46906/3282 >> _2gpj(4.5):C66480/17459 _2gra(4.5):C364/40 _2gr1(4.5):C36064/3442 >> _2gqg(4.5):C42504/22410 _2gqm(4.5):C26821/13787 _2gqu(4.5):C24172/10804 >> _2gqy(4.5):C697/231 _2gr2(4.5):C878/382 _2gr7(4.5):C28135/12761 >> _2gr9(4.5):C3276/1478 _2grb(4.5):C5/1 _2grc(4.5):C3247/1323 _2grd(4.5):C6/1 >> _2grf(4.5):C5/2 _2grg(4.5):C23659/11895 _2grh(4.5):C1 _2grj(4.5):C1 >> _2grk(4.5):C5160/1982 _2grm(4.5):C1210/531 _2grn(4.5):C3957/1790 >> _2gro(4.5):C7734/3504 _2grp(4.5):C220/106 _2grq(4.5):C72751/30166 >> _2grr(4.5):C1)} > > INFO - 2014-06-05 21:14:36.759; org.apache.solr.core.SolrCore; >> [zoomCollection] webapp=null path=null >> params={event=newSearcher&q=d_name:ibm&distrib=false} hits=38 status=0 >> QTime=0 > > INFO - 2014-06-05 21:14:36.760; org.apache.solr.core.QuerySenderListener; >> QuerySenderListener done. > > INFO - 2014-06-05 21:14:36.760; org.apache.solr.core.SolrCore; >> [zoomCollection] Registered new searcher Searcher@5bf20a8a >> main{StandardDirectoryReader(segments_acm:1367002775958 >> _2f28(4.5):C13583563/4088615 _2gl6(4.5):C2754573/202192 >> _2g21(4.5):C1046256/298243 _2ge2(4.5):C835858/208834 >> _2gqd(4.5):C383500/35732 _2gmu(4.5):C125197/33714 _2grl(4.5):C46906/3282 >> _2gpj(4.5):C66480/17459 _2gra(4.5):C364/40 _2gr1(4.5):C36064/3442 >> _2gqg(4.5):C42504/22410 _2gqm(4.5):C26821/13787 _2gqu(4.5):C24172/10804 >> _2gqy(4.5):C697/231 _2gr2(4.5):C878/382 _2gr7(4.5):C28135/12761 >> _2gr9(4.5):C3276/1478 _2grb(4.5):C5/1 _2grc(4.5):C3247/1323 _2grd(4.5):C6/1 >> _2grf(4.5):C5/2 _2grg(4.5):C23659/11895 _2grh(4.5):C1 _2grj(4.5):C1 >> _2grk(4.5):C5160/1982 _2grm(4.5):C1210/531 _2grn(4.5):C3957/1790 >> _2gro(4.5):C7734/3504 _2grp(4.5):C220/106 _2grq(4.5):C72751/30166 >> _2grr(4.5):C1)} > > I've also shared via Google Drive a more complete log for a period of time where this is occurring, as well as our solrconfig.xml in case that is useful. Any ideas on why the Solr commit is not finding any changes despite the clear logging of the adds. For some reason, after hours of this it will find changes and commit everything, including the documents that were skipped previously. Thanks for any assistance! Justin Sweeney solr_commit_issue.log <https://docs.google.com/file/d/0B7jKxYrZOSvac21nV0JuRWF0SW8/edit?usp=drive_web> solrconfig.xml <https://docs.google.com/file/d/0B7jKxYrZOSvaRUY2QzhUN2tQYmM/edit?usp=drive_web>
Re: Documents Added Not Available After Commit (Both Soft and Hard)
Thanks for the input! Erick - To clarify, we see the No Uncommitted Changes message repeatedly for a number of commits (not a consistent number each time this happens) and then eventually we see a commit that successfully finds changes, at which point the documents are available. Shalin - That bug looks like it could be related to our case, did you notice any impact of the bug in situations where there were not just pending deletes by term? In our case, we are adding documents, we do have some deletes, but the bulk are adds. We can see the logging of the adds in the solr log prior to seeing the No Uncommitted Changes message. Either way, it may be useful for us to upgrade and see if it fixes the issue. I'll let you know if that works out once we get a chance to do that. Thanks, Justin On Mon, Jun 9, 2014 at 3:02 AM, Shalin Shekhar Mangar < shalinman...@gmail.com> wrote: > I think this may be the same bug as LUCENE-5289 which was fixed in 4.5.1. > Can you upgrade to 4.5.1 and see if that solves the problem? > > > > > On Fri, Jun 6, 2014 at 7:17 PM, Justin Sweeney > > wrote: > > > Hi, > > > > An application I am working on indexes documents to a Solr index. This > Solr > > index is setup as a single node, without any replication. This index is > > running Solr 4.5.0. > > > > We have noticed an issue lately that is causing some problems for our > > application. The problem is that we add/update a number of documents in > the > > Solr index and we have the index setup to autoCommit (hard) once every 30 > > minutes. In the Solr logs, I am able to see the add command to Solr and I > > can also see Solr start the hard commit. When this hard commit occurs, we > > see the following message: > > INFO - 2014-06-04 20:13:55.135; > > org.apache.solr.update.DirectUpdateHandler2; No uncommitted changes. > > Skipping IW.commit. > > > > This only happens sometimes, but Solr will go hours (we have seen 6-12 > > hours of this behavior) before it does a hard commit where it find > changes. > > After the hard commit where the changes are found, we are then able to > > search for and find the documents that were added hours ago, but up until > > that point the documents are not searchable. > > > > We tried enabling autoSoftCommit every 5 minutes in the hope that this > > would help, but we are seeing the same behavior. > > > > Here is a sampling of the logs showing this occurring (I've trimmed it > down > > to just show what is happening): > > > > INFO - 2014-06-05 20:00:41.300; > > >> org.apache.solr.update.processor.LogUpdateProcessor; [zoomCollection] > > >> webapp=/solr path=/update params={wt=javabin&version=2} > > {add=[359453225]} 0 > > >> 0 > > > > > > INFO - 2014-06-05 20:00:41.376; > > >> org.apache.solr.update.processor.LogUpdateProcessor; [zoomCollection] > > >> webapp=/solr path=/update params={wt=javabin&version=2} > > {add=[347170717]} 0 > > >> 1 > > > > > > INFO - 2014-06-05 20:00:51.527; > > >> org.apache.solr.update.DirectUpdateHandler2; start > > >> > > > commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=true,prepareCommit=false} > > > > > > INFO - 2014-06-05 20:00:51.533; > > org.apache.solr.search.SolrIndexSearcher; > > >> Opening Searcher@257c43d main > > > > > > INFO - 2014-06-05 20:00:51.533; > > >> org.apache.solr.update.DirectUpdateHandler2; end_commit_flush > > > > > > INFO - 2014-06-05 20:00:51.545; > > org.apache.solr.core.QuerySenderListener; > > >> QuerySenderListener sending requests to Searcher@257c43d > > >> main{StandardDirectoryReader(segments_acl:1367002775953 > > >> _2f28(4.5):C13583563/4081507 _2gl6(4.5):C2754573/193533 > > >> _2g21(4.5):C1046256/296354 _2ge2(4.5):C835858/206139 > > >> _2gqd(4.5):C383500/31051 _2gmu(4.5):C125197/32491 > _2grl(4.5):C46906/1255 > > >> _2gpj(4.5):C66480/16562 _2gra(4.5):C364/22 _2gr1(4.5):C36064/2556 > > >> _2gqg(4.5):C42504/21515 _2gqm(4.5):C26821/12659 > _2gqu(4.5):C24172/10240 > > >> _2gqy(4.5):C697/215 _2gr2(4.5):C878/352 _2gr7(4.5):C28135/11775 > > >> _2gr9(4.5):C3276/1341 _2grb(4.5):C5/1 _2grc(4.5):C3247/1219 > > _2grd(4.5):C6/1 > > >> _2grf(4.5):C5/2 _2grg(4.5):C23659/10967 _2grh(4.5):C1 _2grj(4.5):C1 > > >> _2grk(4.5):C5160/1482 _2grm(4.5):C1210/351 _2grn(4.5):C3957/1372 > > >> _2gro(4.5):C7734/2207 _2grp(4.5):C220/36)} > > > > > > INFO - 2014-
Large-scale Solr publish - hanging at blockUntilFinished indefinitely - stuck on SocketInputStream.socketRead0
*Problem:* We periodically rebuild our Solr index from scratch. We have built a custom publisher that horizontally scales to increase write throughput. On a given rebuild, we will have ~60 JVMs running with 5 threads that are actively publishing to all Solr masters. For each thread, we instantiate one StreamingUpdateSolrServer( QueueSize:100, QueueThreadSize: 2 ) for each master = 20 servers/thread. At the end of a publish cycle (we publish in smaller chunks = 5MM records), we execute server.blockUntilFinished() on each of the 20 servers on each thread ( 100 total ). Before we applied a recent change, this would always execute to completion. There were a few hang-ups on publishes but we consistently re-published our entire corpus in 6-7 hours. The *problem* is that the blockUntilFinished hangs indefinitely. From the java thread dumps, it appears that the loop in StreamingUpdateSolrServer thinks a runner thread is still active so it blocks (as expected). The other note about the java thread dump is that the active runner thread is exactly this: *Hung Runner Thread:* "pool-1-thread-8" prio=3 tid=0x0001084c nid=0xfe runnable [0x5c7fe000] java.lang.Thread.State: RUNNABLE at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.read(SocketInputStream.java:129) at java.io.BufferedInputStream.fill(BufferedInputStream.java:218) at java.io.BufferedInputStream.read(BufferedInputStream.java:237) - locked <0xfffe81dbcbe0> (a java.io.BufferedInputStream) at org.apache.commons.httpclient.HttpParser.readRawLine(HttpParser.java:78) at org.apache.commons.httpclient.HttpParser.readLine(HttpParser.java:106) at org.apache.commons.httpclient.HttpConnection.readLine(HttpConnection.java:1116) at org.apache.commons.httpclient.MultiThreadedHttpConnectionManager$HttpConnectionAdapter.readLine(MultiThreadedHttpConnectionManager.java:1413) at org.apache.commons.httpclient.HttpMethodBase.readStatusLine(HttpMethodBase.java:1973) at org.apache.commons.httpclient.HttpMethodBase.readResponse(HttpMethodBase.java:1735) at org.apache.commons.httpclient.HttpMethodBase.execute(HttpMethodBase.java:1098) at org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(HttpMethodDirector.java:398) at org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:171) at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397) at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:323) at org.apache.solr.client.solrj.impl.StreamingUpdateSolrServer$Runner.run(StreamingUpdateSolrServer.java:154) Although the runner thread is reading the socket, there is absolutely no activity on the Solr clients. Other than the blockUntilFinished thread, the client is basically sleeping. * * * * ***Recent Change:* We increased the "maxFieldLength" from 1(default) to 2147483647 (Integer.MAX_VALUE). Given this change is server side, I don't know how this would impact adding a new document. I see how it would increase commit times and index size, but don't see the relationship to hanging client adds. *Ingest Workflow:* 1) Pull artifacts from relational database (PDF/TXT/Java bean) 2) Extract all searchable text fields -- this is where we use Tika, independent of Solr 3) Using Solr4J client, we publish an object that is serialized to XML and written to the master 4) execute "blockUntilFinished" for all 20 servers on each thread. 5) Autocommit set on servers at 30 minutes or 50k documents. During republish, 50k threshold is met first. * * *Environment:* Solr v3.5.0 20 masters 2 slaves/master = 40 slaves *Corpus:* We have ~100MM records, ranging in size from 50MB PDFs to 1KB TXT files. Our schema has an unusually large number of fields, 200. Our index size averages about 30GB/shards, totally 600GB. *Releated Bugs:* My symptoms are most related to this bug but we are not executing any deletes so I have low confidence that it is 100% related https://issues.apache.org/jira/browse/SOLR-1990 Although we have similar stack traces, we are only ADDING docs. Thanks ahead for any input/help! -- Justin Babuscio
Re: Large-scale Solr publish - hanging at blockUntilFinished indefinitely - stuck on SocketInputStream.socketRead0
Shawn, Thank you! Just some quick responses: On your overflow theory, why would this impact the client? Is is possible that a write attempt to Solr would block indefinitely while the Solr server is running wild or in a bad state due to the overflow? We attempt to set the BinaryRequestWriter but per this bug: https://issues.apache.org/jira/browse/SOLR-1565, v3.5 uses the default XML writer. On upgrading to 3.6.2 or 4.x, we have an organizational challenge that requires approval of the software/upgrade. I am promoting/supporting this idea but cannot execute in the short-term. For the mass publish, we originally used the CommonsHttpSolrServer (what we use in live production updates) but we found the trade-off with performance was quite large. I really like your idea about KISS on threading. Since I'm already introducing complexity with all the multi-threading, why stress the older 3.x software. We may need to trade-off time for this. My first tactics will be to adjust the maxFieldLength and toggle the configuration to use CommonsHttpSolrServer. I will follow-up with any discoveries. Thanks again, Justin On Wed, May 22, 2013 at 11:46 AM, Shawn Heisey wrote: > On 5/22/2013 9:08 AM, Justin Babuscio wrote: > >> We periodically rebuild our Solr index from scratch. We have built a >> custom publisher that horizontally scales to increase write throughput. >> On >> a given rebuild, we will have ~60 JVMs running with 5 threads that are >> actively publishing to all Solr masters. >> >> For each thread, we instantiate one StreamingUpdateSolrServer( >> QueueSize:100, QueueThreadSize: 2 ) for each master = 20 servers/thread. >> > > Looking over all your details, you might want to try first reducing the > maxFieldLength to slightly below Integer.MAX_VALUE. Try setting it to 2 > billion, or even something more modest, in the millions. It's > theoretically possible that the other value might be leading to an overflow > somewhere. I've been looking for evidence of this, nothing's turned up yet. > > There MIGHT be bugs in the Apache Commons libraries that SolrJ uses. The > next thing I would try is upgrading those component jars in your > application's classpath - httpclient, commons-io, commons-codec, etc. > > Upgrading to a newer SolrJ version is also a good idea. Your notes imply > that you are using the default XML request writer in SolrJ. If that's > true, you should be able to use a 4.3 SolrJ even with an older Solr > version, which would give you a server object that's based on > HttpComponents 4.x, where your current objects are based on HttpClient 3.x. > You would need to make adjustments in your source code. If you're not > using the default XML request writer, you can get a similar change by using > SolrJ 3.6.2. > > IMHO you should switch to HttpSolrServer (CommonsHttpSolrServer in SolrJ > 3.5 and earlier). StreamingUpdateSolrServer (and its replacement in 3.6 > and later, named ConcurrentUpdateSolrServer) has one glaring problem - it > never informs the calling application about any errors that it encounters > during indexing. It lies to you, and tells you that everything has > succeeded even when it doesn't. > > The one advantage that SUSS/CUSS has over its Http sibling is that it is > multi-threaded, so it can send updates concurrently. You seem to know > enough about how it works, so I'll just say that you don't need additional > complexity that is not under your control and refuses to throw exceptions > when an error occurs. You already have a large-scale concurrent and > multi-threaded indexing setup, so SolrJ's additional thread handling > doesn't really buy you much. > > Thanks, > Shawn > > -- Justin Babuscio 571-210-0035 http://linchpinsoftware.com
Re: Geo spatial search with multi-valued locations (SOLR-2155 / lucene-spatial-playground)
Mike, I've applied the patch as of a June-dated trunk. There were some trivial conflicts, but mostly-easy to apply. It has been in production for a couple months with no major hiccups so far :). Justin "Smiley, David W." writes: > Hi Mike. > > I have hopes that LSP will be ready in time for Solr 4. It's usable now with > the understanding that it's still fairly early and so there are bound to be > bugs. I've been focusing a lot on testing lately. You could try applying > SOLR-2155 but I think there was some Lucene/Solr code re-organization > regarding the ValueSource API. It shouldn't be hard to update. I don't think > JTeam's plugin handles multi-value but I could be wrong (Chris Male will be > sure to jump in and correct me if so). QBase/Metacarta has a Solr plugin > I've used indirectly through a packaged deal with their products > http://www.metacarta.com/products-overview.htm I have no idea if you can get > it stand-alone. As of a few months ago, it was based on a version of Solr > trunk from March 2010 and they have yet to update it. > > ~ David Smiley > > On Aug 29, 2011, at 2:27 PM, Mike Austin wrote: > >> Besides the full integration into solr for this, would you recommend any >> third party solr plugins such as " >> http://www.jteam.nl/products/spatialsolrplugin.html";, or others? >> >> I can understand that spacial features can get complex and there could be >> many use cases, but this seems like a "basic" feature that you would use >> with a standard set of spacial features like what is in solr4 now. >> >> Thanks, >> Mike >> >> On Mon, Aug 29, 2011 at 12:38 PM, Darren Govoni wrote: >> >>> It doesn't. >>> >>> >>> On 08/29/2011 01:37 PM, Mike Austin wrote: >>> >>>> I've been trying to follow the progress of this and I'm not sure what the >>>> current status is. Can someone update me on what is currently in Solr4 >>>> and >>>> does it support multi-valued location in a single document? I saw that >>>> SOLR-2155 was not included and is now lucene-spatial-playground. >>>> >>>> Thanks, >>>> Mike >>>> >>>> >>>
Re: Easy way to tell if there are pending documents
You can enable the stats handler (https://issues.apache.org/jira/browse/SOLR-1750), and get inspect the json pragmatically. -- Justin "Latter, Antoine" writes: > Thank you, that does help - but I am more looking for a way to get at this > programmatically. > > -Original Message- > From: Otis Gospodnetic [mailto:otis_gospodne...@yahoo.com] > Sent: Tuesday, November 15, 2011 11:22 AM > To: solr-user@lucene.apache.org > Subject: Re: Easy way to tell if there are pending documents > > Antoine, > > On Solr Admin Stats page search for "docsPending". I think this is what you > are looking for. > > Otis > > Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem > search :: http://search-lucene.com/ > > >> >>From: "Latter, Antoine" >>To: "'solr-user@lucene.apache.org'" >>Sent: Monday, November 14, 2011 11:39 AM >>Subject: Easy way to tell if there are pending documents >> >>Hi Solr, >> >>Does anyone know of an easy way to tell if there are pending documents >>waiting for commit? >> >>Our application performs operations that are never safe to perform >> while commits are pending. We make this work by making sure that all >> indexing operations end in a commit, and stop the unsafe operations >> from running while a commit is running. >> >>This works great most of the time, except when we have enough disk >> space to add documents to the pending area, but not enough disk >> space to do a commit - then the indexing operations only error out >> after they've done all of their adds. >> >>It would be nice if the unsafe operation could somehow detect that there are >>pending documents and abort. >> >>In the interim I'll have the unsafe operation perform a commit when it >>starts, but I've been weeding out useless commits from my app recently and I >>don't like them creeping back in. >> >>Thanks, >>Antoine >> >> >>
Re: inconsistent JVM crash with version 4.0-SNAPSHOT
Lasse Aagren writes: > Hi, > > We are running Solr-Lucene 4.0-SNAPSHOT (1199777M - hudson - 2011-11-09 > 14:58:50) on severel servers running: > > 64bit Debian Squeeze (6.0.3) > OpenJDK6 (b18-1.8.9-0.1~squeeze1) > Tomcat 6.028 (6.0.28-9+squeeze1) > > Some of the servers have 48G RAM and in that case java have 16G (-Xmx16g) and > some of the servers have 96G RAM and in that case java have 48G (-Xmx48G). > > We are seeing some inconsistent crashes of tomcat's JVM under different > Solr/Lucene operations/circumstances. Sadly we can't replicate it. > > It doesn't happen often, but often enough that we can't rely on it in > production. > > When it happens, something like the following appears in the logs: > > == > # > # A fatal error has been detected by the Java Runtime Environment: > # > # SIGSEGV (0xb) at pc=0x7f6c318d0902, pid=16516, tid=139772378892032 > # > # JRE version: 6.0_18-b18 > # Java VM: OpenJDK 64-Bit Server VM (14.0-b16 mixed mode linux-amd64 ) > # Derivative: IcedTea6 1.8.9 > # Distribution: Debian GNU/Linux 6.0.2 (squeeze), package > 6b18-1.8.9-0.1~squeeze1 > # Problematic frame: > # j > org.apache.lucene.search.MultiTermQueryWrapperFilter.getDocIdSet(Lorg/apache/lucene/index/IndexReader$AtomicReaderContext;Lorg/apache/lucene/util/Bits;)Lorg/apache/lucene/search/DocIdSet;+193 > # > # An error report file with more information is saved as: > # /tmp/hs_err_pid16516.log > # > # If you would like to submit a bug report, please include > # instructions how to reproduce the bug and visit: > # http://icedtea.classpath.org/bugzilla > # > == > > Every time it happens the problematic frame is: > > Problematic frame: > # j > org.apache.lucene.search.MultiTermQueryWrapperFilter.getDocIdSet(Lorg/apache/lucene/index/IndexReader$AtomicReaderContext;Lorg/apache/lucene/util/Bits; > )Lorg/apache/lucene/search/DocIdSet;+193 > > And /tmp/hs_err_pid16516.log is attached to this mail. > > Has anyone seen this before? > > Please don't hesitate to ask for further specification about our setup. > > Best regards, I seem to remember a recent java released fixed seemingly random SIGSEGV's causing Solr/Lucene to crash non-deterministicly. http://lucene.apache.org/solr/#26+October+2011+-+Java+7u1+fixes+index+corruption+and+crash+bugs+in+Apache+Lucene+Core+and+Apache+Solr Hopefully this will provide you with some answers. If not, please let the list know. justin
Re: cache monitoring tools?
At my work, we use Munin and Nagio for monitoring and alerts. Munin is great because writing a plugin for it so simple, and with Solr's statistics handler, we can track almost any solr stat we want. It also comes with included plugins for load, file system stats, processes, etc. http://munin-monitoring.org/ Justin Paul Libbrecht writes: > Allow me to chim in and ask a generic question about monitoring tools > for people close to developers: are any of the tools mentioned in this > thread actually able to show graphs of loads, e.g. cache counts or CPU > load, in parallel to a console log or to an http request log?? > > I am working on such a tool currently but I have a bad feeling of reinventing > the wheel. > > thanks in advance > > Paul > > > > Le 8 déc. 2011 à 08:53, Dmitry Kan a écrit : > >> Otis, Tomás: thanks for the great links! >> >> 2011/12/7 Tomás Fernández Löbbe >> >>> Hi Dimitry, I pointed to the wiki page to enable JMX, then you can use any >>> tool that visualizes JMX stuff like Zabbix. See >>> >>> http://www.lucidimagination.com/blog/2011/10/02/monitoring-apache-solr-and-lucidworks-with-zabbix/ >>> >>> On Wed, Dec 7, 2011 at 11:49 AM, Dmitry Kan wrote: >>> >>>> The culprit seems to be the merger (frontend) SOLR. Talking to one shard >>>> directly takes substantially less time (1-2 sec). >>>> >>>> On Wed, Dec 7, 2011 at 4:10 PM, Dmitry Kan wrote: >>>> >>>>> Tomás: thanks. The page you gave didn't mention cache specifically, is >>>>> there more documentation on this specifically? I have used solrmeter >>>> tool, >>>>> it draws the cache diagrams, is there a similar tool, but which would >>> use >>>>> jmx directly and present the cache usage in runtime? >>>>> >>>>> pravesh: >>>>> I have increased the size of filterCache, but the search hasn't become >>>> any >>>>> faster, taking almost 9 sec on avg :( >>>>> >>>>> name: search >>>>> class: org.apache.solr.handler.component.SearchHandler >>>>> version: $Revision: 1052938 $ >>>>> description: Search using components: >>>>> >>>> >>> org.apache.solr.handler.component.QueryComponent,org.apache.solr.handler.component.FacetComponent,org.apache.solr.handler.component.MoreLikeThisComponent,org.apache.solr.handler.component.HighlightComponent,org.apache.solr.handler.component.StatsComponent,org.apache.solr.handler.component.DebugComponent, >>>>> >>>>> stats: handlerStart : 1323255147351 >>>>> requests : 100 >>>>> errors : 3 >>>>> timeouts : 0 >>>>> totalTime : 885438 >>>>> avgTimePerRequest : 8854.38 >>>>> avgRequestsPerSecond : 0.008789442 >>>>> >>>>> the stats (copying fieldValueCache as well here, to show term >>>> statistics): >>>>> >>>>> name: fieldValueCache >>>>> class: org.apache.solr.search.FastLRUCache >>>>> version: 1.0 >>>>> description: Concurrent LRU Cache(maxSize=1, initialSize=10, >>>>> minSize=9000, acceptableSize=9500, cleanupThread=false) >>>>> stats: lookups : 79 >>>>> hits : 77 >>>>> hitratio : 0.97 >>>>> inserts : 1 >>>>> evictions : 0 >>>>> size : 1 >>>>> warmupTime : 0 >>>>> cumulative_lookups : 79 >>>>> cumulative_hits : 77 >>>>> cumulative_hitratio : 0.97 >>>>> cumulative_inserts : 1 >>>>> cumulative_evictions : 0 >>>>> item_shingleContent_trigram : >>>>> >>>> >>> {field=shingleContent_trigram,memSize=326924381,tindexSize=4765394,time=215426,phase1=213868,nTerms=14827061,bigTerms=35,termInstances=114359167,uses=78} >>>>> name: filterCache >>>>> class: org.apache.solr.search.FastLRUCache >>>>> version: 1.0 >>>>> description: Concurrent LRU Cache(maxSize=153600, initialSize=4096, >>>>> minSize=138240, acceptableSize=145920, cleanupThread=false) >>>>> stats: lookups : 1082854 >>>>> hits : 940370 >>>>> hitratio : 0.86 >>>>> inserts : 142486 >>>>> evictions : 0 >>>>> size : 142486 >>>>> warmupTime : 0 >>>>> cumulative_lookups : 1082854 >>>>> cumulative_hits
Re: cache monitoring tools?
Dmitry, The only added stress that munin puts on each box is the 1 request per stat per 5 minutes to our admin stats handler. Given that we get 25 requests per second, this doesn't make much of a difference. We don't have a sharded index (yet) as our index is only 2-3 GB, but we do have slave servers with replicated indexes that handle the queries, while our master handles updates/commits. Justin Dmitry Kan writes: > Justin, in terms of the overhead, have you noticed if Munin puts much of it > when used in production? In terms of the solr farm: how big is a shard's > index (given you have sharded architecture). > > Dmitry > > On Sun, Dec 11, 2011 at 6:39 PM, Justin Caratzas > wrote: > >> At my work, we use Munin and Nagio for monitoring and alerts. Munin is >> great because writing a plugin for it so simple, and with Solr's >> statistics handler, we can track almost any solr stat we want. It also >> comes with included plugins for load, file system stats, processes, >> etc. >> >> http://munin-monitoring.org/ >> >> Justin >> >> Paul Libbrecht writes: >> >> > Allow me to chim in and ask a generic question about monitoring tools >> > for people close to developers: are any of the tools mentioned in this >> > thread actually able to show graphs of loads, e.g. cache counts or CPU >> > load, in parallel to a console log or to an http request log?? >> > >> > I am working on such a tool currently but I have a bad feeling of >> reinventing the wheel. >> > >> > thanks in advance >> > >> > Paul >> > >> > >> > >> > Le 8 déc. 2011 à 08:53, Dmitry Kan a écrit : >> > >> >> Otis, Tomás: thanks for the great links! >> >> >> >> 2011/12/7 Tomás Fernández Löbbe >> >> >> >>> Hi Dimitry, I pointed to the wiki page to enable JMX, then you can use >> any >> >>> tool that visualizes JMX stuff like Zabbix. See >> >>> >> >>> >> http://www.lucidimagination.com/blog/2011/10/02/monitoring-apache-solr-and-lucidworks-with-zabbix/ >> >>> >> >>> On Wed, Dec 7, 2011 at 11:49 AM, Dmitry Kan >> wrote: >> >>> >> >>>> The culprit seems to be the merger (frontend) SOLR. Talking to one >> shard >> >>>> directly takes substantially less time (1-2 sec). >> >>>> >> >>>> On Wed, Dec 7, 2011 at 4:10 PM, Dmitry Kan >> wrote: >> >>>> >> >>>>> Tomás: thanks. The page you gave didn't mention cache specifically, >> is >> >>>>> there more documentation on this specifically? I have used solrmeter >> >>>> tool, >> >>>>> it draws the cache diagrams, is there a similar tool, but which would >> >>> use >> >>>>> jmx directly and present the cache usage in runtime? >> >>>>> >> >>>>> pravesh: >> >>>>> I have increased the size of filterCache, but the search hasn't >> become >> >>>> any >> >>>>> faster, taking almost 9 sec on avg :( >> >>>>> >> >>>>> name: search >> >>>>> class: org.apache.solr.handler.component.SearchHandler >> >>>>> version: $Revision: 1052938 $ >> >>>>> description: Search using components: >> >>>>> >> >>>> >> >>> >> org.apache.solr.handler.component.QueryComponent,org.apache.solr.handler.component.FacetComponent,org.apache.solr.handler.component.MoreLikeThisComponent,org.apache.solr.handler.component.HighlightComponent,org.apache.solr.handler.component.StatsComponent,org.apache.solr.handler.component.DebugComponent, >> >>>>> >> >>>>> stats: handlerStart : 1323255147351 >> >>>>> requests : 100 >> >>>>> errors : 3 >> >>>>> timeouts : 0 >> >>>>> totalTime : 885438 >> >>>>> avgTimePerRequest : 8854.38 >> >>>>> avgRequestsPerSecond : 0.008789442 >> >>>>> >> >>>>> the stats (copying fieldValueCache as well here, to show term >> >>>> statistics): >> >>>>> >> >>>>> name: fieldValueCache >> >>>>> class: org.apache.solr.search.FastLRUCache >> >>>>> version: 1.0 >> >>>>> descr
Re: cache monitoring tools?
Dmitry, Thats beyond the scope of this thread, but Munin essentially runs "plugins" which are essentially scripts that output graph configuration and values when polled by the Munin server. So it uses a plain text protocol, so that the scripts can be written in any language. Munin then feeds this info into RRDtool, which displays the graph. There are some examples[1] of solr plugins that people have used to scrape the stats.jsp page. Justin 1. http://exchange.munin-monitoring.org/plugins/search?keyword=solr Dmitry Kan writes: > Thanks, Justin. With zabbix I can gather jmx exposed stats from SOLR, how > about munin, what protocol / way it uses to accumulate stats? It wasn't > obvious from their online documentation... > > On Mon, Dec 12, 2011 at 4:56 PM, Justin Caratzas > wrote: > >> Dmitry, >> >> The only added stress that munin puts on each box is the 1 request per >> stat per 5 minutes to our admin stats handler. Given that we get 25 >> requests per second, this doesn't make much of a difference. We don'tg >> have a sharded index (yet) as our index is only 2-3 GB, but we do have >> slave servers with replicated >> indexes that handle the queries, while our master handles >> updates/commits. >> >> Justin >> >> Dmitry Kan writes: >> >> > Justin, in terms of the overhead, have you noticed if Munin puts much of >> it >> > when used in production? In terms of the solr farm: how big is a shard's >> > index (given you have sharded architecture). >> > >> > Dmitry >> > >> > On Sun, Dec 11, 2011 at 6:39 PM, Justin Caratzas >> > wrote: >> > >> >> At my work, we use Munin and Nagio for monitoring and alerts. Munin is >> >> great because writing a plugin for it so simple, and with Solr's >> >> statistics handler, we can track almost any solr stat we want. It also >> >> comes with included plugins for load, file system stats, processes, >> >> etc. >> >> >> >> http://munin-monitoring.org/ >> >> >> >> Justin >> >> >> >> Paul Libbrecht writes: >> >> >> >> > Allow me to chim in and ask a generic question about monitoring tools >> >> > for people close to developers: are any of the tools mentioned in this >> >> > thread actually able to show graphs of loads, e.g. cache counts or CPU >> >> > load, in parallel to a console log or to an http request log?? >> >> > >> >> > I am working on such a tool currently but I have a bad feeling of >> >> reinventing the wheel. >> >> > >> >> > thanks in advance >> >> > >> >> > Paul >> >> > >> >> > >> >> > >> >> > Le 8 déc. 2011 à 08:53, Dmitry Kan a écrit : >> >> > >> >> >> Otis, Tomás: thanks for the great links! >> >> >> >> >> >> 2011/12/7 Tomás Fernández Löbbe >> >> >> >> >> >>> Hi Dimitry, I pointed to the wiki page to enable JMX, then you can >> use >> >> any >> >> >>> tool that visualizes JMX stuff like Zabbix. See >> >> >>> >> >> >>> >> >> >> http://www.lucidimagination.com/blog/2011/10/02/monitoring-apache-solr-and-lucidworks-with-zabbix/ >> >> >>> >> >> >>> On Wed, Dec 7, 2011 at 11:49 AM, Dmitry Kan >> >> wrote: >> >> >>> >> >> >>>> The culprit seems to be the merger (frontend) SOLR. Talking to one >> >> shard >> >> >>>> directly takes substantially less time (1-2 sec). >> >> >>>> >> >> >>>> On Wed, Dec 7, 2011 at 4:10 PM, Dmitry Kan >> >> wrote: >> >> >>>> >> >> >>>>> Tomás: thanks. The page you gave didn't mention cache >> specifically, >> >> is >> >> >>>>> there more documentation on this specifically? I have used >> solrmeter >> >> >>>> tool, >> >> >>>>> it draws the cache diagrams, is there a similar tool, but which >> would >> >> >>> use >> >> >>>>> jmx directly and present the cache usage in runtime? >> >> >>>>> >> >> >>>>> pravesh: >> >> >>>>> I have increased the size of f
setting up clustering
I'm trying to enable clustering in solr 1.4. I'm following these instructions: http://wiki.apache.org/solr/ClusteringComponent However, `ant get-libraries` fails for me. Before it tries to download the 4 jar files, it tries to compile lucene? Is this necessary? Has anyone gotten clustering working properly? My next attempt was to just copy contrib/clustering/lib/*.jar and contrib/clustering/lib/downloads/*.jar to WEB-INF/lib and enable clustering in solrconfig.xml, but this doesnt work either and I cant tell from the error log whether it just couldnt find the jar files or if there is some other problem: SEVERE: org.apache.solr.common.SolrException: Error loading class 'org.apache.solr.handler.clustering.ClusteringComponent'
boosting particular field values
I'm using dismax request handler, solr 1.4. I would like to boost the weight of certain fields according to their values... this appears to work: bq=category:electronics^5.5 However, I think this boosting only affects sorting the results that have already matched? So if I only get 10 rows back, I might not get any records back that are category electronics. If I get 100 rows, I can see that bq is working. However, I only want to get 10 rows. How does one affect the kinds of results that are matched to begin with? bq is the wrong thing to use, right? Thanks for any help, Justin
Re: boosting particular field values
I might have misunderstood, but I think I cant do string literals in function queries, right? myfield:"something"^3.0 I tried it anyway using solr 1.4, doesnt seem to work. On Wed, Jul 21, 2010 at 1:48 PM, Markus Jelsma wrote: > function queries match all documents > > > http://wiki.apache.org/solr/FunctionQuery#Using_FunctionQuery > > > -Original message- > From: Justin Lolofie > Sent: Wed 21-07-2010 20:24 > To: solr-user@lucene.apache.org; > Subject: boosting particular field values > > I'm using dismax request handler, solr 1.4. > > I would like to boost the weight of certain fields according to their > values... this appears to work: > > bq=category:electronics^5.5 > > However, I think this boosting only affects sorting the results that > have already matched? So if I only get 10 rows back, I might not get > any records back that are category electronics. If I get 100 rows, I > can see that bq is working. However, I only want to get 10 rows. > > How does one affect the kinds of results that are matched to begin > with? bq is the wrong thing to use, right? > > Thanks for any help, > Justin >
Re: Dismax query response field number
scrapy what version of solr are you using? I'd like to do "fq=city:Paris" but it doesnt seem to work for me (solr 1.4) and the docs seem to suggest its a feature that is coming but not there yet? Or maybe I misunderstood? On Thu, Jul 22, 2010 at 6:00 AM, wrote: > > Thanks, > > That was the problem! > > > > > select?q=moto&qt=dismax& fq =city:Paris > > > > > > > > > > > > -Original Message- > From: Chantal Ackermann > To: solr-user@lucene.apache.org > Sent: Thu, Jul 22, 2010 12:47 pm > Subject: Re: Dismax query response field number > > > is this a typo in your query or in your e-mail? > > you have the "q" parameter twice. > use "fq" for query inputs that mention a field explicitly when using > dismax. > > So it should be: > select?q=moto&qt=dismax& fq =city:Paris > > (the whitespace is only for visualization) > > > chantal > > > On Thu, 2010-07-22 at 11:03 +0200, scr...@asia.com wrote: >> Yes i've data... maybe my query is wrong? >> >> select?q=moto&qt=dismax&q=city:Paris >> >> Field city is not showing? >> >> >> >> >> >> >> >> >> -Original Message- >> From: Grijesh.singh >> To: solr-user@lucene.apache.org >> Sent: Thu, Jul 22, 2010 10:07 am >> Subject: Re: Dismax query response field number >> >> >> >> Do u have data in that field also,Solr returns field which have data only. > > > > > >
analysis tool vs. reality
Hello, I have found the analysis tool in the admin page to be very useful in understanding my schema. I've made changes to my schema so that a particular case I'm looking at matches properly. I restarted solr, deleted the document from the index, and added it again. But still, when I do a query, the document does not get returned in the results. Does anyone have any tips for debugging this sort of issue? What is different between what I see in analysis tool and new documents added to the index? Thanks, Justin
analysis tool vs. reality
Hi Erik, thank you for replying. So, turning on debugQuery shows information about how the query is processed- is there a way to see how things are stored internally in the index? My query is "ABC12". There is a document who's "title" field is "ABC12". However, I can only get it to match if I search for "ABC" or "12". This was also true in the analysis tool up until recently. However, I changed schema.xml and turned on catenate-all in WordDelimterFilterFactory for title fieldtype. Now, in the analysis tool "ABC12" matches "ABC12". However, when doing an actual query, it does not match. Thank you for any help, Justin -- Forwarded message -- From: Erik Hatcher To: solr-user@lucene.apache.org Date: Tue, 3 Aug 2010 16:50:06 -0400 Subject: Re: analysis tool vs. reality The analysis tool is merely that, but during querying there is also a query parser involved. Adding debugQuery=true to your request will give you the parsed query in the response offering insight into what might be going on. Could be lots of things, like not querying the fields you think you are to a misunderstanding about some text not being analyzed (like wildcard clauses). Erik On Aug 3, 2010, at 4:43 PM, Justin Lolofie wrote: Hello, I have found the analysis tool in the admin page to be very useful in understanding my schema. I've made changes to my schema so that a particular case I'm looking at matches properly. I restarted solr, deleted the document from the index, and added it again. But still, when I do a query, the document does not get returned in the results. Does anyone have any tips for debugging this sort of issue? What is different between what I see in analysis tool and new documents added to the index? Thanks, Justin
analysis tool vs. reality
Erik: Yes, I did re-index if that means adding the document again. Here are the exact steps I took: 1. analysis.jsp "ABC12" does NOT match title "ABC12" (however, ABC or 12 does) 2. changed schema.xml WordDelimeterFilterFactory catenate-all 3. restarted tomcat 4. deleted the document with title "ABC12" 5. added the document with title "ABC12" 6. query "ABC12" does NOT result in the document with title "ABC12" 7. analysis.jsp "ABC12" DOES match that document now Is there any way to see, given an ID, how something is indexed internally? Lance: I understand the index/query sections of analysis.jsp. However, it operates on text that you enter into the form, not on actual index data. Since all my documents have a unique ID, I'd like to supply an ID and a query, and get back the same index/query sections- using whats actually in the index. -- Forwarded message -- From: Erik Hatcher To: solr-user@lucene.apache.org Date: Tue, 3 Aug 2010 22:43:17 -0400 Subject: Re: analysis tool vs. reality Did you reindex after changing the schema? On Aug 3, 2010, at 7:35 PM, Justin Lolofie wrote: Hi Erik, thank you for replying. So, turning on debugQuery shows information about how the query is processed- is there a way to see how things are stored internally in the index? My query is "ABC12". There is a document who's "title" field is "ABC12". However, I can only get it to match if I search for "ABC" or "12". This was also true in the analysis tool up until recently. However, I changed schema.xml and turned on catenate-all in WordDelimterFilterFactory for title fieldtype. Now, in the analysis tool "ABC12" matches "ABC12". However, when doing an actual query, it does not match. Thank you for any help, Justin -- Forwarded message -- From: Erik Hatcher To: solr-user@lucene.apache.org Date: Tue, 3 Aug 2010 16:50:06 -0400 Subject: Re: analysis tool vs. reality The analysis tool is merely that, but during querying there is also a query parser involved. Adding debugQuery=true to your request will give you the parsed query in the response offering insight into what might be going on. Could be lots of things, like not querying the fields you think you are to a misunderstanding about some text not being analyzed (like wildcard clauses). Erik On Aug 3, 2010, at 4:43 PM, Justin Lolofie wrote: Hello, I have found the analysis tool in the admin page to be very useful in understanding my schema. I've made changes to my schema so that a particular case I'm looking at matches properly. I restarted solr, deleted the document from the index, and added it again. But still, when I do a query, the document does not get returned in the results. Does anyone have any tips for debugging this sort of issue? What is different between what I see in analysis tool and new documents added to the index? Thanks, Justin
analysis tool vs. reality
Wow, I got to work this morning and my query results now include the 'ABC12' document. I'm not sure what that means. Either I made a mistake in the process I described in the last email (I dont think this is the case) or there is some kind of caching of query results going on that doesnt get flushed on a restart of tomcat. Erik: Yes, I did re-index if that means adding the document again. Here are the exact steps I took: 1. analysis.jsp "ABC12" does NOT match title "ABC12" (however, ABC or 12 does) 2. changed schema.xml WordDelimeterFilterFactory catenate-all 3. restarted tomcat 4. deleted the document with title "ABC12" 5. added the document with title "ABC12" 6. query "ABC12" does NOT result in the document with title "ABC12" 7. analysis.jsp "ABC12" DOES match that document now Is there any way to see, given an ID, how something is indexed internally? Lance: I understand the index/query sections of analysis.jsp. However, it operates on text that you enter into the form, not on actual index data. Since all my documents have a unique ID, I'd like to supply an ID and a query, and get back the same index/query sections- using whats actually in the index. -- Forwarded message -- From: Erik Hatcher To: solr-user@lucene.apache.org Date: Tue, 3 Aug 2010 22:43:17 -0400 Subject: Re: analysis tool vs. reality Did you reindex after changing the schema? On Aug 3, 2010, at 7:35 PM, Justin Lolofie wrote: Hi Erik, thank you for replying. So, turning on debugQuery shows information about how the query is processed- is there a way to see how things are stored internally in the index? My query is "ABC12". There is a document who's "title" field is "ABC12". However, I can only get it to match if I search for "ABC" or "12". This was also true in the analysis tool up until recently. However, I changed schema.xml and turned on catenate-all in WordDelimterFilterFactory for title fieldtype. Now, in the analysis tool "ABC12" matches "ABC12". However, when doing an actual query, it does not match. Thank you for any help, Justin -- Forwarded message -- From: Erik Hatcher To: solr-user@lucene.apache.org Date: Tue, 3 Aug 2010 16:50:06 -0400 Subject: Re: analysis tool vs. reality The analysis tool is merely that, but during querying there is also a query parser involved. Adding debugQuery=true to your request will give you the parsed query in the response offering insight into what might be going on. Could be lots of things, like not querying the fields you think you are to a misunderstanding about some text not being analyzed (like wildcard clauses). Erik On Aug 3, 2010, at 4:43 PM, Justin Lolofie wrote: Hello, I have found the analysis tool in the admin page to be very useful in understanding my schema. I've made changes to my schema so that a particular case I'm looking at matches properly. I restarted solr, deleted the document from the index, and added it again. But still, when I do a query, the document does not get returned in the results. Does anyone have any tips for debugging this sort of issue? What is different between what I see in analysis tool and new documents added to the index? Thanks, Justin
Re: DIH silently ignoring a record
Shalin, Thanks for your questions- the mystery is solved this morning. My "unique" key was only unique within an entity and not between them. There was only one instance of overlap- the no-longer mysterious record and its doppelganger. All the other symptoms were side effects from how I was troubleshooting. For example, if I did a full import, the doppelganger record (which I didnt know about) would be imported- but my test query was only looking for the one that didnt make it in. However, if I imported only that entity, it would, as expected, update the index record and things would appear fine to me. So, no bug. Just plain old bad/narrow troubleshooting combined with coincidence (only record not getting imported is first row, etc). -justin On Mon, Mar 18, 2013 at 7:34 PM, Shalin Shekhar Mangar < shalinman...@gmail.com> wrote: > That does sound perplexing. > > Justin, can you tell us which field in the query is your record id? What is > the record id's type in database and in solr schema? What is your unique > key and its type in solr schema? > > > On Tue, Mar 19, 2013 at 5:19 AM, Justin L. wrote: > > > Every time I do an import, DataImportHandler is not importing 1 row from > my > > database. > > > > I have 3 entities each defined with a single query. I have confirmed, by > > looking at totals from solr as well as comparing a "*:*" query to direct > db > > queries-- exactly 1 row is missing every time. And its the same row- the > > first row of one of my entities when sorted by primary key. The other two > > entities are fully imported without trouble. > > > > There are no errors in the log- even when DIH logging is turned up to > FINE. > > When I alter the query to retrieve only the mysterious record, it shows > up > > as "Fetched: 1 Skipped: 0 Processed: 1". But when I do a query for *:* it > > returns 0 documents. > > > > Ready for a twist? The DIH query for this entity does not have an ORDER > BY > > clause- when I add one to sort by primary key DESC it imports all of the > > rows for that entity, including the mysterious record. > > > > Ready to have your mind blown? I am using the alternative method for > doing > > delta imports (see query below). When I make clean=false, and update the > > timestamp on the mysterious record- yup- it gets imported properly. > > > > > > > > Because I have the ORDER BY DESC hack, I can get by and live to fight > > another day. But I thought someone might like to know this because I > think > > I am hitting a bug in DIH- specifically, something after the querying but > > before the posting to solr. If someone familiar with DIH innards wants to > > suggest where I should look or how to step through it, I'd be willing to > > take a look. > > > > xoxo, > > Justin > > > > > > * Fun facts: > > Solr 4.0 > > Oracle 11g > > The mysterious record's id is "01" > > I use field elements to rename the columns rather than in-the-sql aliases > > because of a problem I had with them earlier. But I will try changing > that. > > > > > > * Alternative delta import method: > > > > http://wiki.apache.org/solr/DataImportHandlerDeltaQueryViaFullImport > > > > > > * DIH query that should import mysterious record: > > > > select organization_name, organization_id, address > > from organization o > > join rolodex r on r.rolodex_id = o.contact_address_id > > and r.sponsor_address_flag = 'N' > > and r.actv_ind = 'Y' > > where '${dataimporter.request.clean}' = 'true' > > or to_char(o.update_timestamp,'-MM-DD HH24:MI:SS') > > > '${dataimporter.organization.last_index_time > > > > > > -- > Regards, > Shalin Shekhar Mangar. >
Solr v3.5.0 - numFound changes when paging through results on 8-shard cluster
Solr v3.5.0 8 Master Shards 2 Slaves Per Master Confirming that there are no active records being written, the "numFound" value is decreasing as we page through the results. For example, Page1 - numFound = 3683 Page2 - numFound = 3683 Page3 - numFound = 3683 Page4 - numFound = 2866 Page5 - numFound = 2419 Page5 - numFound = 1898 Page6 - numFound = 1898 ... PageN - numFound = 1898 It looks like it eventually settles on the real count. Is this a limitation when using a distributed cluster or is the numFound always intended to give an approximately similar to how Google responds with total hits? I also increased start to higher than 1898 (made it 3000) and it returned 0 results with numCount = 1898. Thanks ahead, -- Justin Babuscio 571-210-0035 http://linchpinsoftware.com
Re: Solr v3.5.0 - numFound changes when paging through results on 8-shard cluster
1) We have 1 core and we use the default search handler. 2) For the shards, we use the URL parameters, shards=s1/solr,s2/solr,s3/solr,...,s8/solr where s# point to a baremetal load balancer that routes the requests to one of the two slave shards. There is definitely the chance that on each page, the load balancer is mixing the shards used for searching. Is there a possibility that the Master1 may have two shards with two different counts? This could explain it. 3) For the URL to execute the aggregating search, we have a virtual IP that round robins to all 16 slaves in the cluster. On a given search, any one of the 16 slaves may field the aggregation request then the shard single-node queries are fired off to the load balancer which then may mix up the nodes. Other than the load balancing, we have no other configuration or search differences besides the start parameter in the URL to move to the next page. On Tue, Jun 19, 2012 at 4:20 PM, Yury Kats wrote: > On 6/19/2012 4:06 PM, Justin Babuscio wrote: > > Solr v3.5.0 > > 8 Master Shards > > 2 Slaves Per Master > > > > Confirming that there are no active records being written, the "numFound" > > value is decreasing as we page through the results. > > > > For example, > > Page1 - numFound = 3683 > > Page2 - numFound = 3683 > > Page3 - numFound = 3683 > > Page4 - numFound = 2866 > > Page5 - numFound = 2419 > > Page5 - numFound = 1898 > > Page6 - numFound = 1898 > > ... > > PageN - numFound = 1898 > > > > > > > > It looks like it eventually settles on the real count. Is this a > > limitation when using a distributed cluster or is the numFound always > > intended to give an approximately similar to how Google responds with > total > > hits? > > numFound should return the real count for any given query. > How are you sepcifying which shards/cores to use for each query? > Does this change between queries? > > -- Justin Babuscio 571-210-0035 http://linchpinsoftware.com
Re: Solr v3.5.0 - numFound changes when paging through results on 8-shard cluster
As I understand your problem, it sounds like you were using your master as part of your search cluster so the two distributed queries were returning conflicting numbers. In my scenario, our eight Masters are used for /updates & /deletes only. There are no queries issued to these nodes. When the distributed query is executed, it could be possible for the two slaves to be out of sync (i.e. one replicated faster than the other). The proof that may eliminate this is that there is no activity on my servers right now. The search counts have stabilized. It's consistently returning less results as I execute the query with a "start" URL param > 400. On Tue, Jun 19, 2012 at 5:05 PM, Shawn Heisey wrote: > On 6/19/2012 2:32 PM, Justin Babuscio wrote: > >> 2) For the shards, we use the URL >> parameters, shards=s1/solr,s2/solr,s3/**solr,...,s8/solr >> where s# point to a baremetal load balancer that routes the requests >> to >> one of the two slave shards. >> > > This most likely has nothing to do with your question about changing > numFound, just a side issue that I wanted to comment on. I was at one time > using a similar method where I had each shard as an entry in the load > balancer. This led to an unusual occasional problem. > > As you may know, a distributed query results in two queries being sent to > each shard -- the first one finds the documents on each shard, then once > Solr has gathered those results, it makes another request that retrieves > the document. > > Imagine that you have just updated your master server, and you make a > query that will include one or more of the new documents in the results. > If you make that query just after the master server gets updated, but > before the slave has had a chance to copy and commit the changes, you can > run into this: The first (search) query goes to the master server and will > see the new document. The second (retrieval) query will then go to the > slave, requesting a document that does not yet exist there. This *will* > happen eventually. I would run into it at least once a day on a monitoring > system that checked the age of the newest document. > > Here's one way to deal with that: I have a dedicated core on each server > that has the shards parameter included in the request handler. This core > does not have an index of its own, it exists only to act as a search > broker, pointing at all the cores with the data. The name of this core is > ncmain, and its standard request handler contains the following: > > idxa2.example.**com:8981/solr/inclive,idxa1.** > example.com:8981/solr/s0live,**idxa1.example.com:8981/solr/** > s1live,idxa1.example.com:8981/**solr/s2live,idxa2.example.com:** > 8981/solr/s3live,idxa2.**example.com:8981/solr/s4live,** > idxa2.example.com:8981/solr/**s5live<http://idxa2.example.com:8981/solr/inclive,idxa1.example.com:8981/solr/s0live,idxa1.example.com:8981/solr/s1live,idxa1.example.com:8981/solr/s2live,idxa2.example.com:8981/solr/s3live,idxa2.example.com:8981/solr/s4live,idxa2.example.com:8981/solr/s5live> > > > On the servers for chain A (idxa1, idxa2), the shards parameter references > only chain A server cores. On the servers for chain B (idxb1, idxb2), the > shards parameter references only chain B server cores. > > The load balancer only talks to these broker cores, not the cores with the > actual indexes. Neither the client nor the load balancer needs to use (or > even know about) the shards parameter. That is handled entirely within the > Solr configuration. > > Thanks, > Shawn > > -- Justin Babuscio 571-210-0035 http://linchpinsoftware.com
Re: Solr v3.5.0 - numFound changes when paging through results on 8-shard cluster
I believe that is the issue. We recently lost a physical server and a misinformed (due to weekend fire) sys admin moved one of the master shards. This caused the automated deployment scripts to change the order of publishing. When a rebuild followed the following day, we essentially wrote the same record to multiple servers causing this false positive count. Thank you for all the feedback in resolving this. We are going to delete our entire index, rebuild from scratch (achievable for our user base), and it should clear up any discrepancies. Justin On Tue, Jun 19, 2012 at 5:40 PM, Chris Hostetter wrote: > : Confirming that there are no active records being written, the "numFound" > : value is decreasing as we page through the results. > > 1) check that the "clones" of each shard are in fact identical (just look > at the index files on each machine and make sure they are the same. > > 2) distributed searching relies heavily on using a uniqeuKey, and can > behave oddly if documents with identical keys exist in multiple shards. > > > http://wiki.apache.org/solr/DistributedSearch?#Distributed_Searching_Limitations > > If i remember correctly, what you are describing sounds like one of the > things that can hapen if you violate the uniqueKey rule across differnet > shards when indexing. > > I *think* what you are seeing is that in the distributed request for > page#1 the coordinator sums up the numFound from all shards, and merges > results 1-$rows acording to the sort, likewise for pages 2 & 3 when you > get to page #4, it suddenly sees that doc#9876543 is included in hte > responses from 3 diff shards, and it subtracts 2 from the numFound, and so > on as you page farther through the results. the more documents with > duplicate uniqueKeys it find in the results as it pages through, the lower > the cumulative numFound gets. > > : For example, > : Page1 - numFound = 3683 > : Page2 - numFound = 3683 > : Page3 - numFound = 3683 > : Page4 - numFound = 2866 > : Page5 - numFound = 2419 > : Page5 - numFound = 1898 > : Page6 - numFound = 1898 > : ... > : PageN - numFound = 1898 > > -Hoss > -- Justin Babuscio 571-210-0035 http://linchpinsoftware.com
Highlighting error InvalidTokenOffsetsException: Token oedipus exceeds length of provided text sized 11
ava:216) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766) at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450) at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230) at org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152) at org.mortbay.jetty.Server.handle(Server.java:326) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542) at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404) at org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:228) at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582) Caused by: org.apache.lucene.search.highlight.InvalidTokenOffsetsException: Token oedipus exceeds length of provided text sized 11 at org.apache.lucene.search.highlight.Highlighter.getBestTextFragments(Highlighter.java:233) at org.apache.solr.highlight.DefaultSolrHighlighter.doHighlightingByHighlighter(DefaultSolrHighlighter.java:490) ... 24 more I am not even sure where the “oedipus” token is coming from. It doesn’t show up in the analysis. Help please? Thank you, Justin
Solr replication hangs on multiple slave nodes
After a large index rebuild (16-masters with ~15GB each), some slaves fail to completely replicate. We are running Solr v3.5 with 16 masters and 2 slaves each for a total of 48 servers. 4 of the 32 slaves sit in a stalled replication state with similar messages: Files Downloaded: 254/260 Downloaded: 12.09 GB / 12.09 GB [ 100% ] Downloading File: _t6.fdt, Downloaded: 3.1 MB / 3.1 MB [ 100 % ] Time Elapsed: 3215s, EStimated Time REmaining: 0s, Speed: 24.5 MB/s As you'll notice, all download sizes appear to be complete but the files downloaded are not. This also prevents the servers from polling for a new update from the masters. When searching, we are occasionally seeing 500 responses from the slaves that fail to replicate. The errors are ArrayIndexOutOfBounds - this occurs when writing the HTTP Response (our container is WebSphere) NullPointerExceptions - org.apache.lucnee.queryParser.QueryParser.parse (QueryParser.java:203 ) We have tried to stop the slave, delete the /data directory, and restart. This started downloading the index but stalled as expected. Thanks, Justin
encountered the "Cannot allocate memory" when calling snapshooter program after optimize command
Hi, I configured solr to listen on postOptimize event and call the snapshooter program after an optimize command. It works well when the Java heap size is set to less than 4G. But if I increased the java heap size to 5G, the snapshooter program can't be successfully called after the optimize command and error message is here: SEVERE: java.io.IOException: Cannot run program "/home/solr_1.3/solr/bin/snapshooter" (in directory "/home/solr_1.3/solr/bin"): java.io.IOException: error=12, Cannot allocate memory at java.lang.ProcessBuilder.start(ProcessBuilder.java:459) at java.lang.Runtime.exec(Runtime.java:593) Here is my server platform: OS: CentOS 5.2 x86_64 Memory: 8G Solr: 1.3 Any suggestion is appreciated. Thanks, Justin
SpellCheckerRequestHandler and onlyMorePopular
I'm running the example Solr install with a custom schema.xml and solrconfig.xml. I'm seeing some unexpected results for searches using the SpellCheckerRequestHandler using the onlyMorePopular option. Namely, searching for certain terms with onlyMorePopular set to true returns a suggestion which, when searched for in turn itself, returns a suggestion back to the original term. For example, a query such as: http://localhost:8983/solr/select/? q=Eft&qt=spellchecker&onlyMorePopular=true returns: − 0 2 − oft And a query for http://localhost:8983/solr/select/? q=oft&qt=spellchecker&onlyMorePopular=true − 0 2 − Eft It seems that "onlyMorePopular" should be an asymmetric relation. I thought perhaps it might actually be implemented as a >= instead of a strict >, making it antisymmetric and perhaps explaining this result as a popularity tie. However, taking a clean copy of the example install, adding entries into the spellchecker.xml file, then inserting and rebuilding the index results in onlyMorePopular cross- recommendations as above even when I've created a clear popularity inequality between similar terms (e.g. adding two docs with word="blackkerry" makes it more popular than the existing "blackberry" doc, but each is suggested for the other). I checked the defaults list for the spellchecker requestHandler in my solrconfig.xml, and it didn't specify a value for onlyMorePopular. I added a default value of true and restarted Solr, but that has no effect. I've also tried using Luke to inspect the spell index, but I'm not sure exactly what to look for. I'd be more than happy to provide any details which might assist others in lending their expertise. Any insights would be very much appreciated. Thanks, Justin Knoll
Distribution without SSH?
Hello, I recently set up Solr with distribution on a couple of servers. I just learned that our network policies do not permit us to use SSH with passphraseless keys, and the snappuller script uses SSH to examine the master Solr instance's state before it pulls the newest index via rsync. We plan to attempt to rewrite the snappuller (and possibly other distribution scripts, as required) to eliminate this dependency on SSH. I thought I ask the list in case anyone has experience with this same situation or any insights into the reasoning behind requiring SSH access to the master instance. Thanks, Justin Knoll
RE: Commit preformance problem
a script for posting large sets (23GB here) http://www.nabble.com/file/p15786630/post3.sh post3.sh -- View this message in context: http://www.nabble.com/Commit-preformance-problem-tp15434972p15786630.html Sent from the Solr - User mailing list archive at Nabble.com.
Solr 8.2 Cloud Replication Locked
Hi all, We are running Solr 8.2 Cloud in a cluster where we have a single TLOG replica per shard and multiple PULL replicas for each shard. We have noticed an issue recently where some of the PULL replicas stop replicating from the masters. The will have a replication which outputs: o.a.s.h.IndexFetcher Number of files in latest index in master: Then nothing else for IndexFetcher after that. I went onto a few instances and took a thread dump and we see the following where it seems to be locked getting the index write lock. I don’t see anything else in the thread dump indicating deadlock. Any ideas here? "indexFetcher-19-thread-1" #468 prio=5 os_prio=0 cpu=285847.01ms > elapsed=62993.13s tid=0x7fa8fc004800 nid=0x254 waiting on condition > [0x7ef584ede000] > java.lang.Thread.State: TIMED_WAITING (parking) > at jdk.internal.misc.Unsafe.park(java.base@11.0.6/Native Method) > - parking to wait for <0x0003aa5e4ad8> (a > java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync) > at java.util.concurrent.locks.LockSupport.parkNanos(java.base@11.0.6 > /LockSupport.java:234) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireNanos(java.base@11.0.6 > /AbstractQueuedSynchronizer.java:980) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireNanos(java.base@11.0.6 > /AbstractQueuedSynchronizer.java:1288) > at > java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.tryLock(java.base@11.0.6 > /ReentrantReadWriteLock.java:1131) > at > org.apache.solr.update.DefaultSolrCoreState.lock(DefaultSolrCoreState.java:179) > at > org.apache.solr.update.DefaultSolrCoreState.closeIndexWriter(DefaultSolrCoreState.java:240) > at > org.apache.solr.handler.IndexFetcher.fetchLatestIndex(IndexFetcher.java:569) > at > org.apache.solr.handler.IndexFetcher.fetchLatestIndex(IndexFetcher.java:351) > at > org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:424) > at > org.apache.solr.handler.ReplicationHandler.lambda$setupPolling$13(ReplicationHandler.java:1193) > at > org.apache.solr.handler.ReplicationHandler$$Lambda$668/0x000800d0f440.run(Unknown > Source) > at java.util.concurrent.Executors$RunnableAdapter.call(java.base@11.0.6 > /Executors.java:515) > at java.util.concurrent.FutureTask.runAndReset(java.base@11.0.6 > /FutureTask.java:305) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(java.base@11.0.6 > /ScheduledThreadPoolExecutor.java:305) > at java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@11.0.6 > /ThreadPoolExecutor.java:1128) > at java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@11.0.6 > /ThreadPoolExecutor.java:628) > at java.lang.Thread.run(java.base@11.0.6/Thread.java:834)
Performance of /export requests
Hi, We are currently working on a project where we are making heavy use of /export in Solr in order to stream data back. We have an index with about 16 fields that are all docvalues fields and any number of them may be requested to be streamed in results. Our index has ~450 million documents spread across 10 shards. We are creating a CloudSolrStream and when we call CloudSolrStream.open() we see that call being slower than we had hoped. For some queries, that call can take 800 ms. What we found interesting was that doing the same request repeatedly resulted in the same time of 800 ms, which seems to indicate that /export does not take advantage of caching or there is something else at play. I’m starting to dig through the code to better understand, but I wanted to reach out to see what sort of expectations we should have here and if there is anything we can do to increase performance of these requests. We are currently using Solr 5, but we’ve also tried with Solr 7 and seen similar results. If I can provide any additional information, please let me know. Thank you! Justin
Re: Performance of /export requests
Thanks for the quick response. We are generally seeing exports from Solr 5 and 7 to be roughly the same, but I’ll check out Solr 8. Joel - We are generally sorting a on tlong field and criteria can vary from searching everything (*:*) to searching on a combination of a few tint and string types. All of our 16 fields are docvalues. Is there any performance degradation as the number of docvalues fields increases or should that not have an impact? Also, is the 30k sliding window configurable? In many cases we are streaming back a few thousand, maybe up to 10k and then cutting off the stream. If we could configure the size of that window, could that speed things up some? Thanks again for the info. On Sat, May 11, 2019 at 2:38 PM Joel Bernstein wrote: > Can you share the sort criteria and search query? The main strategy for > improving performance of the export handler is adding more shards. This is > different than with typical distributed search, where deep paging issues > get worse as you add more shards. With the export handler if you double the > shards you double the pushing power. There are no deep paging drawbacks to > adding more shards. > > On Sat, May 11, 2019 at 2:17 PM Toke Eskildsen wrote: > > > Justin Sweeney wrote: > > > > [Index: 10 shards, 450M docs] > > > > > We are creating a CloudSolrStream and when we call > CloudSolrStream.open() > > > we see that call being slower than we had hoped. For some queries, that > > > call can take 800 ms. [...] > > > > As far as I can see in the code, CloudSolrStream.open() opens streams > > against the relevant shards and checks if there is a result. The last > step > > is important as that means the first batch of tuples must be calculated > in > > the shards. Streaming works internally by having a sliding window of 30K > > tuples through the result set in each shard, so open() results in (up to) > > 30K tuples being calculated. On the other hand, getting the first 30K > > tuples should be very fast after open(). > > > > > We are currently using Solr 5, but we’ve also tried with Solr 7 and > seen > > > similar results. > > > > Solr 7 has a performance regression for export (or rather a regression > for > > DocValues that is very visible when using export. See > > https://issues.apache.org/jira/browse/SOLR-13013), so I would expect it > > to be slower than Solr 5. You could try with Solr 8 where this regression > > should be mitigated somewhat. > > > > - Toke Eskildsen > > >
Solr Slack Workspace
Hi all, I did some googling and didn't find anything, but is there a Slack workspace for Solr? I think this could be useful to expand interaction within the community of Solr users and connect people solving similar problems. I'd be happy to get this setup if it does not exist already. Justin
Re: Solr Slack Workspace
Thanks, I joined the Relevance Slack: https://opensourceconnections.com/slack, I definitely think a dedicated Solr workspace would also be good allowing for channels to get involved with development as well as user based questions. It does seem like slack has made it increasingly difficult to create open workspaces and not force someone to approve or only allow specific email domains. Has anyone tried to do that recently? I tried for an hour or so last weekend and it seemed to not be very straightforward anymore. On Tue, Jan 26, 2021 at 12:57 PM Houston Putman wrote: > There is https://solr-dev.slack.com > > It's not really used, but it's there and we can open it up for people to > join and start using. > > On Tue, Jan 26, 2021 at 5:38 AM Ishan Chattopadhyaya < > ichattopadhy...@gmail.com> wrote: > > > Thanks ufuk. I'll take a look. > > > > On Tue, 26 Jan, 2021, 4:05 pm ufuk yılmaz, > > wrote: > > > > > It’s asking for a searchscale.com email address? > > > > > > Sent from Mail for Windows 10 > > > > > > From: Ishan Chattopadhyaya > > > Sent: 26 January 2021 13:33 > > > To: solr-user > > > Subject: Re: Solr Slack Workspace > > > > > > There is a Slack backed by official IRC support. Please see > > > https://lucene.472066.n3.nabble.com/Solr-Users-Slack-td4466856.html > for > > > details on how to join it. > > > > > > On Tue, 19 Jan, 2021, 2:54 pm Charlie Hull, < > > > ch...@opensourceconnections.com> > > > wrote: > > > > > > > Relevance Slack is open to anyone working on search & relevance - > #solr > > > is > > > > only one of the channels, there's lots more! Hope to see you there. > > > > > > > > Cheers > > > > > > > > Charlie > > > > https://opensourceconnections.com/slack > > > > > > > > > > > > On 16/01/2021 02:18, matthew sporleder wrote: > > > > > IRC has kind of died off, > > > > > https://lucene.apache.org/solr/community.html has a slack > mentioned, > > > > > I'm on https://opensourceconnections.com/slack after taking their > > solr > > > > > training class and assume it's mostly open to solr community. > > > > > > > > > > On Fri, Jan 15, 2021 at 8:10 PM Justin Sweeney > > > > > wrote: > > > > >> Hi all, > > > > >> > > > > >> I did some googling and didn't find anything, but is there a Slack > > > > >> workspace for Solr? I think this could be useful to expand > > interaction > > > > >> within the community of Solr users and connect people solving > > similar > > > > >> problems. > > > > >> > > > > >> I'd be happy to get this setup if it does not exist already. > > > > >> > > > > >> Justin > > > > > > > > > > > > -- > > > > Charlie Hull - Managing Consultant at OpenSource Connections Limited > > > > > > > > Founding member of The Search Network <https://thesearchnetwork.com/ > > > > > > and co-author of Searching the Enterprise > > > > <https://opensourceconnections.com/about-us/books-resources/> > > > > tel/fax: +44 (0)8700 118334 > > > > mobile: +44 (0)7767 825828 > > > > > > > > > > > > >
Re: Solr Slack Workspace
Worked for me and a few others, thanks for doing that! On Tue, Feb 2, 2021 at 5:04 AM Ishan Chattopadhyaya < ichattopadhy...@gmail.com> wrote: > Hi all, > I've created an invite link for the Slack workspace: > https://s.apache.org/solr-slack. > Please test it out. I'll send a broader notification once this is tested > out to be working well. > Thanks and regards, > Ishan > > On Thu, Jan 28, 2021 at 12:26 AM Justin Sweeney < > justin.sweene...@gmail.com> > wrote: > > > Thanks, I joined the Relevance Slack: > > https://opensourceconnections.com/slack, I definitely think a dedicated > > Solr workspace would also be good allowing for channels to get involved > > with development as well as user based questions. > > > > It does seem like slack has made it increasingly difficult to create open > > workspaces and not force someone to approve or only allow specific email > > domains. Has anyone tried to do that recently? I tried for an hour or so > > last weekend and it seemed to not be very straightforward anymore. > > > > On Tue, Jan 26, 2021 at 12:57 PM Houston Putman > > > wrote: > > > > > There is https://solr-dev.slack.com > > > > > > It's not really used, but it's there and we can open it up for people > to > > > join and start using. > > > > > > On Tue, Jan 26, 2021 at 5:38 AM Ishan Chattopadhyaya < > > > ichattopadhy...@gmail.com> wrote: > > > > > > > Thanks ufuk. I'll take a look. > > > > > > > > On Tue, 26 Jan, 2021, 4:05 pm ufuk yılmaz, > > > > > > > wrote: > > > > > > > > > It’s asking for a searchscale.com email address? > > > > > > > > > > Sent from Mail for Windows 10 > > > > > > > > > > From: Ishan Chattopadhyaya > > > > > Sent: 26 January 2021 13:33 > > > > > To: solr-user > > > > > Subject: Re: Solr Slack Workspace > > > > > > > > > > There is a Slack backed by official IRC support. Please see > > > > > > https://lucene.472066.n3.nabble.com/Solr-Users-Slack-td4466856.html > > > for > > > > > details on how to join it. > > > > > > > > > > On Tue, 19 Jan, 2021, 2:54 pm Charlie Hull, < > > > > > ch...@opensourceconnections.com> > > > > > wrote: > > > > > > > > > > > Relevance Slack is open to anyone working on search & relevance - > > > #solr > > > > > is > > > > > > only one of the channels, there's lots more! Hope to see you > there. > > > > > > > > > > > > Cheers > > > > > > > > > > > > Charlie > > > > > > https://opensourceconnections.com/slack > > > > > > > > > > > > > > > > > > On 16/01/2021 02:18, matthew sporleder wrote: > > > > > > > IRC has kind of died off, > > > > > > > https://lucene.apache.org/solr/community.html has a slack > > > mentioned, > > > > > > > I'm on https://opensourceconnections.com/slack after taking > > their > > > > solr > > > > > > > training class and assume it's mostly open to solr community. > > > > > > > > > > > > > > On Fri, Jan 15, 2021 at 8:10 PM Justin Sweeney > > > > > > > wrote: > > > > > > >> Hi all, > > > > > > >> > > > > > > >> I did some googling and didn't find anything, but is there a > > Slack > > > > > > >> workspace for Solr? I think this could be useful to expand > > > > interaction > > > > > > >> within the community of Solr users and connect people solving > > > > similar > > > > > > >> problems. > > > > > > >> > > > > > > >> I'd be happy to get this setup if it does not exist already. > > > > > > >> > > > > > > >> Justin > > > > > > > > > > > > > > > > > > -- > > > > > > Charlie Hull - Managing Consultant at OpenSource Connections > > Limited > > > > > > > > > > > > Founding member of The Search Network < > > https://thesearchnetwork.com/ > > > > > > > > > > and co-author of Searching the Enterprise > > > > > > <https://opensourceconnections.com/about-us/books-resources/> > > > > > > tel/fax: +44 (0)8700 118334 > > > > > > mobile: +44 (0)7767 825828 > > > > > > > > > > > > > > > > > > > > > > > > > >