Re: Negative CDCR Queue Size?
Hi Webster, The queue size "*-1*" suggests the target is not initialized, and you should see a "WARN" in the logs suggesting something bad happened at the respective target. I am also posting the source code for reference. Any chance you can look for WARN in the logs or probably check at respective source and target the CDCR is configured and was running ok? without any manual intervention? Also, you mentioned there are a number of intermittent issues with CDCR, I see you have reported few Jiras. I will be grateful if you can report the rest? Code: > for (CdcrReplicatorState state : replicatorManager.getReplicatorStates()) { > NamedList queueStats = new NamedList(); > CdcrUpdateLog.CdcrLogReader logReader = state.getLogReader(); > if (logReader == null) { > String collectionName = > req.getCore().getCoreDescriptor().getCloudDescriptor().getCollectionName(); > String shard = > req.getCore().getCoreDescriptor().getCloudDescriptor().getShardId(); > log.warn("The log reader for target collection {} is not initialised @ > {}:{}", > state.getTargetCollection(), collectionName, shard); > queueStats.add(CdcrParams.QUEUE_SIZE, -1l); > } else { > queueStats.add(CdcrParams.QUEUE_SIZE, > logReader.getNumberOfRemainingRecords()); > } > queueStats.add(CdcrParams.LAST_TIMESTAMP, > state.getTimestampOfLastProcessedOperation()); > if (hosts.get(state.getZkHost()) == null) { > hosts.add(state.getZkHost(), new NamedList()); > } > ((NamedList) hosts.get(state.getZkHost())).add(state.getTargetCollection(), > queueStats); > } > rsp.add(CdcrParams.QUEUES, hosts); > > Amrit Sarkar Search Engineer Lucidworks, Inc. 415-589-9269 www.lucidworks.com Twitter http://twitter.com/lucidworks LinkedIn: https://www.linkedin.com/in/sarkaramrit2 Medium: https://medium.com/@sarkaramrit2 On Wed, Nov 7, 2018 at 12:47 AM Webster Homer < webster.ho...@milliporesigma.com> wrote: > I'm sorry I should have included that. We are running Solr 7.2. We use > CDCR for almost all of our collections. We have experienced several > intermittent problems with CDCR, this one seems to be new, at least I > hadn't seen it before > > -Original Message- > From: Erick Erickson [mailto:erickerick...@gmail.com] > Sent: Tuesday, November 06, 2018 12:36 PM > To: solr-user > Subject: Re: Negative CDCR Queue Size? > > What version of Solr? CDCR has changed quite a bit in the 7x code line so > it's important to know the version. > > On Tue, Nov 6, 2018 at 10:32 AM Webster Homer < > webster.ho...@milliporesigma.com> wrote: > > > > Several times I have noticed that the CDCR action=QUEUES will return a > negative queueSize. When this happens we seem to be missing data in the > target collection. How can this happen? What does a negative Queue size > mean? The timestamp is an empty string. > > > > We have two targets for a source. One looks like this, with a negative > > queue size > > queues": > > ["uc1f-ecom-mzk01.sial.com:2181,uc1f-ecom-mzk02.sial.com:2181,uc1f-eco > > m-mzk03.sial.com:2181/solr",["ucb-catalog-material-180317",["queueSize > > ",-1,"lastTimestamp",""]], > > > > The other is healthy > > "ae1b-ecom-mzk01.sial.com:2181,ae1b-ecom-mzk02.sial.com:2181,ae1b-ecom > > -mzk03.sial.com:2181/solr",["ucb-catalog-material-180317",["queueSize" > > ,246980,"lastTimestamp","2018-11-06T16:21:53.265Z"]] > > > > We are not seeing CDCR errors. > > > > What could cause this behavior? >
Re: Master Slave Replication Issue
Hi, We have switched from 5.4 to 7.2.1 and we have started to see more issues with the replication. I think it may be related to the fact that a delta import was started during a full import (not the case for the Solr 5.4). I am getting below error: XXX: java.lang.IllegalArgumentException:java.lang.IllegalArgumentException: Directory MMapDirectory@XXX\index lockFactory=org.apache.lucene.store.NativeFSLockFactory@21ff4974 still has pending deleted files; cannot initialize IndexWriter Are there more known issues with Solr 7.X and the replication? Based on https://issues.apache.org/jira/browse/SOLR-11938 I can not trust Solr 7.X anymore. How can I fix the "XXX: java.lang.IllegalArgumentException:java.lang.IllegalArgumentException: Directory MMapDirectory@XXX\index lockFactory=org.apache.lucene.store.NativeFSLockFactory@21ff4974 still has pending deleted files; cannot initialize IndexWriter " issue? Thank you Damian -- Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
Sql server data import
Hello, i managed to set up a connection to my sql server to import data into Solr. The idea is to import filetables but for now i first want to get it working using regular tables. So i created *data-config.xml* *schema.xml* i added and changed uniqueKey entry to Id When i want to import my data (which is just data like Id: 5, PublicId: "test"), i get the following error in the logging. Error creating document : SolrInputDocument(fields: [PublicId=10065, Id=117]) I tried all sorts of things but can't get it fixed. Is anyone want to give me a hand? thanks in advance! -- Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
Re: Sql server data import
Which version of Solr is it? Because we have not used schema.xml for a very long time. It has been managed-schema instead. Also, have you tried using DIH example that uses database and modifying it just enough to read data from your database. Even if it has a lot of extra junk, this would test half of the pipeline, which you can then transfer to the clean setup. Regards, Alex. On Fri, 9 Nov 2018 at 08:09, Verthosa wrote: > > Hello, i managed to set up a connection to my sql server to import data into > Solr. The idea is to import filetables but for now i first want to get it > working using regular tables. So i created > > *data-config.xml* > > driver="com.microsoft.sqlserver.jdbc.SQLServerDriver" > > url="jdbc:sqlserver://localhost;databaseName=inConnexion_Tenant2;integratedSecurity=true" > /> > > > > > > > > > *schema.xml* > i added > multiValued="false" /> > multiValued="false"/> > > and changed uniqueKey entry to > Id > > When i want to import my data (which is just data like Id: 5, PublicId: > "test"), i get the following error in the logging. > > Error creating document : SolrInputDocument(fields: [PublicId=10065, > Id=117]) > > > I tried all sorts of things but can't get it fixed. Is anyone want to give > me a hand? > > thanks in advance! > > > > > -- > Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
RE: Sql server data import
What is "" in the PublicId? Is it part of the data? Did you check if the special characters in your data cause the problem? Steve ### Error creating document : SolrInputDocument(fields: [PublicId=10065, Id=117]) -Original Message- From: Verthosa Sent: Friday, November 9, 2018 7:51 AM To: solr-user@lucene.apache.org Subject: Sql server data import Hello, i managed to set up a connection to my sql server to import data into Solr. The idea is to import filetables but for now i first want to get it working using regular tables. So i created *data-config.xml* *schema.xml* i added and changed uniqueKey entry to Id When i want to import my data (which is just data like Id: 5, PublicId: "test"), i get the following error in the logging. Error creating document : SolrInputDocument(fields: [PublicId=10065, Id=117]) I tried all sorts of things but can't get it fixed. Is anyone want to give me a hand? thanks in advance! -- Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
Re: Disabling jvm properties from ui
Yes, it is important to understand that only trusted clients and persons should be given access to Solr's port. But it may stil be surprising to users that e.g. passwords to a DB or SSL keystore is available over HTTP when there is no need for them at the client side. I'm not saying itis a bug, but may be surprising. So I think we should continue step by step to address these and have Solr behave after the principle of least surprise, thus the discussion in https://issues.apache.org/jira/browse/SOLR-12976 After locking down secrets as good as possible, the next logical step would be to couple Solr's Authentication/Authorization feature to this, so that if a client has a role with the read/edit securityconfig permission, then she could be allowed to see those properties. So far the authorization is true/false based on handler/HTTPMethod meaning we'd have to add a new /solr/admin/info/system/secrets/ handler which could return those hidden props. But there may not be a need to retrieve these on API level at all. -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com > 8. nov. 2018 kl. 19:54 skrev Gus Heck : > > That's an interesting feature, and it addresses X, but there are lots of > ways to discover system properties. In a managed schema, enter a field name > ${java.version} and you'll get a field named 1.8.0_144 (or whatever). I > still think it's important to address Y they are trying to hide the system > properties from someone they have placed their trust in already. > > On Thu, Nov 8, 2018 at 1:16 PM Jan Høydahl wrote: > >> It's not documented in the Ref Guide, but you can set this system property >> to fix it: >> >> >> SOLR_OPTS="-Dsolr.redaction.system.pattern=(.*password.*|.*your-own-regex.*)" >> >> Then the property will show as --REDACTED— in the UI. >> >> Note that the property still will leak through /solr/admin/metrics and you >> need to add the same exclusion in solr.xml, see >> https://lucene.apache.org/solr/guide/7_5/metrics-reporting.html#the-metrics-hiddensysprops-element >> >> -- >> Jan Høydahl, search solution architect >> Cominvent AS - www.cominvent.com >> >>> 7. nov. 2018 kl. 20:51 skrev Naveen M : >>> >>> Hi, >>> >>> Is there a way to disable jvm properties from the solr UI. >>> >>> It has some information which we don’t want to expose. Any pointers would >>> be helpful. >>> >>> >>> Thanks >> >> > > -- > http://www.the111shift.com
Re: Want to subscribe to this list
Hi Michela, For subscription info see: http://lucene.apache.org/solr/community.html#mailing-lists-irc I'm not aware of any Slack discussion groups, but there are two freenode.net IRC channels - see: http://lucene.apache.org/solr/community.html#irc Steve > On Nov 8, 2018, at 10:42 AM, Michela Dennis wrote: > > Do you by any chance have a slack discussion group as well? > > Michela Dennis
Re: Master Slave Replication Issue
Damian: You say you've switched from 5x to 7x. Did you try to use an index created with 5x or did you index fresh with 7x? Solr/Lucene do not guarantee backward compatibility across more than one major version. Best, Erick On Fri, Nov 9, 2018 at 2:34 AM damian.pawski wrote: > > Hi, > We have switched from 5.4 to 7.2.1 and we have started to see more issues > with the replication. > I think it may be related to the fact that a delta import was started during > a full import (not the case for the Solr 5.4). > > I am getting below error: > > XXX: java.lang.IllegalArgumentException:java.lang.IllegalArgumentException: > Directory MMapDirectory@XXX\index > lockFactory=org.apache.lucene.store.NativeFSLockFactory@21ff4974 still has > pending deleted files; cannot initialize IndexWriter > > Are there more known issues with Solr 7.X and the replication? > Based on https://issues.apache.org/jira/browse/SOLR-11938 I can not trust > Solr 7.X anymore. > > How can I fix the > "XXX: java.lang.IllegalArgumentException:java.lang.IllegalArgumentException: > Directory MMapDirectory@XXX\index > lockFactory=org.apache.lucene.store.NativeFSLockFactory@21ff4974 still has > pending deleted files; cannot initialize IndexWriter > " > issue? > > Thank you > Damian > > > > -- > Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
Re: Sql server data import
Hello, i managed to fix the problem. I'm using Solr 7.5.0. My problem was that in the server logs i got "This Indexschema is not mutable" (i did not know about the logs folder, so i just found out 5 minutes ago). I fixed it by modifying solrconfig.xml to false*}" processor="uuid,remove-blank,field-name-mutating,parse-boolean,parse-long,parse-double,parse-date,add-schema-fields"> Since then the indexing is done correctly. I even got the blob fields indexation working now ! Thanks for your reply, everything is fixed for now. -- Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
Re: Sql server data import
Ok, what that means is you're letting Solr do its best to figure out what fields you should have in the schema and how they're defined. Almost invariably, you can do better by explicitly defining the fields you need in your schema rather than enabling add-unknown. It's fine for getting started, but not advised for production. Best, Erick On Fri, Nov 9, 2018 at 7:52 AM Verthosa wrote: > > Hello, i managed to fix the problem. I'm using Solr 7.5.0. My problem was > that in the server logs i got "This Indexschema is not mutable" (i did not > know about the logs folder, so i just found out 5 minutes ago). I fixed it > by modifying solrconfig.xml to > > name="add-unknown-fields-to-the-schema" > default="${update.autoCreateFields:false*}" > > processor="uuid,remove-blank,field-name-mutating,parse-boolean,parse-long,parse-double,parse-date,add-schema-fields"> > > > > > > Since then the indexing is done correctly. I even got the blob fields > indexation working now ! Thanks for your reply, everything is fixed for now. > > > > > -- > Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
Indexing vs Search node
Hi guys, I read in several blog posts that it's never a good idea to index and search on the same node. I wonder how that can be achieved in Solr Cloud or if it happens automatically. -- Fernando Otero Sr Engineering Manager, Panamera Buenos Aires - Argentina Mobile: +54 911 67697108 Email: fernando.ot...@olx.com
Re: Indexing vs Search node
On 11/9/2018 12:13 PM, Fernando Otero wrote: I read in several blog posts that it's never a good idea to index and search on the same node. I wonder how that can be achieved in Solr Cloud or if it happens automatically. I would disagree with that blanket assertion. Indexing does put extra load on a server that can interfere with query performance. Whether that will be a real problem pretty much depends on exactly how much indexing you're doing, and what kind of query load you need to handle. For extreme scaling, it can be a good idea to separate indexing and searching. With a master/slave architecture, any version of Solr can separate indexing and querying. Before 7.x, it wasn't possible to separate indexing and querying with SolrCloud. With previous major versions, ALL replicas do the same indexing. With 7.x, that's still the default behavior, but 7.x has new replica types that make it possible for indexing to only take place on shard leaders. The latest version of Solr 7.x has a way to prefer certain replica types, which is how the separation can be achieved. Thanks, Shawn
Re: Indexing vs Search node
Fernando: I'd phrase it more strongly than Shawn. Prior to 7.0 all replicas both indexed and search (they were NRT replica), so there wasn't any choice but to index and search on every replica. It's one of those things that if you have very high throughput (indexing) situations, you _might_ want to use TLOG and/or PULL replicas. But TANSTAAFL (There Ain't No Such Thing As A Free Lunch). TLOG/PULL replicas copy index segments around, which may be up to 5G each (default TieredMergePolicy cap on individual segment sizes), whereas NRT replicas just get the raw document. So in the TLOG/PULL situations, you'll get bursts of network traffic but each replica has less CPU load because all the replicas but one for each shard do not have to index the doc. In the NRT case, the raw documents are forwarded so the network is less bursty, but all of the replicas spend CPU cycles indexing. So I wouldn't worry about it unless you running into performance problems, _then_ I'd investigate TLOG/PULL replicas. Best, Erick On Fri, Nov 9, 2018 at 11:37 AM Shawn Heisey wrote: > > On 11/9/2018 12:13 PM, Fernando Otero wrote: > > I read in several blog posts that it's never a good idea to index and > > search on the same node. I wonder how that can be achieved in Solr Cloud or > > if it happens automatically. > > I would disagree with that blanket assertion. > > Indexing does put extra load on a server that can interfere with query > performance. Whether that will be a real problem pretty much depends on > exactly how much indexing you're doing, and what kind of query load you > need to handle. For extreme scaling, it can be a good idea to separate > indexing and searching. > > With a master/slave architecture, any version of Solr can separate > indexing and querying. > > Before 7.x, it wasn't possible to separate indexing and querying with > SolrCloud. With previous major versions, ALL replicas do the same > indexing. With 7.x, that's still the default behavior, but 7.x has new > replica types that make it possible for indexing to only take place on > shard leaders. The latest version of Solr 7.x has a way to prefer > certain replica types, which is how the separation can be achieved. > > Thanks, > Shawn >
Re: Indexing vs Search node
I personally like standalone solr for this reason, i can tune the indexing "master" for doing nothing but taking in documents and that way the slaves dont battle for resources in the process. On Fri, Nov 9, 2018 at 3:10 PM Erick Erickson wrote: > Fernando: > > I'd phrase it more strongly than Shawn. Prior to 7.0 > all replicas both indexed and search (they were NRT replica), > so there wasn't any choice but to index and search on > every replica. > > It's one of those things that if you have very high > throughput (indexing) situations, you _might_ > want to use TLOG and/or PULL replicas. > > But TANSTAAFL (There Ain't No Such Thing As A Free Lunch). > TLOG/PULL replicas copy index segments around, which > may be up to 5G each (default TieredMergePolicy cap on individual > segment sizes), whereas NRT replicas just get the raw document. > > So in the TLOG/PULL situations, you'll get bursts of network traffic > but each replica has less CPU load because all the replicas but one > for each shard do not have to index the doc. > > In the NRT case, the raw documents are forwarded so the > network is less bursty, but all of the replicas spend CPU > cycles indexing. > > So I wouldn't worry about it unless you running into performance > problems, _then_ I'd investigate TLOG/PULL replicas. > > Best, > Erick > On Fri, Nov 9, 2018 at 11:37 AM Shawn Heisey wrote: > > > > On 11/9/2018 12:13 PM, Fernando Otero wrote: > > > I read in several blog posts that it's never a good idea to index > and > > > search on the same node. I wonder how that can be achieved in Solr > Cloud or > > > if it happens automatically. > > > > I would disagree with that blanket assertion. > > > > Indexing does put extra load on a server that can interfere with query > > performance. Whether that will be a real problem pretty much depends on > > exactly how much indexing you're doing, and what kind of query load you > > need to handle. For extreme scaling, it can be a good idea to separate > > indexing and searching. > > > > With a master/slave architecture, any version of Solr can separate > > indexing and querying. > > > > Before 7.x, it wasn't possible to separate indexing and querying with > > SolrCloud. With previous major versions, ALL replicas do the same > > indexing. With 7.x, that's still the default behavior, but 7.x has new > > replica types that make it possible for indexing to only take place on > > shard leaders. The latest version of Solr 7.x has a way to prefer > > certain replica types, which is how the separation can be achieved. > > > > Thanks, > > Shawn > > >
Re: Indexing vs Search node
On 11/9/2018 1:58 PM, David Hastings wrote: I personally like standalone solr for this reason, i can tune the indexing "master" for doing nothing but taking in documents and that way the slaves dont battle for resources in the process. SolrCloud can be set up pretty similar to this if you're running 7.5. You set things up so each collection has two TLOG replicas and the rest of them are PULL. SolrCloud doesn't have master and slave in the same way as the old architecture. There are no single points of failure if the hardware is set up correctly. But because PULL replicas cannot become leader, they are a lot like slaves. Solr 7.5 and later can configure a preference for different replica types at query time. So with the setup described above, you tell it to prefer PULL replicas. If all the PULL replicas were to die, then SolrCloud would use whatever is left. Let's say that you set up a collection so it has two TLOG replicas and four PULL replicas. You could have the TLOG replicas live on a pair of servers with SSD drives and less memory than the other four servers that have PULL replicas, which could be running standard hard drives. Queries love memory, indexing loves fast disks. The preference that indicates PULL replicas would keep the queries so they are running only on the four machines with more memory. The reason that you want two TLOG replicas instead of one is so that if the current leader dies, there is another TLOG replica available to become leader. Thanks, Shawn