SOLR Rest API for monitoring
Hi All: I am using CDH 5.13 with Solr 4.10. Trying to automate metrics gathering for JVM (CPU, RAM, Storage etc.) by calling the REST APIs described here -> https://lucene.apache.org/solr/guide/6_6/metrics-reporting.html. Are these not supported in my version of Solr? If not, what option do I have? I tried calling this: http://hadoop-nn2.esolocal.com:8983/solr/admin/metrics?wt=json&type=counter&group=core And receive 404 - request not available. Are there any configuration changes needed? Thanks, Abhi -- Abhi Basu
Synonym not working in 4.10 / CDH 5.14
Can someone please help me? Schema.xml Synonyms.txt has been populated with State abbreviations and names. When searching for PropertyAddressState:"Oregon", I do not find docs with "OR". What am I missing? Thanks, Abhi
Re: Synonym not working in 4.10 / CDH 5.14
Yes have tested with PA and NY, nothing works. On Thu, Mar 1, 2018 at 11:38 AM, Alessandro Hoss wrote: > Have you tested with another state? > > I'm asking because maybe solr is considering "OR" as a clause separator > instead of a search term, and in this case the problem is not with synonym, > it is with your query. > > On Thu, Mar 1, 2018 at 2:24 PM Abhi Basu <9000r...@gmail.com> wrote: > > > Can someone please help me? > > > > Schema.xml > > > > > stored="true" docValues="true"/> > > > > > multiValued="true"/> > > > > > > > > > > > > > positionIncrementGap="100"> > > > > > > > words="stopwords.txt" /> > > > ignoreCase="true" expand="true" > > tokenizerFactory="solr.StandardTokenizerFactory"/> > > > > > > > > > > > words="stopwords.txt" /> > > > > > > > > > > > > > > > > > > Synonyms.txt has been populated with State abbreviations and names. > > > > > > When searching for > > > > PropertyAddressState:"Oregon", I do not find docs with "OR". > > > > > > > > What am I missing? > > > > > > Thanks, > > > > Abhi > > > -- Abhi Basu
Re: Synonym not working in 4.10 / CDH 5.14
I am testing the index analyzer first. Do I need to turn on the query analyzer too? synonyms.txt Alabama, AL Alaska, AK Arizona, AZ Arkansas, AR California, CA Colorado, CO Connecticut, CT Delaware, DE Florida, FL Georgia, GA Hawaii, HI Idaho, ID Illinois, IL Indiana, IN Iowa, IA etc ... On Thu, Mar 1, 2018 at 12:27 PM, Alessandro Hoss wrote: > How's your synonyms declared in the file? > > That xml comment () in the synonym filter section isn't there in > your running solr schema.xml, right? :) > > On Thu, Mar 1, 2018 at 2:53 PM Abhi Basu <9000r...@gmail.com> wrote: > > > Yes have tested with PA and NY, nothing works. > > > > On Thu, Mar 1, 2018 at 11:38 AM, Alessandro Hoss > > wrote: > > > > > Have you tested with another state? > > > > > > I'm asking because maybe solr is considering "OR" as a clause separator > > > instead of a search term, and in this case the problem is not with > > synonym, > > > it is with your query. > > > > > > On Thu, Mar 1, 2018 at 2:24 PM Abhi Basu <9000r...@gmail.com> wrote: > > > > > > > Can someone please help me? > > > > > > > > Schema.xml > > > > > > > > > > > stored="true" docValues="true"/> > > > > > > > > > > > multiValued="true"/> > > > > > > > > > > > > > > > > > > > > > > > > > > > positionIncrementGap="100"> > > > > > > > > > > > > > > > words="stopwords.txt" /> > > > > synonyms="synonyms.txt" > > > > ignoreCase="true" expand="true" > > > > tokenizerFactory="solr.StandardTokenizerFactory"/> > > > > > > > > > > > > > > > > > > > > > > > words="stopwords.txt" /> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Synonyms.txt has been populated with State abbreviations and names. > > > > > > > > > > > > When searching for > > > > > > > > PropertyAddressState:"Oregon", I do not find docs with "OR". > > > > > > > > > > > > > > > > What am I missing? > > > > > > > > > > > > Thanks, > > > > > > > > Abhi > > > > > > > > > > > > > > > -- > > Abhi Basu > > > -- Abhi Basu
Re: Synonym not working in 4.10 / CDH 5.14
What should PropertyAddressState type be in order to be caught into the text_general config below? I have remeoved the copyfield now. On Thu, Mar 1, 2018 at 1:12 PM, Steve Rowe wrote: > Hi Abhi, > > PropertyAddressState is of type “string”, which has no analysis applied. > > Since you copyfield to “text” field, which has the analysis you expect, > you could try querying it instead. > > -- > Steve > www.lucidworks.com > > > On Mar 1, 2018, at 12:23 PM, Abhi Basu <9000r...@gmail.com> wrote: > > > > Can someone please help me? > > > > Schema.xml > > > > > stored="true" docValues="true"/> > > > > > multiValued="true"/> > > > > > > > > > > > > > positionIncrementGap="100"> > > > > > > > words="stopwords.txt" /> > > > ignoreCase="true" expand="true" > > tokenizerFactory="solr.StandardTokenizerFactory"/> > > > > > > > > > > > words="stopwords.txt" /> > > > > > > > > > > > > > > > > > > Synonyms.txt has been populated with State abbreviations and names. > > > > > > When searching for > > > > PropertyAddressState:"Oregon", I do not find docs with "OR". > > > > > > > > What am I missing? > > > > > > Thanks, > > > > Abhi > > -- Abhi Basu
Re: Synonym not working in 4.10 / CDH 5.14
Should it be defined as this instead? On Thu, Mar 1, 2018 at 1:16 PM, Abhi Basu <9000r...@gmail.com> wrote: > What should PropertyAddressState type be in order to be caught into the > text_general config below? > > I have remeoved the copyfield now. > > docValues="true"/> > > > > > positionIncrementGap="100"> > > > words="stopwords.txt" /> > ignoreCase="true" expand="true" > tokenizerFactory="solr.StandardTokenizerFactory"/> > > > > > words="stopwords.txt" /> > ignoreCase="true" expand="true"/> > > > > > > > On Thu, Mar 1, 2018 at 1:12 PM, Steve Rowe wrote: > >> Hi Abhi, >> >> PropertyAddressState is of type “string”, which has no analysis applied. >> >> Since you copyfield to “text” field, which has the analysis you expect, >> you could try querying it instead. >> >> -- >> Steve >> www.lucidworks.com >> >> > On Mar 1, 2018, at 12:23 PM, Abhi Basu <9000r...@gmail.com> wrote: >> > >> > Can someone please help me? >> > >> > Schema.xml >> > >> > > > stored="true" docValues="true"/> >> > >> > > > multiValued="true"/> >> > >> > >> > >> > >> > >> > > > positionIncrementGap="100"> >> > >> > >> >> > words="stopwords.txt" /> >> >> > ignoreCase="true" expand="true" >> > tokenizerFactory="solr.StandardTokenizerFactory"/> >> > >> > >> > >> > >> >> > words="stopwords.txt" /> >> > >> > >> > >> > >> > >> > >> > >> > >> > Synonyms.txt has been populated with State abbreviations and names. >> > >> > >> > When searching for >> > >> > PropertyAddressState:"Oregon", I do not find docs with "OR". >> > >> > >> > >> > What am I missing? >> > >> > >> > Thanks, >> > >> > Abhi >> >> > > > -- > Abhi Basu > -- Abhi Basu
Re: Synonym not working in 4.10 / CDH 5.14
Yes, agreed. Just tested and it works. :) I will have a lot more fields, so every field I need a synonym feature for will have to be type "text_general", right? On Thu, Mar 1, 2018 at 1:57 PM, Steve Rowe wrote: > I think you want type=“text_general” > > -- > Steve > www.lucidworks.com > > > On Mar 1, 2018, at 2:19 PM, Abhi Basu <9000r...@gmail.com> wrote: > > > > Should it be defined as this instead? > > > > > stored="true" docValues="true"/> > > > > > > > > On Thu, Mar 1, 2018 at 1:16 PM, Abhi Basu <9000r...@gmail.com> wrote: > > > >> What should PropertyAddressState type be in order to be caught into the > >> text_general config below? > >> > >> I have remeoved the copyfield now. > >> > >> stored="true" docValues="true"/> > >> > >> > >> > >> > >> positionIncrementGap="100"> > >> > >> > >> words="stopwords.txt" /> > >> ignoreCase="true" expand="true" tokenizerFactory="solr. > StandardTokenizerFactory"/> > >> > >> > >> > >> > >> words="stopwords.txt" /> > >> ignoreCase="true" expand="true"/> > >> > >> > >> > >> > >> > >> > >> On Thu, Mar 1, 2018 at 1:12 PM, Steve Rowe wrote: > >> > >>> Hi Abhi, > >>> > >>> PropertyAddressState is of type “string”, which has no analysis > applied. > >>> > >>> Since you copyfield to “text” field, which has the analysis you expect, > >>> you could try querying it instead. > >>> > >>> -- > >>> Steve > >>> www.lucidworks.com > >>> > >>>> On Mar 1, 2018, at 12:23 PM, Abhi Basu <9000r...@gmail.com> wrote: > >>>> > >>>> Can someone please help me? > >>>> > >>>> Schema.xml > >>>> > >>>> >>>> stored="true" docValues="true"/> > >>>> > >>>> >>>> multiValued="true"/> > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> >>>> positionIncrementGap="100"> > >>>> > >>>> > >>>>>>>> words="stopwords.txt" /> > >>>>>>>> ignoreCase="true" expand="true" > >>>> tokenizerFactory="solr.StandardTokenizerFactory"/> > >>>> > >>>> > >>>> > >>>> > >>>>>>>> words="stopwords.txt" /> > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> Synonyms.txt has been populated with State abbreviations and names. > >>>> > >>>> > >>>> When searching for > >>>> > >>>> PropertyAddressState:"Oregon", I do not find docs with "OR". > >>>> > >>>> > >>>> > >>>> What am I missing? > >>>> > >>>> > >>>> Thanks, > >>>> > >>>> Abhi > >>> > >>> > >> > >> > >> -- > >> Abhi Basu > >> > > > > > > > > -- > > Abhi Basu > > -- Abhi Basu
Re: Synonym not working in 4.10 / CDH 5.14
Thanks for your help. Abhi On Thu, Mar 1, 2018 at 2:06 PM, Steve Rowe wrote: > Yes, either type “text_general” or some other TextField-based field type > that includes a synonym filter. > > -- > Steve > www.lucidworks.com > > > On Mar 1, 2018, at 3:02 PM, Abhi Basu <9000r...@gmail.com> wrote: > > > > Yes, agreed. Just tested and it works. :) > > > > I will have a lot more fields, so every field I need a synonym feature > for > > will have to be type "text_general", right? > > > > On Thu, Mar 1, 2018 at 1:57 PM, Steve Rowe wrote: > > > >> I think you want type=“text_general” > >> > >> -- > >> Steve > >> www.lucidworks.com > >> > >>> On Mar 1, 2018, at 2:19 PM, Abhi Basu <9000r...@gmail.com> wrote: > >>> > >>> Should it be defined as this instead? > >>> > >>> >>> stored="true" docValues="true"/> > >>> > >>> > >>> > >>> On Thu, Mar 1, 2018 at 1:16 PM, Abhi Basu <9000r...@gmail.com> wrote: > >>> > >>>> What should PropertyAddressState type be in order to be caught into > the > >>>> text_general config below? > >>>> > >>>> I have remeoved the copyfield now. > >>>> > >>>> >> stored="true" docValues="true"/> > >>>> > >>>> > >>>> > >>>> > >>>> >> positionIncrementGap="100"> > >>>> > >>>> > >>>>>> words="stopwords.txt" /> > >>>>>> ignoreCase="true" expand="true" tokenizerFactory="solr. > >> StandardTokenizerFactory"/> > >>>> > >>>> > >>>> > >>>> > >>>>>> words="stopwords.txt" /> > >>>>>> ignoreCase="true" expand="true"/> > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> On Thu, Mar 1, 2018 at 1:12 PM, Steve Rowe wrote: > >>>> > >>>>> Hi Abhi, > >>>>> > >>>>> PropertyAddressState is of type “string”, which has no analysis > >> applied. > >>>>> > >>>>> Since you copyfield to “text” field, which has the analysis you > expect, > >>>>> you could try querying it instead. > >>>>> > >>>>> -- > >>>>> Steve > >>>>> www.lucidworks.com > >>>>> > >>>>>> On Mar 1, 2018, at 12:23 PM, Abhi Basu <9000r...@gmail.com> wrote: > >>>>>> > >>>>>> Can someone please help me? > >>>>>> > >>>>>> Schema.xml > >>>>>> > >>>>>> >>>>>> stored="true" docValues="true"/> > >>>>>> > >>>>>> >>>>>> multiValued="true"/> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> >>>>>> positionIncrementGap="100"> > >>>>>> > >>>>>> > >>>>>> >>>>>> words="stopwords.txt" /> > >>>>>> >>>>>> ignoreCase="true" expand="true" > >>>>>> tokenizerFactory="solr.StandardTokenizerFactory"/> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> >>>>>> words="stopwords.txt" /> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> Synonyms.txt has been populated with State abbreviations and names. > >>>>>> > >>>>>> > >>>>>> When searching for > >>>>>> > >>>>>> PropertyAddressState:"Oregon", I do not find docs with "OR". > >>>>>> > >>>>>> > >>>>>> > >>>>>> What am I missing? > >>>>>> > >>>>>> > >>>>>> Thanks, > >>>>>> > >>>>>> Abhi > >>>>> > >>>>> > >>>> > >>>> > >>>> -- > >>>> Abhi Basu > >>>> > >>> > >>> > >>> > >>> -- > >>> Abhi Basu > >> > >> > > > > > > -- > > Abhi Basu > > -- Abhi Basu
HDInsight with Solr 4.9.0 Create Collection
Folks: I'm in a bind. Added Solr 4.9.0 to HDInsight cluster and find no Solrctl commands installed. So, I am doing the following to create a collection. I have my collection schema in a location: /home/sshuser/abhi/ems-collection/conf Using this command to create a collection: http://headnode1:8983/solr/admin/cores?action=CREATE&name=ems-collection&instanceDir=/home/sshuser/abhi/ems-collection/conf <http://hn0-esohad.iqz04pwsg24ulbodxuo51nheig.jx.internal.cloudapp.net:8983/solr/admin/cores?action=CREATE&name=ems-collection&instanceDir=/home/sshuser/abhi/ems-collection/conf/> / Get an error like: Error CREATEing SolrCore 'ems-collection': Unable to create core: ems-collection Caused by: Could not find configName for collection ems-collection found:[collection1, hditestconfig] I guess i need to register my config name with Zk. How do I register the collection schema with Zookeeper? Is there way to bypass the registration with zk and build the collection directly from my schema files at that folder location, like I was able to do in Solr 4.10 in CDH 5.14: solrctl --zk hadoop-dn6.eso.local:2181/solr instancedir --create ems-collection /home/sshuser/abhi/ems-collection/ solrctl --zk hadoop-dn6.eso.local:2181/solr collection --create ems-collection -s 3 -r 2 Your help is appreciated. Thanks, Abhi -- Abhi Basu
Re: HDInsight with Solr 4.9.0 Create Collection
Thanks for the reply, this really helped me. For Solr 4.9, what is the actual zkcli command to upload config? java -classpath example/solr-webapp/WEB-INF/lib/* org.apache.solr.cloud.ZkCLI -cmd upconfig -zkhost 127.0.0.1:9983 -confdir example/solr/collection1/conf -confname conf1 -solrhome example/solr OR ./server/scripts/cloud-scripts/zkcli.sh -zkhost 127.0.0.1:9983 -cmd upconfig -confname my_new_config -confdir server/solr/configsets/basic_configs/conf I dont know why HDP/HDInsight does not provide something like solrctl commands to make life easier for all! On Thu, Mar 8, 2018 at 5:43 PM, Shawn Heisey wrote: > On 3/8/2018 1:26 PM, Abhi Basu wrote: > > I'm in a bind. Added Solr 4.9.0 to HDInsight cluster and find no Solrctl > > commands installed. So, I am doing the following to create a collection. > > This 'solrctl' command is NOT part of Solr. Google tells me it's part > of software from Cloudera. > > You need to talk to Cloudera for support on that software. > > > I have my collection schema in a location: > > > > /home/sshuser/abhi/ems-collection/conf > > > > Using this command to create a collection: > > > > http://headnode1:8983/solr/admin/cores?action=CREATE&; > name=ems-collection&instanceDir=/home/sshuser/abhi/ems-collection/conf > > <http://hn0-esohad.iqz04pwsg24ulbodxuo51nheig.jx. > internal.cloudapp.net:8983/solr/admin/cores?action= > CREATE&name=ems-collection&instanceDir=/home/sshuser/ > abhi/ems-collection/conf/> > > / > > You're using the term "collection". And later you mention ZooKeeper. So > you're almost certainly running in SolrCloud mode. If your Solr is > running in SolrCloud mode, do not try to use the CoreAdmin API > (/solr/admin/cores). Use the Collections API instead. But before that, > you need to get the configuration into ZooKeeper. For standard Solr > without Cloudera's tools, you would typically use the "zkcli" script > (either zkcli.sh or zkcli.bat). See page 376 of the reference guide for > that specific version of Solr for help with the "upconfig" command for > that script: > > http://archive.apache.org/dist/lucene/solr/ref-guide/ > apache-solr-ref-guide-4.9.pdf > > > I guess i need to register my config name with Zk. How do I register the > > collection schema with Zookeeper? > > > > Is there way to bypass the registration with zk and build the collection > > directly from my schema files at that folder location, like I was able to > > do in Solr 4.10 in CDH 5.14: > > > > solrctl --zk hadoop-dn6.eso.local:2181/solr instancedir --create > > ems-collection /home/sshuser/abhi/ems-collection/ > > > > solrctl --zk hadoop-dn6.eso.local:2181/solr collection --create > > ems-collection -s 3 -r 2 > > The solrctl command is not something we can help you with on this > mailing list. Cloudera customizes Solr to the point where only they are > able to really provide support for their version. Your best bet will be > to talk to Cloudera. > > When Solr is running with ZooKeeper, it's in SolrCloud mode. In > SolrCloud mode, you cannot create cores in the same way that you can in > standalone mode -- you MUST create collections, and all configuration > will be in zookeeper, not on the disk. > > Thanks, > Shawn > > -- Abhi Basu
Re: HDInsight with Solr 4.9.0 Create Collection
Ok, so I tried the following: /usr/hdp/current/solr/example/scripts/cloud-scripts/zkcli.sh -cmd upconfig -zkhost zk0-esohad.mzwz3dh4pb1evcdwc1lcsddrbe.jx.internal.cloudapp.net:2181 -confdir /home/sshuser/abhi/ems-collection/conf -confname ems-collection And got this exception: java.lang.IllegalArgumentException: Illegal directory: /home/sshuser/abhi/ems-collection/conf On Fri, Mar 9, 2018 at 10:43 AM, Abhi Basu <9000r...@gmail.com> wrote: > Thanks for the reply, this really helped me. > > For Solr 4.9, what is the actual zkcli command to upload config? > > java -classpath example/solr-webapp/WEB-INF/lib/* > org.apache.solr.cloud.ZkCLI -cmd upconfig -zkhost 127.0.0.1:9983 > -confdir example/solr/collection1/conf -confname conf1 -solrhome > example/solr > > OR > > ./server/scripts/cloud-scripts/zkcli.sh -zkhost 127.0.0.1:9983 -cmd > upconfig -confname my_new_config -confdir server/solr/configsets/basic_ > configs/conf > > I dont know why HDP/HDInsight does not provide something like solrctl > commands to make life easier for all! > > > > > On Thu, Mar 8, 2018 at 5:43 PM, Shawn Heisey wrote: > >> On 3/8/2018 1:26 PM, Abhi Basu wrote: >> > I'm in a bind. Added Solr 4.9.0 to HDInsight cluster and find no Solrctl >> > commands installed. So, I am doing the following to create a collection. >> >> This 'solrctl' command is NOT part of Solr. Google tells me it's part >> of software from Cloudera. >> >> You need to talk to Cloudera for support on that software. >> >> > I have my collection schema in a location: >> > >> > /home/sshuser/abhi/ems-collection/conf >> > >> > Using this command to create a collection: >> > >> > http://headnode1:8983/solr/admin/cores?action=CREATE&name= >> ems-collection&instanceDir=/home/sshuser/abhi/ems-collection/conf >> > <http://hn0-esohad.iqz04pwsg24ulbodxuo51nheig.jx.internal. >> cloudapp.net:8983/solr/admin/cores?action=CREATE&name=ems- >> collection&instanceDir=/home/sshuser/abhi/ems-collection/conf/> >> > / >> >> You're using the term "collection". And later you mention ZooKeeper. So >> you're almost certainly running in SolrCloud mode. If your Solr is >> running in SolrCloud mode, do not try to use the CoreAdmin API >> (/solr/admin/cores). Use the Collections API instead. But before that, >> you need to get the configuration into ZooKeeper. For standard Solr >> without Cloudera's tools, you would typically use the "zkcli" script >> (either zkcli.sh or zkcli.bat). See page 376 of the reference guide for >> that specific version of Solr for help with the "upconfig" command for >> that script: >> >> http://archive.apache.org/dist/lucene/solr/ref-guide/apache- >> solr-ref-guide-4.9.pdf >> >> > I guess i need to register my config name with Zk. How do I register the >> > collection schema with Zookeeper? >> > >> > Is there way to bypass the registration with zk and build the collection >> > directly from my schema files at that folder location, like I was able >> to >> > do in Solr 4.10 in CDH 5.14: >> > >> > solrctl --zk hadoop-dn6.eso.local:2181/solr instancedir --create >> > ems-collection /home/sshuser/abhi/ems-collection/ >> > >> > solrctl --zk hadoop-dn6.eso.local:2181/solr collection --create >> > ems-collection -s 3 -r 2 >> >> The solrctl command is not something we can help you with on this >> mailing list. Cloudera customizes Solr to the point where only they are >> able to really provide support for their version. Your best bet will be >> to talk to Cloudera. >> >> When Solr is running with ZooKeeper, it's in SolrCloud mode. In >> SolrCloud mode, you cannot create cores in the same way that you can in >> standalone mode -- you MUST create collections, and all configuration >> will be in zookeeper, not on the disk. >> >> Thanks, >> Shawn >> >> > > > -- > Abhi Basu > -- Abhi Basu
Re: HDInsight with Solr 4.9.0 Create Collection
That was due to a folder not being present. Is this something to do with version? http://hn0-esohad.mzwz3dh4pb1evcdwc1lcsddrbe.jx.internal.cloudapp.net:8983/solr/admin/collections?action=CREATE&name=ems-collection2&numShards=2&replicationFactor=2&maxShardsPerNode=1 org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException:Error CREATEing SolrCore 'ems-collection2_shard2_replica2': Unable to create core: ems-collection2_shard2_replica2 Caused by: No enum constant org.apache.lucene.util.Version.4.10.3 On Fri, Mar 9, 2018 at 11:11 AM, Abhi Basu <9000r...@gmail.com> wrote: > Ok, so I tried the following: > > /usr/hdp/current/solr/example/scripts/cloud-scripts/zkcli.sh -cmd > upconfig -zkhost zk0-esohad.mzwz3dh4pb1evcdwc1lcsddrbe.jx. > internal.cloudapp.net:2181 -confdir /home/sshuser/abhi/ems-collection/conf > -confname ems-collection > > And got this exception: > java.lang.IllegalArgumentException: Illegal directory: > /home/sshuser/abhi/ems-collection/conf > > > On Fri, Mar 9, 2018 at 10:43 AM, Abhi Basu <9000r...@gmail.com> wrote: > >> Thanks for the reply, this really helped me. >> >> For Solr 4.9, what is the actual zkcli command to upload config? >> >> java -classpath example/solr-webapp/WEB-INF/lib/* >> org.apache.solr.cloud.ZkCLI -cmd upconfig -zkhost 127.0.0.1:9983 >> -confdir example/solr/collection1/conf -confname conf1 -solrhome >> example/solr >> >> OR >> >> ./server/scripts/cloud-scripts/zkcli.sh -zkhost 127.0.0.1:9983 -cmd >> upconfig -confname my_new_config -confdir server/solr/configsets/basic_c >> onfigs/conf >> >> I dont know why HDP/HDInsight does not provide something like solrctl >> commands to make life easier for all! >> >> >> >> >> On Thu, Mar 8, 2018 at 5:43 PM, Shawn Heisey wrote: >> >>> On 3/8/2018 1:26 PM, Abhi Basu wrote: >>> > I'm in a bind. Added Solr 4.9.0 to HDInsight cluster and find no >>> Solrctl >>> > commands installed. So, I am doing the following to create a >>> collection. >>> >>> This 'solrctl' command is NOT part of Solr. Google tells me it's part >>> of software from Cloudera. >>> >>> You need to talk to Cloudera for support on that software. >>> >>> > I have my collection schema in a location: >>> > >>> > /home/sshuser/abhi/ems-collection/conf >>> > >>> > Using this command to create a collection: >>> > >>> > http://headnode1:8983/solr/admin/cores?action=CREATE&name=em >>> s-collection&instanceDir=/home/sshuser/abhi/ems-collection/conf >>> > <http://hn0-esohad.iqz04pwsg24ulbodxuo51nheig.jx.internal.cl >>> oudapp.net:8983/solr/admin/cores?action=CREATE&name=ems-coll >>> ection&instanceDir=/home/sshuser/abhi/ems-collection/conf/> >>> > / >>> >>> You're using the term "collection". And later you mention ZooKeeper. So >>> you're almost certainly running in SolrCloud mode. If your Solr is >>> running in SolrCloud mode, do not try to use the CoreAdmin API >>> (/solr/admin/cores). Use the Collections API instead. But before that, >>> you need to get the configuration into ZooKeeper. For standard Solr >>> without Cloudera's tools, you would typically use the "zkcli" script >>> (either zkcli.sh or zkcli.bat). See page 376 of the reference guide for >>> that specific version of Solr for help with the "upconfig" command for >>> that script: >>> >>> http://archive.apache.org/dist/lucene/solr/ref-guide/apache- >>> solr-ref-guide-4.9.pdf >>> >>> > I guess i need to register my config name with Zk. How do I register >>> the >>> > collection schema with Zookeeper? >>> > >>> > Is there way to bypass the registration with zk and build the >>> collection >>> > directly from my schema files at that folder location, like I was able >>> to >>> > do in Solr 4.10 in CDH 5.14: >>> > >>> > solrctl --zk hadoop-dn6.eso.local:2181/solr instancedir --create >>> > ems-collection /home/sshuser/abhi/ems-collection/ >>> > >>> > solrctl --zk hadoop-dn6.eso.local:2181/solr collection --create >>> > ems-collection -s 3 -r 2 >>> >>> The solrctl command is not something we can help you with on this >>> mailing list. Cloudera customizes Solr to the point where only they are >>> able to really provide support for their version. Your best bet will be >>> to talk to Cloudera. >>> >>> When Solr is running with ZooKeeper, it's in SolrCloud mode. In >>> SolrCloud mode, you cannot create cores in the same way that you can in >>> standalone mode -- you MUST create collections, and all configuration >>> will be in zookeeper, not on the disk. >>> >>> Thanks, >>> Shawn >>> >>> >> >> >> -- >> Abhi Basu >> > > > > -- > Abhi Basu > -- Abhi Basu
Re: HDInsight with Solr 4.9.0 Create Collection
This has been resolved! Turned out to be schema and config file version diff between 4.10 and 4.9. Thanks, Abhi On Fri, Mar 9, 2018 at 11:41 AM, Abhi Basu <9000r...@gmail.com> wrote: > That was due to a folder not being present. Is this something to do with > version? > > http://hn0-esohad.mzwz3dh4pb1evcdwc1lcsddrbe.jx. > internal.cloudapp.net:8983/solr/admin/collections?action= > CREATE&name=ems-collection2&numShards=2&replicationFactor= > 2&maxShardsPerNode=1 > > > org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException:Error > CREATEing SolrCore 'ems-collection2_shard2_replica2': Unable to create > core: ems-collection2_shard2_replica2 Caused by: No enum constant > org.apache.lucene.util.Version.4.10.3 > > On Fri, Mar 9, 2018 at 11:11 AM, Abhi Basu <9000r...@gmail.com> wrote: > >> Ok, so I tried the following: >> >> /usr/hdp/current/solr/example/scripts/cloud-scripts/zkcli.sh -cmd >> upconfig -zkhost zk0-esohad.mzwz3dh4pb1evcdwc1l >> csddrbe.jx.internal.cloudapp.net:2181 -confdir >> /home/sshuser/abhi/ems-collection/conf -confname ems-collection >> >> And got this exception: >> java.lang.IllegalArgumentException: Illegal directory: >> /home/sshuser/abhi/ems-collection/conf >> >> >> On Fri, Mar 9, 2018 at 10:43 AM, Abhi Basu <9000r...@gmail.com> wrote: >> >>> Thanks for the reply, this really helped me. >>> >>> For Solr 4.9, what is the actual zkcli command to upload config? >>> >>> java -classpath example/solr-webapp/WEB-INF/lib/* >>> org.apache.solr.cloud.ZkCLI -cmd upconfig -zkhost 127.0.0.1:9983 >>> -confdir example/solr/collection1/conf -confname conf1 -solrhome >>> example/solr >>> >>> OR >>> >>> ./server/scripts/cloud-scripts/zkcli.sh -zkhost 127.0.0.1:9983 -cmd >>> upconfig -confname my_new_config -confdir server/solr/configsets/basic_c >>> onfigs/conf >>> >>> I dont know why HDP/HDInsight does not provide something like solrctl >>> commands to make life easier for all! >>> >>> >>> >>> >>> On Thu, Mar 8, 2018 at 5:43 PM, Shawn Heisey >>> wrote: >>> >>>> On 3/8/2018 1:26 PM, Abhi Basu wrote: >>>> > I'm in a bind. Added Solr 4.9.0 to HDInsight cluster and find no >>>> Solrctl >>>> > commands installed. So, I am doing the following to create a >>>> collection. >>>> >>>> This 'solrctl' command is NOT part of Solr. Google tells me it's part >>>> of software from Cloudera. >>>> >>>> You need to talk to Cloudera for support on that software. >>>> >>>> > I have my collection schema in a location: >>>> > >>>> > /home/sshuser/abhi/ems-collection/conf >>>> > >>>> > Using this command to create a collection: >>>> > >>>> > http://headnode1:8983/solr/admin/cores?action=CREATE&name=em >>>> s-collection&instanceDir=/home/sshuser/abhi/ems-collection/conf >>>> > <http://hn0-esohad.iqz04pwsg24ulbodxuo51nheig.jx.internal.cl >>>> oudapp.net:8983/solr/admin/cores?action=CREATE&name=ems-coll >>>> ection&instanceDir=/home/sshuser/abhi/ems-collection/conf/> >>>> > / >>>> >>>> You're using the term "collection". And later you mention ZooKeeper. So >>>> you're almost certainly running in SolrCloud mode. If your Solr is >>>> running in SolrCloud mode, do not try to use the CoreAdmin API >>>> (/solr/admin/cores). Use the Collections API instead. But before that, >>>> you need to get the configuration into ZooKeeper. For standard Solr >>>> without Cloudera's tools, you would typically use the "zkcli" script >>>> (either zkcli.sh or zkcli.bat). See page 376 of the reference guide for >>>> that specific version of Solr for help with the "upconfig" command for >>>> that script: >>>> >>>> http://archive.apache.org/dist/lucene/solr/ref-guide/apache- >>>> solr-ref-guide-4.9.pdf >>>> >>>> > I guess i need to register my config name with Zk. How do I register >>>> the >>>> > collection schema with Zookeeper? >>>> > >>>> > Is there way to bypass the registration with zk and build the >>>> collection >>>> > directly from my schema files at that folder location, like I was >>>> able to >>>> > do in Solr 4.10 in CDH 5.14: >>>> > >>>> > solrctl --zk hadoop-dn6.eso.local:2181/solr instancedir --create >>>> > ems-collection /home/sshuser/abhi/ems-collection/ >>>> > >>>> > solrctl --zk hadoop-dn6.eso.local:2181/solr collection --create >>>> > ems-collection -s 3 -r 2 >>>> >>>> The solrctl command is not something we can help you with on this >>>> mailing list. Cloudera customizes Solr to the point where only they are >>>> able to really provide support for their version. Your best bet will be >>>> to talk to Cloudera. >>>> >>>> When Solr is running with ZooKeeper, it's in SolrCloud mode. In >>>> SolrCloud mode, you cannot create cores in the same way that you can in >>>> standalone mode -- you MUST create collections, and all configuration >>>> will be in zookeeper, not on the disk. >>>> >>>> Thanks, >>>> Shawn >>>> >>>> >>> >>> >>> -- >>> Abhi Basu >>> >> >> >> >> -- >> Abhi Basu >> > > > > -- > Abhi Basu > -- Abhi Basu
Solr on HDInsight to write to Active Data Lake
MS Azure does not support Solr 4.9 on HDI, so I am posting here. I would like to write index collection data to HDFS (hosted on ADL). Note: I am able to get to ADL from hadoop fs command like, so hadoop is configured correctly to get to ADL: hadoop fs -ls adl:// This is what I have done so far: 1. Copied all required jars to sol ext lib folder: sudo cp -f /usr/hdp/current/hadoop-client/*.jar /usr/hdp/current/solr/example/lib/ext sudo cp -f /usr/hdp/current/hadoop-client/lib/*.jar /usr/hdp/current/solr/example/lib/ext sudo cp -f /usr/hdp/current/hadoop-hdfs-client/*.jar /usr/hdp/current/solr/example/lib/ext sudo cp -f /usr/hdp/current/hadoop-hdfs-client/lib/*.jar /usr/hdp/current/solr/example/lib/ext sudo cp -f /usr/hdp/current/storm-client/contrib/storm-hbase/storm-hbase*.jar /usr/hdp/current/solr/example/lib/ext sudo cp -f /usr/hdp/current/phoenix-client/lib/phoenix*.jar /usr/hdp/current/solr/example/lib/ext sudo cp -f /usr/hdp/current/hbase-client/lib/hbase*.jar /usr/hdp/current/solr/example/lib/ext This includes the Azure active data lake jars also. 2. Edited my solr-config.xml file for my collection: ${solr.core.name}/data/ adl://esodevdleus2.azuredatalakestore.net/clusters/esohadoopdeveus2/solr/ /usr/hdp/2.6.2.25-1/hadoop/conf ${solr.hdfs.blockcache.global:true} true 1 true 16384 true true 16 When this collection is deployed to solr, I see this error message: 0 2189 org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException:Error CREATEing SolrCore 'ems-collection_shard2_replica2': Unable to create core: ems-collection_shard2_replica2 Caused by: Class org.apache.hadoop.fs.adl.HdiAdlFileSystem not foundorg.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException:Error CREATEing SolrCore 'ems-collection_shard2_replica1': Unable to create core: ems-collection_shard2_replica1 Caused by: Class org.apache.hadoop.fs.adl.HdiAdlFileSystem not foundorg.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException:Error CREATEing SolrCore 'ems-collection_shard1_replica1': Unable to create core: ems-collection_shard1_replica1 Caused by: Class org.apache.hadoop.fs.adl.HdiAdlFileSystem not foundorg.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException:Error CREATEing SolrCore 'ems-collection_shard1_replica2': Unable to create core: ems-collection_shard1_replica2 Caused by: Class org.apache.hadoop.fs.adl.HdiAdlFileSystem not found Has anyone done this and can help me out? Thanks, Abhi -- Abhi Basu
Re: Solr on HDInsight to write to Active Data Lake
I'll try it out. Thanks Abhi On Fri, Mar 23, 2018, 6:22 PM Rick Leir wrote: > Abhi > Check your lib directives. > > https://lucene.apache.org/solr/guide/6_6/lib-directives-in-solrconfig.html#lib-directives-in-solrconfig > > I suspect your jars are not in a lib dir mentioned in solrconfig.xml > Cheers -- Rick > > On March 23, 2018 11:12:17 AM EDT, Abhi Basu <9000r...@gmail.com> wrote: > >MS Azure does not support Solr 4.9 on HDI, so I am posting here. I > >would > >like to write index collection data to HDFS (hosted on ADL). > > > >Note: I am able to get to ADL from hadoop fs command like, so hadoop is > >configured correctly to get to ADL: > >hadoop fs -ls adl:// > > > >This is what I have done so far: > >1. Copied all required jars to sol ext lib folder: > >sudo cp -f /usr/hdp/current/hadoop-client/*.jar > >/usr/hdp/current/solr/example/lib/ext > >sudo cp -f /usr/hdp/current/hadoop-client/lib/*.jar > >/usr/hdp/current/solr/example/lib/ext > >sudo cp -f /usr/hdp/current/hadoop-hdfs-client/*.jar > >/usr/hdp/current/solr/example/lib/ext > >sudo cp -f /usr/hdp/current/hadoop-hdfs-client/lib/*.jar > >/usr/hdp/current/solr/example/lib/ext > >sudo cp -f > >/usr/hdp/current/storm-client/contrib/storm-hbase/storm-hbase*.jar > >/usr/hdp/current/solr/example/lib/ext > >sudo cp -f /usr/hdp/current/phoenix-client/lib/phoenix*.jar > >/usr/hdp/current/solr/example/lib/ext > >sudo cp -f /usr/hdp/current/hbase-client/lib/hbase*.jar > >/usr/hdp/current/solr/example/lib/ext > > > >This includes the Azure active data lake jars also. > > > >2. Edited my solr-config.xml file for my collection: > > > >${solr.core.name}/data/ > > > > >class="solr.HdfsDirectoryFactory"> > > >name="solr.hdfs.home">adl:// > esodevdleus2.azuredatalakestore.net/clusters/esohadoopdeveus2/solr/ > > /usr/hdp/2.6.2.25-1/hadoop/conf > > > >name="solr.hdfs.blockcache.global">${solr.hdfs.blockcache.global:true} > > true > > 1 > > true > > 16384 > > true > > true > > 16 > > > > > > > >When this collection is deployed to solr, I see this error message: > > > > > > > >0 > >2189 > > > > >org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException:Error > >CREATEing SolrCore 'ems-collection_shard2_replica2': > >Unable to create core: ems-collection_shard2_replica2 Caused by: Class > >org.apache.hadoop.fs.adl.HdiAdlFileSystem not > > >foundorg.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException:Error > >CREATEing SolrCore 'ems-collection_shard2_replica1': Unable to create > >core: ems-collection_shard2_replica1 Caused by: Class > >org.apache.hadoop.fs.adl.HdiAdlFileSystem not > > >foundorg.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException:Error > >CREATEing SolrCore 'ems-collection_shard1_replica1': Unable to create > >core: ems-collection_shard1_replica1 Caused by: Class > >org.apache.hadoop.fs.adl.HdiAdlFileSystem not > > >foundorg.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException:Error > >CREATEing SolrCore 'ems-collection_shard1_replica2': Unable to create > >core: ems-collection_shard1_replica2 Caused by: Class > >org.apache.hadoop.fs.adl.HdiAdlFileSystem not found > > > > > > > > > >Has anyone done this and can help me out? > > > >Thanks, > > > >Abhi > > > > > >-- > >Abhi Basu > > -- > Sorry for being brief. Alternate email is rickleir at yahoo dot com
Re: Solr on HDInsight to write to Active Data Lake
Adding this to solrconfig.xml did not work. I put all the azure and hadoop jars in the ext folder. Caused by: Class org.apache.hadoop.fs.adl.HdiAdlFileSystem not found Thanks, Abhi On Fri, Mar 23, 2018 at 7:40 PM, Abhi Basu <9000r...@gmail.com> wrote: > I'll try it out. > > Thanks > > Abhi > > On Fri, Mar 23, 2018, 6:22 PM Rick Leir wrote: > >> Abhi >> Check your lib directives. >> https://lucene.apache.org/solr/guide/6_6/lib-directives- >> in-solrconfig.html#lib-directives-in-solrconfig >> >> I suspect your jars are not in a lib dir mentioned in solrconfig.xml >> Cheers -- Rick >> >> On March 23, 2018 11:12:17 AM EDT, Abhi Basu <9000r...@gmail.com> wrote: >> >MS Azure does not support Solr 4.9 on HDI, so I am posting here. I >> >would >> >like to write index collection data to HDFS (hosted on ADL). >> > >> >Note: I am able to get to ADL from hadoop fs command like, so hadoop is >> >configured correctly to get to ADL: >> >hadoop fs -ls adl:// >> > >> >This is what I have done so far: >> >1. Copied all required jars to sol ext lib folder: >> >sudo cp -f /usr/hdp/current/hadoop-client/*.jar >> >/usr/hdp/current/solr/example/lib/ext >> >sudo cp -f /usr/hdp/current/hadoop-client/lib/*.jar >> >/usr/hdp/current/solr/example/lib/ext >> >sudo cp -f /usr/hdp/current/hadoop-hdfs-client/*.jar >> >/usr/hdp/current/solr/example/lib/ext >> >sudo cp -f /usr/hdp/current/hadoop-hdfs-client/lib/*.jar >> >/usr/hdp/current/solr/example/lib/ext >> >sudo cp -f >> >/usr/hdp/current/storm-client/contrib/storm-hbase/storm-hbase*.jar >> >/usr/hdp/current/solr/example/lib/ext >> >sudo cp -f /usr/hdp/current/phoenix-client/lib/phoenix*.jar >> >/usr/hdp/current/solr/example/lib/ext >> >sudo cp -f /usr/hdp/current/hbase-client/lib/hbase*.jar >> >/usr/hdp/current/solr/example/lib/ext >> > >> >This includes the Azure active data lake jars also. >> > >> >2. Edited my solr-config.xml file for my collection: >> > >> >${solr.core.name}/data/ >> > >> >> >class="solr.HdfsDirectoryFactory"> >> >> >name="solr.hdfs.home">adl://esodevdleus2.azuredatalakestore.net/ >> clusters/esohadoopdeveus2/solr/ >> > /usr/hdp/2.6.2.25-1/hadoop/conf >> >> >name="solr.hdfs.blockcache.global">${solr.hdfs. >> blockcache.global:true} >> > true >> > 1 >> > true >> > 16384 >> > true >> > true >> > 16 >> > >> > >> > >> >When this collection is deployed to solr, I see this error message: >> > >> > >> > >> >0 >> >2189 >> > >> >org.apache.solr.client.solrj.impl.HttpSolrServer$ >> RemoteSolrException:Error >> >CREATEing SolrCore 'ems-collection_shard2_replica2': >> >Unable to create core: ems-collection_shard2_replica2 Caused by: Class >> >org.apache.hadoop.fs.adl.HdiAdlFileSystem not >> >foundorg.apache.solr.client.solrj.impl.HttpSolrServer$ >> RemoteSolrException:Error >> >CREATEing SolrCore 'ems-collection_shard2_replica1': Unable to create >> >core: ems-collection_shard2_replica1 Caused by: Class >> >org.apache.hadoop.fs.adl.HdiAdlFileSystem not >> >foundorg.apache.solr.client.solrj.impl.HttpSolrServer$ >> RemoteSolrException:Error >> >CREATEing SolrCore 'ems-collection_shard1_replica1': Unable to create >> >core: ems-collection_shard1_replica1 Caused by: Class >> >org.apache.hadoop.fs.adl.HdiAdlFileSystem not >> >foundorg.apache.solr.client.solrj.impl.HttpSolrServer$ >> RemoteSolrException:Error >> >CREATEing SolrCore 'ems-collection_shard1_replica2': Unable to create >> >core: ems-collection_shard1_replica2 Caused by: Class >> >org.apache.hadoop.fs.adl.HdiAdlFileSystem not found >> > >> > >> > >> > >> >Has anyone done this and can help me out? >> > >> >Thanks, >> > >> >Abhi >> > >> > >> >-- >> >Abhi Basu >> >> -- >> Sorry for being brief. Alternate email is rickleir at yahoo dot com > > -- Abhi Basu
Solr 4.9 - configs and collections
Running on MS HDInsight and Solr 4.9.What is the BKM for creation, update, delete of configurations and collections? I do the following: 1. First I create the zk config: sudo zkcli.sh -cmd upconfig -zkhost zknode <http://zk1-esohad.tzun3mpncofedp04lr3ird23xc.jx.internal.cloudapp.net/>:2181 -confdir /home/sshuser/ems-collection-49/conf/ -confname ems-collection 2. Then I create the collection: curl ' http://headnode0:8983/solr/admin/collections?action=CREATE&name=ems-collection&numShards=2&replicationFactor=2&maxShardsPerNode=1 ' This works the first time. When I change the zk config, do I run the same command #1? Also, do I do a reload: curl ' http://headnode0:8983/solr/admin/collections?action=RELOAD&name=ems-collection ' Or, do I need to delete and recreate the collection? Very familiar with CDH solrctl commands that make life easier by only having one command for this. Any help is appreciated. Thanks, Abhi -- Abhi Basu
Re: Solr on HDInsight to write to Active Data Lake
; did you restart the JVM? > > Best, > Erick > > On Mon, Mar 26, 2018 at 6:49 AM, Abhi Basu <9000r...@gmail.com> wrote: > > Adding this to solrconfig.xml did not work. I put all the azure and > hadoop > > jars in the ext folder. > > > > > > > > Caused by: Class org.apache.hadoop.fs.adl.HdiAdlFileSystem not found > > > > Thanks, > > > > Abhi > > > > On Fri, Mar 23, 2018 at 7:40 PM, Abhi Basu <9000r...@gmail.com> wrote: > > > >> I'll try it out. > >> > >> Thanks > >> > >> Abhi > >> > >> On Fri, Mar 23, 2018, 6:22 PM Rick Leir wrote: > >> > >>> Abhi > >>> Check your lib directives. > >>> https://lucene.apache.org/solr/guide/6_6/lib-directives- > >>> in-solrconfig.html#lib-directives-in-solrconfig > >>> > >>> I suspect your jars are not in a lib dir mentioned in solrconfig.xml > >>> Cheers -- Rick > >>> > >>> On March 23, 2018 11:12:17 AM EDT, Abhi Basu <9000r...@gmail.com> > wrote: > >>> >MS Azure does not support Solr 4.9 on HDI, so I am posting here. I > >>> >would > >>> >like to write index collection data to HDFS (hosted on ADL). > >>> > > >>> >Note: I am able to get to ADL from hadoop fs command like, so hadoop > is > >>> >configured correctly to get to ADL: > >>> >hadoop fs -ls adl:// > >>> > > >>> >This is what I have done so far: > >>> >1. Copied all required jars to sol ext lib folder: > >>> >sudo cp -f /usr/hdp/current/hadoop-client/*.jar > >>> >/usr/hdp/current/solr/example/lib/ext > >>> >sudo cp -f /usr/hdp/current/hadoop-client/lib/*.jar > >>> >/usr/hdp/current/solr/example/lib/ext > >>> >sudo cp -f /usr/hdp/current/hadoop-hdfs-client/*.jar > >>> >/usr/hdp/current/solr/example/lib/ext > >>> >sudo cp -f /usr/hdp/current/hadoop-hdfs-client/lib/*.jar > >>> >/usr/hdp/current/solr/example/lib/ext > >>> >sudo cp -f > >>> >/usr/hdp/current/storm-client/contrib/storm-hbase/storm-hbase*.jar > >>> >/usr/hdp/current/solr/example/lib/ext > >>> >sudo cp -f /usr/hdp/current/phoenix-client/lib/phoenix*.jar > >>> >/usr/hdp/current/solr/example/lib/ext > >>> >sudo cp -f /usr/hdp/current/hbase-client/lib/hbase*.jar > >>> >/usr/hdp/current/solr/example/lib/ext > >>> > > >>> >This includes the Azure active data lake jars also. > >>> > > >>> >2. Edited my solr-config.xml file for my collection: > >>> > > >>> >${solr.core.name}/data/ > >>> > > >>> > >>> >class="solr.HdfsDirectoryFactory"> > >>> > >>> >name="solr.hdfs.home">adl://esodevdleus2.azuredatalakestore.net/ > >>> clusters/esohadoopdeveus2/solr/ > >>> > /usr/hdp/2.6.2.25-1/hadoop/conf > >>> > >>> >name="solr.hdfs.blockcache.global">${solr.hdfs. > >>> blockcache.global:true} > >>> > true > >>> > 1 > >>> > > true > >>> > 16384 > >>> > true > >>> > true > >>> > 16 > >>> > > >>> > > >>> > > >>> >When this collection is deployed to solr, I see this error message: > >>> > > >>> > > >>> > > >>> >0 > >>> >2189 > >>> > > >>> >org.apache.solr.client.solrj.impl.HttpSolrServer$ > >>> RemoteSolrException:Error > >>> >CREATEing SolrCore 'ems-collection_shard2_replica2': > >>> >Unable to create core: ems-collection_shard2_replica2 Caused by: Class > >>> >org.apache.hadoop.fs.adl.HdiAdlFileSystem not > >>> >foundorg.apache.solr.client.solrj.impl.HttpSolrServer$ > >>> RemoteSolrException:Error > >>> >CREATEing SolrCore 'ems-collection_shard2_replica1': Unable to create > >>> >core: ems-collection_shard2_replica1 Caused by: Class > >>> >org.apache.hadoop.fs.adl.HdiAdlFileSystem not > >>> >foundorg.apache.solr.client.solrj.impl.HttpSolrServer$ > >>> RemoteSolrException:Error > >>> >CREATEing SolrCore 'ems-collection_shard1_replica1': Unable to create > >>> >core: ems-collection_shard1_replica1 Caused by: Class > >>> >org.apache.hadoop.fs.adl.HdiAdlFileSystem not > >>> >foundorg.apache.solr.client.solrj.impl.HttpSolrServer$ > >>> RemoteSolrException:Error > >>> >CREATEing SolrCore 'ems-collection_shard1_replica2': Unable to create > >>> >core: ems-collection_shard1_replica2 Caused by: Class > >>> >org.apache.hadoop.fs.adl.HdiAdlFileSystem not found > >>> > > >>> > > >>> > > >>> > > >>> >Has anyone done this and can help me out? > >>> > > >>> >Thanks, > >>> > > >>> >Abhi > >>> > > >>> > > >>> >-- > >>> >Abhi Basu > >>> > >>> -- > >>> Sorry for being brief. Alternate email is rickleir at yahoo dot com > >> > >> > > > > > > -- > > Abhi Basu > -- Abhi Basu
Re: Solr 4.9 - configs and collections
Thanks for the explanations, very helpful. One more question, what is the sequence to delete the collection? If I use the rest api to delete the collection, then when I go to create it again, I sometimes get an error message saying shard already present. How to clean up the underlying directories on all the nodes? Thanks, Abhi On Mon, Mar 26, 2018 at 6:22 PM, Shawn Heisey wrote: > On 3/26/2018 8:43 AM, Abhi Basu wrote: > > Running on MS HDInsight and Solr 4.9.What is the BKM for creation, > update, > > delete of configurations and collections? > > I have no idea what a BKM is. I will cover the update of configuration > below. > > > I do the following: > > > > 1. First I create the zk config: > > sudo zkcli.sh -cmd upconfig -zkhost zknode > > <http://zk1-esohad.tzun3mpncofedp04lr3ird23xc.jx.internal.cloudapp.net/ > >:2181 > > -confdir /home/sshuser/ems-collection-49/conf/ -confname ems-collection > > Exactly what you've got configured there for the zkhost parameter is > difficult to decipher because it looks like the hsotname got replaced > with a URL by your mail client. But I think you've only got one ZK > server there. Usually there are at least three of them. The command > actually only needs one, but the zkHost string usually has at least > three. It's generally a good idea to use the same string for zkcli that > you use for Solr itself, so it works even when a server is down. > > > 2. Then I create the collection: > > curl ' > > http://headnode0:8983/solr/admin/collections?action= > CREATE&name=ems-collection&numShards=2&replicationFactor= > 2&maxShardsPerNode=1 > > ' > > > > This works the first time. When I change the zk config, do I run the same > > command #1? Also, do I do a reload: > > Yes, if you want to change an existing config and then make it active, > you re-upload the config and then reload any affected collection. > Deleting and recreating the collection is not something you would want > to do unless you plan to completely rebuild it anyway -- deleting the > collection will also delete all the index data. If that's what you > WANT, then deleting and recreating the collection is a good way to make > it happen. Many config updates *do* require a reindex, and some changes > will also require completely deleting the index directories before > building it again. > > > Very familiar with CDH solrctl commands that make life easier by only > > having one command for this. Any help is appreciated. > > If you're using CDH, you'll want to talk to Cloudera for help. They > customize their Solr install to the point where they're the only ones > who know how to use it properly. > > Thanks, > Shawn > > -- Abhi Basu
Re: Solr on HDInsight to write to Active Data Lake
Yes, for the life of me, cannot find info on azure data lake jars and MS has not been much help either. Maybe they dont want us to use Solr on ADLS, Thanks, Abhi On Wed, Mar 28, 2018 at 10:59 AM, Rick Leir wrote: > Hi, > The class that is not found is likely in the Azure related libraries. As > Erick said, are you sure that you have a library containing it? > Cheers > Rick > -- > Sorry for being brief. Alternate email is rickleir at yahoo dot com > -- Abhi Basu
Solr 7.2 cannot see all running nodes
What am I missing? I used the following instructions http://blog.thedigitalgroup.com/susheelk/2015/08/03/solrcloud-2-nodes-solr-1-node-zk-setup/#comment-4321 on 4 nodes. The only difference is I have 3 external zk servers. So this is how I am starting each solr node: ./bin/solr start -cloud -s /usr/local/bin/solr-7.2.1/server/solr/node1/ -p 8983 -z zk0-esohad,zk1-esohad,zk3-esohad:2181 -m 8g They all run without any errors, but when trying to create a collection with 2S/2R, I get an error saying only one node is running. ./server/scripts/cloud-scripts/zkcli.sh -zkhost zk0-esohad,zk1-esohad,zk3-esohad:2181 -cmd upconfig -confname ems-collection -confdir /usr/local/bin/solr-7.2.1/server/solr/configsets/ems-collection-72_configs/conf "Operation create caused exception:":"org.apache.solr.common.SolrException:org.apache.solr.common.SolrException: Cannot create collection ems-collection. Value of maxShardsPerNode is 1, and the number of nodes currently live or live and part of your createNodeSet is 1. This allows a maximum of 1 to be created. Value of numShards is 2, value of nrtReplicas is 2, value of tlogReplicas is 0 and value of pullReplicas is 0. This requires 4 shards to be created (higher than the allowed number)", Any ideas? Thanks, Abhi -- Abhi Basu
Re: Solr 7.2 cannot see all running nodes
Yes, only showing one live node on admin site. Checking zk logs. Thanks, Abhi On Thu, Mar 29, 2018 at 9:32 AM, Ganesh Sethuraman wrote: > may be you can check int he Admin UI --> Cloud --> Tree --> /live_nodes. To > see the list of live nodes before running. If it is less than what you > expected, check the Zoo keeper logs? or make sure connectivity between the > shards and zookeeper. > > On Thu, Mar 29, 2018 at 10:25 AM, Abhi Basu <9000r...@gmail.com> wrote: > > > What am I missing? I used the following instructions > > http://blog.thedigitalgroup.com/susheelk/2015/08/03/ > > solrcloud-2-nodes-solr-1-node-zk-setup/#comment-4321 > > on 4 nodes. The only difference is I have 3 external zk servers. So this > > is how I am starting each solr node: > > > > ./bin/solr start -cloud -s /usr/local/bin/solr-7.2.1/server/solr/node1/ > -p > > 8983 -z zk0-esohad,zk1-esohad,zk3-esohad:2181 -m 8g > > > > They all run without any errors, but when trying to create a collection > > with 2S/2R, I get an error saying only one node is running. > > > > ./server/scripts/cloud-scripts/zkcli.sh -zkhost > > zk0-esohad,zk1-esohad,zk3-esohad:2181 -cmd upconfig -confname > > ems-collection -confdir > > /usr/local/bin/solr-7.2.1/server/solr/configsets/ems- > > collection-72_configs/conf > > > > > > "Operation create caused > > exception:":"org.apache.solr.common.SolrException:org. > apache.solr.common. > > SolrException: > > Cannot create collection ems-collection. Value of maxShardsPerNode is 1, > > and the number of nodes currently live or live and part of your > > createNodeSet is 1. This allows a maximum of 1 to be created. Value of > > numShards is 2, value of nrtReplicas is 2, value of tlogReplicas is 0 and > > value of pullReplicas is 0. This requires 4 shards to be created (higher > > than the allowed number)", > > > > > > Any ideas? > > > > Thanks, > > > > Abhi > > > > -- > > Abhi Basu > > > -- Abhi Basu
Re: Solr 7.2 cannot see all running nodes
So, in the solr.xml on each node should I set the host to the actual host name? ${host:} ${jetty.port:8983} ${hostContext:solr} ${genericCoreNodeNames:true} ${zkClientTimeout:3} ${distribUpdateSoTimeout:60} ${distribUpdateConnTimeout:6} ${zkCredentialsProvider:org.apache.solr.common.cloud.DefaultZkCredentialsProvider} ${zkACLProvider:org.apache.solr.common.cloud.DefaultZkACLProvider} On Thu, Mar 29, 2018 at 9:46 AM, Shawn Heisey wrote: > On 3/29/2018 8:25 AM, Abhi Basu wrote: > >> "Operation create caused >> exception:":"org.apache.solr.common.SolrException:org.apache >> .solr.common.SolrException: >> Cannot create collection ems-collection. Value of maxShardsPerNode is 1, >> and the number of nodes currently live or live and part of your >> > > I'm betting that all your nodes are registering themselves with the same > name, and that name is probably either 127.0.0.1 or 127.1.1.0 -- an address > on the loopback interface. > > Usually this problem (on an OS other than Windows, at least) is caused by > an incorrect /etc/hosts file that maps your hostname to a loopback address > instead of a real address. > > You can override the value that SolrCloud uses to register itself into > zookeeper so it doesn't depend on the OS configuration. In solr.in.sh, I > think this is the SOLR_HOST variable, which gets translated into -Dhost=XXX > on the java commandline. It can also be configured in solr.xml. > > Thanks, > Shawn > > -- Abhi Basu
Re: Solr 7.2 cannot see all running nodes
Ok, will give it a try along with the host name. On Thu, Mar 29, 2018 at 12:20 PM, Webster Homer wrote: > This Zookeeper ensemble doesn't look right. > > > > ./bin/solr start -cloud -s /usr/local/bin/solr-7.2.1/server/solr/node1/ > -p > > 8983 -z zk0-esohad,zk1-esohad,zk3-esohad:2181 -m 8g > > > Shouldn't the zookeeper ensemble be specified as: > zk0-esohad:2181,zk1-esohad:2181,zk3-esohad:2181 > > You should put the zookeeper port on each node in the comma separated list. > I don't know if this is your problem, but I think your solr nodes will only > be connecting to 1 zookeeper > > On Thu, Mar 29, 2018 at 10:56 AM, Walter Underwood > wrote: > > > I had that problem. Very annoying and we probably should require special > > flag to use localhost. > > > > We need to start solr like this: > > > > ./solr start -c -h `hostname` > > > > If anybody ever forgets, we get a 127.0.0.1 node that shows down in > > cluster status. No idea how to get rid of that. > > > > wunder > > Walter Underwood > > wun...@wunderwood.org > > http://observer.wunderwood.org/ (my blog) > > > > > On Mar 29, 2018, at 7:46 AM, Shawn Heisey wrote: > > > > > > On 3/29/2018 8:25 AM, Abhi Basu wrote: > > >> "Operation create caused > > >> exception:":"org.apache.solr.common.SolrException:org. > > apache.solr.common.SolrException: > > >> Cannot create collection ems-collection. Value of maxShardsPerNode is > 1, > > >> and the number of nodes currently live or live and part of your > > > > > > I'm betting that all your nodes are registering themselves with the > same > > name, and that name is probably either 127.0.0.1 or 127.1.1.0 -- an > address > > on the loopback interface. > > > > > > Usually this problem (on an OS other than Windows, at least) is caused > > by an incorrect /etc/hosts file that maps your hostname to a loopback > > address instead of a real address. > > > > > > You can override the value that SolrCloud uses to register itself into > > zookeeper so it doesn't depend on the OS configuration. In solr.in.sh, > I > > think this is the SOLR_HOST variable, which gets translated into > -Dhost=XXX > > on the java commandline. It can also be configured in solr.xml. > > > > > > Thanks, > > > Shawn > > > > > > > > > -- > > > This message and any attachment are confidential and may be privileged or > otherwise protected from disclosure. If you are not the intended recipient, > you must not copy this message or attachment or disclose the contents to > any other person. If you have received this transmission in error, please > notify the sender immediately and delete the message and any attachment > from your system. Merck KGaA, Darmstadt, Germany and any of its > subsidiaries do not accept liability for any omissions or errors in this > message which may arise as a result of E-Mail-transmission or for damages > resulting from any unauthorized changes of the content of this message and > any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its > subsidiaries do not guarantee that this message is free of viruses and does > not accept liability for any damages caused by any virus transmitted > therewith. > > Click http://www.emdgroup.com/disclaimer to access the German, French, > Spanish and Portuguese versions of this disclaimer. > -- Abhi Basu
Re: Solr 7.2 cannot see all running nodes
Also, another question, where it says to copy the zoo.cfg from /solr72/server/solr folder to /solr72/server/solr/node1/solr, should I actually be grabbing the zoo.cfg from one of my external zk nodes? Thanks, Abhi On Thu, Mar 29, 2018 at 1:04 PM, Abhi Basu <9000r...@gmail.com> wrote: > Ok, will give it a try along with the host name. > > > On Thu, Mar 29, 2018 at 12:20 PM, Webster Homer > wrote: > >> This Zookeeper ensemble doesn't look right. >> > >> > ./bin/solr start -cloud -s /usr/local/bin/solr-7.2.1/server/solr/node1/ >> -p >> > 8983 -z zk0-esohad,zk1-esohad,zk3-esohad:2181 -m 8g >> >> >> Shouldn't the zookeeper ensemble be specified as: >> zk0-esohad:2181,zk1-esohad:2181,zk3-esohad:2181 >> >> You should put the zookeeper port on each node in the comma separated >> list. >> I don't know if this is your problem, but I think your solr nodes will >> only >> be connecting to 1 zookeeper >> >> On Thu, Mar 29, 2018 at 10:56 AM, Walter Underwood > > >> wrote: >> >> > I had that problem. Very annoying and we probably should require special >> > flag to use localhost. >> > >> > We need to start solr like this: >> > >> > ./solr start -c -h `hostname` >> > >> > If anybody ever forgets, we get a 127.0.0.1 node that shows down in >> > cluster status. No idea how to get rid of that. >> > >> > wunder >> > Walter Underwood >> > wun...@wunderwood.org >> > http://observer.wunderwood.org/ (my blog) >> > >> > > On Mar 29, 2018, at 7:46 AM, Shawn Heisey >> wrote: >> > > >> > > On 3/29/2018 8:25 AM, Abhi Basu wrote: >> > >> "Operation create caused >> > >> exception:":"org.apache.solr.common.SolrException:org. >> > apache.solr.common.SolrException: >> > >> Cannot create collection ems-collection. Value of maxShardsPerNode >> is 1, >> > >> and the number of nodes currently live or live and part of your >> > > >> > > I'm betting that all your nodes are registering themselves with the >> same >> > name, and that name is probably either 127.0.0.1 or 127.1.1.0 -- an >> address >> > on the loopback interface. >> > > >> > > Usually this problem (on an OS other than Windows, at least) is caused >> > by an incorrect /etc/hosts file that maps your hostname to a loopback >> > address instead of a real address. >> > > >> > > You can override the value that SolrCloud uses to register itself into >> > zookeeper so it doesn't depend on the OS configuration. In solr.in.sh, >> I >> > think this is the SOLR_HOST variable, which gets translated into >> -Dhost=XXX >> > on the java commandline. It can also be configured in solr.xml. >> > > >> > > Thanks, >> > > Shawn >> > > >> > >> > >> >> -- >> >> >> This message and any attachment are confidential and may be privileged or >> otherwise protected from disclosure. If you are not the intended >> recipient, >> you must not copy this message or attachment or disclose the contents to >> any other person. If you have received this transmission in error, please >> notify the sender immediately and delete the message and any attachment >> from your system. Merck KGaA, Darmstadt, Germany and any of its >> subsidiaries do not accept liability for any omissions or errors in this >> message which may arise as a result of E-Mail-transmission or for damages >> resulting from any unauthorized changes of the content of this message and >> any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its >> subsidiaries do not guarantee that this message is free of viruses and >> does >> not accept liability for any damages caused by any virus transmitted >> therewith. >> >> Click http://www.emdgroup.com/disclaimer to access the German, French, >> Spanish and Portuguese versions of this disclaimer. >> > > > > -- > Abhi Basu > -- Abhi Basu
Re: Solr 7.2 cannot see all running nodes
Just an update. Adding hostnames to solr.xml and using "-z zk1:2181,zk2:2181,zk3:2181" worked and I can see 4 live nodes and able to create collection with 2S/2R. Thanks for your help, greatly appreciate it. Regards, Abhi On Thu, Mar 29, 2018 at 1:45 PM, Abhi Basu <9000r...@gmail.com> wrote: > Also, another question, where it says to copy the zoo.cfg from > /solr72/server/solr folder to /solr72/server/solr/node1/solr, should I > actually be grabbing the zoo.cfg from one of my external zk nodes? > > Thanks, > > Abhi > > On Thu, Mar 29, 2018 at 1:04 PM, Abhi Basu <9000r...@gmail.com> wrote: > >> Ok, will give it a try along with the host name. >> >> >> On Thu, Mar 29, 2018 at 12:20 PM, Webster Homer >> wrote: >> >>> This Zookeeper ensemble doesn't look right. >>> > >>> > ./bin/solr start -cloud -s /usr/local/bin/solr-7.2.1/server/solr/node1/ >>> -p >>> > 8983 -z zk0-esohad,zk1-esohad,zk3-esohad:2181 -m 8g >>> >>> >>> Shouldn't the zookeeper ensemble be specified as: >>> zk0-esohad:2181,zk1-esohad:2181,zk3-esohad:2181 >>> >>> You should put the zookeeper port on each node in the comma separated >>> list. >>> I don't know if this is your problem, but I think your solr nodes will >>> only >>> be connecting to 1 zookeeper >>> >>> On Thu, Mar 29, 2018 at 10:56 AM, Walter Underwood < >>> wun...@wunderwood.org> >>> wrote: >>> >>> > I had that problem. Very annoying and we probably should require >>> special >>> > flag to use localhost. >>> > >>> > We need to start solr like this: >>> > >>> > ./solr start -c -h `hostname` >>> > >>> > If anybody ever forgets, we get a 127.0.0.1 node that shows down in >>> > cluster status. No idea how to get rid of that. >>> > >>> > wunder >>> > Walter Underwood >>> > wun...@wunderwood.org >>> > http://observer.wunderwood.org/ (my blog) >>> > >>> > > On Mar 29, 2018, at 7:46 AM, Shawn Heisey >>> wrote: >>> > > >>> > > On 3/29/2018 8:25 AM, Abhi Basu wrote: >>> > >> "Operation create caused >>> > >> exception:":"org.apache.solr.common.SolrException:org. >>> > apache.solr.common.SolrException: >>> > >> Cannot create collection ems-collection. Value of maxShardsPerNode >>> is 1, >>> > >> and the number of nodes currently live or live and part of your >>> > > >>> > > I'm betting that all your nodes are registering themselves with the >>> same >>> > name, and that name is probably either 127.0.0.1 or 127.1.1.0 -- an >>> address >>> > on the loopback interface. >>> > > >>> > > Usually this problem (on an OS other than Windows, at least) is >>> caused >>> > by an incorrect /etc/hosts file that maps your hostname to a loopback >>> > address instead of a real address. >>> > > >>> > > You can override the value that SolrCloud uses to register itself >>> into >>> > zookeeper so it doesn't depend on the OS configuration. In solr.in.sh, >>> I >>> > think this is the SOLR_HOST variable, which gets translated into >>> -Dhost=XXX >>> > on the java commandline. It can also be configured in solr.xml. >>> > > >>> > > Thanks, >>> > > Shawn >>> > > >>> > >>> > >>> >>> -- >>> >>> >>> This message and any attachment are confidential and may be privileged or >>> otherwise protected from disclosure. If you are not the intended >>> recipient, >>> you must not copy this message or attachment or disclose the contents to >>> any other person. If you have received this transmission in error, please >>> notify the sender immediately and delete the message and any attachment >>> from your system. Merck KGaA, Darmstadt, Germany and any of its >>> subsidiaries do not accept liability for any omissions or errors in this >>> message which may arise as a result of E-Mail-transmission or for damages >>> resulting from any unauthorized changes of the content of this message >>> and >>> any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its >>> subsidiaries do not guarantee that this message is free of viruses and >>> does >>> not accept liability for any damages caused by any virus transmitted >>> therewith. >>> >>> Click http://www.emdgroup.com/disclaimer to access the German, French, >>> Spanish and Portuguese versions of this disclaimer. >>> >> >> >> >> -- >> Abhi Basu >> > > > > -- > Abhi Basu > -- Abhi Basu
Solr 7.2 solr.log is missing
Not located in the /server/logs/ folder. Have these files instead solr-8983-console.log solr_gc.log.0.current I can see logs from the Solr dashboard. Where is the solr.log file going to? A search of "solr.log" in the system did not find the file. Is the file called something else for solrcloud mode? log4j.properties shows this: # Default Solr log4j config # rootLogger log level may be programmatically overridden by -Dsolr.log.level solr.log=${solr.log.dir} log4j.rootLogger=INFO, file, CONSOLE # Console appender will be programmatically disabled when Solr is started with option -Dsolr.log.muteconsole log4j.appender.CONSOLE=org.apache.log4j.ConsoleAppender log4j.appender.CONSOLE.layout=org.apache.log4j.EnhancedPatternLayout log4j.appender.CONSOLE.layout.ConversionPattern=%d{-MM-dd HH:mm:ss.SSS} %-5p (%t) [%X{collection} %X{shard} %X{replica} %X{core}] %c{1.} %m%n #- size rotation with log cleanup. log4j.appender.file=org.apache.log4j.RollingFileAppender log4j.appender.file.MaxFileSize=4MB log4j.appender.file.MaxBackupIndex=9 #- File to log to and log format log4j.appender.file.File=${solr.log}/solr.log log4j.appender.file.layout=org.apache.log4j.EnhancedPatternLayout log4j.appender.file.layout.ConversionPattern=%d{-MM-dd HH:mm:ss.SSS} %-5p (%t) [%X{collection} %X{shard} %X{replica} %X{core}] %c{1.} %m%n # Adjust logging levels that should differ from root logger log4j.logger.org.apache.zookeeper=WARN log4j.logger.org.apache.hadoop=WARN log4j.logger.org.eclipse.jetty=WARN log4j.logger.org.eclipse.jetty.server.Server=INFO log4j.logger.org.eclipse.jetty.server.ServerConnector=INFO # set to INFO to enable infostream log messages log4j.logger.org.apache.solr.update.LoggingInfoStream=OFF Thanks, Abhi -- Abhi Basu
Re: Solr 7.2 solr.log is missing
Wow life is complicated :) Since I am using this to start solr, I am assuming the one in /server/scripts/cloud-scripts is being used: ./bin/solr start -cloud -s /usr/local/bin/solr-7.2.1/server/solr/node1/solr -p 8983 -z zk0-esohad:2181,zk1-esohad:2181,zk5-esohad:2181 -m 10g So, I guess I need to edit that one. Thanks, Abhi On Mon, Apr 2, 2018 at 1:14 PM, Erick Erickson wrote: > Technically, Solr doesn't name the file at all, that's in your log4j > config, this line: > > log4j.appender.file.File=${solr.log}/solr.log > > so it's weird that you can't find it on your machine at all. How do > you _start_ Solr? In particular, to you define a system variable > "-Dsolr.log=some_path"? > > And also note that there are three log4j configs, and it's easy to be > using one you don't think you are using, see SOLR-12008. > > Best, > Erick > > On Mon, Apr 2, 2018 at 10:02 AM, Abhi Basu <9000r...@gmail.com> wrote: > > Not located in the /server/logs/ folder. > > > > Have these files instead > > > > solr-8983-console.log > > solr_gc.log.0.current > > > > I can see logs from the Solr dashboard. Where is the solr.log file going > > to? A search of "solr.log" in the system did not find the file. > > > > Is the file called something else for solrcloud mode? > > > > log4j.properties shows this: > > > > # Default Solr log4j config > > # rootLogger log level may be programmatically overridden by > > -Dsolr.log.level > > solr.log=${solr.log.dir} > > log4j.rootLogger=INFO, file, CONSOLE > > > > # Console appender will be programmatically disabled when Solr is started > > with option -Dsolr.log.muteconsole > > log4j.appender.CONSOLE=org.apache.log4j.ConsoleAppender > > log4j.appender.CONSOLE.layout=org.apache.log4j.EnhancedPatternLayout > > log4j.appender.CONSOLE.layout.ConversionPattern=%d{-MM-dd > HH:mm:ss.SSS} > > %-5p (%t) [%X{collection} %X{shard} %X{replica} %X{core}] %c{1.} %m%n > > > > #- size rotation with log cleanup. > > log4j.appender.file=org.apache.log4j.RollingFileAppender > > log4j.appender.file.MaxFileSize=4MB > > log4j.appender.file.MaxBackupIndex=9 > > > > #- File to log to and log format > > log4j.appender.file.File=${solr.log}/solr.log > > log4j.appender.file.layout=org.apache.log4j.EnhancedPatternLayout > > log4j.appender.file.layout.ConversionPattern=%d{-MM-dd HH:mm:ss.SSS} > > %-5p (%t) [%X{collection} %X{shard} %X{replica} %X{core}] %c{1.} %m%n > > > > # Adjust logging levels that should differ from root logger > > log4j.logger.org.apache.zookeeper=WARN > > log4j.logger.org.apache.hadoop=WARN > > log4j.logger.org.eclipse.jetty=WARN > > log4j.logger.org.eclipse.jetty.server.Server=INFO > > log4j.logger.org.eclipse.jetty.server.ServerConnector=INFO > > > > # set to INFO to enable infostream log messages > > log4j.logger.org.apache.solr.update.LoggingInfoStream=OFF > > > > > > Thanks, > > > > Abhi > > > > -- > > Abhi Basu > -- Abhi Basu
Re: Largest number of indexed documents used by Solr
We have tested Solr 4.10 with 200 million docs with avg doc size of 250 KB. No issues with performance when using 3 shards / 2 replicas. On Tue, Apr 3, 2018 at 8:12 PM, Steven White wrote: > Hi everyone, > > I'm about to start a project that requires indexing 36 million records > using Solr 7.2.1. Each record range from 500 KB to 0.25 MB where the > average is 0.1 MB. > > Has anyone indexed this number of records? What are the things I should > worry about? And out of curiosity, what is the largest number of records > that Solr has indexed which is published out there? > > Thanks > > Steven > -- Abhi Basu
Re: Decision on Number of shards and collection
*The BKM I have read so far (trying to find source) says 50 million docs/shard performs well. I have found this in my recent tests as well. But of course it depends on index structure, etc.* On Wed, Apr 11, 2018 at 10:37 AM, Shawn Heisey wrote: > On 4/11/2018 4:15 AM, neotorand wrote: > > I believe heterogeneous data can be indexed to same collection and i can > > have multiple shards for the index to be partitioned.So whats the need > of a > > second collection?. yes when collection size grows i should look for more > > collection.what exactly that size is? what KPI drives the decision of > having > > more collection?Any pointers or links for best practice. > > There are no hard rules. Many factors affect these decisions. > > https://lucidworks.com/2012/07/23/sizing-hardware-in-the- > abstract-why-we-dont-have-a-definitive-answer/ > > Creating multiple collections should be done when there is a logical or > business reason for keeping different sets of data separate from each > other. If there's never any need for people to query all the data at > once, then it might make sense to use separate collections. Or you > might want to put them together just for convenience, and use data in > the index to filter the results to only the information that the user is > allowed to access. > > > when should i go for multiple shards? > > yes when shard size grows.Right? whats the size and how do i benchmark. > > Some indexes function really well with 300 million documents or more per > shard. Other indexes struggle with less than a million per shard. It's > impossible to give you any specific number. It depends on a bunch of > factors. > > If query rate is very high, then you want to keep the shard count low. > Using one shard might not be possible due to index size, but it should > be as low as you can make it. You're also going to want to have a lot > of replicas to handle the load. > > If query rate is extremely low, then sharding the index can actually > *improve* performance, because there will be idle CPU capacity that can > be used for the subqueries. > > Thanks, > Shawn > > -- Abhi Basu