Solr Cloud Security not working for internal authentication
I am trying to use Solr Security on Solr 5.0 Cloud. Following process I have used :- 1. Modifying web.xml :- AdminAllowedQueries /admin/* admin BASIC Solr Realm Admin admin 1. Changes in jetty.xml :- Solr Realm /etc/realm.properties 0 2. Creating realm.properties:- solradmin: solradmin,admin 3. Set SOLR OPTS in solr.in.sh:- SOLR_OPTS="$SOLR_OPTS -DinternalAuthCredentialsBasicAuthUsername=solradmin" SOLR_OPTS="$SOLR_OPTS -DinternalAuthCredentialsBasicAuthPassword=solradmin" I am getting Unauthorized error while creating collection using following command:- curl -i -X GET \ -H "Authorization:Basic c29scmFkbWluOnNvbHJhZG1pbg==" \ 'http://localhost:8080/solr/admin/collections?action=CREATE&name=test&collection.configName=testconf&numShards=1' Kindly help or suggest the best to get this done. Thanx in advance. Regards, Swaraj Kumar Senior Software Engineer I MakeMyTrip.com ✆ +91-9811774497
Re: Solr -indexing from csv file having 28 cols taking lot of time ..plz help i m new to solr
I have used the following and it works very fast in DIH solr-5.0 You can try this for getting groupNames from regex. Regards, Swaraj Kumar Senior Software Engineer I MakeMyTrip.com +91-9811774497
Re: Solr -indexing from csv file having 28 cols taking lot of time ..plz help i m new to solr
I am not sure but the following regex have worked for me in JAVA. Kindly check with your's one. ([^\x01])\x01([^\x01])\x01..([^\x01])$ Thanks, Swaraj
Re: Setting up SolrCloud 5.0.0 and ZooKeeper 3.4.6
As per http://stackoverflow.com/questions/11765015/zookeeper-not-starting <http://stackoverflow.com/questions/11765015/zookeeper-not-starting> Running without start will fix this. One more change you need to do is Solr default runs on 8983 and you have used 8983 in zookeeper so start solr on different port. Regards, Swaraj Kumar Senior Software Engineer I MakeMyTrip.com Mob No- 9811774497 On Tue, Apr 7, 2015 at 9:42 AM, Zheng Lin Edwin Yeo wrote: > Hi Erick, > > I think I'll just setup the ZooKeeper server in standalone mode first, > before I get more confused as I'm quite new to both Solr and ZooKeeper too. > Better not to jump the gun. > > However, I face this error when I try to start it in standalone mode. > > 2015-04-07 11:59:51,789 [myid:] - ERROR [main:ZooKeeperServerMain@54] - > Invalid arguments, exiting abnormally > java.lang.NumberFormatException: For input string: > "C:\Users\edwin\zookeeper-3.4.6\bin\..\conf\zoo.cfg" > at java.lang.NumberFormatException.forInputString(Unknown Source) > at java.lang.Integer.parseInt(Unknown Source) > at java.lang.Integer.parseInt(Unknown Source) > at org.apache.zookeeper.server.ServerConfig.parse(ServerConfig.java:60) > at > > org.apache.zookeeper.server.ZooKeeperServerMain.initializeAndRun(ZooKeeperServerMain.java:83) > at > > org.apache.zookeeper.server.ZooKeeperServerMain.main(ZooKeeperServerMain.java:52) > at > > org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:116) > at > > org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:78) > 2015-04-07 11:59:51,796 [myid:] - INFO [main:ZooKeeperServerMain@55] - > Usage: ZooKeeperServerMain configfile | port datadir [ticktime] [maxcnxns] > > > I have the following information in my zoo.cfg: > > tickTime=2000 > initLimit=10 > syncLimit=5 > dataDir=C:\\Users\\edwin\\zookeeper-3.4.6\\singleserver > clientPort=8983 > > > I got the same error even if I set the clientPort=2888. > > > Regards, > Edwin > > > > On 7 April 2015 at 11:26, Erick Erickson wrote: > > > Believe me, I'm no Zookeeper expert, but it looks to me like you're > > mixing Solr ports and Zookeeper ports. AFAIK, the two ports in > > the zoo.cfg file are exclusively for the Zookeeper instances to talk > > to each other. Zookeeper isn't aware that the listening nodes are > > Solr noodes, so putting Solr ports in there is confusing Zookeeper > > I'd guess. > > > > Assuming you're starting your three ZK instances on ports 2888, 2889 and > > 2890, > > I'd expect the proper ports are > > 2888:3888 > > 2889:3889 > > 2890:3890 > > > > But as I said I'm not a Zookeeper expert so beware.. > > > > > > Best, > > Erick > > > > On Mon, Apr 6, 2015 at 7:57 PM, Zheng Lin Edwin Yeo > > wrote: > > > Hi, > > > > > > I'm using Solr 5.0.0 and ZooKeeper 3.4.6. I'm trying to set up a > > ZooKeeper > > > with simulation of 3 servers, but they are all located on the same > > machine > > > for testing purpose. > > > > > > In my zoo.cfg file, I have listed down the 3 servers to be as follows: > > > server.1=localhost:8983:3888 > > > server.2=localhost:8984:3889 > > > server.3=localhost:8985:3890 > > > > > > Then I try to start Solr using the following command: > > > bin/solr start -e cloud -z localhost:8983-noprompt > > > > > > However, I'm unable to establish a connection from my Solr to the > > > ZooKeeper. Is this configuration possible, or is there anything which I > > > missed out? > > > > > > Thank you in advance for your help. > > > > > > Regards, > > > Edwin > > >
Re: What is the best way of Indexing different formats of documents?
You can always choose either DIH or /update/extract to index docs in solr. Now there are multiple benefits of DIH which I am listing below :- 1. Clean and update using a single command. 2. DIH also optimize indexing using optimize=true 3. You can do delta-import based on last index time where as in case of /update/extract you need to do manual operation in case of delta import. 4. You can use multiple entity processor and transformers in case of DIH which is very useful to index exact data you want. 5. Query parameter "rows" limits the num of records. Regards, Swaraj Kumar Senior Software Engineer I MakeMyTrip.com Mob No- 9811774497 On Tue, Apr 7, 2015 at 4:18 PM, sangeetha.subraman...@gtnexus.com < sangeetha.subraman...@gtnexus.com> wrote: > Hi, > > I am a newbie to SOLR and basically from database background. We have a > requirement of indexing files of different formats (x12,edifact, csv,xml). > The files which are inputted can be of any format and we need to do a > content based search on it. > > From the web I understand we can use TIKA processor to extract the content > and store it in SOLR. What I want to know is, is there any better approach > for indexing files in SOLR ? Can we index the document through streaming > directly from the Application ? If so what is the disadvantage of using it > (against DIH which fetches from the database)? Could someone share me some > insight on this ? ls there any web links which I can refer to get some idea > on it ? Please do help. > > Thanks > Sangeetha > >
Re: Deploying multiple ZooKeeper ensemble on a single machine
Hi Zheng, I am not sure if this command *"zkServer.cmd start zoo.cfg" * works in windows or not, but in zkServer.cmd it calls zkEnv.cmd where " *ZOOCFG=%ZOOCFGDIR%\zoo.cfg*" is set. So, if you want to run multiple instances of zookeeper, change zoo.cfg to your config file and start zookeeper. The command will not include any start. Regards, Swaraj Kumar Senior Software Engineer I MakeMyTrip.com Mob No- 9811774497 On Wed, Apr 8, 2015 at 12:29 PM, Zheng Lin Edwin Yeo wrote: > Thank you nutchsolruser and Shawn. > > I've changed the clientPort to different port for each of the machine. > It is able to work for my another setup, in which I have 3 different > zookeeper folder, and each has its own configuration and all are using > zoo.cfg. For that setup I can start the 3 servers individually using > zkServer.cmd. > > However, when I use zkServer.cmd in the setup which I posted earlier, only > the first server managed to get started up, and I see the same error > message for the other 2 servers. Although some documents says use the > following commands, it doesn't help too, since I'm supposed to use the > zkServer.cmd > zkServer.cmd start zoo.cfg > zkServer.cmd start zoo2.cfg > zkServer.cmd start zoo3.cfg > > > Regards, > Edwin > > > > On 8 April 2015 at 13:46, Shawn Heisey wrote: > > > On 4/7/2015 9:16 PM, Zheng Lin Edwin Yeo wrote: > > > I'm using SolrCloud 5.0.0 and ZooKeeper 3.4.6 running on Windows, and > now > > > I'm trying to deploy a multiple ZooKeeper ensemble (3 servers) on a > > single > > > machine. These are the settings which I have configured, according to > the > > > Solr Reference Guide. > > > > > > These files are under \conf\ directory > > > (C:\Users\edwin\zookeeper-3.4.6\conf) > > > > > > *zoo.cfg* > > > tickTime=2000 > > > initLimit=10 > > > syncLimit=5 > > > dataDir=C:\\Users\\edwin\\zookeeper-3.4.6\\1 > > > clientPort=2181 > > > server.1=localhost:2888:3888 > > > server.2=localhost:2889:3889 > > > server.3=localhost:2890:3890 > > > > > > > > > [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:QuorumCnxManager@382] - > Cannot > > > open channel to 2 at election address localhost/127.0.0.1:3889 > > > java.net.ConnectException: Connection refused: connect > > > > The first thing I would suspect when running any network program on a > > Windows machine that won't communicate is the Windows firewall, unless > > you have either turned off the firewall or you have explicitly > > configured an exception in the firewall for the relevant ports. > > > > Your other reply that you got from nutchsolruser does point out that all > > three zookeeper configs are using 2181 as the clientPort. Because these > > are all running on the same machine, you must use a different port for > > each one. I'm not sure what happens to subsequent processes after the > > first one starts, but they won't work even if they do manage to start. > > > > Thanks, > > Shawn > > > > >
Re: What is the best way of Indexing different formats of documents?
Hi Sangeetha, /update/extract refers to extractrequesthandler. If you only want to index the data, you can do it with extractrequesthandler. I dont think it requires metadata, but you need to provide literal.id to specify which field will be unique id. For more information :- https://wiki.apache.org/solr/ExtractingRequestHandler https://cwiki.apache.org/confluence/display/solr/Uploading+Data+with+Solr+Cell+using+Apache+Tika Regards, Swaraj Kumar Senior Software Engineer I MakeMyTrip.com Mob No- 9811774497 On Wed, Apr 8, 2015 at 2:20 PM, sangeetha.subraman...@gtnexus.com < sangeetha.subraman...@gtnexus.com> wrote: > Hi Swaraj, > > > > Thanks for the answers. > > From my understanding We can index, > > · Using DIH from db > > · Using DIH from filesystem - this is where I am concentrating on. > > o For this we can use SolrJ with Tika(solr cell) from Java layer in > order to extract the content and send the data through REST API to > solrserver > > o Or we can use extractrequesthandler to do the job. > > > > I just want to index only certain documents and there will not be any > update happening on the indexed document. > > > > In our existing system we already have DIH implemented which indexes > document from sql server (As you said based on last index time). In this > case the metadata is there available in database. > > > > But if we are streaming via url, we would need to append the metadata too. > correct me if i am wrong. And how does the indexing happening here based on > last index time or something else ? Also for extractrequesthandler when > you say manual operation what is it you are talking about ? Can you please > clarify. > > > > Thanks > > Sangeetha > > > > -Original Message- > From: Swaraj Kumar [mailto:swaraj2...@gmail.com] > Sent: 07 April 2015 18:02 > To: solr-user@lucene.apache.org > Subject: Re: What is the best way of Indexing different formats of > documents? > > > > You can always choose either DIH or /update/extract to index docs in solr. > > Now there are multiple benefits of DIH which I am listing below :- > > > > 1. Clean and update using a single command. > > 2. DIH also optimize indexing using optimize=true 3. You can do > delta-import based on last index time where as in case of /update/extract > you need to do manual operation in case of delta import. > > 4. You can use multiple entity processor and transformers in case of DIH > which is very useful to index exact data you want. > > 5. Query parameter "rows" limits the num of records. > > > > Regards, > > > > > > Swaraj Kumar > > Senior Software Engineer I > > MakeMyTrip.com > > Mob No- 9811774497 > > > > On Tue, Apr 7, 2015 at 4:18 PM, sangeetha.subraman...@gtnexus.com sangeetha.subraman...@gtnexus.com> < sangeetha.subraman...@gtnexus.com > <mailto:sangeetha.subraman...@gtnexus.com>> wrote: > > > > > Hi, > > > > > > I am a newbie to SOLR and basically from database background. We have > > > a requirement of indexing files of different formats (x12,edifact, > csv,xml). > > > The files which are inputted can be of any format and we need to do a > > > content based search on it. > > > > > > From the web I understand we can use TIKA processor to extract the > > > content and store it in SOLR. What I want to know is, is there any > > > better approach for indexing files in SOLR ? Can we index the document > > > through streaming directly from the Application ? If so what is the > > > disadvantage of using it (against DIH which fetches from the > > > database)? Could someone share me some insight on this ? ls there any > > > web links which I can refer to get some idea on it ? Please do help. > > > > > > Thanks > > > Sangeetha > > > > > > >
Suggestion in Solr Cloud
Hi All, I want to use suggest option in solr but my SOLR is in cloud mode hence to get the suggestion every time in query I need to provide shard url with it like below:- http://node1/solr/city/suggest?suggest.dictionary=solr-suggester&suggest=true&suggest.build=true&suggest.q=Delhi&shards=node1/solr/city,node2/solr/city&shards.qt=/suggest Here my requirement is, if any ways where I get the same suggestion by not providing shards in the query. Regards, Swaraj Kumar Senior Software Engineer I MakeMyTrip.com Mob No- 9811774497
Re: Suggestion in Solr Cloud
Do anyone has any idea on this?? Regards, Swaraj Kumar Senior Software Engineer I MakeMyTrip.com Mob No- 9811774497 On Wed, Apr 22, 2015 at 11:47 AM, Swaraj Kumar wrote: > Hi All, > > I want to use suggest option in solr but my SOLR is in cloud mode hence to > get the suggestion every time in query I need to provide shard url with it > like below:- > > > http://node1/solr/city/suggest?suggest.dictionary=solr-suggester&suggest=true&suggest.build=true&suggest.q=Delhi&shards=node1/solr/city,node2/solr/city&shards.qt=/suggest > > Here my requirement is, if any ways where I get the same suggestion by > not providing shards in the query. > > > > Regards, > > > Swaraj Kumar > Senior Software Engineer I > MakeMyTrip.com > Mob No- 9811774497 >
Issue with Solr Suggester
I am trying to implement Suggester in SOLR 5.0, Below is my configuration :- my-suggester FuzzyLookupFactory DocumentDictionaryFactory name cityname id text_general true true 10 suggest In Schema.xml :- My data is like :- Delhi New Delhi The New Castle When I Query using following parameter below :- http://localhost/solr/location/suggest?suggest.dictionary=my-suggester&suggest=true&suggest.build=true&suggest.q=Delhi I get "Delhi" only in result but the ideal result is both Delhi and New Delhi. It will be very helpful if you guyz suggest me how to achieve this. Regards, Swaraj
Re: Issue with Solr Suggester
This is working as expected but the problem I get with wrong spelling searches. When I give suggest.q=Bhopml Fuzzy lookup suggests bhopal which is correct buy in AnalyzingInfixSuggester it doesn't provide this . Regards, Swaraj Kumar Senior Software Engineer I MakeMyTrip.com Mob No- 9811774497 On Thu, Apr 23, 2015 at 9:38 PM, Erick Erickson wrote: > I'm pretty sure that FuzzyLookup only goes from the beginning of the > field, so this is not surprising. To get what you're looking for you > probably would get more joy from the AnalyzingInfixSuggester. > > Best, > Erick > > On Thu, Apr 23, 2015 at 6:20 AM, Swaraj Kumar > wrote: > > I am trying to implement Suggester in SOLR 5.0, > > > > Below is my configuration :- > > > > > > my-suggester > > FuzzyLookupFactory > > DocumentDictionaryFactory > > name > > cityname > > id > > text_general > > true > > > > > > > >startup="lazy"> > > > > true > > 10 > > > > > > suggest > > > > > > > > > > In Schema.xml :- > > > > > > > positionIncrementGap="100"> > > > > > > > words="stopwords.txt" /> > > > > > > > > > > > words="stopwords.txt" /> > > > ignoreCase="true" expand="true"/> > > > > > > > > > > My data is like :- > > > > Delhi > > New Delhi > > The New Castle > > > > > > > > When I Query using following parameter below :- > > > http://localhost/solr/location/suggest?suggest.dictionary=my-suggester&suggest=true&suggest.build=true&suggest.q=Delhi > > > > I get "Delhi" only in result but the ideal result is both Delhi and New > > Delhi. > > > > > > It will be very helpful if you guyz suggest me how to achieve this. > > > > > > Regards, > > Swaraj >
Delete Collection in SolrCloud
Hi, I was trying to delete a collection in solrcloud but some server didn't respond and hence some shard and replica didn't get deleted. I deleted physical memory of remaining shard and replica manually but I can see my collection reference in solrcloud because it is not able to find core. No when I try to create the collection it is not able to create also saying collection is already present. I have deleted the collection reference in zookeeper /collection and deleted the config also from /configs. There is nothing in my clusterstate.json. Please help how to create collection? Regards, Swaraj Kumar Senior Software Engineer I MakeMyTrip.com Mob No- 9811774497