Securing Solr 5.0.0

2015-03-22 Thread Frederik Arnold
I followed the "Taking Solr to Production" tutorial and I now have an solr 5.0.0 instance up and running. What is the recommended way for securing solr? Searching should be available for everyone but I want authentication for the Solr Admin UI and also for posting and deleting files.

schemaless slow indexing

2015-03-22 Thread Mike Murphy
I'm trying out schemaless in solr 5.0, but the indexing seems quite a bit slower than it did in the past on 4.10. Any pointers? --Mike

Re: Securing Solr 5.0.0

2015-03-22 Thread Erick Erickson
Have you looked at https://wiki.apache.org/solr/SolrSecurity? Best, Erick On Sun, Mar 22, 2015 at 4:20 AM, Frederik Arnold wrote: > I followed the "Taking Solr to Production" tutorial and I now have an > solr 5.0.0 instance up and running. > > What is the recommended way for securing solr? > Sea

Re: schemaless slow indexing

2015-03-22 Thread Erick Erickson
Please review: http://wiki.apache.org/solr/UsingMailingLists You haven't quantified the slowdown. Or given any details on how you're measuring the "slowdown". Or how you've configured your setups in 4.10 and 5.0. Or... Ad Hossman would say "details matter". Best, Erick On Sun, Mar 22, 2015 at 8:

Re: schemaless slow indexing

2015-03-22 Thread Mike Murphy
I start up solr schemaless and index a bunch of data, and it takes a lot longer to finish indexing. No configuration changes, just straight schemaless. --Mike On Sun, Mar 22, 2015 at 12:27 PM, Erick Erickson wrote: > Please review: http://wiki.apache.org/solr/UsingMailingLists > > You haven't qu

Re: Need help using DIH with FileListEntityProcessor with XPathEntityProcessor

2015-03-22 Thread Martin Wunderlich
Hi Alex, Thanks a lot for the reply and apologies for being unclear. The XPathEntityProcessor provides an option to specify an XSLT file that should be applied to the XML input prior to the actual data import. I am including my current configuration below, with the respective attribute highlig

Re: Securing Solr 5.0.0

2015-03-22 Thread Frederik Arnold
I have and I tried all sorts of things and they didn't work. But I figured it out now. I setup Apache as a reverse proxy and it works. 2015-03-22 17:25 GMT+01:00 Erick Erickson : > Have you looked at https://wiki.apache.org/solr/SolrSecurity? > > Best, > Erick > > On Sun, Mar 22, 2015 at 4:20 AM,

Re: Solr hangs / LRU operations are heavy on cpu

2015-03-22 Thread Umesh Prasad
We use filter very heavily because we run an e-commerce site which has a lot of faceting and drill downs configured at different paths on the store .. We are using master slave replication and we use slaves to support higher qps. filterCache : Concurrent LFU Cache(maxSize=1, initialSize

Re: Need help using DIH with FileListEntityProcessor with XPathEntityProcessor

2015-03-22 Thread Alexandre Rafalovitch
I am not entirely sure your problem is at the XSL level yet? *) I see problems with quotes in two places (in datasource, and in outer entity). Did you paste definitions from MSWord by any chance? *) I see that you declare outer entity to be rootEntity=true, so you will not get anything from inner

Re: schemaless slow indexing

2015-03-22 Thread Alexandre Rafalovitch
Same data with same version of Solr with the only difference between Schema vs. Schemaless? How much longer, 10%, 2x, 20x? Schemaless mode has a much more complex UpdateRequestProcessor chain, that's partially what makes it schemaless. But I hesitate pointing fingers at that without any real detai

Re: schemaless slow indexing

2015-03-22 Thread Yonik Seeley
I took a quick look at the stock schemaless configs... unfortunately they contain a performance trap. There's a copyField by default that copies *all* fields to a catch-all field called "_text". IMO, that's not a great default. Double the index size (well, the "index" portion of it at least... no

Re: How to use ConcurrentUpdateSolrServer for Secured Solr?

2015-03-22 Thread Ramkumar R. Aiyengar
Not a direct answer, but Anshum just created this.. https://issues.apache.org/jira/browse/SOLR-7275 On 20 Mar 2015 23:21, "Furkan KAMACI" wrote: > Is there anyway to use ConcurrentUpdateSolrServer for secured Solr as like > CloudSolrServer: > > HttpClientUtil.setBasicAuth(cloudSolrServer.getLbS

Re: schemaless slow indexing

2015-03-22 Thread Mike Murphy
That's it! I hand edited the file that says you are not supposed to edit it and removed that copyField. Indexing performance is now back to expected levels. I created an issue for this, https://issues.apache.org/jira/browse/SOLR-7284 --Mike On Sun, Mar 22, 2015 at 3:29 PM, Yonik Seeley wrote: >

Error trying to index files to Solr

2015-03-22 Thread Majisha Parambath
Hello, As part of an assignment, we initially crawled and collected NSF and NASA Polar Datasets using Nutch. We used the nutch dump command to dump out the segments that were created as part of the crawl. Now we have to index this data into Solr. I am using java -jar post.jar filename to post to

SolrCloud on Hadoop (Hortonworks Data Platform)

2015-03-22 Thread Vijay Bhoomireddy
Hi, I am trying to setup a SolrCloud cluster on top of Hadoop cluster using Hortonworks Data Platform. I understood how to configure Solr to enable it to store data in HDFS (process given below). However, I could not understand how to enable Solr to setup the cluster using Zookeeper already av

Re: schemaless slow indexing

2015-03-22 Thread Erick Erickson
I think you mean https://issues.apache.org/jira/browse/SOLR-7290? Erick On Sun, Mar 22, 2015 at 2:30 PM, Mike Murphy wrote: > That's it! > I hand edited the file that says you are not supposed to edit it and > removed that copyField. > Indexing performance is now back to expected levels. > > I c

Re: Error trying to index files to Solr

2015-03-22 Thread Shawn Heisey
On 3/22/2015 5:04 PM, Majisha Parambath wrote: > As part of an assignment, we initially crawled and collected NSF and > NASA Polar Datasets using Nutch. We used the nutch dump command to dump > out the segments that were created as part of the crawl. > Now we have to index this data into Solr. I a