Re: Is it OK to have very big number of fields in solr/lucene ?

2014-08-11 Thread Lisheng Zhang
Thanks for helps! The use case is that there is a file to which different users have different accessment, if only for filtering I can create one field, like status_field, with value like user1_important user2_important user3_unimportant ... then using a filter to get all files which is importan

Re: what os env you use to develop lucene or solr?

2014-08-11 Thread Toke Eskildsen
On Mon, 2014-08-11 at 03:49 +0200, rulinma wrote: > I want know this, if linux is the best choosen? The only "special" OS-thing about Lucene/Solr is that you should use a 64-bit OS for proper memory mapping. With that out of the way, the question becomes "What OS should one use for developing"

Re: what os env you use to develop lucene or solr?

2014-08-11 Thread Paul Libbrecht
I use MacOSX for development since more than 10 years. It's, by far, the user-friendliest Unix-based system. So copy and paste works "correctly" from the terminal to the IDE. Find in the terminal is nicely behaving (really!). This is kilometers away from XWindows' terminals and megameters away fro

Re: How to grab matching stats in Similarity class

2014-08-11 Thread Varun Thacker
On Thu, Aug 7, 2014 at 6:10 AM, Hafiz Mian M Hamid < mianhami...@yahoo.com.invalid> wrote: > We're using solr 4.2.1 and use an extension of Lucene's DefaultSimilarity > as our similarity class. I am trying to figure out how we could get hold of > the matching stats (i.e. how many/which terms in th

Re: Is it OK to have very big number of fields in solr/lucene ?

2014-08-11 Thread Toke Eskildsen
On Mon, 2014-08-11 at 09:44 +0200, Lisheng Zhang wrote: [...] > But tough part is that we may need to sort files according to such flag, > for each user (I should have mentioned in last mail). My solution is to add > many fields to file document, like > > user1_status, user2_status, user3_status

Solr search \ special cases

2014-08-11 Thread Shay Sofer
Hi, I have some strange cases while search with Solr. I have doc with names like: rule #22, rule +33, rule %44. When search for #22 or %55 or +33 Solr bring me as expected: rule #22 and rule +33 and rule %44. But when appending star (*) to each search (#22*, +33*, %55*), just the one with +

Re: Solr search \ special cases

2014-08-11 Thread Harshvardhan Ojha
Hi Shay, I believe + is treated as space, is it a rest call or api ? what is your field type ? Regards Harshvardhan Ojha On Mon, Aug 11, 2014 at 4:04 PM, Shay Sofer wrote: > Hi, > > I have some strange cases while search with Solr. > > I have doc with names like: rule #22, rule +33, rule %44.

RE: Solr search \ special cases

2014-08-11 Thread Shay Sofer
I call directly from Solr web api. Field type is string. * should bring more results ? this is suffix search ? am I wrong ? Thanks ! -Original Message- From: Harshvardhan Ojha [mailto:ojha.harshvard...@gmail.com] Sent: Monday, August 11, 2014 1:40 PM To: solr-user@lucene.apache.org Subj

Re: Solr search \ special cases

2014-08-11 Thread Jack Krupansky
The use of a wildcard suppresses analysis of the query term, so the special characters remain, but... they were removed when the terms were indexed, so no match. You must manually emulate the index term analysis in order to use wildcards. -- Jack Krupansky -Original Message- From: Sh

RE: SqlEntityProcessor

2014-08-11 Thread Dyer, James
I've heard of a user adding a separate section to the end of their data-config.xml with a SqlEntityProcessor and an UPDATE statement. It would run after your main section. I have not tried it myself, and surely DIH was not designed to do this, but it might work. A better solution might be t

Re: SolrCloud Scale Struggle

2014-08-11 Thread Shawn Heisey
On 8/10/2014 11:07 PM, anand.mahajan wrote: > Thank you for your suggestions. With the autoCommit (every 10 mins) and > softCommit (every 10 secs) frequencies reduced things work much better now. > The CPU usages has gone down considerably too (by about 60%) and the > read/write throughput is showi

When I use minimum match and maxCollationTries parameters together in edismax, Solr gets stuck

2014-08-11 Thread Harun Reşit Zafer
Hi, In the following configuration when uncomment both mm and maxCollationTries lines, and run a query on |/select|, Solr gets stuck with no exception. I tried different values for both parameters and found that values for mm less than %40 still works. | explicit

RE: When I use minimum match and maxCollationTries parameters together in edismax, Solr gets stuck

2014-08-11 Thread Dyer, James
Harun, Just to clarify, is this happening during startup when a warmup query is running, or is this once the server is fully started? This might be another instance of https://issues.apache.org/jira/browse/SOLR-5386 . James Dyer Ingram Content Group (615) 213-4311 -Original Message-

Performance comparison of uploading SolrInputDocument vs JSONRequestHandler

2014-08-11 Thread georgelavash
I have a large number of documents that I am trying to load into SOLR. I am about to begin bench marking this effort, but I thought I would ask here. I have the documents in JSONArrays already. I am most concerned with ingest rate on the server. So I don't mind performing extra work on the cl

Re: Performance comparison of uploading SolrInputDocument vs JSONRequestHandler

2014-08-11 Thread Erick Erickson
I think you're worrying about the wrong problem ;) Often, the difference between the JSON and SolrInputDocument decoding on the server is dwarfed by the time it takes the client to assemble the docs to send. Quick test: When you start indexing, how hard is the Solr server working (measure crudely

RES: SOLRJ Stop Streaming

2014-08-11 Thread Felipe Dantas de Souza Paiva
Hey Guys, any ideas? Thanks, Felipe De: Felipe Paiva Enviado: quarta-feira, 6 de agosto de 2014 16:40 Para: solr-user@lucene.apache.org Assunto: SOLRJ Stop Streaming Hi Guys, in version 4.0 of SOLRJ a support for streaming response was added: https://issues.apache.org/jira/browse/SOLR-2112 I

Need some help with solr not restarting

2014-08-11 Thread Mike Thomsen
I'm very new to SolrCloud. When I tried restarting our tomcat server running SolrCloud, I started getting this in our logs: SEVERE: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /configs/configuration1/default-collection/data/index/_3ts3_Lucene4

regexTransformer returns no results if there is no match

2014-08-11 Thread alxsss
Hello, I try to construct wikipedia page url from page title using regexTransformer with This does not work for titles that have no space, so title_underscore for them is empty. Any ideas what is wrong here? This is with solr-4.8.1 Thanks. Alex.

SolrCloud OOM Problem

2014-08-11 Thread dancoleman
My SolrCloud of 3 shard / 3 replicas is having a lot of OOM errors. Here are some specs on my setup: hosts: all are EC2 m1.large with 250G data volumes documents: 120M total zookeeper: 5 external t1.micros startup command with memory and GC values

Re: SolrCloud OOM Problem

2014-08-11 Thread Shawn Heisey
On 8/11/2014 5:27 PM, dancoleman wrote: > My SolrCloud of 3 shard / 3 replicas is having a lot of OOM errors. Here are > some specs on my setup: > > hosts: all are EC2 m1.large with 250G data volumes > documents: 120M total > zookeeper: 5 external t1.micros > Linux "top" command output with no

Re: SolrCloud OOM Problem

2014-08-11 Thread dancoleman
90G is correct, each host is currently holding that much data. Are you saying that 32GB to 96GB would be needed for each host? Assuming we did not add more shards that is. -- View this message in context: http://lucene.472066.n3.nabble.com/SolrCloud-OOM-Problem-tp4152389p4152401.html Sent f

Re: SolrCloud OOM Problem

2014-08-11 Thread Shawn Heisey
> 90G is correct, each host is currently holding that much data. > > Are you saying that 32GB to 96GB would be needed for each host? Assuming > we did not add more shards that is. If you want good performance and enough memory to give Solr the heap it will need, yes. Lucene (the search API that

what's the difference between solr and elasticsearch in hdfs case?

2014-08-11 Thread Jianyi
Hi~ I'm new to both solr and elasticsearch. I have read that both the two support creating index on hdfs. So, what's the difference between solr and elasticsearch in hdfs case? -- View this message in context: http://lucene.472066.n3.nabble.com/what-s-the-difference-between-solr-and-elasticse

SpatialForTimeDurations question

2014-08-11 Thread 小川修
Hello. I am sorry for bad English. I am using Solr4.7.1 I want to search date range query to multiValued field. Then, I found solution. http://wiki.apache.org/solr/SpatialForTimeDurations The solution is almost perfect. But some values, I got error message #message 2014/08/10 23:28:12.558: ERR

Re: what's the difference between solr and elasticsearch in hdfs case?

2014-08-11 Thread Alexandre Rafalovitch
Are you comparing Solr vs. ElasticSearch. Or Cloudera vs. ElasticSearch? Because Cloudera is also commercial like ElasticSearch and has a full HDFS story. Regards, Alex. Personal: http://www.outerthoughts.com/ and @arafalov Solr resources and newsletter: http://www.solr-start.com/ and @solrsta

Can I use multiple cores

2014-08-11 Thread Ramprasad Padmanabhan
I need to store in SOLR all data of my clients mailing activitiy The data contains meta data like From;To:Date;Time:Subject etc I would easily have 1000 Million records every 2 months. What I am currently doing is creating cores per client. So I have 400 cores already. Is this a good idea to do

Re: When I use minimum match and maxCollationTries parameters together in edismax, Solr gets stuck

2014-08-11 Thread Harun Reşit Zafer
I happens once the server is fully started. And when it gets stuck sometimes I have to restart the server, sometimes I'm able to edit the solrconfig.xml and reload it. Harun Reşit Zafer TÜBİTAK BİLGEM BTE Bulut Bilişim ve Büyük Veri Analiz Sistemleri Bölümü T +90 262 675 3268 W http://www.hrza