Perf. difference when the solr core is 'current' or not 'current'

2013-07-01 Thread jchen2000
in Solr's admin statistics page, there is a 'current' flag indicating whether the core index reader is 'current' or not. According to some discussions in this mailing list a few months back, it wouldn't affect anything. But my observation is completely different. When the current flag was not check

Re: does solr support query time only stopwords?

2013-06-10 Thread jchen2000
on terms you that are > not stopwords. Try adding &debug=query and > seeing what the parsed query actually is. > > And, of course, I have no idea what Datastax is > doing. > > And, you have to at least reload the core > to pick up the new stopwords. > > Best >

Re: does solr support query time only stopwords?

2013-06-09 Thread jchen2000
Nope. I only searched with individual stop words. Very strange to me Otis Gospodnetic-5 wrote > Maybe returned hits match other query terms. > > Otis > Solr & ElasticSearch Support > http://sematext.com/ > On Jun 8, 2013 6:34 PM, "jchen2000" < > jchen200@

does solr support query time only stopwords?

2013-06-08 Thread jchen2000
I wanted to analyze high frequency terms using Solr's Luke request handler and keep updating the stopwords file for new queries from time to time. Obviously I have to index all terms whether they belong to stopwords list or not. So I configured query analyzer stopwords list but disabled index anal

Re: customize solr search/scoring for performance

2012-11-12 Thread jchen2000
The following was generated from jvisualvm. Seems like the perf is related to scoring a lot. Any idea/pointer on how to customize that part? -- View this message in context: http://lucene.472066.n3.nabble.com/customize-sol

Re: customize solr search/scoring for performance

2012-11-11 Thread jchen2000
Yes, we only need term overlap information to choose top candidates (we may incorporate boost factor for different terms later but that's another story). we are quite new to solr so haven't really profiled the process. Is there any rough guess on what could be expected latency from such cases? ou

customize solr search/scoring for performance

2012-11-09 Thread jchen2000
Hi we have 20million short docs (about 60 terms, less than 1k in total bytes each) on each box, and we wanted to rank results based on how many terms got matched only. In particular we are only interested in top N with best scores (say a small number like 5). With some help from the forum users

Re: need help on solr search

2012-11-05 Thread jchen2000
Used mm parameter and it works! Right now preparing perf test. Please share if anybody has method to optimize dismax queries Thanks! Jeremy Otis Gospodnetic-5 wrote > Hi, > > Have a look at your solrconfig.xml and look for your default operator. > Also > look at the docs for the mm parameter

Re: need help on solr search

2012-11-01 Thread jchen2000
Otis Gospodnetic-5 wrote > You want "ordered term matching" (like in a phrase), but you cannot use > AND > because you do not want all query terms to be required. Correct? That's exactly right! actually none of the query term is required, but we need to base similarity score on how many terms are

Re: need help on solr search

2012-11-01 Thread jchen2000
Seems like phrase query is close, but not exactly what we needed. Here is an example assuming just one field: the doc: a1 a2 a3 b1 b2 c1 c2 c3 c4 d1 d2 the query: a1 a2* a3 a4* b1 b2* c2 d1* d2 both doc and query terms are ordered. We know that a term should never go match with b or c terms. Obvio

Re: need help on solr search

2012-10-31 Thread jchen2000
Sure. here are some more details: 1) we are having 30M ~ 60M documents per node (right now we have 4 nodes, but that will increase in the future). Documents are relatively small (around 3K), but 99% searches must be returned within 200ms and this is measured by test drivers sitting right in front

need help on solr search

2012-10-30 Thread jchen2000
Hi Solr experts, Our documents as well as queries consist of 10 properties in a particular order. Because of stringent requirements on search latency, we grouped them into only 2 fields with 5 properties each (we may use just 1 field, field number over 3 seems too slow), and each property value is