RE: Index an entire Phrase and not it's constituent parts?

2010-03-14 Thread MitchK
Hmm, I don't understand the problem. Look: If your analyzer looks like: And your document would looks like: "There is a big performance issue. Solving the problem would be great. As long as we try to give our best, ..." After the LowerCaseFilterFactory every

RE: Index an entire Phrase and not it's constituent parts?

2010-03-14 Thread MitchK
I'm sorry for doubleposting: Drinking a coup of coffee was a good idea. KeepWordFilter seems to mean, that you give a Set of words to it. Everything that is not in the set, will be deleted. Furthermore, the description is correct, since it really behaves like an inversion of StopWordFilter. -- Vi

Re: DIH field options

2010-03-14 Thread Dennis Gearon
I asked, but did not see a reply to the following question, (for a newbie like me): Question: What does DIH mean? Answer: Data Import Handler Sent to list to aid searches by other newbies in the future. Dennis Gearon Signature Warning EARTH has a Right To Life, otherwise

DIH datasource configuration

2010-03-14 Thread blargy
My current DIH is configured via the requestHandler block in solrconfig.xml data-config.xml ${datasource.driver} ${datasource.url} ${datasource.user} ${datasource.password} -1 true My question is, does the batchsize a

Re: Best performance for facet dates in trunk using solr.TrieDateField

2010-03-14 Thread Peter Sturge
Hi Yonik, I'm a bit confused now. In your recent Mastering Solr webinar (great stuff, btw, thank you!), the slides imply using tdate fields with a precisionStep of 8 for faster range queries: - Use tint, tfloat, tlong, tdouble, tdate for faster range queries - - Date faceting also

RegexTransformer

2010-03-14 Thread blargy
How would I go about splitting a column by a certain delimiter AND ignore all empty matches. For example: I have a some columns that dont have a value for values but so its getting actually index as blank. I just want to totally ignore those values. Is this possible? -- View this message in

Re: Best performance for facet dates in trunk using solr.TrieDateField

2010-03-14 Thread Yonik Seeley
On Sun, Mar 14, 2010 at 3:39 PM, Peter Sturge wrote: > I'm a bit confused now. In your recent Mastering Solr webinar (great stuff, > btw, thank you!), the slides imply using tdate fields with a precisionStep > of 8 for faster range queries: > >   - Use tint, tfloat, tlong, tdouble, tdate for faste

some hyphenated words not found

2010-03-14 Thread george young
I have a nearly generic out-of-box installation of solr. When I search on a short text document containing a few hyphenated words, I get hits on *some* of the words, but not all. I'm quite puzzled as to why. I've checked that the text is only plain ascii. How can I find out what's wrong? In th

create core with separate solrconfig.xml

2010-03-14 Thread Mark Fletcher
Hi, I wanted to configure one core as Master and one core as slave. This is my existing configuration:- In my SOLR_HOME I have conf/schema.xml, conf/solrconfig.xml and the others when no core was present Also in my SOLR_HOME are solr.xml and coreA created using the CREATE command for cores I ha

Re: some hyphenated words not found

2010-03-14 Thread Lance Norskog
Look at the terms in the index with the analysis.jsp file, or with Luke. The different here is that love-lorn is a separate phrase, but life-long has a comma after it. Try inserting a space before the comma. On 3/14/10, george young wrote: > I have a nearly generic out-of-box installation of sol

Re: Warning : no lockType configured for...

2010-03-14 Thread Lance Norskog
Doing an exhaustive scan of this problem, I did find this one hole: This constructor is not deprecated, but it uses a super() call that is deprecated. Also, this constructor is not used anywhere. I nominate it for deprecation as well. SolrIndexWriter.java, around line 170 /** * */ publi

Re: Multi valued fields

2010-03-14 Thread Lance Norskog
This could be done with a function query, except that the function I would use does not exist. There is no function that returns the number of values that exist for a field. If there were, you could say: -field:A OR (field:A and function() > 1) I don't know the Lucene data structures well, but I