shingles and dismax?

2011-10-29 Thread Vijay Ramachandran
Hello. While trying to understand why phrase match and boost was not working with shingles and the dismax parser, I saw this thread - http://lucene.472066.n3.nabble.com/Local-Params-syntax-not-protecting-Shingles-in-DisMax-from-Lucene-query-parser-td1563090.html It states "I really like the DisMax

Re: difference between analysis output and searches

2011-10-29 Thread Erik Hatcher
Robert - Can you give us a concrete input text, the field type definition, and the query(/ies) that you'd expect to match? The devil is in the details. Just because analysis.jsp _only_ means that an index and query time output token for the given text was equal. But in the "real world" of doi

difference between analysis output and searches

2011-10-29 Thread Robert Petersen
Why is it that I can see in the analysis admin page an obvious match between terms, yet sometimes they don't come back in searches? Debug output on the searches indicate a non-match yet the analysis page shows an obvious match. I don't get it.

Re: large scale indexing issues / single threaded bottleneck

2011-10-29 Thread Nagendra Nagarajayya
Roman: 2) what would be the best way to port these (and only these) changes to 3.4.0? I tried to dig into the branching and revisions, but got lost quickly. Tried something like "svn diff […]realtime_search@r953476 […]realtime_search@r1097767", but I'm not sure if it's even possible to merge th

Re: large scale indexing issues / single threaded bottleneck

2011-10-29 Thread Yonik Seeley
On Sat, Oct 29, 2011 at 6:35 AM, Michael McCandless wrote: > I saw a mention somewhere that you can tell Solr not to use > IW.addDocument (not IW.updateDocument) when you add a document if you > are certain it's not replacing a previous document with the same ID Right - adding overwrite=false to

Re: Uncomplete date expressions

2011-10-29 Thread Erik Fäßler
Hello François, thank you for your quick reply. I thought about just storing which information I am lacking and this would be a possibility of course. It just seemed a bit like quick&dirty to me and I wondered whether Solr really cannot understand dates which only consist of the year. Isn't it

Re: Uncomplete date expressions

2011-10-29 Thread François Schiettecatte
Erik I would complement the date with default values as you suggest and store a boolean flag indicating whether the date was complete or not, or store the original date if it is not complete which would probably be better because the presence of that data would tell you that the original date w

Uncomplete date expressions

2011-10-29 Thread Erik Fäßler
Hi all, I want to index MEDLINE documents which not always contain complete dates of publication. The year is known always. Now the Solr documentation states, dates must have the format "1995-12-31T23:59:59Z" for which month, day and even the time of the day must be known. I could, of course, j

Re: URL Redirect

2011-10-29 Thread Erik Hatcher
I would personally implement this in the app tier, above Solr. One way to do it using Solr to match keywords to URLs is to index special "redirect" documents with the keywords in the search field (either in the main index, or in a separate core index). But there is nothing magically built i

Re: large scale indexing issues / single threaded bottleneck

2011-10-29 Thread Michael McCandless
On Fri, Oct 28, 2011 at 3:27 PM, Simon Willnauer wrote: > one more thing, after somebody (thanks robert) pointed me at the > stacktrace it seems kind of obvious what the root cause of your > problem is. Its solr :) Solr closes the IndexWriter on commit which is > very wasteful since you basically