HTML Standard Strip filter word boundary bug

2008-08-07 Thread matt connolly
I found a bug in the HTML Standard Strip filter where it doesn't place word boundaries at html tags that should be ends of blocks. I've just discovered that if I index some text like this: titlesome text it is stripped and indexed as "titlesome" and "text". Putting a space or newline between th

dismax and empty query

2008-07-29 Thread matt connolly
I'm having trouble setting up a dismax handler. I'm trying something really simple, like this: explicit 0.1 title^1.5 tags^1.0 body^0.5 *,score When I analyse a query, I get this (example) in the response: +DisjunctionMaxQuery((title:chair^1.5 | body:chair

Re: nested data structure definition

2008-07-28 Thread matt connolly
In my site, I have a document, which may have multiple comments. For each comment, I would like to know several pieces of information, like: text, author, and date. -Matt Shalin Shekhar Mangar wrote: > > Hi Ranjeet, > > Solr supports multi-valued fields and you can always denormalize your >

Re: solr synonyms behaviour

2008-07-15 Thread matt connolly
You won't have the multiple word problem if you use synonyms at index time instead of query time. swarag wrote: > > Here is a basic example of some synonyms in my synonyms.txt: > club=>club,bar,night cabaret > bar=>bar,club > > As you can see, a search for 'bar' will return any documents with

Re: Filter by Type increases search results.

2008-07-15 Thread matt connolly
Of course - it's so obvious now. Thanks! -- View this message in context: http://www.nabble.com/Filter-by-Type-increases-search-results.-tp18462188p18464457.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Filter by Type increases search results.

2008-07-15 Thread matt connolly
Yes, the same, except for the filter. For example: http://localhost:8983/solr/select?q=fish returns: etc (followed by 2 docs) http://localhost:8983/solr/select?q=fish+type:idea returns: . (followed by 9 docs) -Matt Preetam Rao wrote: > > Hi Matt, > > Other than applying o

Filter by Type increases search results.

2008-07-15 Thread matt connolly
I'm using Solr with a Drupal site, and one of the fields in the schema is "type". In my example development site, searching for the word "fish" returns 2 documents, one type='story', and the other type='idea'. If I filter by type:idea then I get 9 results, the correct first result, followed by 8

Re: solr synonyms behaviour

2008-07-15 Thread matt connolly
swarag wrote: > > Knowing the Lucene struggles with multi-word query-time synonyms, my > question is, does this also affect index-time synonyms? What other > alternatives do we have if we require there to be multiple word synonyms? > No the multiple word problem doesn't happen with index synon

Re: Synonyms list breaks solr

2008-07-11 Thread matt connolly
H... The Analyzer shows me *almost* what I am expecting to see. When I show it being verbose with debug info, I can see exactly what is going on, which is great. Thanks for the tip. What's happening (for most of my test cases) is that some of the synonyms are multiple words (and it's a big sy

Re: Synonyms list breaks solr

2008-07-11 Thread matt connolly
There's no errors in my log, just a list of GET HEAD and POST entries, it looks just like an Apache access log. There are a few entries in the log file that have " " and "-" in them, but as far as I can see that isn't a problem. Is there a way to make Solr's logging a bit more verbose to help de

Re: Synonyms list breaks solr

2008-07-11 Thread matt connolly
I discovered that moving the synonym expansion to at index time rather than query time works just fine with my synonym list. I'd still like to know why it doesn't work expanding at query time though :( -- View this message in context: http://www.nabble.com/Synonyms-list-breaks-solr-tp18401

Synonyms list breaks solr

2008-07-11 Thread matt connolly
I'm setting up Solr to run on a web site I'm working on. Basically, if I use no synonym file, then Solr is working really well for finding text, the porter stemmer filter is great. It also works with a small synonym file, like the one in the example, which defines Television,TV. But when I add