Re: Problems with WordDelimiterFilterFactory

2009-10-08 Thread Christian Zambrano
] Error on searching: "400" Status: org.apache.lucene.queryParser.ParseException: Cannot parse ' (Asia -- Civilization AND status_i:(2)) ': Encount Bern -Original Message- From: Christian Zambrano [mailto:czamb...@gmail.com] Sent: Thursday, 8 October 2009 12:48 PM To

Re: Problems with WordDelimiterFilterFactory

2009-10-07 Thread Christian Zambrano
of the "television broadcasting -- asia" links, or type it in the Quick Search box. TIA bern -Original Message- From: Christian Zambrano [mailto:czamb...@gmail.com] Sent: Thursday, 8 October 2009 9:43 AM To: solr-user@lucene.apache.org Subject: Re:

Re: Problems with WordDelimiterFilterFactory

2009-10-07 Thread Christian Zambrano
Could you please provide the exact URL of a query where you are experiencing this problem? eg(Not URL encoded): q=fieldName:"hot and cold: temperatures" On 10/07/2009 05:32 PM, Bernadette Houghton wrote: We are having some issues with our solr parent application not retrieving records as expec

Re: Facet query pb

2009-10-07 Thread Christian Zambrano
Clico, Because you are doing a wildcard query, the token 'AMERICA' will not be analyzed at all. This means that 'AMERICA*' will NOT match 'america'. On 10/07/2009 12:30 PM, Avlesh Singh wrote: I have no idea what "pb" mean but this is what you probably want - fq=(location_field:(NORTH AMERICA

Re: Weird Facet and KeywordTokenizerFactory Issue

2009-10-06 Thread Christian Zambrano
http://localhost:8080/solr-admin/topicscore/select/?facet=true&facet.limit=-1&*facet.field=location* On 10/06/2009 04:09 PM, Ravi Kiran wrote: Yes Exactly the same On Tue, Oct 6, 2009 at 4:52 PM, Christian Zambrano wrote: And you had the analyzer for

Re: Weird Facet and KeywordTokenizerFactory Issue

2009-10-06 Thread Christian Zambrano
sition 1 term text New York term type word source start,end 0,8 payload On Tue, Oct 6, 2009 at 4:19 PM, Christian Zambrano wrote: Have you tried using the Analysis page to see what tokens are generated for the string "New York"? It could be one of the toke

Re: Weird Facet and KeywordTokenizerFactory Issue

2009-10-06 Thread Christian Zambrano
And you had the analyzer for that field set-up the same way as shown on your previous e-mail when you indexed the data? On 10/06/2009 03:46 PM, Ravi Kiran wrote: I did infact check it out any there is no weirdness in analysis page...see result below Index Analyzer org.apache.solr.analysis.Ke

Re: Weird Facet and KeywordTokenizerFactory Issue

2009-10-06 Thread Christian Zambrano
Have you tried using the Analysis page to see what tokens are generated for the string "New York"? It could be one of the token filter is adding the token 'new' for all strings that start with 'new' On 10/06/2009 02:54 PM, Ravi Kiran wrote: Hello All, Iam getting some ghost face

Re: Question about PatternReplace filter and automatic Synonym generation

2009-10-05 Thread Christian Zambrano
Prasanna, Wouldn't it be better to use built-in token filters at both index and query that will convert 'it!' to just 'it'? I believe the WorkDelimeterFilterFactory will do that for you. Christian On Oct 5, 2009, at 7:31 PM, Prasanna Ranganathan > wrote: On 10/5/09 2:46 AM, "Shalin S

Re: Need "OR" in DisMax Query

2009-10-05 Thread Christian Zambrano
David, If your schema includes fields with analyzers that use the StopFilterFactory and the dismax QueryHandler is set-up to search within those fields, then you are correct. On 10/05/2009 01:36 PM, David Giffin wrote: Hi There, Maybe I'm missing something, but I can't seem to get the dism

Re: wildcard searches

2009-10-05 Thread Christian Zambrano
'not analyzed' On 10/05/2009 12:27 PM, Avlesh Singh wrote: No filters are applied to wildcard/fuzzy searches. Ah! Not like that .. I guess, it is just that the phrase searches using wildcards are not analyzed. Cheers Avlesh On Mon, Oct 5, 2009 at 10:42 P

Re: A little help with indexing joined words

2009-10-05 Thread Christian Zambrano
Would you mind explaining how omitNorm has any effect on the IDF problem I described earlier? I agree with your second sentence. I had to use the NGramTokenFilter to accommodate partial matches. On 10/05/2009 12:11 PM, Avlesh Singh wrote: Using synonyms might be a better solution because the

Re: wildcard searches

2009-10-05 Thread Christian Zambrano
Avlesh, I don't understand your answer. First of all, I know of no way of doing wildcard phrase queries. When I said not filters, I meant TokenFilters which is what I believe you mean by 'not analyzed' On 10/05/2009 12:27 PM, Avlesh Singh wrote: No filters are applied to wildcard/fuzzy searc

Re: wildcard searches

2009-10-05 Thread Christian Zambrano
No filters are applied to wildcard/fuzzy searches. I couldn't find a reference to this on either the solr or lucene documentation but I read it on the Solr book from PACKT On 10/05/2009 12:09 PM, Angel Ice wrote: Hi everyone, I have a little question regarding the search engine when a wildca

Re: Question regarding synonym

2009-10-05 Thread Christian Zambrano
lease correct me if my assumption is wrong. Thanks darniz Christian Zambrano wrote: On 10/02/2009 06:02 PM, darniz wrote: Thanks As i said it even works by giving double quotes too. like carDescription:"austin martin" So is that the conclusion that in order to map tw

Re: A little help with indexing joined words

2009-10-05 Thread Christian Zambrano
Using synonyms might be a better solution because the use of EdgeNGramTokenizerFactory has the potential of creating a large number of token which will artificially increase the number of tokens in the index which in turn will affect the IDF score. A query for "borderland" should have returned

Re: Always spellcheck (suggest)

2009-10-05 Thread Christian Zambrano
Shalin, Thanks for the clarification. That explains a lot. I should have looked at the lucene documentation. On 10/05/2009 05:28 AM, Shalin Shekhar Mangar wrote: On Mon, Oct 5, 2009 at 10:24 AM, Christian Zambranowrote: I am really surprised that a query for "behaviour" returns "behav

Re: Always spellcheck (suggest)

2009-10-04 Thread Christian Zambrano
ar 'correct' terms. Eg. 'behaviour' suggests 'behavior' because it has four times as many hits, but they are both 'correct' and the suggestion does not occur without the 'onlyMorePopular' flag set. 'behavior' will not suggest 'behaviour

Re: Always spellcheck (suggest)

2009-10-04 Thread Christian Zambrano
e search terms are present in the dictionary (ie. correct)? Is there any way to force behaviour (1) without behaviour (2) (filtering on frequency). Ta, Greg -----Original Message- From: Christian Zambrano [mailto:czamb...@gmail.com] Sent: Monday, 5 October 2009 11:59 AM To: solr-user@lucene.

Re: Question regarding synonym

2009-10-04 Thread Christian Zambrano
occupy the same position: there is no way to indicate that a "phrase" occupies the same position as a term. For our example the resulting MultiPhraseQuery would be "(sea | sea | seabiscuit) (biscuit | biscit)" which would not match the simple case of "seabisui

Re: Always spellcheck (suggest)

2009-10-04 Thread Christian Zambrano
I believe your understanding in incorrect. The first behavior you described is produced by adding the paremeter "spellcheck=true". Suggestions will be returned regardless of whether there are results. The only time I believe spelling suggestions might not be included is when all of the words ar

Re: Question regarding synonym

2009-10-02 Thread Christian Zambrano
When you use a field qualifier(fieldName:valueToLookFor) it only applies to the word right after the semicolon. If you look at the debug infomation you will notice that for the second word it is using the default field. carDescription:austin *text*:martin the following should word: carDescri

Re: Problem with Wildcard...

2009-10-02 Thread Christian Zambrano
Another thing to remember about wildcard and fuzzy searches is that none of the token filters will be applied. If you are using the LowerCaseFilterFactory at index time, then "RI-MC50034-1" gets converted to "ri-mc50034-1" which is never going to match "RI-MC5000*" Also, I would probably use

Re: What Tokenizerfactory/TokenFilterFactory can/should I use so a search for "wal mart" matches "walmart"(quotes not included in search or index)?

2009-09-11 Thread Christian Zambrano
Ahmet, Thanks a lot. Your suggestion was really helpful. I tried using synonyms before but for some reason it didn't work but this time around it worked. On 09/11/2009 02:55 AM, AHMET ARSLAN wrote: There are a lot of company names that people are uncertain as to the correct spelling. A few of

What Tokenizerfactory/TokenFilterFactory can/should I use so a search for "wal mart" matches "walmart"(quotes not included in search or index)?

2009-09-10 Thread Christian Zambrano
There are a lot of company names that people are uncertain as to the correct spelling. A few of examples are: 1. best buy, bestbuy 2. walmart, wal mart, wal-mart 3. Holiday Inn, HolidayInn What Tokenizer Factory and/or TokenFilterFactory should I use so that somebody typing "wal mart"(quotes no