Is your browser caching the older search result? The example config comes with HTTP caching on, and if you comment it out the engine defaults to caching on. So, you have to use the XML to configure Solr to stop caching.
On Fri, Oct 8, 2010 at 6:52 AM, Markus Jelsma <markus.jel...@openindex.io> wrote: > > > On Friday, October 08, 2010 03:40:09 pm Allistair Crossley wrote: >> Well, a lot of this is working but not all. >> >> Consider the company name Shooters Inc >> >> My ngram field is able to match queries to the name for shoot and hoot and >> so on. This works. >> >> However consider the company name >> >> Location Scotland >> >> If I query scot I get one result back - but it's for a company called >> Prescott Inc >> >> I looked at the analyzer and realised that the NGramTokenizer was >> generating substrings from the start (left) of the *whole phrase* >> >> location scotland >> >> Because my max was set to 15 it was not generating a token for scot > > Huh? Your supplied config does generate scot as a token. The 15 is just the > maximum size of a gram, it does not set a limit to how many new terms are > generated. > > Are you querying the correct server? Did you reindex on the correct server? It > should work. > >> So I figured I would change to a whitespace tokenizer first and then apply >> the ngram as a filter. >> >> This now looks like it is generating scot in the tokens as shown below: >> Index Analyzer >> >> org.apache.solr.analysis.WhitespaceTokenizerFactory {} >> >> term position 1 2 >> term text location scotland >> term type word word >> source start,end 0,8 9,17 >> payload >> org.apache.solr.analysis.NGramFilterFactory {maxGramSize=15, minGramSize=4} >> >> term >> position 1 2 3 4 5 6 7 8 >> 9 10 11 12 13 14 > 15 16 17 18 19 20 21 22 23 24 > 25 >> 26 27 28 29 30 term >> text loca ocat cati atio tion locat ocati catio ation >> locati ocatio > cation >> locatio ocation location scot cotl otla tlan land >> scotl cotla otlan > tland >> scotla cotlan otland scotlan cotland scotland term >> type word word word word word word word word word >> word word word word word >> word word word word word word word word word >> word word word word word word >> word source >> start,end 0,4 1,5 2,6 3,7 4,8 0,5 1,6 2,7 >> 3,8 0,6 1,7 2,8 0,7 > 1,8 0,8 9,13 >> 10,14 11,15 12,16 13,17 9,14 10,15 11,16 12,17 9,15 > 10,16 11,17 9,16 10,17 >> 9,17 payload >> Query Analyzer >> >> scot >> scot >> >> BUT it still results no results for scot, but does continue to return the >> Prescott result. >> >> So ngramming is working but it is not working when the query is something >> far to the right of the indexed value. >> >> Is this another user-error or have I missed something else here? >> >> Cheers >> >> On Oct 8, 2010, at 9:02 AM, Allistair Crossley wrote: >> > Oh my. I am basically being a total monkey. Every time I was changing my >> > schema.xml to try new things out I was then reindexing our staging >> > server's index instead of my local dev index so no changes were >> > occurring locally. >> > >> > Dear me. >> > >> > This is working now, surprise. >> > >> > On Oct 8, 2010, at 8:53 AM, Markus Jelsma wrote: >> >> How come your query analyser spits out grams? It isn't configured to do >> >> so or you posted an older field definition. Anyway, do you actually >> >> search on your new field? >> >> >> >> On Friday, October 08, 2010 02:46:08 pm Allistair Crossley wrote: >> >>> Hi, >> >>> >> >>> Yep, I was just looking at the analyzer jsp. The ngrams *do* exist as >> >>> expected, so it's not my configuration that is at fault (he says) >> >>> >> >>> Index Analyzer >> >>> sh ho oo ot te er sho hoo oot >> >>> ote ter shoo hoot oote oter > shoot >> >> >> >> hoote ooter >> >> >> >>> shoote hooter >> >>> >> >>> sh ho oo ot te er sho hoo oot >> >>> ote ter shoo hoot oote oter > shoot >> >> >> >> hoote oote >> >> >> >>> r shoote hooter >> >>> sh ho oo ot te er sho hoo oot >> >>> ote ter shoo hoot oote oter > shoot >> >> >> >> hoote oote >> >> >> >>> r shoote hooter >> >>> sh ho oo ot te er sho hoo oot >> >>> ote ter shoo hoot oote oter > shoot >> >> >> >> hoote oote >> >> >> >>> r shoote hooter Query Analyzer >> >>> >> >>> sh ho oo ot te er sho hoo oot >> >>> ote ter shoo hoot oote oter > shoot >> >> >> >> hoote ooter >> >> >> >>> shoote hooter >> >>> >> >>> sh ho oo ot te er sho hoo oot >> >>> ote ter shoo hoot oote oter > shoot >> >> >> >> hoote oote >> >> >> >>> r shoote hooter >> >>> sh ho oo ot te er sho hoo oot >> >>> ote ter shoo hoot oote oter > shoot >> >> >> >> hoote oote >> >> >> >>> r shoote hooter >> >>> sh ho oo ot te er sho hoo oot >> >>> ote ter shoo hoot oote oter > shoot >> >> >> >> hoote oote >> >> >> >>> r shoote hooter >> >>> >> >>> >> >>> Yet, searching either >> >>> >> >>> /solr/select?q=hoot >> >>> >> >>> or >> >>> >> >>> /solr/select?q=name:hoot >> >>> >> >>> does not yield results. >> >>> >> >>> When searching for shooter I see 2 results with names: >> >>> >> >>> 1. <str name="name">Shooters International Inc</str> >> >>> 2. <str name="name">Hong Kong Shooter</str> >> >>> >> >>> Yours, puzzled :) >> >>> >> >>> On Oct 8, 2010, at 8:38 AM, Jan Høydahl / Cominvent wrote: >> >>>> Hi, >> >>>> >> >>>> The first thing I would try is to go to the analysis page, enter your >> >>>> test data, and report back what each analysis stage prints out: >> >>>> http://localhost:8983/solr/admin/analysis.jsp >> >>>> >> >>>> -- >> >>>> Jan Høydahl, search solution architect >> >>>> Cominvent AS - www.cominvent.com >> >>>> >> >>>> On 8. okt. 2010, at 14.19, Allistair Crossley wrote: >> >>>>> Morning all, >> >>>>> >> >>>>> I would like to ngram a company name field in our index. I have read >> >>>>> about >> >> >> >> the costs of doing so in the great David Smiley Solr 1.4 book and just >> >> to get started I have followed his example in setting up an ngram field >> >> type as >> >> >> >> follows: >> >>>>> <fieldType name="text_substring" class="solr.TextField" >> >>>>> positionIncrementGap="100" stored="false" >> >>>>> multiValued="true"> >> >>>>> >> >>>>> <analyzer type="index"> >> >>>>> >> >>>>> <tokenizer >> >>>>> class="solr.StandardTokenizerFactory" /> >> >>>>> <filter >> >>>>> class="solr.LowerCaseFilterFactory" /> >> >>>>> <filter class="solr.NGramFilterFactory" > minGramSize="4" >> >>>>> maxGramSize="15" /> >> >>>>> >> >>>>> </analyzer> >> >>>>> <analyzer type="query"> >> >>>>> >> >>>>> <tokenizer >> >>>>> class="solr.StandardTokenizerFactory" /> >> >>>>> <filter >> >>>>> class="solr.LowerCaseFilterFactory" /> >> >>>>> >> >>>>> </analyzer> >> >>>>> >> >>>>> </fieldType> >> >>>>> >> >>>>> I have restarted/reindexed everything but I still cannot search >> >>>>> >> >>>>> hoot >> >>>>> >> >>>>> and get back the company named Shooter. searching shooter is fine. >> >>>>> >> >>>>> I have followed other examples on the internet regards an ngram field >> >>>>> type. Some examples seem to use an index analyzer that has an ngram >> >>>>> tokenizer rather than filter if this makes a difference. But in all >> >>>>> cases I am not seeing the expected result, just 0 results. >> >>>>> >> >>>>> Is there anything else I should be considering here? I feel like I >> >>>>> must be very close, it doesn't seem complicated but yet it's not >> >>>>> working like everything else I have done with solr to date :) >> >>>>> >> >>>>> Any guidance appreciated, >> >>>>> >> >>>>> Allistair > > -- > Markus Jelsma - CTO - Openindex > http://www.linkedin.com/in/markus17 > 050-8536600 / 06-50258350 > -- Lance Norskog goks...@gmail.com