Thanks! Any idea why
Miguel : three dimensions : [Exhibitio parse to: miguel, three,dimensions, exhibitio BUT Miguel : three dimensions : [Exhibition] parses to miguel, three, dimensions, null_1, exhibition seems quite strange... --peter On Mon, Aug 3, 2009 at 4:02 PM, Andrzej Bialecki <a...@getopt.org> wrote: > Peter Keane wrote: > >> I've used Luke to figure out what is going on, and I see in the fields >> that >> fail to match, a "null_1". Could someone tell me what that is? I see >> some >> null_100s there as well, which see to separate field values. Clearly the >> null_1s are causing the search to fail. >> > > You used the "Reconstruct" function to obtain the field values for unstored > fields, right? null_NNN is Luke's way of telling you that the tokens that > should be on these positions are absent, because they were removed by > analyzer during indexing, and there is no stored value of this field from > which you could recover the original text. In other words, they are holes in > the token stream, of length NNN. > > Such holes may be also produced by artificially increasing the token > positions, hence the null_100 that serves to separate multiple field values > so that e.g. phrase queries don't match unrelated text. > > Phrase queries that you can construct using QueryParser can't match two > tokens separated by a hole, unless you set a slop value > 0. > > -- > Best regards, > Andrzej Bialecki <>< > ___. ___ ___ ___ _ _ __________________________________ > [__ || __|__/|__||\/| Information Retrieval, Semantic Web > ___|||__|| \| || | Embedded Unix, System Integration > http://www.sigram.com Contact: info at sigram dot com > >