On Nov 18, 2007 2:25 PM, Tricia Williams <[EMAIL PROTECTED]> wrote:
> Thanks for your comments, Yonik!
> > All for it... depending on what one means by "payload functionality" of 
> > course.
> > We should probably hold off on adding a new lucene version to Solr
> > until the Payload API has stabilized (it will most likely be changing
> > very soon).
> >
> >
> It sounds like Lucene 2.3 is going to be released soonish
> (http://www.nabble.com/How%27s-2.3-doing--tf4802426.html#a13740605).  As
> best I can tell it will include the Payload stuff marked experimental.
> The new Lucene version will have many improvements besides Payloads
> which would benefit Solr (examples galore in CHANGES.txt
> http://svn.apache.org/viewvc/lucene/java/trunk/CHANGES.txt?view=log).
> So I find it hard to believe that the new release will not be included.

Sorry for the mis-understanding... Solr will include Lucene 2.3 when
it comes out (or even before).  When I mentioned holding off, my
assumption was that the payload API would be nailed down before 2.3
was released.
http://www.nabble.com/Payload-API-tf4828837.html#a13815548

> I agree that is a lot of data to associate with every token - especially
> since the data is repetitive in nature.  Erik Hatcher suggested I store
> a representation of the structure of the document in a separate field,
> store a numeric representation of the mapping of the token to the
> structure as the payload for each token, and do a lookup at query time
> based on the numeric mapping in the payload at the position hit to get
> the structure/context back for the token.

That seems like it would work for highlighting-type scenarios (where
few stored fields would be loaded), but not during querying.

> I'm also wondering how others have accomplished this.  Grant Ingersoll
> noted that one of the original use cases was XPath queries so I'm
> particularly interested in finding out if anyone has implemented that,
> and how.

Me too.   Any clarifications on that Grant???

> Maybe we will have to write new TokenFilters for each Tokenzier that
> uses Payloads (but I sure hope not!).  Maybe we can build some optional
> configuration options into the TokenFilter constructor that guide their
> behavior with regard to Payloads.

Yes, that was my thought too.

>  Maybe there is something stored in
> the TokenStream that dictates how the Payloads are handled by the
> TokenFilters.

Interesting idea.... it could be easily implemented as a flag in a bitfield:
http://www.nabble.com/new-Token-API-tf4828894.html#a13815702

-Yonik

Reply via email to