On Nov 18, 2007 2:25 PM, Tricia Williams <[EMAIL PROTECTED]> wrote: > Thanks for your comments, Yonik! > > All for it... depending on what one means by "payload functionality" of > > course. > > We should probably hold off on adding a new lucene version to Solr > > until the Payload API has stabilized (it will most likely be changing > > very soon). > > > > > It sounds like Lucene 2.3 is going to be released soonish > (http://www.nabble.com/How%27s-2.3-doing--tf4802426.html#a13740605). As > best I can tell it will include the Payload stuff marked experimental. > The new Lucene version will have many improvements besides Payloads > which would benefit Solr (examples galore in CHANGES.txt > http://svn.apache.org/viewvc/lucene/java/trunk/CHANGES.txt?view=log). > So I find it hard to believe that the new release will not be included.
Sorry for the mis-understanding... Solr will include Lucene 2.3 when it comes out (or even before). When I mentioned holding off, my assumption was that the payload API would be nailed down before 2.3 was released. http://www.nabble.com/Payload-API-tf4828837.html#a13815548 > I agree that is a lot of data to associate with every token - especially > since the data is repetitive in nature. Erik Hatcher suggested I store > a representation of the structure of the document in a separate field, > store a numeric representation of the mapping of the token to the > structure as the payload for each token, and do a lookup at query time > based on the numeric mapping in the payload at the position hit to get > the structure/context back for the token. That seems like it would work for highlighting-type scenarios (where few stored fields would be loaded), but not during querying. > I'm also wondering how others have accomplished this. Grant Ingersoll > noted that one of the original use cases was XPath queries so I'm > particularly interested in finding out if anyone has implemented that, > and how. Me too. Any clarifications on that Grant??? > Maybe we will have to write new TokenFilters for each Tokenzier that > uses Payloads (but I sure hope not!). Maybe we can build some optional > configuration options into the TokenFilter constructor that guide their > behavior with regard to Payloads. Yes, that was my thought too. > Maybe there is something stored in > the TokenStream that dictates how the Payloads are handled by the > TokenFilters. Interesting idea.... it could be easily implemented as a flag in a bitfield: http://www.nabble.com/new-Token-API-tf4828894.html#a13815702 -Yonik