Hi Erick, yes, absolutely, it's a great pleasure for me contribute. On Wed, Oct 23, 2019 at 2:25 PM Erick Erickson <erickerick...@gmail.com> wrote:
> Bookmarked. Do you intend that this should be incorporated into Solr? If > so, please raise a JIRA and link your PR in…. > > Thanks! > Erick > > > On Oct 22, 2019, at 6:56 PM, Vincenzo D'Amore <v.dam...@gmail.com> > wrote: > > > > Hi all, > > > > this evening I had some spare hour to spend in order to put everything > > together in a repository. > > > > https://github.com/freedev/solr-payload-string-function-query > > > > > > > > On Tue, Oct 22, 2019 at 5:54 PM Vincenzo D'Amore <v.dam...@gmail.com> > wrote: > > > >> Hi all, > >> > >> thanks for supporting. And many thanks whom have implemented > >> the integration of the github Solr repository with the intellij IDE. > >> To configure the environment and run the debugger I spent less than one > >> hour, (and most of the time I had to wait the compilation). > >> Solr and you guys really rocks together. > >> > >> What I've done: > >> > >> I was looking at the original payload function is defined into > >> the ValueSourceParser, this function uses a FloatPayloadValueSource to > >> return the value found. > >> > >> As said I wrote a new version of payload function that handles strings, > I > >> named it spayload, and basically is able to extract the string value > from > >> the payload. > >> > >> Given the former example where I have a multivalue field payloadCurrency > >> > >> payloadCurrency: [ > >> "store1|USD", > >> "store2|EUR", > >> "store3|GBP" > >> ] > >> > >> executing spayload(payloadCurrency,store2) returns "EUR", and so on for > >> the remaining key/value in the field. > >> > >> To implement the spayload function, I've added a new ValueSourceParser > >> instance to the list of defined functions and which returns > >> a StringPayloadValueSource with the value inside (does the same thing of > >> former FloatPayloadValueSource). > >> > >> That's all. As said, always beware of your code when works at first run. > >> And really there was something wrong, initially I messed up in the > >> conversion of the payload into String (bytes, offset, etc). > >> Now it is fixed, or at least it seems to me. > >> I see this function cannot be used in the sort, very likely the simple > >> implementation of the StringPayloadValueSource miss something. > >> > >> As far as I understand I'm scratching the surface of this solution, > there > >> are few things I'm worried about. I have a bunch of questions, please be > >> patient. > >> This function returns an empty string "" when does not match any key, or > >> should return an empty value? not sure about, what's the correct way to > >> return an empty value? > >> I wasn't able to find a test unit for the payload function in the tests. > >> Could you give me few suggestion in order to test properly the > >> implementation? > >> In case the spayload is used on a different field type (i.e. the use > >> spayload on a float payload) the behaviour is not handled. Can this > >> function check the type of the payload content? > >> And at last, what do you think, can this simple fix be interesting for > the > >> Solr community, may I try to submit a pull request or add a feature to > JIRA? > >> > >> Best regards, > >> Vincenzo > >> > >> > >> On Mon, Oct 21, 2019 at 9:12 PM Erik Hatcher <erik.hatc...@gmail.com> > >> wrote: > >> > >>> Yes. The decoding of a payload based on its schema type is what the > >>> payload() function does. Your Payloader won't currently work > well/legibly > >>> for fields encoded numerically: > >>> > >>> > >>> > https://github.com/o19s/payload-component/blob/master/src/main/java/com/o19s/payloads/Payloader.java#L130 > >>> < > >>> > https://github.com/o19s/payload-component/blob/master/src/main/java/com/o19s/payloads/Payloader.java#L130 > >>>> > >>> > >>> I think that code could probably be slightly enhanced to leverage > >>> PayloadUtils.getPayloadDecoder(fieldType) and use bytes if the field > type > >>> doesn't have a better decoder. > >>> > >>> Erik > >>> > >>> > >>>> On Oct 21, 2019, at 2:55 PM, Eric Pugh < > ep...@opensourceconnections.com> > >>> wrote: > >>>> > >>>> Have you checked out > >>>> https://github.com/o19s/payload-component > >>>> > >>>> On Mon, Oct 21, 2019 at 2:47 PM Erik Hatcher <erik.hatc...@gmail.com> > >>> wrote: > >>>> > >>>>> How about a single field, with terms like: > >>>>> > >>>>> store1_USD|125.0 store2_EUR|220.0 store3_GBP|225.0 > >>>>> > >>>>> Would that do the trick? > >>>>> > >>>>> And yeah, payload decoding is currently limited to float and int with > >>> the > >>>>> built-in payload() function. We'd need a new way to pull out > >>>>> textual/bytes payloads - like maybe a DocTransformer? > >>>>> > >>>>> Erik > >>>>> > >>>>> > >>>>>> On Oct 21, 2019, at 9:59 AM, Vincenzo D'Amore <v.dam...@gmail.com> > >>>>> wrote: > >>>>>> > >>>>>> Hi Erick, > >>>>>> > >>>>>> thanks for getting back to me. We started to use payloads because we > >>> have > >>>>>> the classical per-store pricing problem. > >>>>>> Thousands of stores across and different prices. > >>>>>> Then we found the payloads very useful started to use it for many > >>>>> reasons, > >>>>>> like enabling/disabling the product for such store, save the stock > >>>>>> availability, or save the other info like buy/sell price, discount > >>> rates, > >>>>>> and so on. > >>>>>> All those information are numbers, but stores can also be in > different > >>>>>> countries, I mean would be useful also have the currency and other > >>>>>> attributes related to the store. > >>>>>> > >>>>>> Thinking about an alternative for payloads maybe I could use the > >>> dynamic > >>>>>> fields, well, I know it is ugly. > >>>>>> > >>>>>> Consider this hypothetical case where I have two field payload : > >>>>>> > >>>>>> payloadPrice: [ > >>>>>> "store1|125.0", > >>>>>> "store2|220.0", > >>>>>> "store3|225.0" > >>>>>> ] > >>>>>> > >>>>>> payloadCurrency: [ > >>>>>> "store1|USD", > >>>>>> "store2|EUR", > >>>>>> "store3|GBP" > >>>>>> ] > >>>>>> > >>>>>> with dynamic fields I could have different fields for each document. > >>>>>> > >>>>>> currency_store1_s: "USD" > >>>>>> currency_store2_s: "EUR" > >>>>>> currency_store3_s: "GBP" > >>>>>> > >>>>>> But how many dynamic fields like this can I have? more than > thousands? > >>>>>> > >>>>>> Again, I've just started to look at solr-ocrhighlighting github > >>> project > >>>>> you > >>>>>> suggested. > >>>>>> Those seems have written their own payload object type where store > ocr > >>>>>> highlighting information. > >>>>>> It seems interesting, I'll take a look immediately. > >>>>>> > >>>>>> Thanks again for your time. > >>>>>> > >>>>>> Best regards, > >>>>>> Vincenzo > >>>>>> > >>>>>> > >>>>>> On Mon, Oct 21, 2019 at 2:55 PM Erick Erickson < > >>> erickerick...@gmail.com> > >>>>>> wrote: > >>>>>> > >>>>>>> This is one of those situations where I know a client did it, but > >>> didn’t > >>>>>>> see the code myself. > >>>>>>> > >>>>>>> So I can’t help much. > >>>>>>> > >>>>>>> Perhaps a good question at this point, though, is “why do you want > to > >>>>> add > >>>>>>> string payloads anyway”? > >>>>>>> > >>>>>>> This isn’t the client, but it might give you some pointers: > >>>>>>> > >>>>>>> > >>>>>>> > >>>>> > >>> > https://github.com/dbmdz/solr-ocrpayload-plugin/blob/master/src/main/java/de/digitalcollections/solr/plugin/components/ocrhighlighting/OcrHighlighting.java > >>>>>>> > >>>>>>> Best, > >>>>>>> Erick > >>>>>>> > >>>>>>>> On Oct 21, 2019, at 6:37 AM, Vincenzo D'Amore <v.dam...@gmail.com > > > >>>>>>> wrote: > >>>>>>>> > >>>>>>>> Hi Erick, > >>>>>>>> > >>>>>>>> It seems I've reached a dead-point, or at least it seems looking > at > >>> the > >>>>>>>> code, it seems I can't easily add a custom decoder: > >>>>>>>> > >>>>>>>> Looking at PayloadUtils class there is getPayloadDecoder method > >>> invoked > >>>>>>> to > >>>>>>>> return the PayloadDecoder : > >>>>>>>> > >>>>>>>> public static PayloadDecoder getPayloadDecoder(FieldType > fieldType) > >>> { > >>>>>>>> PayloadDecoder decoder = null; > >>>>>>>> > >>>>>>>> String encoder = getPayloadEncoder(fieldType); > >>>>>>>> > >>>>>>>> if ("integer".equals(encoder)) { > >>>>>>>> decoder = (BytesRef payload) -> payload == null ? 1 : > >>>>>>>> PayloadHelper.decodeInt(payload.bytes, payload.offset); > >>>>>>>> } > >>>>>>>> if ("float".equals(encoder)) { > >>>>>>>> decoder = (BytesRef payload) -> payload == null ? 1 : > >>>>>>>> PayloadHelper.decodeFloat(payload.bytes, payload.offset); > >>>>>>>> } > >>>>>>>> // encoder could be "identity" at this point, in the case of > >>>>>>>> DelimitedTokenFilterFactory encoder="identity" > >>>>>>>> > >>>>>>>> // TODO: support pluggable payload decoders? > >>>>>>>> > >>>>>>>> return decoder; > >>>>>>>> } > >>>>>>>> > >>>>>>>> Any advice to work around this situation? > >>>>>>>> > >>>>>>>> > >>>>>>>> On Mon, Oct 21, 2019 at 1:51 AM Erick Erickson < > >>>>> erickerick...@gmail.com> > >>>>>>>> wrote: > >>>>>>>> > >>>>>>>>> You’d need to write one. Payloads are generally intended to hold > >>>>>>> numerics > >>>>>>>>> you can then use in a function query to factor into the score… > >>>>>>>>> > >>>>>>>>> Best, > >>>>>>>>> Erick > >>>>>>>>> > >>>>>>>>>> On Oct 20, 2019, at 4:57 PM, Vincenzo D'Amore < > v.dam...@gmail.com > >>>> > >>>>>>>>> wrote: > >>>>>>>>>> > >>>>>>>>>> Sorry, I just realized that I was wrong in how I'm using the > >>> payload > >>>>>>>>>> function. > >>>>>>>>>> Give that the payload function only handles a numeric (integer > or > >>>>>>> float) > >>>>>>>>>> payload, could you suggest me an alternative function that > handles > >>>>>>>>> strings? > >>>>>>>>>> If not, should I write one? > >>>>>>>>>> > >>>>>>>>>> On Sun, Oct 20, 2019 at 10:43 PM Vincenzo D'Amore < > >>>>> v.dam...@gmail.com> > >>>>>>>>>> wrote: > >>>>>>>>>> > >>>>>>>>>>> Hi all, > >>>>>>>>>>> > >>>>>>>>>>> I'm trying to understand what I did wrong with a payload query > >>> that > >>>>>>>>>>> returns > >>>>>>>>>>> > >>>>>>>>>>> error: { > >>>>>>>>>>> metadata: [ "error-class", > >>> "org.apache.solr.common.SolrException", > >>>>>>>>>>> "root-error-class", "org.apache.solr.common.SolrException" ], > >>>>>>>>>>> msg: "No payload decoder found for field: colorCode", > >>>>>>>>>>> code: 400 > >>>>>>>>>>> } > >>>>>>>>>>> > >>>>>>>>>>> I have reduced my problem in a little sample to show what > >>> happens to > >>>>>>> me. > >>>>>>>>>>> Basically I have a document with a couple of payload fields one > >>>>>>>>>>> delimited_payloads_string and one delimited_payloads_integer > >>>>>>>>>>> > >>>>>>>>>>> { > >>>>>>>>>>> field_dps: "key|data", > >>>>>>>>>>> field_dpi: "key|1", > >>>>>>>>>>> } > >>>>>>>>>>> > >>>>>>>>>>> When I execute this query solr returns as expected the payload > >>> for > >>>>> the > >>>>>>>>> key > >>>>>>>>>>> > >>>>>>>>>>> q=*:*&fl=payload(field_dpi,key) > >>>>>>>>>>> > >>>>>>>>>>> { > >>>>>>>>>>> payload(field_dpi,key): 1 > >>>>>>>>>>> } > >>>>>>>>>>> > >>>>>>>>>>> But for the strings there have to be something of different to > >>> do, > >>>>>>>>> because > >>>>>>>>>>> I'm unable receive the payload value back. Executing this > query, > >>> as > >>>>> in > >>>>>>>>> the > >>>>>>>>>>> short introduction of this post, I receive an error. > >>>>>>>>>>> > >>>>>>>>>>> ?q=*:*&fl=payload(field_dps,key) > >>>>>>>>>>> > >>>>>>>>>>> error: { > >>>>>>>>>>> metadata: [ "error-class", > >>> "org.apache.solr.common.SolrException", > >>>>>>>>>>> "root-error-class", "org.apache.solr.common.SolrException" ], > >>>>>>>>>>> msg: "No payload decoder found for field: colorCode", > >>>>>>>>>>> code: 400 > >>>>>>>>>>> } > >>>>>>>>>>> > >>>>>>>>>>> Am I doing something wrong? How can I read strings payload > data? > >>>>>>>>>>> > >>>>>>>>>>> Thanks in advance for your time, > >>>>>>>>>>> Vincenzo > >>>>>>>>>>> > >>>>>>>>>>> -- > >>>>>>>>>>> Vincenzo D'Amore > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> -- > >>>>>>>>>> Vincenzo D'Amore > >>>>>>>>> > >>>>>>>>> > >>>>>>>> > >>>>>>>> -- > >>>>>>>> Vincenzo D'Amore > >>>>>>> > >>>>>>> > >>>>>> > >>>>>> -- > >>>>>> Vincenzo D'Amore > >>>>> > >>>>> > >>> > >>> > >> > >> -- > >> Vincenzo D'Amore > >> > >> > > > > -- > > Vincenzo D'Amore > > -- Vincenzo D'Amore