Hi all, this evening I had some spare hour to spend in order to put everything together in a repository.
https://github.com/freedev/solr-payload-string-function-query On Tue, Oct 22, 2019 at 5:54 PM Vincenzo D'Amore <v.dam...@gmail.com> wrote: > Hi all, > > thanks for supporting. And many thanks whom have implemented > the integration of the github Solr repository with the intellij IDE. > To configure the environment and run the debugger I spent less than one > hour, (and most of the time I had to wait the compilation). > Solr and you guys really rocks together. > > What I've done: > > I was looking at the original payload function is defined into > the ValueSourceParser, this function uses a FloatPayloadValueSource to > return the value found. > > As said I wrote a new version of payload function that handles strings, I > named it spayload, and basically is able to extract the string value from > the payload. > > Given the former example where I have a multivalue field payloadCurrency > > payloadCurrency: [ > "store1|USD", > "store2|EUR", > "store3|GBP" > ] > > executing spayload(payloadCurrency,store2) returns "EUR", and so on for > the remaining key/value in the field. > > To implement the spayload function, I've added a new ValueSourceParser > instance to the list of defined functions and which returns > a StringPayloadValueSource with the value inside (does the same thing of > former FloatPayloadValueSource). > > That's all. As said, always beware of your code when works at first run. > And really there was something wrong, initially I messed up in the > conversion of the payload into String (bytes, offset, etc). > Now it is fixed, or at least it seems to me. > I see this function cannot be used in the sort, very likely the simple > implementation of the StringPayloadValueSource miss something. > > As far as I understand I'm scratching the surface of this solution, there > are few things I'm worried about. I have a bunch of questions, please be > patient. > This function returns an empty string "" when does not match any key, or > should return an empty value? not sure about, what's the correct way to > return an empty value? > I wasn't able to find a test unit for the payload function in the tests. > Could you give me few suggestion in order to test properly the > implementation? > In case the spayload is used on a different field type (i.e. the use > spayload on a float payload) the behaviour is not handled. Can this > function check the type of the payload content? > And at last, what do you think, can this simple fix be interesting for the > Solr community, may I try to submit a pull request or add a feature to JIRA? > > Best regards, > Vincenzo > > > On Mon, Oct 21, 2019 at 9:12 PM Erik Hatcher <erik.hatc...@gmail.com> > wrote: > >> Yes. The decoding of a payload based on its schema type is what the >> payload() function does. Your Payloader won't currently work well/legibly >> for fields encoded numerically: >> >> >> https://github.com/o19s/payload-component/blob/master/src/main/java/com/o19s/payloads/Payloader.java#L130 >> < >> https://github.com/o19s/payload-component/blob/master/src/main/java/com/o19s/payloads/Payloader.java#L130 >> > >> >> I think that code could probably be slightly enhanced to leverage >> PayloadUtils.getPayloadDecoder(fieldType) and use bytes if the field type >> doesn't have a better decoder. >> >> Erik >> >> >> > On Oct 21, 2019, at 2:55 PM, Eric Pugh <ep...@opensourceconnections.com> >> wrote: >> > >> > Have you checked out >> > https://github.com/o19s/payload-component >> > >> > On Mon, Oct 21, 2019 at 2:47 PM Erik Hatcher <erik.hatc...@gmail.com> >> wrote: >> > >> >> How about a single field, with terms like: >> >> >> >> store1_USD|125.0 store2_EUR|220.0 store3_GBP|225.0 >> >> >> >> Would that do the trick? >> >> >> >> And yeah, payload decoding is currently limited to float and int with >> the >> >> built-in payload() function. We'd need a new way to pull out >> >> textual/bytes payloads - like maybe a DocTransformer? >> >> >> >> Erik >> >> >> >> >> >>> On Oct 21, 2019, at 9:59 AM, Vincenzo D'Amore <v.dam...@gmail.com> >> >> wrote: >> >>> >> >>> Hi Erick, >> >>> >> >>> thanks for getting back to me. We started to use payloads because we >> have >> >>> the classical per-store pricing problem. >> >>> Thousands of stores across and different prices. >> >>> Then we found the payloads very useful started to use it for many >> >> reasons, >> >>> like enabling/disabling the product for such store, save the stock >> >>> availability, or save the other info like buy/sell price, discount >> rates, >> >>> and so on. >> >>> All those information are numbers, but stores can also be in different >> >>> countries, I mean would be useful also have the currency and other >> >>> attributes related to the store. >> >>> >> >>> Thinking about an alternative for payloads maybe I could use the >> dynamic >> >>> fields, well, I know it is ugly. >> >>> >> >>> Consider this hypothetical case where I have two field payload : >> >>> >> >>> payloadPrice: [ >> >>> "store1|125.0", >> >>> "store2|220.0", >> >>> "store3|225.0" >> >>> ] >> >>> >> >>> payloadCurrency: [ >> >>> "store1|USD", >> >>> "store2|EUR", >> >>> "store3|GBP" >> >>> ] >> >>> >> >>> with dynamic fields I could have different fields for each document. >> >>> >> >>> currency_store1_s: "USD" >> >>> currency_store2_s: "EUR" >> >>> currency_store3_s: "GBP" >> >>> >> >>> But how many dynamic fields like this can I have? more than thousands? >> >>> >> >>> Again, I've just started to look at solr-ocrhighlighting github >> project >> >> you >> >>> suggested. >> >>> Those seems have written their own payload object type where store ocr >> >>> highlighting information. >> >>> It seems interesting, I'll take a look immediately. >> >>> >> >>> Thanks again for your time. >> >>> >> >>> Best regards, >> >>> Vincenzo >> >>> >> >>> >> >>> On Mon, Oct 21, 2019 at 2:55 PM Erick Erickson < >> erickerick...@gmail.com> >> >>> wrote: >> >>> >> >>>> This is one of those situations where I know a client did it, but >> didn’t >> >>>> see the code myself. >> >>>> >> >>>> So I can’t help much. >> >>>> >> >>>> Perhaps a good question at this point, though, is “why do you want to >> >> add >> >>>> string payloads anyway”? >> >>>> >> >>>> This isn’t the client, but it might give you some pointers: >> >>>> >> >>>> >> >>>> >> >> >> https://github.com/dbmdz/solr-ocrpayload-plugin/blob/master/src/main/java/de/digitalcollections/solr/plugin/components/ocrhighlighting/OcrHighlighting.java >> >>>> >> >>>> Best, >> >>>> Erick >> >>>> >> >>>>> On Oct 21, 2019, at 6:37 AM, Vincenzo D'Amore <v.dam...@gmail.com> >> >>>> wrote: >> >>>>> >> >>>>> Hi Erick, >> >>>>> >> >>>>> It seems I've reached a dead-point, or at least it seems looking at >> the >> >>>>> code, it seems I can't easily add a custom decoder: >> >>>>> >> >>>>> Looking at PayloadUtils class there is getPayloadDecoder method >> invoked >> >>>> to >> >>>>> return the PayloadDecoder : >> >>>>> >> >>>>> public static PayloadDecoder getPayloadDecoder(FieldType fieldType) >> { >> >>>>> PayloadDecoder decoder = null; >> >>>>> >> >>>>> String encoder = getPayloadEncoder(fieldType); >> >>>>> >> >>>>> if ("integer".equals(encoder)) { >> >>>>> decoder = (BytesRef payload) -> payload == null ? 1 : >> >>>>> PayloadHelper.decodeInt(payload.bytes, payload.offset); >> >>>>> } >> >>>>> if ("float".equals(encoder)) { >> >>>>> decoder = (BytesRef payload) -> payload == null ? 1 : >> >>>>> PayloadHelper.decodeFloat(payload.bytes, payload.offset); >> >>>>> } >> >>>>> // encoder could be "identity" at this point, in the case of >> >>>>> DelimitedTokenFilterFactory encoder="identity" >> >>>>> >> >>>>> // TODO: support pluggable payload decoders? >> >>>>> >> >>>>> return decoder; >> >>>>> } >> >>>>> >> >>>>> Any advice to work around this situation? >> >>>>> >> >>>>> >> >>>>> On Mon, Oct 21, 2019 at 1:51 AM Erick Erickson < >> >> erickerick...@gmail.com> >> >>>>> wrote: >> >>>>> >> >>>>>> You’d need to write one. Payloads are generally intended to hold >> >>>> numerics >> >>>>>> you can then use in a function query to factor into the score… >> >>>>>> >> >>>>>> Best, >> >>>>>> Erick >> >>>>>> >> >>>>>>> On Oct 20, 2019, at 4:57 PM, Vincenzo D'Amore <v.dam...@gmail.com >> > >> >>>>>> wrote: >> >>>>>>> >> >>>>>>> Sorry, I just realized that I was wrong in how I'm using the >> payload >> >>>>>>> function. >> >>>>>>> Give that the payload function only handles a numeric (integer or >> >>>> float) >> >>>>>>> payload, could you suggest me an alternative function that handles >> >>>>>> strings? >> >>>>>>> If not, should I write one? >> >>>>>>> >> >>>>>>> On Sun, Oct 20, 2019 at 10:43 PM Vincenzo D'Amore < >> >> v.dam...@gmail.com> >> >>>>>>> wrote: >> >>>>>>> >> >>>>>>>> Hi all, >> >>>>>>>> >> >>>>>>>> I'm trying to understand what I did wrong with a payload query >> that >> >>>>>>>> returns >> >>>>>>>> >> >>>>>>>> error: { >> >>>>>>>> metadata: [ "error-class", >> "org.apache.solr.common.SolrException", >> >>>>>>>> "root-error-class", "org.apache.solr.common.SolrException" ], >> >>>>>>>> msg: "No payload decoder found for field: colorCode", >> >>>>>>>> code: 400 >> >>>>>>>> } >> >>>>>>>> >> >>>>>>>> I have reduced my problem in a little sample to show what >> happens to >> >>>> me. >> >>>>>>>> Basically I have a document with a couple of payload fields one >> >>>>>>>> delimited_payloads_string and one delimited_payloads_integer >> >>>>>>>> >> >>>>>>>> { >> >>>>>>>> field_dps: "key|data", >> >>>>>>>> field_dpi: "key|1", >> >>>>>>>> } >> >>>>>>>> >> >>>>>>>> When I execute this query solr returns as expected the payload >> for >> >> the >> >>>>>> key >> >>>>>>>> >> >>>>>>>> q=*:*&fl=payload(field_dpi,key) >> >>>>>>>> >> >>>>>>>> { >> >>>>>>>> payload(field_dpi,key): 1 >> >>>>>>>> } >> >>>>>>>> >> >>>>>>>> But for the strings there have to be something of different to >> do, >> >>>>>> because >> >>>>>>>> I'm unable receive the payload value back. Executing this query, >> as >> >> in >> >>>>>> the >> >>>>>>>> short introduction of this post, I receive an error. >> >>>>>>>> >> >>>>>>>> ?q=*:*&fl=payload(field_dps,key) >> >>>>>>>> >> >>>>>>>> error: { >> >>>>>>>> metadata: [ "error-class", >> "org.apache.solr.common.SolrException", >> >>>>>>>> "root-error-class", "org.apache.solr.common.SolrException" ], >> >>>>>>>> msg: "No payload decoder found for field: colorCode", >> >>>>>>>> code: 400 >> >>>>>>>> } >> >>>>>>>> >> >>>>>>>> Am I doing something wrong? How can I read strings payload data? >> >>>>>>>> >> >>>>>>>> Thanks in advance for your time, >> >>>>>>>> Vincenzo >> >>>>>>>> >> >>>>>>>> -- >> >>>>>>>> Vincenzo D'Amore >> >>>>>>>> >> >>>>>>>> >> >>>>>>> >> >>>>>>> -- >> >>>>>>> Vincenzo D'Amore >> >>>>>> >> >>>>>> >> >>>>> >> >>>>> -- >> >>>>> Vincenzo D'Amore >> >>>> >> >>>> >> >>> >> >>> -- >> >>> Vincenzo D'Amore >> >> >> >> >> >> > > -- > Vincenzo D'Amore > > -- Vincenzo D'Amore