Thanks Geert-Jan, this is indeed very helpful. The delimiters I gave were just for the need of the example. I will use non frequent delimiter.
Cheers, -Saïd On Jun 26, 2010, at 1:53 PM, Geert-Jan Brits wrote: >> If I understand your suggestion correctly, you said that there's NO need to > have many Dynamic Fields; instead, we can have one definitive field name, > which can store a long string (concatenation of >information about tens of > pictures), e.g., using "-" and "%" delimiters: > pic_url_value1-pic_caption_value1-pic_description_value1%pic_url_value2-pic_caption_value2-pic_description_value2%... >> I don't clearly see the reason of doing this. Is there a gain in terms of > performance? Or does this make programming on the client-side easier? Or > something else? > > I think you should ask the exact opposite question. If you don't do anything > with these fields which Solr is particularly good at (searching / filtering > / faceting/ sorting) why go through the trouble of creating dynamic fields? > (more fields is more overhead cost/ tracking cost no matter how you look at > it) > > Moreover, indeed from a client-view it's easier the way I suggested, since > otherwise you: > - would have to ask (through SolrJ) to include all dynamic fields to be > returned in the Fl-field ( > http://wiki.apache.org/solr/CommonQueryParameters#fl). This is difficult, > because a-priori you don't know how many dynamic-fields to query. So in > other words you can't just ask SOlr (though SolrJ lik you asked) to just > return all dynamic fields beginning with pic_*. (afaik) > - your client iterate code (looping the pics) is a bit more involved. > > HTH, Cheers, > > Geert-Jan > > 2010/6/26 Saïd Radhouani <r.steve....@gmail.com> > >> Thanks Geert-Jan for the detailed answer. Actually, I don't search at all >> on these fields. I'm only filtering (w/ vs w/ pic) and sorting (based on the >> number of pictures). Thus, your suggestion of adding an extra field NrOfPics >> [0,N] would be the best solution. >> >> Regarding the other suggestion: >> >>> If you dont need search at all on these fields, the best thing imo is to >>> store all pic-related info of all pics together by concatenating them >> with >>> some delimiter which you know how to seperate at the client-side. >>> That or just store it in an external RDB since solr is just sitting on >> the >>> data and not doing anything intelligent with it. >> >> If I understand your suggestion correctly, you said that there's NO need to >> have many Dynamic Fields; instead, we can have one definitive field name, >> which can store a long string (concatenation of information about tens of >> pictures), e.g., using "-" and "%" delimiters: >> pic_url_value1-pic_caption_value1-pic_description_value1%pic_url_value2-pic_caption_value2-pic_description_value2%... >> >> I don't clearly see the reason of doing this. Is there a gain in terms of >> performance? Or does this make programming on the client-side easier? Or >> something else? >> >> >> My other question was: in case we use Dynamic Fields, is there a >> documentation about using SolrJ for this purpose? >> >> Thanks >> -Saïd >> >> On Jun 26, 2010, at 12:29 PM, Geert-Jan Brits wrote: >> >>> You can treat dynamic fields like any other field, so you can facet, >> sort, >>> filter, etc on these fields (afaik) >>> >>> I believe the confusion arises that sometimes the usecase for dynamic >> fields >>> seems to be ill-understood, i.e: to be able to use them to do some kind >> of >>> wildcard search, e.g: search for a value in any of the dynamic fields at >>> once like pic_url_*. This however is NOT possible. >>> >>> As far as your question goes: >>> >>>> Now, I'm trying to make facets on pictures: display doc w/ pic vs. doc >> w/o >>> pic >>>> To the best of my knowledge, everyone is saying that faceting cannot be >>> done on dynamic fields (only on definitive field names). Thus, I tried >> the >>> following and it's working: I assume that the stored > >pictures have a >>> sequential number (_1, _2, etc.), i.e., if pic_url_1 exists in the index, >> it >>> means that the underlying doc has at least one picture: >>>> ...&facet=on&facet.field=pic_url_1&facet.mincount=1&fq=pic_url_1:* >>>> While this is working fine, I'm wondering whether there's a cleaner way >> to >>> do the same thing without assuming that pictures have a sequential >> number. >>> >>> If I understand your question correctly: faceting on docs with and >> without >>> pics could ofcourse by done like you mention, however it would be more >>> efficient to have an extra field defined: hasAtLestOnePic with values (0 >> | >>> 1) >>> use that to facet / filter on. >>> >>> you can extend this to NrOfPics [0,N) if you need to filter / facet on >> docs >>> with a certain nr of pics. >>> >>> also I wondered what else you wanted to do with this pic-related info. Do >>> you want to search on pic-description / pic-caption for instance? In that >>> case the dynamic-fields approach may not be what you want: how would you >>> know in which dynamic-field to search for a particular term? Would if be >>> pic_desc_1 , or pic_desc_x? Of couse you could OR over all dynamic >> fields, >>> but you need to know how many pics an upperbound for the nr of pics and >> it >>> really doesn't feel right, to me at least. >>> >>> If you need search on pic_description for instance, but don't mind what >> pic >>> matches, you could create a single field pic_description and put in the >>> concat of all pic-descriptions and search on that, or just make it a a >>> multi-valued field. >>> >>> If you dont need search at all on these fields, the best thing imo is to >>> store all pic-related info of all pics together by concatenating them >> with >>> some delimiter which you know how to seperate at the client-side. >>> That or just store it in an external RDB since solr is just sitting on >> the >>> data and not doing anything intelligent with it. >>> >>> I assume btw that you don't want to sort/ facet on pic-desc / >> pic_caption/ >>> pic_url either ( I have a hard time thinking of a useful usecase for >> that) >>> >>> HTH, >>> >>> Geert-Jan >>> >>> >>> >>> 2010/6/26 Saïd Radhouani <r.steve....@gmail.com> >>> >>>> Thanks so much Otis. This is working great. >>>> >>>> Now, I'm trying to make facets on pictures: display doc w/ pic vs. doc >> w/o >>>> pic >>>> >>>> To the best of my knowledge, everyone is saying that faceting cannot be >>>> done on dynamic fields (only on definitive field names). Thus, I tried >> the >>>> following and it's working: I assume that the stored pictures have a >>>> sequential number (_1, _2, etc.), i.e., if pic_url_1 exists in the >> index, it >>>> means that the underlying doc has at least one picture: >>>> >>>> ...&facet=on&facet.field=pic_url_1&facet.mincount=1&fq=pic_url_1:* >>>> >>>> While this is working fine, I'm wondering whether there's a cleaner way >> to >>>> do the same thing without assuming that pictures have a sequential >> number. >>>> >>>> Also, do you have any documentation about handling Dynamic Fields using >>>> SolrJ. So far, I found only issues about that on JIRA, but no >> documentation. >>>> >>>> Thanks a lot. >>>> >>>> -Saïd >>>> >>>> On Jun 26, 2010, at 1:18 AM, Otis Gospodnetic wrote: >>>> >>>>> Saïd, >>>>> >>>>> Dynamic fields could help here, for example imagine a doc with: >>>>> id >>>>> pic_url_* >>>>> pic_caption_* >>>>> pic_description_* >>>>> >>>>> See http://wiki.apache.org/solr/SchemaXml#Dynamic_fields >>>>> >>>>> So, for you: >>>>> >>>>> <dynamicField name="pic_url_*" type="string" indexed="true" >>>> stored="true"/> >>>>> <dynamicField name="pic_caption_*" type="text" indexed="true" >>>> stored="true"/> >>>>> <dynamicField name="pic_description_*" type="text" indexed="true" >>>> stored="true"/> >>>>> >>>>> Then you can add docs with unlimited number of >>>> pic_(url|caption|description)_* fields, e.g. >>>>> >>>>> id >>>>> pic_url_1 >>>>> pic_caption_1 >>>>> pic_description_1 >>>>> >>>>> id >>>>> pic_url_2 >>>>> pic_caption_2 >>>>> pic_description_2 >>>>> >>>>> >>>>> Otis >>>>> ---- >>>>> Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch >>>>> Lucene ecosystem search :: http://search-lucene.com/ >>>>> >>>>> >>>>> >>>>> ----- Original Message ---- >>>>>> From: Saïd Radhouani <r.steve....@gmail.com> >>>>>> To: solr-user@lucene.apache.org >>>>>> Sent: Fri, June 25, 2010 6:01:13 PM >>>>>> Subject: Setting many properties for a multivalued field. Schema.xml ? >>>> External file? >>>>>> >>>>>> Hi, >>>>> >>>>> I'm trying to index data containing a multivalued field "picture", >>>>>> that has three properties: url, caption and description: >>>>> >>>>> <picture/> >>>>>> >>>>> <url/> >>>>> >>>>>> <caption/> >>>>> <description/> >>>>> >>>>> Thus, each >>>>>> indexed document might have many pictures, each of them has a url, a >>>> caption, >>>>>> and a description. >>>>> >>>>> I wonder wether it's possible to store this data using >>>>>> only schema.xml. I couldn't figure it out so far. Instead, I'm >> thinking >>>> of using >>>>>> an external file to sore the properties of each picture, but I haven't >>>> tried yet >>>>>> this solution, waiting for your suggestions... >>>>> >>>>> Thanks, >>>>> -Saïd >>>> >>>> >> >>