Thanks again Jeff. I will check the documentation of join queries becasue I never used it before.
Regards Roland 2016-02-24 19:07 GMT+01:00 Jeff Wartes <jwar...@whitepages.com>: > > I suspect your problem is the intersection of “very large document” and > “high rate of change”. Either of those alone would be fine. > > You’re correct, if the thing you need to search or sort by is the thing > with a high change rate, you probably aren’t going to be able to peel those > things out of your index. > > Perhaps you could work something out with join queries? So you have two > kinds of documents - book content and book price - and your high-frequency > change is limited to documents with very little data. > > > > > > On 2/24/16, 4:01 AM, "roland.sz...@booknwalk.com on behalf of Szűcs > Roland" <roland.sz...@booknwalk.com on behalf of > szucs.rol...@bookandwalk.hu> wrote: > > >I have checked it already in the ref. guide. It is stated that you can not > >search in external fields: > > > https://cwiki.apache.org/confluence/display/solr/Working+with+External+Files+and+Processes > > > >Really I am very curios that my problem is not a usual one or the case is > >that SOLR mainly focuses on search and not a kind of end-to-end support. > >How this approach works with 1 million documents with frequently changing > >prices? > > > >Thanks your time, > > > >Roland > > > >2016-02-24 12:39 GMT+01:00 Stefan Matheis <matheis.ste...@gmail.com>: > > > >> Depending of what features you do actually need, might be worth a look > >> on "External File Fields" Roland? > >> > >> -Stefan > >> > >> On Wed, Feb 24, 2016 at 12:24 PM, Szűcs Roland > >> <szucs.rol...@bookandwalk.hu> wrote: > >> > Thanks Jeff your help, > >> > > >> > Can it work in production environment? Imagine when my customer > initiate > >> a > >> > query having 1 000 docs in the result set. I can not use the > pagination > >> of > >> > SOLR as the field which is the basis of the sort is not included in > the > >> > schema for example the price. The customer wants the list in > descending > >> > order of the price. > >> > > >> > So I have to get all the 1000 docids from solr and find the metadata > of > >> > them in a sql database or in cache in best case. This is the way you > >> > suggested? Is it not too slow? > >> > > >> > Regards, > >> > Roland > >> > > >> > 2016-02-23 19:29 GMT+01:00 Jeff Wartes <jwar...@whitepages.com>: > >> > > >> >> > >> >> My suggestion would be to split your problem domain. Use Solr > >> exclusively > >> >> for search - index the id and only those fields you need to search > on. > >> Then > >> >> use some other data store for retrieval. Get the id’s from the solr > >> >> results, and look them up in the data store to get the rest of your > >> fields. > >> >> This allows you to keep your solr docs as small as possible, and you > >> only > >> >> need to update them when a *searchable* field changes. > >> >> > >> >> Every “update" in solr is a delete/insert. Even the "atomic update” > >> >> feature is just a shortcut for that. It requires stored fields > because > >> the > >> >> data from the stored fields gets copied into the new insert. > >> >> > >> >> > >> >> > >> >> > >> >> > >> >> On 2/22/16, 12:21 PM, "Roland Szűcs" <roland.sz...@booknwalk.com> > >> wrote: > >> >> > >> >> >Hi folks, > >> >> > > >> >> >We use SOLR 5.2.1. We have ebooks stored in SOLR. The majority of > the > >> >> >fields do not change at all like content, author, publisher.... Only > >> the > >> >> >price field changes frequently. > >> >> > > >> >> >We let the customers to make full text search so we indexed the > content > >> >> >filed. Due to the frequency of the price updates we use the atomic > >> update > >> >> >feature. As a requirement of the atomic updates we have to store all > >> the > >> >> >fields even the content field which is 1MB/document and we did not > >> want to > >> >> >store it just index it. > >> >> > > >> >> >As we wanted to update 100 documents with atomic update it took > about 3 > >> >> >minutes. Taking into account that our metadata /document is 1 Kb and > >> our > >> >> >content field / document is 1MB we use 1000 more memory to > accelerate > >> the > >> >> >update process. > >> >> > > >> >> >I am almost 100% sure that we make something wrong. > >> >> > > >> >> >What is the best practice of the frequent updates when 99% part of a > >> given > >> >> >document is constant forever? > >> >> > > >> >> >Thank in advance > >> >> > > >> >> >-- > >> >> ><https://www.linkedin.com/pub/roland-sz%C5%B1cs/28/226/24/hu> > Roland > >> >> Szűcs > >> >> ><https://www.linkedin.com/pub/roland-sz%C5%B1cs/28/226/24/hu> > Connect > >> >> with > >> >> >me on Linkedin < > >> >> https://www.linkedin.com/pub/roland-sz%C5%B1cs/28/226/24/hu> > >> >> ><https://bookandwalk.hu/> > >> >> >CEO Phone: +36 1 210 81 13 > >> >> >Bookandwalk.hu <https://bokandwalk.hu/> > >> >> > >> > > >> > > >> > > >> > -- > >> > <https://www.linkedin.com/pub/roland-sz%C5%B1cs/28/226/24/hu> Szűcs > >> Roland > >> > <https://www.linkedin.com/pub/roland-sz%C5%B1cs/28/226/24/hu> > >> Ismerkedjünk > >> > meg a Linkedin < > >> https://www.linkedin.com/pub/roland-sz%C5%B1cs/28/226/24/hu> > >> > -en <https://bookandwalk.hu/> > >> > Ügyvezető Telefon: +36 1 210 81 13 > >> > Bookandwalk.hu <https://bokandwalk.hu/> > >> > > > > > > > >-- > ><https://www.linkedin.com/pub/roland-sz%C5%B1cs/28/226/24/hu> Szűcs > Roland > ><https://www.linkedin.com/pub/roland-sz%C5%B1cs/28/226/24/hu> > Ismerkedjünk > >meg a Linkedin < > https://www.linkedin.com/pub/roland-sz%C5%B1cs/28/226/24/hu> > >-en <https://bookandwalk.hu/> > >Ügyvezető Telefon: +36 1 210 81 13 > >Bookandwalk.hu <https://bokandwalk.hu/> > -- <https://www.linkedin.com/pub/roland-sz%C5%B1cs/28/226/24/hu> Szűcs Roland <https://www.linkedin.com/pub/roland-sz%C5%B1cs/28/226/24/hu> Ismerkedjünk meg a Linkedin <https://www.linkedin.com/pub/roland-sz%C5%B1cs/28/226/24/hu> -en <https://bookandwalk.hu/> Ügyvezető Telefon: +36 1 210 81 13 Bookandwalk.hu <https://bokandwalk.hu/>