Re: very slow frequent updates

Szűcs Roland Wed, 24 Feb 2016 10:10:10 -0800

Thanks again Jeff. I will check the documentation of join queries becasue I
never used it before.


Regards

Roland

2016-02-24 19:07 GMT+01:00 Jeff Wartes <jwar...@whitepages.com>:

>
> I suspect your problem is the intersection of “very large document” and
> “high rate of change”. Either of those alone would be fine.
>
> You’re correct, if the thing you need to search or sort by is the thing
> with a high change rate, you probably aren’t going to be able to peel those
> things out of your index.
>
> Perhaps you could work something out with join queries? So you have two
> kinds of documents - book content and book price - and your high-frequency
> change is limited to documents with very little data.
>
>
>
>
>
> On 2/24/16, 4:01 AM, "roland.sz...@booknwalk.com on behalf of Szűcs
> Roland" <roland.sz...@booknwalk.com on behalf of
> szucs.rol...@bookandwalk.hu> wrote:
>
> >I have checked it already in the ref. guide. It is stated that you can not
> >search in external fields:
> >
> https://cwiki.apache.org/confluence/display/solr/Working+with+External+Files+and+Processes
> >
> >Really I am very curios that my problem is not a usual one or the case is
> >that SOLR mainly focuses on search and not a kind of end-to-end support.
> >How this approach works with 1 million documents with frequently changing
> >prices?
> >
> >Thanks your time,
> >
> >Roland
> >
> >2016-02-24 12:39 GMT+01:00 Stefan Matheis <matheis.ste...@gmail.com>:
> >
> >> Depending of what features you do actually need, might be worth a look
> >> on "External File Fields" Roland?
> >>
> >> -Stefan
> >>
> >> On Wed, Feb 24, 2016 at 12:24 PM, Szűcs Roland
> >> <szucs.rol...@bookandwalk.hu> wrote:
> >> > Thanks Jeff your help,
> >> >
> >> > Can it work in production environment? Imagine when my customer
> initiate
> >> a
> >> > query having 1 000 docs in the result set. I can not use the
> pagination
> >> of
> >> > SOLR as the field which is the basis of the sort is not included in
> the
> >> > schema for example the price. The customer wants the list in
> descending
> >> > order of the price.
> >> >
> >> > So I have to get all the 1000 docids from solr and find the metadata
> of
> >> > them in a sql database or in cache in best case. This is the way you
> >> > suggested? Is it not too slow?
> >> >
> >> > Regards,
> >> > Roland
> >> >
> >> > 2016-02-23 19:29 GMT+01:00 Jeff Wartes <jwar...@whitepages.com>:
> >> >
> >> >>
> >> >> My suggestion would be to split your problem domain. Use Solr
> >> exclusively
> >> >> for search - index the id and only those fields you need to search
> on.
> >> Then
> >> >> use some other data store for retrieval. Get the id’s from the solr
> >> >> results, and look them up in the data store to get the rest of your
> >> fields.
> >> >> This allows you to keep your solr docs as small as possible, and you
> >> only
> >> >> need to update them when a *searchable* field changes.
> >> >>
> >> >> Every “update" in solr is a delete/insert. Even the "atomic update”
> >> >> feature is just a shortcut for that. It requires stored fields
> because
> >> the
> >> >> data from the stored fields gets copied into the new insert.
> >> >>
> >> >>
> >> >>
> >> >>
> >> >>
> >> >> On 2/22/16, 12:21 PM, "Roland Szűcs" <roland.sz...@booknwalk.com>
> >> wrote:
> >> >>
> >> >> >Hi folks,
> >> >> >
> >> >> >We use SOLR 5.2.1. We have ebooks stored in SOLR. The majority of
> the
> >> >> >fields do not change at all like content, author, publisher.... Only
> >> the
> >> >> >price field changes frequently.
> >> >> >
> >> >> >We let the customers to make full text search so we indexed the
> content
> >> >> >filed. Due to the frequency of the price updates we use the atomic
> >> update
> >> >> >feature. As a requirement of the atomic updates we have to store all
> >> the
> >> >> >fields even the content field which is 1MB/document and we did not
> >> want to
> >> >> >store it just index it.
> >> >> >
> >> >> >As we wanted to update 100 documents with atomic update it took
> about 3
> >> >> >minutes. Taking into account that our metadata /document is 1 Kb and
> >> our
> >> >> >content field / document is 1MB we use 1000 more memory to
> accelerate
> >> the
> >> >> >update process.
> >> >> >
> >> >> >I am almost 100% sure that we make something wrong.
> >> >> >
> >> >> >What is the best practice of the frequent updates when 99% part of a
> >> given
> >> >> >document is constant forever?
> >> >> >
> >> >> >Thank in advance
> >> >> >
> >> >> >--
> >> >> ><https://www.linkedin.com/pub/roland-sz%C5%B1cs/28/226/24/hu>
> Roland
> >> >> Szűcs
> >> >> ><https://www.linkedin.com/pub/roland-sz%C5%B1cs/28/226/24/hu>
> Connect
> >> >> with
> >> >> >me on Linkedin <
> >> >> https://www.linkedin.com/pub/roland-sz%C5%B1cs/28/226/24/hu>
> >> >> ><https://bookandwalk.hu/>
> >> >> >CEO Phone: +36 1 210 81 13
> >> >> >Bookandwalk.hu <https://bokandwalk.hu/>
> >> >>
> >> >
> >> >
> >> >
> >> > --
> >> > <https://www.linkedin.com/pub/roland-sz%C5%B1cs/28/226/24/hu> Szűcs
> >> Roland
> >> > <https://www.linkedin.com/pub/roland-sz%C5%B1cs/28/226/24/hu>
> >> Ismerkedjünk
> >> > meg a Linkedin <
> >> https://www.linkedin.com/pub/roland-sz%C5%B1cs/28/226/24/hu>
> >> > -en <https://bookandwalk.hu/>
> >> > Ügyvezető Telefon: +36 1 210 81 13
> >> > Bookandwalk.hu <https://bokandwalk.hu/>
> >>
> >
> >
> >
> >--
> ><https://www.linkedin.com/pub/roland-sz%C5%B1cs/28/226/24/hu> Szűcs
> Roland
> ><https://www.linkedin.com/pub/roland-sz%C5%B1cs/28/226/24/hu>
> Ismerkedjünk
> >meg a Linkedin <
> https://www.linkedin.com/pub/roland-sz%C5%B1cs/28/226/24/hu>
> >-en <https://bookandwalk.hu/>
> >Ügyvezető Telefon: +36 1 210 81 13
> >Bookandwalk.hu <https://bokandwalk.hu/>
>



-- 
<https://www.linkedin.com/pub/roland-sz%C5%B1cs/28/226/24/hu> Szűcs Roland
<https://www.linkedin.com/pub/roland-sz%C5%B1cs/28/226/24/hu> Ismerkedjünk
meg a Linkedin <https://www.linkedin.com/pub/roland-sz%C5%B1cs/28/226/24/hu>
-en <https://bookandwalk.hu/>
Ügyvezető Telefon: +36 1 210 81 13
Bookandwalk.hu <https://bokandwalk.hu/>

Re: very slow frequent updates

Reply via email to