Egor, would you mind to share some best practices regarding cursorMark in SolrEntityProcessor?
On Thu, Feb 6, 2020 at 1:04 PM Karl Stoney <karl.sto...@autotrader.co.uk.invalid> wrote: > Spoke too soon, looks like it memory leaks. After about 1.3m the old gc > times went through the root and solr was almost unresponsive, had to > abort. We're going to write our own implementation to copy data from one > core to another that runs outside of solr. > > On 06/02/2020, 09:57, "Karl Stoney" <karl.sto...@autotrader.co.uk> wrote: > > I cannot believe how much of a difference that cursorMark and sort > order made. > Previously it died about 800k docs, now we're at 1.2m without any > slowdown. > > Thank you so much > > On 06/02/2020, 08:14, "Mikhail Khludnev" <m...@apache.org> wrote: > > Hello, Karl. > Please check these: > > https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Flucene.apache.org%2Fsolr%2Fguide%2F6_6%2Fpagination-of-results.html%23constraints-when-using-cursors&data=02%7C01%7Ckarl.stoney%40autotrader.co.uk%7C31a2300d8a0e42a9e28f08d7aadc92c7%7C926f3743f3d24b8a816818cfcbe776fe%7C0%7C0%7C637165736641024457&sdata=pNw8x6YUBTtXst60oMAe8UqWvUtakYvoJ9%2FKn7R8ETo%3D&reserved=0 > > > https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Flucene.apache.org%2Fsolr%2Fguide%2F6_6%2Fuploading-structured-data-store-data-with-the-data-import-handler.html%23solrentityprocessor&data=02%7C01%7Ckarl.stoney%40autotrader.co.uk%7C31a2300d8a0e42a9e28f08d7aadc92c7%7C926f3743f3d24b8a816818cfcbe776fe%7C0%7C0%7C637165736641024457&sdata=572w%2Br7QtZ8eHORG5UVrE3yE3SZaUXsuqFpRuwE80sw%3D&reserved=0 > cursorMark="true" > Good luck. > > > On Wed, Feb 5, 2020 at 10:06 PM Karl Stoney > <karl.sto...@autotrader.co.uk.invalid> wrote: > > > Hey All, > > I'm trying to implement a simplistic reindex strategy to copy > all of the > > data out of one collection, into another, on a single node (no > distributed > > queries). > > > > It's approx 4 million documents, with an index size of 26gig. > Based on > > your experience, I'm wondering what people feel sensible values > for the > > SolrEntityProcessor are (to give me a sensible starting point, > to save me > > iterating over loads of them). > > > > This is where I'm at right now. I know `rows` would increase > memory > > pressure but speed up the copy, I can't really find anywhere > online where > > people have benchmarked different values for rows and the > default (50) > > seems quite low. > > > > <dataConfig> > > <document> > > <entity name="solr_doc" processor="SolrEntityProcessor" > > query="*:*" > > rows="100" > > fl="*,old_version:_version_" > > wt="javabin" > > url=" > https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2F127.0.0.1%2Fsolr%2Fat-uk&data=02%7C01%7Ckarl.stoney%40autotrader.co.uk%7C31a2300d8a0e42a9e28f08d7aadc92c7%7C926f3743f3d24b8a816818cfcbe776fe%7C0%7C0%7C637165736641024457&sdata=e9BfXappFygVqSlweYXJdsxf5TXtlrL%2BwHop7PrOsJQ%3D&reserved=0 > "> > > </entity> > > </document> > > </dataConfig> > > > > Any suggestions are welcome. > > Thanks > > This e-mail is sent on behalf of Auto Trader Group Plc, > Registered Office: > > 1 Tony Wilson Place, Manchester, Lancashire, M15 4FN (Registered > in England > > No. 9439967). This email and any files transmitted with it are > confidential > > and may be legally privileged, and intended solely for the use > of the > > individual or entity to whom they are addressed. If you have > received this > > email in error please notify the sender. This email message has > been swept > > for the presence of computer viruses. > > > > > -- > Sincerely yours > Mikhail Khludnev > > > > > This e-mail is sent on behalf of Auto Trader Group Plc, Registered Office: > 1 Tony Wilson Place, Manchester, Lancashire, M15 4FN (Registered in England > No. 9439967). This email and any files transmitted with it are confidential > and may be legally privileged, and intended solely for the use of the > individual or entity to whom they are addressed. If you have received this > email in error please notify the sender. This email message has been swept > for the presence of computer viruses. > -- Sincerely yours Mikhail Khludnev