Has anyone ever been successful in processing 150M records using the Suggester Component? The make of the component, please comment.
On Tue, Jun 26, 2018 at 1:37 AM, Ratnadeep Rakshit <ratnad...@qedrix.com> wrote: > The site_address field has all the address of United states. Idea is to > build something similar to Google Places autosuggest. > > Here's an example query: curl "http://localhost/solr/ > addressbook/suggest?suggest.q=1054%20club&wt=json" > > Response: > > { > "responseHeader": { > "status": 0, > "QTime": 3125, > "params": { > "suggest.q": "1054 club", > "wt": "json" > } > }, > "suggest": { > "mySuggester2": { > "1054 club": { > "numFound": 3, > "suggestions": [{ > "term": "<b>1054</b> null N COUNTRY <b>CLUB</b> null BLVD null STOCKTON CA > 95204 5008", > "weight": 0, > "payload": "0023865882|06077|37.970769,-121.310433" > }, { > "term": "<b>1054</b> null E HERITAGE <b>CLUB</b> null CIR null DELRAY > BEACH FL 33483 3482", > "weight": 0, > "payload": "0117190535|12099|26.445485,-80.069336" > }, { > "term": "<b>1054</b> null null CORAL <b>CLUB</b> null DR <b>1054</b> CORAL > SPRINGS FL 33071 5657", > "weight": 0, > "payload": "0111342342|12011|26.243918,-80.267577" > }] > } > }, > "mySuggester1": { > "1054 club": { > "numFound": 0, > "suggestions": [] > } > } > } > } > > Now when I start building with 25M address records in the addressbook > core, the process runs smoothly. I can check the Heap utilization upto 56% > max out of the 20GB allotted to Solr. > I am not very experienced in metering solr performance. But it looks like > when I increase the record size beyond 25M in the core, the build process > fails. The query process of the suggester still works. > > Did that answer your questions correctly? > > On Tue, Jun 12, 2018 at 3:17 PM, Alessandro Benedetti < > a.benede...@sease.io> wrote: > >> Hi, >> first of all the two different suggesters you are using are based on >> different data structures ( with different memory utilisation) : >> >> - FuzzyLookupFactory -> FST ( in memory and stored binary on disk) >> - AnalyzingInfixLookupFactory -> Auxiliary Lucene Index >> >> Both the data structures should be very memory efficient ( both in >> building >> and storage). >> What is the cardinality of the fields you are building suggestions from ? >> ( >> site_address and site_address_other) >> What is the memory situation in Solr when you start the suggester >> building ? >> You are allocating much more memory to the JVM Solr process than the OS ( >> which in your situation doesn't fit the entire index ideal scenario). >> >> I would recommend to put some monitoring in place ( there are plenty of >> open >> source tools to do that) >> >> Regards >> >> >> >> ----- >> --------------- >> Alessandro Benedetti >> Search Consultant, R&D Software Engineer, Director >> Sease Ltd. - www.sease.io >> -- >> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html >> > >