Hi Andy, The amount of memory and hence NumberOfBuffers a dataset requires for being hosted/cached in memory for best performance varies depending on how well a given dataset can be compressed, which varies based on its structure (data regularity) and size of literals etc. and thus makes this hard to predict with any accuracy before hand. As a general rule for column store we say the average storage requirement is 10 bytes per quad ie 10GB RAM per billion triples , but as said this varies some maybe less and some more. For example we have seen 9 bytes/quad with DBpedia, 7 bytes/quad with BSBM , 14 bytes/quad with web crawl data (not very regular).
Once a dataset is loaded its space utilisation can be determined as detailed at: http://docs.openlinksw.com/virtuoso/coredbengine.html#colstorespaceutil I hope this helps ... Best Regards Hugh Williams Professional Services OpenLink Software, Inc. // http://www.openlinksw.com/ Weblog -- http://www.openlinksw.com/blogs/ LinkedIn -- http://www.linkedin.com/company/openlink-software/ Twitter -- http://twitter.com/OpenLink Google+ -- http://plus.google.com/100570109519069333827/ Facebook -- http://www.facebook.com/OpenLinkSoftware Universal Data Access, Integration, and Management Technology Providers > On 22 Apr 2015, at 17:01, Andy Jenkinson <andy.jenkin...@ebi.ac.uk> wrote: > > Hi, > > I would like to find a more efficient way to work out how much memory to > allocate to some new Virtuoso VMs, which is directly dependent on how many > buffers I allocate to Virtuoso. I want to allocate enough memory so that the > column store is effectively operating in-memory. To work this out, currently > we have to allocate a large number of buffers and then monitor the service to > see how many are actually used, but this is: > a) time consuming > b) an inefficient use of memory in the meantime > c) difficult to plan > > I realise one would “normally" allocate the buffers according to the > available memory, but I want to do the reverse. Is there a rough guide to > calculate an appropriate number of buffers from the number of triples in the > dataset? > > Cheers, > Andy > ------------------------------------------------------------------------------ > BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT > Develop your own process in accordance with the BPMN 2 standard > Learn Process modeling best practices with Bonita BPM through live exercises > http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- event?utm_ > source=Sourceforge_BPM_Camp_5_6_15&utm_medium=email&utm_campaign=VA_SF > _______________________________________________ > Virtuoso-users mailing list > Virtuoso-users@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/virtuoso-users ------------------------------------------------------------------------------ BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT Develop your own process in accordance with the BPMN 2 standard Learn Process modeling best practices with Bonita BPM through live exercises http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- event?utm_ source=Sourceforge_BPM_Camp_5_6_15&utm_medium=email&utm_campaign=VA_SF _______________________________________________ Virtuoso-users mailing list Virtuoso-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/virtuoso-users