Hi Andy,

The amount of memory and hence NumberOfBuffers a dataset requires for being 
hosted/cached in memory for best performance varies depending on how well a 
given dataset can be compressed, which varies based on its structure (data 
regularity) and size of literals etc. and thus makes this hard to predict with 
any accuracy before hand. As a general rule for column store we say the average 
storage requirement is 10 bytes per quad ie 10GB RAM per billion triples , but 
as said this varies some maybe  less and some more. For example we have seen 9 
bytes/quad with DBpedia, 7 bytes/quad with BSBM , 14 bytes/quad with web crawl 
data (not very regular).

Once a dataset is loaded its space utilisation can be determined as detailed at:

        http://docs.openlinksw.com/virtuoso/coredbengine.html#colstorespaceutil

I hope this helps ...

Best Regards
Hugh Williams
Professional Services
OpenLink Software, Inc.      //              http://www.openlinksw.com/
Weblog   -- http://www.openlinksw.com/blogs/
LinkedIn -- http://www.linkedin.com/company/openlink-software/
Twitter  -- http://twitter.com/OpenLink
Google+  -- http://plus.google.com/100570109519069333827/
Facebook -- http://www.facebook.com/OpenLinkSoftware
Universal Data Access, Integration, and Management Technology Providers

> On 22 Apr 2015, at 17:01, Andy Jenkinson <andy.jenkin...@ebi.ac.uk> wrote:
> 
> Hi,
> 
> I would like to find a more efficient way to work out how much memory to 
> allocate to some new Virtuoso VMs, which is directly dependent on how many 
> buffers I allocate to Virtuoso. I want to allocate enough memory so that the 
> column store is effectively operating in-memory. To work this out, currently 
> we have to allocate a large number of buffers and then monitor the service to 
> see how many are actually used, but this is:
> a) time consuming
> b) an inefficient use of memory in the meantime
> c) difficult to plan
> 
> I realise one would “normally" allocate the buffers according to the 
> available memory, but I want to do the reverse. Is there a rough guide to 
> calculate an appropriate number of buffers from the number of triples in the 
> dataset?
> 
> Cheers,
> Andy
> ------------------------------------------------------------------------------
> BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT
> Develop your own process in accordance with the BPMN 2 standard
> Learn Process modeling best practices with Bonita BPM through live exercises
> http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- event?utm_
> source=Sourceforge_BPM_Camp_5_6_15&utm_medium=email&utm_campaign=VA_SF
> _______________________________________________
> Virtuoso-users mailing list
> Virtuoso-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/virtuoso-users


------------------------------------------------------------------------------
BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT
Develop your own process in accordance with the BPMN 2 standard
Learn Process modeling best practices with Bonita BPM through live exercises
http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- event?utm_
source=Sourceforge_BPM_Camp_5_6_15&utm_medium=email&utm_campaign=VA_SF
_______________________________________________
Virtuoso-users mailing list
Virtuoso-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/virtuoso-users

Reply via email to