Awesome. Thank you. On Wed, Feb 9, 2011 at 8:31 PM, Kingsley Idehen <kide...@openlinksw.com>wrote:
> On 2/9/11 8:29 AM, Abhi wrote: > > Thanks for that. > > Say I have to load a triple file for a single instance, then I use the > ld_dir function to inform virtuoso of the file and then use rdf_loader_run > and load it into virtuoso(This is present in the dbpedia example page). > > Now for the cluster, say I have 4 triple files belonging to 4 different > graphs. Do you mean, load each instance with one of the the triple files? > > > Yes, so they load in parallel. > > > Also, say I have a huge file of 10 Gb, then can I split the file into 2.5 > Gbs of well formed triples and then load the split files into the 4 > instances? > > > Yes, again so they load in parallel. > > This is how we load the entire DBpedia dataset in 15 mins on the LOD cloud > cache instance of Virtuoso. We split the load across 8 instances in the > 8-node cluster :-) > > > When you say "You talk to Virtuoso as you would the single server edition > from any port. " do you mean to say, I can talk to any cluster instance and > I will have access to the data in all the cluster instances? > > > Yep! That's the essence of the matter re. our "shared nothing" cluster. > > Kingsley > > > On Wed, Feb 9, 2011 at 5:55 PM, Kingsley Idehen <kide...@openlinksw.com>wrote: > >> On 2/9/11 3:46 AM, Abhi wrote: >> >> Can a virtuoso cluster be treated as a virtual single instance? To expand: >> >> >> Say I have a cluster of 4 virtuoso instances with one of them configured >> as a master. Now, I have to load the cluster with say 3 billion triples >> belonging to say 5 different graphs. >> >> 1. I just load the data into the master server and virtuoso clustering >> takes care of spreading the data into the different servers as it sees fit? >> Also, is the data partitioned into the different servers or is it just >> replicated? >> >> >> You can load across all 4 instances in parallel. That's the very essence >> of the horizontal partitioning that underlies our cluster engine. It's one >> virtual database in a sense where access to any node delivers access to the >> entire parallelized cluster. >> >> >> 2. When I have to query this data say from 3 interconnected graphs, then >> I just run the query against the master cluster and virtuoso cluster will >> take care of fetching the partitioned data(assuming it is partitioned) from >> the different instances? >> >> Are my assumptions correct? >> >> >> You talk to Virtuoso as you would the single server edition from any >> port. The cluster engine deals with the rest of the work :-) >> >> >> Kingsley >> >> >> -- >> Cheers, >> Abhi >> >> >> ------------------------------------------------------------------------------ >> The ultimate all-in-one performance toolkit: Intel(R) Parallel Studio XE: >> Pinpoint memory and threading errors before they happen. >> Find and fix more than 250 security defects in the development cycle. >> Locate bottlenecks in serial and parallel code that limit >> performance.http://p.sf.net/sfu/intel-dev2devfeb >> >> >> _______________________________________________ >> Virtuoso-users mailing >> listVirtuoso-users@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/virtuoso-users >> >> >> >> -- >> >> Regards, >> >> Kingsley Idehen >> President & CEO >> OpenLink Software >> Web: http://www.openlinksw.com >> Weblog: http://www.openlinksw.com/blog/~kidehen >> Twitter/Identi.ca: kidehen >> >> >> >> >> >> >> >> ------------------------------------------------------------------------------ >> The ultimate all-in-one performance toolkit: Intel(R) Parallel Studio XE: >> Pinpoint memory and threading errors before they happen. >> Find and fix more than 250 security defects in the development cycle. >> Locate bottlenecks in serial and parallel code that limit performance. >> http://p.sf.net/sfu/intel-dev2devfeb >> _______________________________________________ >> Virtuoso-users mailing list >> Virtuoso-users@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/virtuoso-users >> >> > > > -- > Cheers, > Abhi > > > ------------------------------------------------------------------------------ > The ultimate all-in-one performance toolkit: Intel(R) Parallel Studio XE: > Pinpoint memory and threading errors before they happen. > Find and fix more than 250 security defects in the development cycle. > Locate bottlenecks in serial and parallel code that limit > performance.http://p.sf.net/sfu/intel-dev2devfeb > > > _______________________________________________ > Virtuoso-users mailing > listVirtuoso-users@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/virtuoso-users > > > > -- > > Regards, > > Kingsley Idehen > President & CEO > OpenLink Software > Web: http://www.openlinksw.com > Weblog: http://www.openlinksw.com/blog/~kidehen > Twitter/Identi.ca: kidehen > > > > > > > > ------------------------------------------------------------------------------ > The ultimate all-in-one performance toolkit: Intel(R) Parallel Studio XE: > Pinpoint memory and threading errors before they happen. > Find and fix more than 250 security defects in the development cycle. > Locate bottlenecks in serial and parallel code that limit performance. > http://p.sf.net/sfu/intel-dev2devfeb > _______________________________________________ > Virtuoso-users mailing list > Virtuoso-users@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/virtuoso-users > > -- Cheers, Abhi