Awesome. Thank you.

On Wed, Feb 9, 2011 at 8:31 PM, Kingsley Idehen <kide...@openlinksw.com>wrote:

>  On 2/9/11 8:29 AM, Abhi wrote:
>
> Thanks for that.
>
> Say I have to load a triple file for a single instance, then I use the
> ld_dir function to inform virtuoso of the file and then use rdf_loader_run
> and load it into virtuoso(This is present in the dbpedia example page).
>
> Now for the cluster, say I have 4 triple files belonging to 4 different
> graphs. Do you mean, load each instance with one of the the triple files?
>
>
> Yes, so they load in parallel.
>
>
> Also, say I have a huge file of 10 Gb, then can I split the file into 2.5
> Gbs of well formed triples and then load the split files into the 4
> instances?
>
>
> Yes, again so they load in parallel.
>
> This is how we load the entire DBpedia dataset in 15 mins on the LOD cloud
> cache instance of Virtuoso. We split the load across 8 instances in the
> 8-node cluster :-)
>
>
> When you say "You talk to Virtuoso as you would the single server edition
> from any port. " do you mean to say, I can talk to any cluster instance and
> I will have access to the data in all the cluster instances?
>
>
> Yep! That's the essence of the matter re. our "shared nothing" cluster.
>
> Kingsley
>
>
> On Wed, Feb 9, 2011 at 5:55 PM, Kingsley Idehen <kide...@openlinksw.com>wrote:
>
>>  On 2/9/11 3:46 AM, Abhi wrote:
>>
>> Can a virtuoso cluster be treated as a virtual single instance? To expand:
>>
>>
>>  Say I have a cluster of 4 virtuoso instances with one of them configured
>> as a master. Now, I have to load the cluster with say 3 billion triples
>> belonging to say 5 different graphs.
>>
>>  1. I just load the data into the master server and virtuoso clustering
>> takes care of spreading the data into the different servers as it sees fit?
>> Also, is the data partitioned into the different servers or is it just
>> replicated?
>>
>>
>>  You can load across all 4 instances in parallel. That's the very essence
>> of the horizontal partitioning that underlies our cluster engine. It's one
>> virtual database in a sense where access to any node delivers access to the
>> entire parallelized cluster.
>>
>>
>>  2. When I have to query this data say from 3 interconnected graphs, then
>> I just run the query against the master cluster and virtuoso cluster will
>> take care of fetching the partitioned data(assuming it is partitioned) from
>> the different instances?
>>
>>  Are my assumptions correct?
>>
>>
>>  You talk to Virtuoso as you would the single server edition from any
>> port. The cluster engine deals with the rest of the work :-)
>>
>>
>> Kingsley
>>
>>
>> --
>> Cheers,
>> Abhi
>>
>>
>> ------------------------------------------------------------------------------
>> The ultimate all-in-one performance toolkit: Intel(R) Parallel Studio XE:
>> Pinpoint memory and threading errors before they happen.
>> Find and fix more than 250 security defects in the development cycle.
>> Locate bottlenecks in serial and parallel code that limit 
>> performance.http://p.sf.net/sfu/intel-dev2devfeb
>>
>>
>> _______________________________________________
>> Virtuoso-users mailing 
>> listVirtuoso-users@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/virtuoso-users
>>
>>
>>
>> --
>>
>> Regards,
>>
>> Kingsley Idehen      
>> President & CEO
>> OpenLink Software
>> Web: http://www.openlinksw.com
>> Weblog: http://www.openlinksw.com/blog/~kidehen
>> Twitter/Identi.ca: kidehen
>>
>>
>>
>>
>>
>>
>>
>> ------------------------------------------------------------------------------
>> The ultimate all-in-one performance toolkit: Intel(R) Parallel Studio XE:
>> Pinpoint memory and threading errors before they happen.
>> Find and fix more than 250 security defects in the development cycle.
>> Locate bottlenecks in serial and parallel code that limit performance.
>> http://p.sf.net/sfu/intel-dev2devfeb
>> _______________________________________________
>> Virtuoso-users mailing list
>> Virtuoso-users@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/virtuoso-users
>>
>>
>
>
> --
> Cheers,
> Abhi
>
>
> ------------------------------------------------------------------------------
> The ultimate all-in-one performance toolkit: Intel(R) Parallel Studio XE:
> Pinpoint memory and threading errors before they happen.
> Find and fix more than 250 security defects in the development cycle.
> Locate bottlenecks in serial and parallel code that limit 
> performance.http://p.sf.net/sfu/intel-dev2devfeb
>
>
> _______________________________________________
> Virtuoso-users mailing 
> listVirtuoso-users@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/virtuoso-users
>
>
>
> --
>
> Regards,
>
> Kingsley Idehen       
> President & CEO
> OpenLink Software
> Web: http://www.openlinksw.com
> Weblog: http://www.openlinksw.com/blog/~kidehen
> Twitter/Identi.ca: kidehen
>
>
>
>
>
>
>
> ------------------------------------------------------------------------------
> The ultimate all-in-one performance toolkit: Intel(R) Parallel Studio XE:
> Pinpoint memory and threading errors before they happen.
> Find and fix more than 250 security defects in the development cycle.
> Locate bottlenecks in serial and parallel code that limit performance.
> http://p.sf.net/sfu/intel-dev2devfeb
> _______________________________________________
> Virtuoso-users mailing list
> Virtuoso-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/virtuoso-users
>
>


-- 
Cheers,
Abhi

Reply via email to