hi,

this sounds very interesting to me, i'm currently fiddling
around with a suitable row and column setup for triples.

i'm about to implement openrdf's sail api for hbase (i just did 
a lucene quad store implementation which is superfast a scales 
to a couple of hundreds of millions of triples (http://turnguard.com/tuqs)) 
but i'm in my first days of hbase encounters, so my experience
in row column design is manageable.

from my point of view the problem is to really efficiantly store
besides the triples themselves the contexts (named graphs) and
languages of literal.

by the way : i just did a small tablemanager (in beta) that lets
you create htables -> from <- rdf (see 
http://sourceforge.net/projects/hbasetablemgr/)

i'd be really happy to contribute on the rdf and sparql side,
but certainly could need some help on the hbase table design side.

wkr www.turnguard.com/turnguard



----- Original Message -----
From: "Raffi Basmajian" <[email protected]>
To: [email protected], [email protected]
Sent: Thursday, April 1, 2010 9:45:59 PM
Subject: RE: Using SPARQL against HBase


This is an interesting article from a few guys over at BBN/Raytheon. By
storing triples in flat files theu used a custom algorithm, detailed in
the article, to iterate the WHERE clause from a SPARQL query and reduce
the map into the desired result. 

This is very similar to what I need to do; the only difference being
that our data is stored in Hbase tables, not as triples in flat files. 
 

-----Original Message-----
From: Amandeep Khurana [mailto:[email protected]] 
Sent: Wednesday, March 31, 2010 3:30 PM
To: [email protected]; [email protected]
Subject: Re: Using SPARQL against HBase

Why do you need to build an in-memory graph which you would want to
read/write to? You could store the graph in HBase directly. As pointed
out, HBase might not be the best suited for SPARQL queries, but its not
impossible to do. Using the triples, you can form a graph that can be
represented in HBase as an adjacency list. I've stored graphs with
16-17M nodes which was data equivalent to about 600M triples. And this
was on a small cluster and could certainly scale way more than 16M graph
nodes.

In case you are interested in working on SPARQL over HBase, we could
collaborate on it...

-ak


Amandeep Khurana
Computer Science Graduate Student
University of California, Santa Cruz


On Wed, Mar 31, 2010 at 11:56 AM, Andrew Purtell
<[email protected]>wrote:

> Hi Raffi,
>
> To read up on fundamentals I suggest Google's BigTable paper:
> http://labs.google.com/papers/bigtable.html
>
> Detail on how HBase implements the BigTable architecture within the 
> Hadoop ecosystem can be found here:
>
>  http://wiki.apache.org/hadoop/Hbase/HbaseArchitecture
>  http://www.larsgeorge.com/2009/10/hbase-architecture-101-storage.html
>
> http://www.larsgeorge.com/2010/01/hbase-architecture-101-write-ahead-l
> og.html
>
> Hope that helps,
>
>   - Andy
>
> > From: Basmajian, Raffi <[email protected]>
> > Subject: RE: Using SPARQL against HBase
> > To: [email protected], [email protected]
> > Date: Wednesday, March 31, 2010, 11:42 AM If Hbase can't respond to 
> > SPARQL-like queries, then what type of query language can it respond

> > to? In a traditional RDBMS database one would use SQL; so what is 
> > the counterpart query language with Hbase?
>
>
>
>
>

------------------------------------------------------------------------------
This e-mail transmission may contain information that is proprietary, 
privileged and/or confidential and is intended exclusively for the person(s) to 
whom it is addressed. Any use, copying, retention or disclosure by any person 
other than the intended recipient or the intended recipient's designees is 
strictly prohibited. If you are not the intended recipient or their designee, 
please notify the sender immediately by return e-mail and delete all copies. 
OppenheimerFunds may, at its sole discretion, monitor, review, retain and/or 
disclose the content of all email communications. 
==============================================================================


-- 
punkt. netServices
______________________________
Jürgen Jakobitsch
Codeography

Lerchenfelder Gürtel 43 Top 5/2
A - 1160 Wien
Tel.: 01 / 897 41 22 - 29
Fax: 01 / 897 41 22 - 22

netServices http://www.punkt.at

Reply via email to