Re: [proposal] TripleSoup - a SPARQL endpoint for httpd

Sanjiva Weerawarana Tue, 30 Jan 2007 11:07:42 -0800

+1 from me, despite my still sparse understanding of semantic Web stuff.

If there's significant XML processing involved, might I suggest you
consider using Axiom/C from Axis2/C? That's the XML Infoset model we use
and its fast and works. Plus I'm sure you can help improve it :).


Thanks,

Sanjiva.

On Mon, 2007-01-29 at 17:16 +0100, Leo Simons wrote:
> Hi all,
> 
> This is a proposal to start a rdf database server project at apache.
> 
> What do you think?
> 
> cheers!
> 
> - Leo
> 
> ----
> = summary =
> 
> TripleSoup is the simplest thing that you can do to turn your apache
> web server into a SPARQL endpoint.
> 
> TripleSoup will be an RDF [2] store [3], tooling to work with that
> database, and a REST [4] web interface to talk to that database using
> SPARQL [5], implemented as an apache webserver module.
> 
> {{{
> Target:    TLP
> Sponsor:   Incubator PMC
> Champion:  Leo Simons <[EMAIL PROTECTED]>
> Mentors:   Dirk-Willem van Gulik <[EMAIL PROTECTED]>,
>             Ben Hyde <[EMAIL PROTECTED]>,
>             Stefano Mazzocchi <[EMAIL PROTECTED]>,
>             Leo Simons <[EMAIL PROTECTED]>
> Resources: SVN:     https://svn.apache.org/repos/asf/incubator/ 
> triplesoup/
>             Website: http://incubator.apache.org/triplesoup/
>             Jira:    http://issues.apache.org/jira/browse/TRIPLES
>             Wiki:    http://wiki.apache.org/triplesoup/
>             Mailing lists:
>                      [EMAIL PROTECTED]
>                      [EMAIL PROTECTED]
>                      [EMAIL PROTECTED]
>                      [EMAIL PROTECTED]
>              Moderators: [EMAIL PROTECTED]
>                          [EMAIL PROTECTED]
>                          [EMAIL PROTECTED]
> Initial committers:
>             Dave Beckett <[EMAIL PROTECTED]>, redland author
>             Dirk-Willem van Gulik <[EMAIL PROTECTED]>,
>             Ben Hyde <[EMAIL PROTECTED]>,
>             Stefano Mazzocchi <[EMAIL PROTECTED]>,
>             Andrea Marchesini <[EMAIL PROTECTED]>, b store  
> author
>             Alberto Reggiori <[EMAIL PROTECTED]>, rdfstore author
>             David Reid <[EMAIL PROTECTED]>,
>             Leo Simons <[EMAIL PROTECTED]>
> Initial source:     mod_sparql, commercial triple store,
>                      existing open source triple store
> Known risks:        None
> Technologies:       c
> Reference:          http://wiki.apache.org/incubator/TripleSoupProposal
> }}}
> 
> = Proposal details =
> 
> == Technology (basics) ==
> 
> What is RDF? It is just about any kind of data, represented as  
> triples of
> (subject, predicate, object), usually with a rich vocabulary  
> describing the
> semantics of the data (with the vocabulary typically also encoded as
> triples).
> 
> This data has a representation as RDF/XML as well as using other  
> formats such
> as N3, and a query language SPARQL for searching through it. See [6]  
> for an
> overview.
> 
> So if it is just some data in some format, why does it need a special
> server? Because RDF data is fundamentally not constrained to a  
> "file", and
> there often is no "resource identifier" that readily identifies  
> something as a
> "document" which can be served up over HTTP.
> 
> So why the REST interface? RDF is one of the building blocks proposed  
> for the
> "semantic web", and that's why a system that works well with/over  
> HTTP is
> needed from the start.
> 
> == Technology (concrete) ==
> 
> This is just an example. Imagine that there is an application  
> "someapp" on
> the host foo.example.com which provides access to information about  
> books,
> and you want to get a list of those books (their URIs) and the names  
> of the
> books.
> 
> {{{
> $ telnet foo.example.com 80
> SELECT /someapp HTTP/1.0
> Host: foo.example.com
> Query-Language: http://www.w3.org/TR/2006/CR-rdf-sparql-query-20060406/
> Accept: application/sparql-results+xml, rdf/xml, rdf/n3
> 
> PREFIX books:   <http://example.org/book/>
> PREFIX dc:      <http://purl.org/dc/elements/1.1/>
> SELECT ?book ?title
> WHERE
>    { ?book dc:title ?title }
> 
> HTTP/1.0 200 Ok
> Content-Type: application/sparql-results+xml
> Content-Length: 1234
> 
> <?xml version="1.0"?>
> <sparql
>      xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#";
>      xmlns:xs="http://www.w3.org/2001/XMLSchema#";
>      xmlns="http://www.w3.org/2005/sparql-results#";>
>    <head>
>      <variable name="book"/>
>      <variable name="title"/>
>    </head>
>    <results ordered="false" distinct="false">
>      <result>
>        <binding name="book">
>          <uri>http://example.org/book/book6</uri>
>        </binding>
>        <binding name="title">
>          <literal>Harry Potter and the Half-Blood Prince</literal>
>        </binding>
>      </result>
>    </results>
> </sparql>
> 
> Connection closed by foo.example.com
> $
> }}}
> 
> It turns out there's only one book in the database in this example.
> (Sample data taken from http://www.sparql.org/). David Reid has some  
> code that
> does something not unlike this already [7], implemented as a httpd  
> module,
> using the Redland library [11,12] as its backend store.
> 
> == What would you use TripleSoup for? ==
> 
> * It could be a backend for piggy bank [8].
> 
> * It could be a backend for the next version of wikipedia.
> 
> * It could be a backend for an "open" version of iTunes or IMDB.
> 
> * It could be the backend for the information management system of the
> Dutch ministry of water management [9].
> 
> * It could be the backend for projects.a.o [10] and similar  
> applications.
> 
> * Most importantly, it could be a backend for dozens of useful new  
> innovative
> projects that no-one has envisioned yet.
> 
> == The initial source ==
> 
> RDFstore is a standalone RDF storage system implemented as a C  
> library, licensed
> under the ASL 1.1. It has perl bindings. Find its distribution at [15].
> 
> mod_sparql [7] is an in-development apache module that implements a  
> SPARQL
> endpoint. It is licensed under the Apache License 2.0. It uses  
> redland as a
> backend. The SVN repository can be found at [7].
> 
> B is an in-development storage backend for Redland implemented as a  
> standalone
> C library. It is currently a closed source codebase. A code snapshot  
> can be
> found at [16].
> 
> == The initial committers ==
> 
> Dirk-Willem, Ben, Stefano, David and Leo are ASF members who  
> hopefully need no
> introduction.
> 
> Dave Beckett is the primary author of the Redland RDF application  
> framework.
> 
> Alberto Reggiori is the primary author of rdfstore, an rdf store  
> developed by
> asemantics [13], which will be contributed to TripleSoup. He is a  
> partner at
> asemantics.
> 
> Andrea Marchesini is the primary author of B, a storage backend for RDF
> developed at Joost [14], which will be contributed to TripleSoup.
> 
> All initial committers have experience working on open source  
> projects. They
> work for at least 5 different companies.
> 
> == TripleSoup as an apache project ==
> 
> We think TripleSoup will have to reference dozens of specifications  
> from the
> W3C (XML, RDF, OWL, SPARQL, their standards for URIs, and more) and  
> from the
> IETF (HTTP, URL, URI, URN, and more), will make use of or integrate  
> with quite
> a few existing open source projects (like the redland RDF libraries  
> as well as
> apache apr&httpd). As such, it seems like TripleSoup should fit in  
> really well
> at apache.
> 
> The responses we got from various members of the RDF and semantic web
> communities so far when discussing this proposal with them have all been
> quite positive, and we expect and hope there'll be quite a few people
> new to apache joining the project soon after it starts.
> 
> Most importantly, we think this project will be useful, innovative, and
> fun!
> 
> = References =
> 
> {{{
> [1] http://incubator.apache.org/
> [2] http://www.w3.org/RDF/
> [3] these are often called "triple stores"
> [4] http://www.ics.uci.edu/~fielding/pubs/dissertation/ 
> rest_arch_style.htm
> [5] http://www.w3.org/TR/rdf-sparql-query/
> [6] http://www.betaversion.org/~stefano/papers/ac2006.1.pdf
> [7] http://david-reid.com/repos/public/mod_sparql/
> [8] http://simile.mit.edu/wiki/Piggy_Bank
> [9] http://www.wadi.nl/uk/
> [10] http://projects.apache.org/
> [11] http://www.librdf.net/
> [12] http://svn.librdf.org/repository/
> [13] http://www.asemantics.com/
> [14] http://www.joost.com/
> [15] http://rdfstore.sourceforge.net/downloads/RDFStore-0.51.tar.gz
> [16] http://opensource.joost.com/libb/
> }}}
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 
-- 
Sanjiva Weerawarana, Ph.D.
Founder & Director; Lanka Software Foundation; http://www.opensource.lk/
Founder, Chairman & CEO; WSO2, Inc.; http://www.wso2.com/
Director; Open Source Initiative; http://www.opensource.org/
Member; Apache Software Foundation; http://www.apache.org/
Visiting Lecturer; University of Moratuwa; http://www.cse.mrt.ac.lk/


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: [proposal] TripleSoup - a SPARQL endpoint for httpd

Reply via email to