Re: [Virtuoso-users] Virtuoso RDF API and bnodes

imikhailov Sun, 29 Oct 2006 13:54:44 -0800

Hello Anton,

There's no really convenient way, because the bnode usually appears in more
than one insert statement and there should be a way of remembering them
between calls.


Blank nodes have no persistent names but they have persistent IRI_IDs,
because they're nodes anyway.

The function DB.DBA.TTLP_EXEC_NEW_BLANK (in g varchar, inout app_env any)
returns IRI_ID
returns an IRI_ID of new blank node that never appeared before in the
database.

The parameter g is a 'hint', it should be equal to the IRI of graph where
the blank node will be used. If the bnode is used in more than one graph,
choose the 'most popular' one, if graph name is not known when the bnode is
created then pass a NULL value. Future versions of Virtuoso will tend to
group together all blank node IDs of every graph, to get slightly better
locality of disk access and slightly better compression of bitmap indexes.

The parameter app_env is a variable that should be set to NULL before the
first call and then used only in calls of DB.DBA.TTLP_EXEC_NEW_BLANK()
It's OK to 'forget' the value between consequent calls of
DB.DBA.TTLP_EXEC_NEW_BLANK() or to use many different variables for this
purpose. This is just a hint. So the following code is quite OK:

declare bnode_app_env any;
declare env_2 any;
bnode_app_env := NULL;

bnode1 := DB.DBA.TTLP_EXEC_NEW_BLANK ('mygraph', bnode_app_env);
bnode2 := DB.DBA.TTLP_EXEC_NEW_BLANK ('mygraph', bnode_app_env);
bnode_app_env := NULL;
bnode3 := DB.DBA.TTLP_EXEC_NEW_BLANK ('mygraph', bnode_app_env);
env_2 := NULL;
bnode4 := DB.DBA.TTLP_EXEC_NEW_BLANK ('mygraph', env_2);
bnode5 := DB.DBA.TTLP_EXEC_NEW_BLANK ('mygraph', bnode_app_env);

That's the way of obtaining blank node ID that will stay valid in next
versions of Virtuoso. In current version (but maybe not in future versions),
the following code will work quite OK:

bnode6 := iri_id_from_num (sequence_next ('RDF_URL_IID_BLANK'));

If you should create one blank node per object, and you have arbitrary
number of objects, you may wish to create a dictionary (using dict_new()),
to keep there pairs of objects and their blank node IDs.

No matter how did you get bnode IRI_ID, you can 'cheat' and pass these
IRI_ID to DB.DBA.RDF_QUAD_URI(...) instead of sublect or object URI. Neither
graph nor predicate can be bnodes: no error is reported on invalid call but
SPARQL queries may incorrectly filter ill formed triples, because the SPARQL
optimizer hopes that P and G are always named IRI references. Hence the
following three queries will be compiled into the same SQL:

sparql select * where { graph ?g { ?s ?p ?o . FILTER (isIRI(?g)) } };
sparql select * where { graph ?g { ?s ?p ?o . FILTER (isIRI(?p)) } };
sparql select * where { graph ?g { ?s ?p ?o } };

whereas the following will be compiled into select ... where (1=2):

sparql select * where { graph ?g { ?s ?p ?o . FILTER (isBLANK(?p)) } };



Note: Allocation of bnodes is transaction-sensitive.

The created bnode IRI_ID may became invalid on transaction rollback. You
should not try to save the created bnode ID in a global variables or
dictionaries that may 'survive' the rollback of the transaction where the ID
is allocated. This is important when a function creates large number of
bnode IDs and then performs large number of insertions or updates, in a
loop, with deadlock retries. If the first insertion or update fails with
rollback, whole set of allocated IDs become invalid, use of these IDs may
result in 'collision in the air' because they can be allocated again for
other purpose by other transaction. When in doubt, use 'commit work' between
allocating IRI_IDs and an operation that may result in transaction rollback.



Note: Allocation of bnodes is a function with side effects, use with caution
in SELECT statements.

The following code may result in a weird error:

declare bnode_app_env any;
bnode_app_env := NULL;
for (
  select U_ID, U_NAME, U_E_MAIL,
     DB.DBA.TTLP_EXEC_NEW_BLANK ('mygraph', bnode_app_env) as U_BNODE
  from SYS_USERS )
do
  {
    COMPOSE_SOME_USER_DATA (U_ID, U_NAME, U_E_MAIL, U_BNODE);
    COMPOSE_MORE_USER_DATA (U_ID, U_NAME, U_E_MAIL, U_BNODE);
  }

The SQL optimizer has a right to make only one call of
DB.DBA.TTLP_EXEC_NEW_BLANK ('mygraph', bnode_app_env) per code invocation,
instead of one call per retrieved user. This is probably not what you
expect. The safe code is as follows:

declare bnode_app_env any;
bnode_app_env := NULL;
for (
  select U_ID, U_NAME, U_E_MAIL from SYS_USERS )
do
  {
    declare u_bnode IRI_ID;
    u_bnode := DB.DBA.TTLP_EXEC_NEW_BLANK ('mygraph', bnode_app_env);
    COMPOSE_SOME_USER_DATA (U_ID, U_NAME, U_E_MAIL, u_bnode);
    COMPOSE_MORE_USER_DATA (U_ID, U_NAME, U_E_MAIL, u_bnode);
  }



Best Regards,
IvAn Mikhailov



-----Original Message-----
From: virtuoso-users-boun...@lists.sourceforge.net
[mailto:virtuoso-users-boun...@lists.sourceforge.net] On Behalf Of Anton
Krasovsky
Sent: Friday, October 27, 2006 9:47 PM
To: virtuoso-users@lists.sourceforge.net
Subject: [Virtuoso-users] Virtuoso RDF API and bnodes

Hi,

is there any way to programmatically insert triples containing bnodes into
the store?

Something similar to DB.DBA.RDF_QUAD_URI(...)?

Regards,
Anton

Re: [Virtuoso-users] Virtuoso RDF API and bnodes

Reply via email to