Hello Hugh and virtuoso users,

>> Von: Hugh Williams <hwilli...@openlinksw.com>
>> An: armin.na...@neofonie.de
>> Kopie: Benjamin Grossman <benja...@neofonie.de>,
>> virtuoso-users@lists.sourceforge.net
>> Betreff: Re: [Virtuoso-users] dbpedia in virtuoso, extract personsdata
>> fails
>> Datum: Thu, 26 Feb 2009 17:04:19 +0000
>>
>> Hi Armin,
>>
>>
>> Can you do the following:
I did ;-)
>>
>>
>> 1. Confirm the Virtuoso version in use

Version 05.09.3035-pthreads for Linux as of Jan 21 2009 (see below in log  
file)

>> 2. Provide the output of running the virtuoso explain function showing
>> the compiled query as detailed at:
>> http://docs.openlinksw.com/virtuoso/fn_explain.html


SQL> explain ('sparql define output:format "TTL" PREFIX rdfs:  
<http://www.w3.org/2000/01/rdf-schema#> CONSTRUCT {  ?s ?p ?lo.  ?p  
rdfs:label ?lp. } FROM <http://neofonie.de/dbpedia_3_2> WHERE { ?s a  
<http://dbpedia.org/ontology/Person>. ?s ?p ?lo. ?p rdfs:label ?lp.}');
REPORT
VARCHAR
_______________________________________________________________________________

{
Fork
{

Precode:
       0: $25 ".de/dbpedia_3_2" := Call __i2idn (<constant  
(http://neofonie.de/dbpedia_3_2)>)
       5: $26 "-ns#type" := Call __i2idn (<constant  
(http://www.w3.org/1999/02/22-rdf-syntax-ns#type)>)
       10: $27 "org/ontology/Person" := Call __i2idn (<constant  
(http://dbpedia.org/ontology/Person)>)
       15: $28 "callret" := Call min_bnode_iri_id ()
       20: $29 "ema#label" := Call __i2idn (<constant  
(http://www.w3.org/2000/01/rdf-schema#label)>)
       25: $30 "callret" := Call vector (<constant (1)>, <constant (2)>,  
<constant (1)>, <constant (0)>, <constant (1)>, <constant (3)>)
       30: $31 "callret" := Call __bft (<constant  
(http://www.w3.org/2000/01/rdf-schema#label)>, <constant (1)>)
       35: $32 "callret" := Call __bft ($31 "callret", <constant (1)>)
       40: $33 "callret" := Call vector (<constant (1)>, <constant (0)>,  
<constant (3)>, $32 "callret", <constant (1)>, <constant (1)>)
       45: $34 "callret" := Call vector ($30 "callret", $33 "callret")
       50: $35 "callret" := Call vector ()
       55: BReturn 0
 from DB.DBA.RDF_QUAD by RDF_QUAD_OGPS    1.5e+05 rows
Key RDF_QUAD_OGPS  ASC ($37 "s-1-6-t2.S")
<col=415 O = $27 "org/ontology/Person"> , <col=412 G = $25  
".de/dbpedia_3_2"> , <col=414 P = $26 "-ns#type">
row specs: <col=415 O LIKE <constant (T�)>>

Current of: <$39 "<DB.DBA.RDF_QUAD s-1-6-t2>" spec 5>

Precode:
       0: $40 "callret" := Call __id2i ($37 "s-1-6-t2.S")
       5: BReturn 0
 from DB.DBA.RDF_QUAD by RDF_QUAD          4 rows
Key RDF_QUAD  ASC ($43 "s-1-6-t3.P", $42 "s-1-6-t3.O")
  inlined <col=412 G = $25 ".de/dbpedia_3_2"> , <col=413 S = $37  
"s-1-6-t2.S">

Current of: <$45 "<DB.DBA.RDF_QUAD s-1-6-t3>" spec 5>

Precode:
       0: $46 "callret" := Call __id2i ($43 "s-1-6-t3.P")
       5: $47 "callret" := Call __ro2sq ($42 "s-1-6-t3.O")
       10: BReturn 0
 from DB.DBA.RDF_QUAD by RDF_QUAD       0.85 rows
Key RDF_QUAD  ASC ($49 "s-1-6-t4.O")
  inlined <col=412 G = $25 ".de/dbpedia_3_2"> , <col=413 S = $43  
"s-1-6-t3.P"> , <col=414 P = $29 "ema#label">
row specs: <col=413 S < $28 "callret">

Current of: <$51 "<DB.DBA.RDF_QUAD s-1-6-t4>" spec 5>

After code:
       0: $52 "callret" := Call __ro2sq ($49 "s-1-6-t4.O")
       5: $53 "callret" := Call vector ($46 "callret", $52 "callret", $40  
"callret", $47 "callret")
       10: if ($56 "user_aggr_notfirst" 1(=) <constant (1)>) then 24 else  
13 unkn 13
       13: $56 "user_aggr_notfirst" :=  := artm <constant (1)>
       17: $58 "user_aggr_ret" := Call DB.DBA.SPARQL_CONSTRUCT_INIT ($57  
"user_aggr_env")
       24: $58 "user_aggr_ret" := Call DB.DBA.SPARQL_CONSTRUCT_ACC ($57  
"user_aggr_env", $34 "callret", $53 "callret", $35 "callret")
       31: BReturn 0
}

After code:
       0: $59 "callret" := Call DB.DBA.SPARQL_CONSTRUCT_FIN ($57  
"user_aggr_env")
       7: $60 "callretTTL-0" := Call DB.DBA.RDF_FORMAT_TRIPLE_DICT_AS_TTL  
($59 "callret")
       14: BReturn 0
Select (TOP <constant (1)>) ($60 "callretTTL-0", <$51 "<DB.DBA.RDF_QUAD  
s-1-6-t4>" spec 5>, <$45 "<DB.DBA.RDF_QUAD s-1-6-t3>" spec 5>, <$39  
"<DB.DBA.RDF_QUAD s-1-6-t2>" spec 5>)
}

60 Rows. -- 1020 msec.


>> 3. Turn virtuoso tracing on using the trace_on() function to provide
>> more debug info in the Virtuoso log as detailed at:
>> http://docs.openlinksw.com/virtuoso/fn_trace_on.html

I choose all the debug constants available by calling statement:

trace_on ('user_names', 'user_log', 'failed_log', 'compile', 'ddl_log',  
'client_sql', 'errors', 'dsn', 'sql_send', 'transact', 'remote_transact',  
'exec', 'soap', 'thread', 'cursor');

>> 4. Provide a copy of your virtuoso.log file for analysis

                 Mon Mar 02 2009
16:12:21 INFO: OpenLink Virtuoso Universal Server
16:12:21 INFO: Version 05.09.3035-pthreads for Linux as of Jan 21 2009
16:12:21 INFO: uses parts of OpenSSL, PCRE, Html Tidy
16:12:21 INFO: Database version 3016
16:12:22 INFO: SQL Optimizer enabled (max 1000 layouts)
16:12:23 INFO: Compiler unit is timed at 0.001435 msec
16:13:14 INFO: Roll forward started
16:13:14 INFO: Roll forward complete
16:14:09 INFO: Checkpoint made, log reused
16:14:11 INFO: HTTP/WebDAV server online at 8890
16:14:11 INFO: Server online at 1111 (pid 23136)
16:14:24 INFO: LTRS_1 dba 127.0.0.1 1111:1 Commit transact 0x6f6d590 0
16:14:24 INFO: LTRS_2 dba 127.0.0.1 1111:1 Restart transact 0x6f6d590
16:14:50 INFO: CSLQ_0 dba 127.0.0.1 1111:1 s1111_1_0 string_to_file  
('/webdata_sempa/dbpedia/personen/persons_isql.n3', (sparql define  
output:format "TTL" PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>  
CONSTRUCT {  ?s ?p ?lo.  ?p rdfs:label ?lp. } FROM  
<http://neofonie.de/dbpedia_3_2> WHERE { ?s a  
<http://dbpedia.org/ontology/Person>. ?s ?p ?lo. ?p rdfs:label ?lp.}), -2)
16:14:50 INFO: COMP_2 dba 127.0.0.1 1111:1 Compile text:  string_to_file  
('/webdata_sempa/dbpedia/personen/persons_isql.n3', (sparql define  
output:format "TTL" PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>  
CONSTRUCT {  ?s ?p ?lo.  ?p rdfs:label ?lp. } FROM  
<http://neofonie.de/dbpedia_3_2> WHERE { ?s a  
<http://dbpedia.org/ontology/Person>. ?s ?p ?lo. ?p rdfs:label ?lp.}), -2)
16:14:52 INFO: EXEC_1 dba 127.0.0.1 1111:1 s1111_1_0 Exec 1 time(s)  
string_to_file ('/webdata_sempa/dbpedia/personen/persons_isql.n3', (sparql  
define output:format "TTL" PREFIX rdfs:  
<http://www.w3.org/2000/01/rdf-schema#> CONSTRUCT {  ?s ?p ?lo.  ?p  
rdfs:label ?lp. } FROM <http://neofonie.de/dbpedia_3_2> WHERE { ?s a  
<http://dbpedia.org/ontology/Person>. ?s ?p ?lo. ?p rdfs:label ?lp.}), -2)
16:24:12 INFO: LTRS_0 <DBA> Internal Internal Begin transact 0x3dbcf1b0
16:24:12 INFO: LTRS_1 <DBA> Internal Internal Commit transact 0x3dbcf1b0  
140488380252160
16:24:12 INFO: LTRS_2 <DBA> Internal Internal Restart transact 0x3dbcf1b0
16:24:12 INFO: LTRS_1 <DBA> Internal Internal Commit transact 0x3dbcf1b0 0
16:24:12 INFO: LTRS_2 <DBA> Internal Internal Restart transact 0x3dbcf1b0
16:24:12 INFO: LTRS_1 <DBA> Internal Internal Commit transact 0x3dbcf1b0 0
16:24:12 INFO: LTRS_2 <DBA> Internal Internal Restart transact 0x3dbcf1b0
16:24:12 INFO: LTRS_1 <DBA> Internal Internal Commit transact 0x3dbcf1b0 0
16:24:12 INFO: LTRS_2 <DBA> Internal Internal Restart transact 0x3dbcf1b0
16:24:12 INFO: LTRS_1 <DBA> Internal Internal Commit transact 0x3dbcf1b0 0
16:24:12 INFO: LTRS_2 <DBA> Internal Internal Restart transact 0x3dbcf1b0
16:24:12 INFO: LTRS_1 <DBA> Internal Internal Commit transact 0x3dbcf1b0 0
16:24:12 INFO: LTRS_2 <DBA> Internal Internal Restart transact 0x3dbcf1b0
16:24:12 INFO: LTRS_1 <DBA> Internal Internal Commit transact 0x3dbcf1b0 0
16:24:12 INFO: LTRS_2 <DBA> Internal Internal Restart transact 0x3dbcf1b0
16:24:12 INFO: LTRS_1 <DBA> Internal Internal Commit transact 0x3dbcf1b0 0
16:24:12 INFO: LTRS_2 <DBA> Internal Internal Restart transact 0x3dbcf1b0
16:24:12 INFO: LTRS_1 <DBA> Internal Internal Commit transact 0x3dbcf1b0 0
16:24:12 INFO: LTRS_2 <DBA> Internal Internal Restart transact 0x3dbcf1b0
16:24:12 INFO: LTRS_1 <DBA> Internal Internal Commit transact 0x3dbcf1b0 0
16:24:12 INFO: LTRS_2 <DBA> Internal Internal Restart transact 0x3dbcf1b0
16:24:12 INFO: LTRS_1 <DBA> Internal Internal Commit transact 0x3dbcf1b0 0
16:24:12 INFO: LTRS_2 <DBA> Internal Internal Restart transact 0x3dbcf1b0
16:24:12 INFO: LTRS_1 <DBA> Internal Internal Commit transact 0x3dbcf1b0 0
16:24:12 INFO: LTRS_2 <DBA> Internal Internal Restart transact 0x3dbcf1b0
16:24:12 INFO: LTRS_1 <DBA> Internal Internal Commit transact 0x3dbcf1b0 0
16:24:12 INFO: LTRS_2 <DBA> Internal Internal Restart transact 0x3dbcf1b0
16:24:12 INFO: LTRS_1 <DBA> Internal Internal Commit transact 0x3dbcf1b0 0
16:24:12 INFO: LTRS_2 <DBA> Internal Internal Restart transact 0x3dbcf1b0
16:24:12 INFO: LTRS_1 <DBA> Internal Internal Commit transact 0x3dbcf1b0  
140488380252160
16:24:12 INFO: LTRS_2 <DBA> Internal Internal Restart transact 0x3dbcf1b0


After a couple of minutes virtuoso just crashes without telling sth. When  
watching the process with htop it turned out, that virtuoso loads more and  
more data in main memory until the available 4 GB of my machine are exhausted - 
this is  
exactly the time when it crashes.

I will try now with more main memory.
Swapping seems not to work, I don't know why.. (system is ubuntu 8.10)
I will evaluate the problem soon and let you know about it.

Thanks for your help
>>
>> Best Regards
>> Hugh Williams
>> Professional Services
>> OpenLink Software
>> Web: http://www.openlinksw.com
>> Support: http://support.openlinksw.com
>> Forums: http://boards.openlinksw.com/support
>>
>>
>>
>>
>>
>> On 26 Feb 2009, at 14:36, Armin Nagel wrote:
>>
>> > Hello users,
>> >
>> >
>> > I have loaded some dbpedia parts into virtuoso merged to just one
>> > graph.
>> > All works fine.
>> >
>> >
>> > Now I have the idea to extract via sparql-construct all data belong
>> > to
>> > resources of type person.
>> >
>> >
>> > This is my sparql construct:
>> >
>> >
>> > define output:format "TTL"
>> > PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
>> > CONSTRUCT {  ?s ?p ?lo.   ?p rdfs:label ?lp.}
>> > WHERE {  ?s a <http://dbpedia.org/ontology/Person>.
>> > { ?s ?p ?lo. FILTER isLiteral(?lo)  }
>> > UNION { ?s ?p ?o.  ?o rdfs:label ?lo. FILTER isIRI(?o)  }
>> > ?p rdfs:label ?lp.};
>> >
>> >
>> > Explained:
>> > I collect all information to any resource of type person.
>> > The resource must be subject. If it is linked with another resource
>> > like
>> > a place, I collect the label of it too.
>> >
>> >
>> > I tried different ways. The construct works for type actor over
>> > httpendpoint, but for person not all data is in the result. I think
>> > virtuoso breaks if the query takes to much time.
>> >
>> >
>> > My virtuoso.ini:
>> > ;
>> > ;  Server parameters
>> > ;
>> > [Parameters]
>> > ServerPort                      = 1111
>> > DisableUnixSocket               = 1
>> > ;SSLServerPort                  = 2111
>> > ;SSLCertificate                 = cert.pem
>> > ;SSLPrivateKey                  = pk.pem
>> > ;X509ClientVerify               = 0
>> > ;X509ClientVerifyDepth          = 0
>> > ;X509ClientVerifyCAFile         = ca.pem
>> > ServerThreads                   = 20
>> > CheckpointInterval              = 60
>> > O_DIRECT                        = 0
>> > NumberOfBuffers                 = 400000
>> > MaxDirtyBuffers                 = 1200
>> > CaseMode                        = 2
>> > MaxStaticCursorRows             = 5000
>> > CheckpointAuditTrail            = 0
>> > AllowOSCalls                    = 0
>> > SchedulerInterval               = 10
>> > DirsAllowed                     = ., /foo/vad, /bar/dbpedia,
>> > ThreadCleanupInterval           = 0
>> > ThreadThreshold                 = 10
>> > ResourcesCleanupInterval        = 0
>> > FreeTextBatchSize               = 100000
>> > SingleCPU                       = 0
>> > VADInstallDir                   = /moo/vad/
>> > PrefixResultNames               = 0
>> >
>> >
>> > [SPARQL]
>> > ;ExternalQuerySource            = 1
>> > ;ExternalXsltSource             = 1
>> > ResultSetMaxRows                = 9223372036854775807
>> > DefaultGraph                    = http://neofonie.de/dbpedia_3_2
>> > ;ImmutableGraphs                = http://localhost:8890/dataspace
>> > ;MaxQueryCostEstimationTime     = 120 ; in seconds
>> > ;MaxQueryExecutionTime          = 10 ; in seconds
>> > ;PingService                    = http://rpc.pingthesemanticweb.com/
>> > DefaultQuery                    = select * where { ?s ?p ?o . }
>> > Limit
>> > 100
>> >
>> >
>> > I tried over isql with following command:
>> > /isql 1111 foo bar test.sql > test_r.nt
>> >
>> >
>> > test.sql contains:
>> > sparl
>> > PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
>> > CONSTRUCT {  ?s ?p ?lo.   ?p rdfs:label ?lp.}
>> > WHERE {  ?s a <http://dbpedia.org/ontology/Person>.
>> > { ?s ?p ?lo. FILTER isLiteral(?lo)  }
>> > UNION { ?s ?p ?o.  ?o rdfs:label ?lo. FILTER isIRI(?o)  }
>> > ?p rdfs:label ?lp.};
>> >
>> >
>> > It' breaks with warning
>> > Warning 01004: [Virtuoso Driver]CL077: Data truncated in column 1 of
>> > the
>> > result-se(callretRDF/XML-0, type 125)
>> >
>> >
>> > Is there any way to extract such huge data without breaking
>> > virtuoso?
>> >
>> >
>> > Kind regards
>> >
>> >
>> > Armin Nagel
>> >
>> >
>> > --
>> >
>> >
>> > Sie finden uns auch mit unserer Web 2.0 Suchmaschine WeFind
>> > auf der CeBIT in Halle 006, Stand G60 (in der Webciety).
>> >
>> >
>> > Wir freuen uns auf Ihr Kommen!
>> > ________________________________
>> >
>> >
>> > Armin Nagel
>> > Softwareentwickler
>> >
>> >
>> > neofonie
>> > Technologieentwicklung und
>> > Informationsmanagement GmbH
>> > Robert-Koch-Platz 4
>> > 10115 Berlin
>> > fon: +49.30 24627 257
>> > fax: +49.30 24627 120
>> > armin.na...@neofonie.de
>> > http://www.neofonie.de
>> >
>> >
>> > Handelsregister
>> > Berlin-Charlottenburg: HRB 67460
>> >
>> >
>> > Geschäftsführung
>> > Helmut Hoffer von Ankershoffen
>> > (Sprecher der Geschäftsführung)
>> > Nurhan Yildirim
>> > Uwe-Gernot Fasold
>> > ________________________________
>> >
>> >
>> > Die erste Web 2.0 Suchmaschine jetzt auf http://www.wefind.de .
>> >
>> >
>> > Unterwegs immer bestens informiert mit WeFind Mobile für iPhone und
>> > jetzt auch mit WeFind Mobile für Android: kostenloser Download im
>> > iTunes
>> > AppStore und im Android Market.
>> >
>> >
>> >
>> >
>> >
>> >
>> >  
>> ------------------------------------------------------------------------------
>> > Open Source Business Conference (OSBC), March 24-25, 2009, San
>> > Francisco, CA
>> > -OSBC tackles the biggest issue in open source: Open Sourcing the
>> > Enterprise
>> > -Strategies to boost innovation and cut costs with open source
>> > participation
>> > -Receive a $600 discount off the registration fee with the source
>> > code: SFAD
>> > http://p.sf.net/sfu/XcvMzF8H
>> > _______________________________________________
>> > Virtuoso-users mailing list
>> > Virtuoso-users@lists.sourceforge.net
>> > https://lists.sourceforge.net/lists/listinfo/virtuoso-users
>>
>>
>> ------------------------------------------------------------------------------
>> Open Source Business Conference (OSBC), March 24-25, 2009, San  
>> Francisco, CA
>> -OSBC tackles the biggest issue in open source: Open Sourcing the  
>> Enterprise
>> -Strategies to boost innovation and cut costs with open source  
>> participation
>> -Receive a $600 discount off the registration fee with the source code:  
>> SFAD
>> http://p.sf.net/sfu/XcvMzF8H
>> _______________________________________________ Virtuoso-users mailing  
>> list Virtuoso-users@lists.sourceforge.net  
>> https://lists.sourceforge.net/lists/listinfo/virtuoso-users

-- 

Sie finden uns auch mit unserer Web 2.0 Suchmaschine WeFind 
auf der CeBIT in Halle 006, Stand G60 (in der Webciety). 

Wir freuen uns auf Ihr Kommen!
________________________________

Armin Nagel
Softwareentwickler

neofonie
Technologieentwicklung und
Informationsmanagement GmbH
Robert-Koch-Platz 4
10115 Berlin
fon: +49.30 24627 257
fax: +49.30 24627 120
armin.na...@neofonie.de
http://www.neofonie.de

Handelsregister
Berlin-Charlottenburg: HRB 67460

Geschäftsführung
Helmut Hoffer von Ankershoffen
(Sprecher der Geschäftsführung)
Nurhan Yildirim
Uwe-Gernot Fasold
________________________________
  
Die erste Web 2.0 Suchmaschine jetzt auf http://www.wefind.de .

Unterwegs immer bestens informiert mit WeFind Mobile für iPhone und
jetzt auch mit WeFind Mobile für Android: kostenloser Download im iTunes
AppStore und im Android Market.





Reply via email to