Hello Hugh and virtuoso users, >> Von: Hugh Williams <hwilli...@openlinksw.com> >> An: armin.na...@neofonie.de >> Kopie: Benjamin Grossman <benja...@neofonie.de>, >> virtuoso-users@lists.sourceforge.net >> Betreff: Re: [Virtuoso-users] dbpedia in virtuoso, extract personsdata >> fails >> Datum: Thu, 26 Feb 2009 17:04:19 +0000 >> >> Hi Armin, >> >> >> Can you do the following: I did ;-) >> >> >> 1. Confirm the Virtuoso version in use
Version 05.09.3035-pthreads for Linux as of Jan 21 2009 (see below in log file) >> 2. Provide the output of running the virtuoso explain function showing >> the compiled query as detailed at: >> http://docs.openlinksw.com/virtuoso/fn_explain.html SQL> explain ('sparql define output:format "TTL" PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> CONSTRUCT { ?s ?p ?lo. ?p rdfs:label ?lp. } FROM <http://neofonie.de/dbpedia_3_2> WHERE { ?s a <http://dbpedia.org/ontology/Person>. ?s ?p ?lo. ?p rdfs:label ?lp.}'); REPORT VARCHAR _______________________________________________________________________________ { Fork { Precode: 0: $25 ".de/dbpedia_3_2" := Call __i2idn (<constant (http://neofonie.de/dbpedia_3_2)>) 5: $26 "-ns#type" := Call __i2idn (<constant (http://www.w3.org/1999/02/22-rdf-syntax-ns#type)>) 10: $27 "org/ontology/Person" := Call __i2idn (<constant (http://dbpedia.org/ontology/Person)>) 15: $28 "callret" := Call min_bnode_iri_id () 20: $29 "ema#label" := Call __i2idn (<constant (http://www.w3.org/2000/01/rdf-schema#label)>) 25: $30 "callret" := Call vector (<constant (1)>, <constant (2)>, <constant (1)>, <constant (0)>, <constant (1)>, <constant (3)>) 30: $31 "callret" := Call __bft (<constant (http://www.w3.org/2000/01/rdf-schema#label)>, <constant (1)>) 35: $32 "callret" := Call __bft ($31 "callret", <constant (1)>) 40: $33 "callret" := Call vector (<constant (1)>, <constant (0)>, <constant (3)>, $32 "callret", <constant (1)>, <constant (1)>) 45: $34 "callret" := Call vector ($30 "callret", $33 "callret") 50: $35 "callret" := Call vector () 55: BReturn 0 from DB.DBA.RDF_QUAD by RDF_QUAD_OGPS 1.5e+05 rows Key RDF_QUAD_OGPS ASC ($37 "s-1-6-t2.S") <col=415 O = $27 "org/ontology/Person"> , <col=412 G = $25 ".de/dbpedia_3_2"> , <col=414 P = $26 "-ns#type"> row specs: <col=415 O LIKE <constant (T�)>> Current of: <$39 "<DB.DBA.RDF_QUAD s-1-6-t2>" spec 5> Precode: 0: $40 "callret" := Call __id2i ($37 "s-1-6-t2.S") 5: BReturn 0 from DB.DBA.RDF_QUAD by RDF_QUAD 4 rows Key RDF_QUAD ASC ($43 "s-1-6-t3.P", $42 "s-1-6-t3.O") inlined <col=412 G = $25 ".de/dbpedia_3_2"> , <col=413 S = $37 "s-1-6-t2.S"> Current of: <$45 "<DB.DBA.RDF_QUAD s-1-6-t3>" spec 5> Precode: 0: $46 "callret" := Call __id2i ($43 "s-1-6-t3.P") 5: $47 "callret" := Call __ro2sq ($42 "s-1-6-t3.O") 10: BReturn 0 from DB.DBA.RDF_QUAD by RDF_QUAD 0.85 rows Key RDF_QUAD ASC ($49 "s-1-6-t4.O") inlined <col=412 G = $25 ".de/dbpedia_3_2"> , <col=413 S = $43 "s-1-6-t3.P"> , <col=414 P = $29 "ema#label"> row specs: <col=413 S < $28 "callret"> Current of: <$51 "<DB.DBA.RDF_QUAD s-1-6-t4>" spec 5> After code: 0: $52 "callret" := Call __ro2sq ($49 "s-1-6-t4.O") 5: $53 "callret" := Call vector ($46 "callret", $52 "callret", $40 "callret", $47 "callret") 10: if ($56 "user_aggr_notfirst" 1(=) <constant (1)>) then 24 else 13 unkn 13 13: $56 "user_aggr_notfirst" := := artm <constant (1)> 17: $58 "user_aggr_ret" := Call DB.DBA.SPARQL_CONSTRUCT_INIT ($57 "user_aggr_env") 24: $58 "user_aggr_ret" := Call DB.DBA.SPARQL_CONSTRUCT_ACC ($57 "user_aggr_env", $34 "callret", $53 "callret", $35 "callret") 31: BReturn 0 } After code: 0: $59 "callret" := Call DB.DBA.SPARQL_CONSTRUCT_FIN ($57 "user_aggr_env") 7: $60 "callretTTL-0" := Call DB.DBA.RDF_FORMAT_TRIPLE_DICT_AS_TTL ($59 "callret") 14: BReturn 0 Select (TOP <constant (1)>) ($60 "callretTTL-0", <$51 "<DB.DBA.RDF_QUAD s-1-6-t4>" spec 5>, <$45 "<DB.DBA.RDF_QUAD s-1-6-t3>" spec 5>, <$39 "<DB.DBA.RDF_QUAD s-1-6-t2>" spec 5>) } 60 Rows. -- 1020 msec. >> 3. Turn virtuoso tracing on using the trace_on() function to provide >> more debug info in the Virtuoso log as detailed at: >> http://docs.openlinksw.com/virtuoso/fn_trace_on.html I choose all the debug constants available by calling statement: trace_on ('user_names', 'user_log', 'failed_log', 'compile', 'ddl_log', 'client_sql', 'errors', 'dsn', 'sql_send', 'transact', 'remote_transact', 'exec', 'soap', 'thread', 'cursor'); >> 4. Provide a copy of your virtuoso.log file for analysis Mon Mar 02 2009 16:12:21 INFO: OpenLink Virtuoso Universal Server 16:12:21 INFO: Version 05.09.3035-pthreads for Linux as of Jan 21 2009 16:12:21 INFO: uses parts of OpenSSL, PCRE, Html Tidy 16:12:21 INFO: Database version 3016 16:12:22 INFO: SQL Optimizer enabled (max 1000 layouts) 16:12:23 INFO: Compiler unit is timed at 0.001435 msec 16:13:14 INFO: Roll forward started 16:13:14 INFO: Roll forward complete 16:14:09 INFO: Checkpoint made, log reused 16:14:11 INFO: HTTP/WebDAV server online at 8890 16:14:11 INFO: Server online at 1111 (pid 23136) 16:14:24 INFO: LTRS_1 dba 127.0.0.1 1111:1 Commit transact 0x6f6d590 0 16:14:24 INFO: LTRS_2 dba 127.0.0.1 1111:1 Restart transact 0x6f6d590 16:14:50 INFO: CSLQ_0 dba 127.0.0.1 1111:1 s1111_1_0 string_to_file ('/webdata_sempa/dbpedia/personen/persons_isql.n3', (sparql define output:format "TTL" PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> CONSTRUCT { ?s ?p ?lo. ?p rdfs:label ?lp. } FROM <http://neofonie.de/dbpedia_3_2> WHERE { ?s a <http://dbpedia.org/ontology/Person>. ?s ?p ?lo. ?p rdfs:label ?lp.}), -2) 16:14:50 INFO: COMP_2 dba 127.0.0.1 1111:1 Compile text: string_to_file ('/webdata_sempa/dbpedia/personen/persons_isql.n3', (sparql define output:format "TTL" PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> CONSTRUCT { ?s ?p ?lo. ?p rdfs:label ?lp. } FROM <http://neofonie.de/dbpedia_3_2> WHERE { ?s a <http://dbpedia.org/ontology/Person>. ?s ?p ?lo. ?p rdfs:label ?lp.}), -2) 16:14:52 INFO: EXEC_1 dba 127.0.0.1 1111:1 s1111_1_0 Exec 1 time(s) string_to_file ('/webdata_sempa/dbpedia/personen/persons_isql.n3', (sparql define output:format "TTL" PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> CONSTRUCT { ?s ?p ?lo. ?p rdfs:label ?lp. } FROM <http://neofonie.de/dbpedia_3_2> WHERE { ?s a <http://dbpedia.org/ontology/Person>. ?s ?p ?lo. ?p rdfs:label ?lp.}), -2) 16:24:12 INFO: LTRS_0 <DBA> Internal Internal Begin transact 0x3dbcf1b0 16:24:12 INFO: LTRS_1 <DBA> Internal Internal Commit transact 0x3dbcf1b0 140488380252160 16:24:12 INFO: LTRS_2 <DBA> Internal Internal Restart transact 0x3dbcf1b0 16:24:12 INFO: LTRS_1 <DBA> Internal Internal Commit transact 0x3dbcf1b0 0 16:24:12 INFO: LTRS_2 <DBA> Internal Internal Restart transact 0x3dbcf1b0 16:24:12 INFO: LTRS_1 <DBA> Internal Internal Commit transact 0x3dbcf1b0 0 16:24:12 INFO: LTRS_2 <DBA> Internal Internal Restart transact 0x3dbcf1b0 16:24:12 INFO: LTRS_1 <DBA> Internal Internal Commit transact 0x3dbcf1b0 0 16:24:12 INFO: LTRS_2 <DBA> Internal Internal Restart transact 0x3dbcf1b0 16:24:12 INFO: LTRS_1 <DBA> Internal Internal Commit transact 0x3dbcf1b0 0 16:24:12 INFO: LTRS_2 <DBA> Internal Internal Restart transact 0x3dbcf1b0 16:24:12 INFO: LTRS_1 <DBA> Internal Internal Commit transact 0x3dbcf1b0 0 16:24:12 INFO: LTRS_2 <DBA> Internal Internal Restart transact 0x3dbcf1b0 16:24:12 INFO: LTRS_1 <DBA> Internal Internal Commit transact 0x3dbcf1b0 0 16:24:12 INFO: LTRS_2 <DBA> Internal Internal Restart transact 0x3dbcf1b0 16:24:12 INFO: LTRS_1 <DBA> Internal Internal Commit transact 0x3dbcf1b0 0 16:24:12 INFO: LTRS_2 <DBA> Internal Internal Restart transact 0x3dbcf1b0 16:24:12 INFO: LTRS_1 <DBA> Internal Internal Commit transact 0x3dbcf1b0 0 16:24:12 INFO: LTRS_2 <DBA> Internal Internal Restart transact 0x3dbcf1b0 16:24:12 INFO: LTRS_1 <DBA> Internal Internal Commit transact 0x3dbcf1b0 0 16:24:12 INFO: LTRS_2 <DBA> Internal Internal Restart transact 0x3dbcf1b0 16:24:12 INFO: LTRS_1 <DBA> Internal Internal Commit transact 0x3dbcf1b0 0 16:24:12 INFO: LTRS_2 <DBA> Internal Internal Restart transact 0x3dbcf1b0 16:24:12 INFO: LTRS_1 <DBA> Internal Internal Commit transact 0x3dbcf1b0 0 16:24:12 INFO: LTRS_2 <DBA> Internal Internal Restart transact 0x3dbcf1b0 16:24:12 INFO: LTRS_1 <DBA> Internal Internal Commit transact 0x3dbcf1b0 0 16:24:12 INFO: LTRS_2 <DBA> Internal Internal Restart transact 0x3dbcf1b0 16:24:12 INFO: LTRS_1 <DBA> Internal Internal Commit transact 0x3dbcf1b0 0 16:24:12 INFO: LTRS_2 <DBA> Internal Internal Restart transact 0x3dbcf1b0 16:24:12 INFO: LTRS_1 <DBA> Internal Internal Commit transact 0x3dbcf1b0 140488380252160 16:24:12 INFO: LTRS_2 <DBA> Internal Internal Restart transact 0x3dbcf1b0 After a couple of minutes virtuoso just crashes without telling sth. When watching the process with htop it turned out, that virtuoso loads more and more data in main memory until the available 4 GB of my machine are exhausted - this is exactly the time when it crashes. I will try now with more main memory. Swapping seems not to work, I don't know why.. (system is ubuntu 8.10) I will evaluate the problem soon and let you know about it. Thanks for your help >> >> Best Regards >> Hugh Williams >> Professional Services >> OpenLink Software >> Web: http://www.openlinksw.com >> Support: http://support.openlinksw.com >> Forums: http://boards.openlinksw.com/support >> >> >> >> >> >> On 26 Feb 2009, at 14:36, Armin Nagel wrote: >> >> > Hello users, >> > >> > >> > I have loaded some dbpedia parts into virtuoso merged to just one >> > graph. >> > All works fine. >> > >> > >> > Now I have the idea to extract via sparql-construct all data belong >> > to >> > resources of type person. >> > >> > >> > This is my sparql construct: >> > >> > >> > define output:format "TTL" >> > PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> >> > CONSTRUCT { ?s ?p ?lo. ?p rdfs:label ?lp.} >> > WHERE { ?s a <http://dbpedia.org/ontology/Person>. >> > { ?s ?p ?lo. FILTER isLiteral(?lo) } >> > UNION { ?s ?p ?o. ?o rdfs:label ?lo. FILTER isIRI(?o) } >> > ?p rdfs:label ?lp.}; >> > >> > >> > Explained: >> > I collect all information to any resource of type person. >> > The resource must be subject. If it is linked with another resource >> > like >> > a place, I collect the label of it too. >> > >> > >> > I tried different ways. The construct works for type actor over >> > httpendpoint, but for person not all data is in the result. I think >> > virtuoso breaks if the query takes to much time. >> > >> > >> > My virtuoso.ini: >> > ; >> > ; Server parameters >> > ; >> > [Parameters] >> > ServerPort = 1111 >> > DisableUnixSocket = 1 >> > ;SSLServerPort = 2111 >> > ;SSLCertificate = cert.pem >> > ;SSLPrivateKey = pk.pem >> > ;X509ClientVerify = 0 >> > ;X509ClientVerifyDepth = 0 >> > ;X509ClientVerifyCAFile = ca.pem >> > ServerThreads = 20 >> > CheckpointInterval = 60 >> > O_DIRECT = 0 >> > NumberOfBuffers = 400000 >> > MaxDirtyBuffers = 1200 >> > CaseMode = 2 >> > MaxStaticCursorRows = 5000 >> > CheckpointAuditTrail = 0 >> > AllowOSCalls = 0 >> > SchedulerInterval = 10 >> > DirsAllowed = ., /foo/vad, /bar/dbpedia, >> > ThreadCleanupInterval = 0 >> > ThreadThreshold = 10 >> > ResourcesCleanupInterval = 0 >> > FreeTextBatchSize = 100000 >> > SingleCPU = 0 >> > VADInstallDir = /moo/vad/ >> > PrefixResultNames = 0 >> > >> > >> > [SPARQL] >> > ;ExternalQuerySource = 1 >> > ;ExternalXsltSource = 1 >> > ResultSetMaxRows = 9223372036854775807 >> > DefaultGraph = http://neofonie.de/dbpedia_3_2 >> > ;ImmutableGraphs = http://localhost:8890/dataspace >> > ;MaxQueryCostEstimationTime = 120 ; in seconds >> > ;MaxQueryExecutionTime = 10 ; in seconds >> > ;PingService = http://rpc.pingthesemanticweb.com/ >> > DefaultQuery = select * where { ?s ?p ?o . } >> > Limit >> > 100 >> > >> > >> > I tried over isql with following command: >> > /isql 1111 foo bar test.sql > test_r.nt >> > >> > >> > test.sql contains: >> > sparl >> > PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> >> > CONSTRUCT { ?s ?p ?lo. ?p rdfs:label ?lp.} >> > WHERE { ?s a <http://dbpedia.org/ontology/Person>. >> > { ?s ?p ?lo. FILTER isLiteral(?lo) } >> > UNION { ?s ?p ?o. ?o rdfs:label ?lo. FILTER isIRI(?o) } >> > ?p rdfs:label ?lp.}; >> > >> > >> > It' breaks with warning >> > Warning 01004: [Virtuoso Driver]CL077: Data truncated in column 1 of >> > the >> > result-se(callretRDF/XML-0, type 125) >> > >> > >> > Is there any way to extract such huge data without breaking >> > virtuoso? >> > >> > >> > Kind regards >> > >> > >> > Armin Nagel >> > >> > >> > -- >> > >> > >> > Sie finden uns auch mit unserer Web 2.0 Suchmaschine WeFind >> > auf der CeBIT in Halle 006, Stand G60 (in der Webciety). >> > >> > >> > Wir freuen uns auf Ihr Kommen! >> > ________________________________ >> > >> > >> > Armin Nagel >> > Softwareentwickler >> > >> > >> > neofonie >> > Technologieentwicklung und >> > Informationsmanagement GmbH >> > Robert-Koch-Platz 4 >> > 10115 Berlin >> > fon: +49.30 24627 257 >> > fax: +49.30 24627 120 >> > armin.na...@neofonie.de >> > http://www.neofonie.de >> > >> > >> > Handelsregister >> > Berlin-Charlottenburg: HRB 67460 >> > >> > >> > Geschäftsführung >> > Helmut Hoffer von Ankershoffen >> > (Sprecher der Geschäftsführung) >> > Nurhan Yildirim >> > Uwe-Gernot Fasold >> > ________________________________ >> > >> > >> > Die erste Web 2.0 Suchmaschine jetzt auf http://www.wefind.de . >> > >> > >> > Unterwegs immer bestens informiert mit WeFind Mobile für iPhone und >> > jetzt auch mit WeFind Mobile für Android: kostenloser Download im >> > iTunes >> > AppStore und im Android Market. >> > >> > >> > >> > >> > >> > >> > >> ------------------------------------------------------------------------------ >> > Open Source Business Conference (OSBC), March 24-25, 2009, San >> > Francisco, CA >> > -OSBC tackles the biggest issue in open source: Open Sourcing the >> > Enterprise >> > -Strategies to boost innovation and cut costs with open source >> > participation >> > -Receive a $600 discount off the registration fee with the source >> > code: SFAD >> > http://p.sf.net/sfu/XcvMzF8H >> > _______________________________________________ >> > Virtuoso-users mailing list >> > Virtuoso-users@lists.sourceforge.net >> > https://lists.sourceforge.net/lists/listinfo/virtuoso-users >> >> >> ------------------------------------------------------------------------------ >> Open Source Business Conference (OSBC), March 24-25, 2009, San >> Francisco, CA >> -OSBC tackles the biggest issue in open source: Open Sourcing the >> Enterprise >> -Strategies to boost innovation and cut costs with open source >> participation >> -Receive a $600 discount off the registration fee with the source code: >> SFAD >> http://p.sf.net/sfu/XcvMzF8H >> _______________________________________________ Virtuoso-users mailing >> list Virtuoso-users@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/virtuoso-users -- Sie finden uns auch mit unserer Web 2.0 Suchmaschine WeFind auf der CeBIT in Halle 006, Stand G60 (in der Webciety). Wir freuen uns auf Ihr Kommen! ________________________________ Armin Nagel Softwareentwickler neofonie Technologieentwicklung und Informationsmanagement GmbH Robert-Koch-Platz 4 10115 Berlin fon: +49.30 24627 257 fax: +49.30 24627 120 armin.na...@neofonie.de http://www.neofonie.de Handelsregister Berlin-Charlottenburg: HRB 67460 Geschäftsführung Helmut Hoffer von Ankershoffen (Sprecher der Geschäftsführung) Nurhan Yildirim Uwe-Gernot Fasold ________________________________ Die erste Web 2.0 Suchmaschine jetzt auf http://www.wefind.de . Unterwegs immer bestens informiert mit WeFind Mobile für iPhone und jetzt auch mit WeFind Mobile für Android: kostenloser Download im iTunes AppStore und im Android Market.