I've also been dealing with optimizing SPARQL queries lately, and have gotten some impressive performance boosts out of subqueries. If there are parts of your query that can be isolated in order to reduce the solution space early on, there might be a benefit in putting those criteria in a sub query.
Jim On Wed, Sep 28, 2016 at 9:40 AM Peter Lawrence <peter.lawre...@inova8.com> wrote: > Another observation I note is that there is a common factor within each > UNION: > > GRAPH <GraphABaseURI>{ > ?variableA a graphA:ClassA . > ?variableA graphA:propertyA ?variableB . > ?variableB dcterms:title ?variableC . > ?variableA graphA:propertyB ?variableD .} > > Could this be factored out to drive the overall query, and then UNION the > sub-selects on the other conditions? > > > On Wed, Sep 28, 2016 at 2:06 PM Jerven Tjalling Bolleman > <jerven.bolleman@sib.swiss> wrote: > >> Hi Pantelis, >> >> One thing that struck me as an other virtuoso user is >> the joins in the filters. >> >> e.g. >> >> FILTER (?variableE = ?variableH) >> >> I think if you remove all instances of ?variableH with ?variableE >> you should get better results. >> >> Hope that helps. >> >> Regards, >> Jerven >> >> On 28/09/16 14:44, Pantelis Natsiavas wrote: >> > Hi everybody. >> > >> > I have a rather complex SPARQL query, which is executed thousands of >> > times in parallel threads (400 threads). The query is here somewhat >> > simplified (namespaces, properties and variables have been reduced) for >> > readability, but the complexity is left untouched (unions, number of >> > graphs etc.). The query is run against 4 graphs, the biggest of which >> > contains 5561181 triples. >> > >> > PREFIX graphA: <GraphABaseURI:> >> > >> > ASK >> > FROM NAMED <GraphBURI> >> > FROM NAMED <GraphCURI> >> > FROM NAMED <GraphABaseURI> >> > FROM NAMED <GraphDBaseURI> >> > WHERE{ >> > { >> > GRAPH <GraphABaseURI>{ >> > ?variableA a graphA:ClassA . >> > ?variableA graphA:propertyA ?variableB . >> > ?variableB dcterms:title ?variableC . >> > ?variableA graphA:propertyB ?variableD . >> > ?variableL<GraphABaseURI:propertyB> ?variableD . >> > ?variableD <propertyBURI> ?variableE >> > } >> > . >> > GRAPH <GraphBURI>{ >> > ?variableF <propertyCURI>/<propertyDURI> ?variableG . >> > ?variableF <propertyEURI> ?variableH >> > } >> > . >> > GRAPH <GraphCURI>{ >> > ?variableI <http://www.w3.org/2004/02/skos/core#notation> >> > ?variableJ . >> > ?variableI <http://www.w3.org/2004/02/skos/core#prefLabel> >> > ?variableK . >> > FILTER (isLiteral(?variableK) && REGEX(?variableK, "literalA", >> "i")) >> > } >> > . >> > FILTER (isLiteral(?variableJ) && ?variableG = ?variableJ) . >> > FILTER (?variableE = ?variableH) >> > } >> > UNION >> > { >> > GRAPH <GraphABaseURI>{ >> > ?variableA a graphA:ClassA . >> > ?variableA graphA:propertyA ?variableB . >> > ?variableB dcterms:title ?variableC . >> > ?variableA graphA:propertyB ?variableD . >> > ?variableL<propertyBURI> ?variableE . >> > ?variableL <propertyFURI> ?variableD . >> > } >> > . >> > GRAPH <GraphDBaseURI>{ >> > ?variableM <propertyGURI> ?variableN . >> > ?variableM <propertyHURI> ?variableO . >> > FILTER (isLiteral(?variableO) && REGEX(?variableO, "literalA", >> > "i")) >> > } >> > . >> > FILTER (?variableE = ?variableN) . >> > >> > } >> > UNION >> > { >> > GRAPH <GraphABaseURI>{ >> > ?variableA a graphA:ClassA . >> > ?variableA graphA:propertyA ?variableB . >> > ?variableB dcterms:title ?variableC . >> > ?variableA graphA:propertyB ?variableD . >> > ?variableL<propertyBURI> ?variableE . >> > ?variableL <propertyIURI> ?variableD . >> > } >> > . >> > GRAPH <GraphDBaseURI>{ >> > ?variableM <propertyGURI> ?variableN . >> > ?variableM <propertyHURI> ?variableO . >> > FILTER (isLiteral(?variableO) && REGEX(?variableO, "literalA", >> > "i")) >> > } >> > . >> > FILTER (?variableE = ?variableN) . >> > } >> > . FILTER (isLiteral(?variableC) && REGEX(?variableC, "literalB", >> > "i")) . >> > } >> > >> > >> > I would not expect someone to transform the above query (of course...). >> > I am only posting the query to demonstrate the complexity and all the >> > SPARQL structures used. >> > >> > My questions: >> > >> > 1. Would I gain regarding performance if I had all my triples in one >> > graph? This way I would avoid unions and simplify my query, however, >> > would this also benefit in terms of performance? >> > 2. Are there any kind of indexes that I could built and they could be of >> > any help with the above query? I am not really confident on data >> > indexing, however reading in >> > >> http://virtuoso.openlinksw.com/dataspace/doc/dav/wiki/Main/VirtRDFPerformanceTuning#RDF >> > Index Scheme I wonder if the virtuoso 7's default indexing scheme is >> > suitable for queries like the above. While the predicates are defined in >> > the above query's SPARQL triple patterns, there are many triple patterns >> > that have not defined subject or predicate. Could this be a major >> > problem regarding performance? >> > 3. Perhaps there is a SPARQL syntax structure that I am not aware of and >> > could be of great help in the above query. Could you suggest something? >> > For example, I have already improved performance by removing STR() casts >> > and using the isLiteral() function. Could you suggest anything else? >> > 4. Perhaps you could suggest overusing a complex SPARQL syntax >> structure? >> > >> > Please note that I use Virtuoso Open source edition, built on Ubuntu, >> > Version: 07.20.3214, Build: Oct 14 2015. >> > >> > Regards, >> > Pantelis Natsiavas >> > >> > >> > >> > >> ------------------------------------------------------------------------------ >> > >> > >> > >> > _______________________________________________ >> > Virtuoso-users mailing list >> > Virtuoso-users@lists.sourceforge.net >> > https://lists.sourceforge.net/lists/listinfo/virtuoso-users >> > >> >> -- >> ------------------------------------------------------------------- >> Jerven Bolleman Jerven.Bolleman@sib.swiss >> SIB Swiss Institute of Bioinformatics Tel: +41 (0)22 379 58 85 >> CMU, rue Michel Servet 1 Fax: +41 (0)22 379 58 58 >> 1211 Geneve 4, >> Switzerland www.sib.swiss - www.uniprot.org >> Follow us at https://twitter.com/#!/uniprot >> ------------------------------------------------------------------- >> >> >> ------------------------------------------------------------------------------ >> _______________________________________________ >> Virtuoso-users mailing list >> Virtuoso-users@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/virtuoso-users >> > -- > *Peter J. Lawrence* > > *inova8* > *Providing answers for users' information questions* > *Mobile:* +1 330 631 3772 | *Phone:* +1 330 342 0582 |*Skype:* > PeterJLawrence > *Email:* peter.lawre...@inova8.com | *Web:* www.inova8.com > *LinkedIn: **http://www.linkedin.com/in/peterjohnlawrence > <http://www.linkedin.com/in/peterjohnlawrence>* > > ------------------------------------------------------------------------------ > _______________________________________________ > Virtuoso-users mailing list > Virtuoso-users@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/virtuoso-users > -- James P. McCusker III, Ph.D. http://tw.rpi.edu/web/JamesMcCusker
------------------------------------------------------------------------------
_______________________________________________ Virtuoso-users mailing list Virtuoso-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/virtuoso-users