Hi Bart, On 28-Aug-14 1:56 PM, Bart Vandewoestyne wrote:
On 2014-07-16 10:34, Alexey Zakhlestin wrote:On 14 Jul 2014, at 17:37, Bart Vandewoestyne <bart.vandewoest...@telenet.be> wrote:Our problem is the following: we observed that certain of our SPARQL queries have subqueries (that return IDs) that run rather slow (because of a slow bif:contains full-text-search). We therefore 'migrated'/'copied'/'flattened' some of our data to another technology/database/store that allows for faster textual search. Querying this database results in a list of IDs, and these IDs are then 'injected' in a SPARQL query (we simply create a SPARQL query string with all these IDs in the FILTER command). Apparently, we are limited to filtering 4094 IDs and I don't see a way to overcome this if the subquery returns more. Could there be a way to work-around this limitation at SPARQL level?Couple of untested thoughts: 1. try to use several IN() filters by joining them using OR FILTER(?id IN(1,2,3) or ?id IN(4,5,6)) 2. use UNION to join result sets { … . FILTER(?id IN(1,2,3) } UNION { … . FILTER(?id IN(4,5,6) } 3. insert IDs in separate named graph, use them for subquery and then clear the graph INSERT DATA {<id1> a <https://example.com/dummy> . <id2> a <https://example.com/dummy> .} etc.Getting back to my question from July, I have now tested solutions 1) and 2). Solution 2 using UNIONs seems indeed a good way to overcome the 4094 FILTER limit. Concerning solution 1), I came across the following issue: I tried filtering using the following expression: FILTER( ?id IN(<id1>) || ?id IN(<id2>) || ... and so on ... || ?id IN(<idN) ) or, similarly: FILTER( (?id = <id1>) || (?id = <id2>) || ... and so on ... || (?id = <idN) ) This works with values of N up to and including 1024. From the moment I try with N=1025, Virtuoso returns me the error: Virtuoso 37000 Error SP031: SPARQL compiler: Internal error: The maximum number of elements in array too long
The error comes from the compiler's limitations .. Does it make difference if you use instead this: FILTER( ?id IN(<id1>, <id2> ...<id1025>)) i.e. do you get again (the same) error or this works for you? Best Regards, Rumi Kocis
I find this error message in the source file http://code.metager.de/source/xref/openlink/virtuoso-opensource/libsrc/Wi/sparql_sff.c but I cannot deduce the exact reason for the error message. Could it be that only 1024 expressions can be ORed together in a FILTER statement like the one above? Thereby making solution 1) not really an option to overcome the 4094 limit? Any thoughts on this are welcome! Bart ------------------------------------------------------------------------------ Slashdot TV. Video for Nerds. Stuff that matters. http://tv.slashdot.org/ _______________________________________________ Virtuoso-users mailing list Virtuoso-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/virtuoso-users
smime.p7s
Description: S/MIME Cryptographic Signature
------------------------------------------------------------------------------ Slashdot TV. Video for Nerds. Stuff that matters. http://tv.slashdot.org/
_______________________________________________ Virtuoso-users mailing list Virtuoso-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/virtuoso-users