On 2014-07-16 10:34, Alexey Zakhlestin wrote:
>
> On 14 Jul 2014, at 17:37, Bart Vandewoestyne <bart.vandewoest...@telenet.be> 
> wrote:
>
>> Our problem is the following: we observed that certain of our SPARQL
>> queries have subqueries (that return IDs) that run rather slow (because
>> of a slow bif:contains full-text-search).  We therefore
>> 'migrated'/'copied'/'flattened' some of our data to another
>> technology/database/store that allows for faster textual search.
>> Querying this database results in a list of IDs, and these IDs are then
>> 'injected' in a SPARQL query (we simply create a SPARQL query string
>> with all these IDs in the FILTER command).
>>
>> Apparently, we are limited to filtering 4094 IDs and I don't see a way
>> to overcome this if the subquery returns more.  Could there be a way to
>> work-around this limitation at SPARQL level?
>
> Couple of untested thoughts:
>
> 1. try to use several IN() filters by joining them using OR
>
>      FILTER(?id IN(1,2,3) or ?id IN(4,5,6))
>
> 2. use UNION to join result sets
>
>      {
>          … . FILTER(?id IN(1,2,3)
>      }
>      UNION
>      {
>          … . FILTER(?id IN(4,5,6)
>      }
>
>
> 3. insert IDs in separate named graph, use them for subquery and then clear 
> the graph
>
>      INSERT DATA {<id1> a <https://example.com/dummy> . <id2> a 
> <https://example.com/dummy> .} etc.

Getting back to my question from July, I have now tested solutions 1) 
and 2).  Solution 2 using UNIONs seems indeed a good way to overcome the 
4094 FILTER limit.  Concerning solution 1), I came across the following 
issue:

I tried filtering using the following expression:

   FILTER( ?id IN(<id1>)
           || ?id IN(<id2>)
           || ... and so on ...
           || ?id IN(<idN) )

or, similarly:

   FILTER( (?id = <id1>)
           || (?id = <id2>)
           || ... and so on ...
           || (?id = <idN) )

This works with values of N up to and including 1024.  From the moment I 
try with N=1025, Virtuoso returns me the error:

Virtuoso 37000 Error SP031: SPARQL compiler:
      Internal error: The maximum number of elements in array too long

I find this error message in the source file 
http://code.metager.de/source/xref/openlink/virtuoso-opensource/libsrc/Wi/sparql_sff.c
 
but I cannot deduce the exact reason for the error message.

Could it be that only 1024 expressions can be ORed together in a FILTER 
statement like the one above?  Thereby making solution 1) not really an 
option to overcome the 4094 limit?

Any thoughts on this are welcome!

Bart

------------------------------------------------------------------------------
Slashdot TV.  
Video for Nerds.  Stuff that matters.
http://tv.slashdot.org/
_______________________________________________
Virtuoso-users mailing list
Virtuoso-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/virtuoso-users

Reply via email to