Hi Jörn,

> 
> I know i can do something like the following, but it's terribly slow 
> (timeout, no result):
> 
> select * where {
>  ?s rdfs:label ?l .
>  FILTER(str(?l) = "贝拉克·奥巴马").
> }
> 
> e.g.
> http://dbpedia.org/sparql?default-graph-uri=http%3A%2F%2Fdbpedia.org&qtxt=prefix+dbpedia%3A+%3Chttp%3A%2F%2Fdbpedia.org%2Fresource%2F%3E%0D%0Aselect+*+where+{%0D%0A++%3Fs+rdfs%3Alabel+%3Fl+.%0D%0A++FILTER%28str%28%3Fl%29+%3D+%22%E8%B4%9D%E6%8B%89%E5%85%8B%C2%B7%E5%A5%A5%E5%B7%B4%E9%A9%AC%22%29.%0D%0A}&format=text%2Fhtml&CXML_redir_for_subjs=121&CXML_redir_for_hrefs=&timeout=30000&debug=on
> 

This of course would result in a partial table scan as the str function would 
need to be evaluated for each record.


> 
> I already tried bif:contains like this:
> 
> select * where {
>  ?s rdfs:label ?l .
>  FILTER(bif:contains(?l, "贝拉克·奥巴马")).
> }
> 
> e.g.
> http://dbpedia.org/sparql?default-graph-uri=http%3A%2F%2Fdbpedia.org&qtxt=prefix+dbpedia%3A+%3Chttp%3A%2F%2Fdbpedia.org%2Fresource%2F%3E%0D%0Aselect+*+where+{%0D%0A++%3Fs+rdfs%3Alabel+%3Fl+.%0D%0A++FILTER%28bif%3Acontains%28%3Fl%2C+%22%E8%B4%9D%E6%8B%89%E5%85%8B%C2%B7%E5%A5%A5%E5%B7%B4%E9%A9%AC%22%29%29.%0D%0A}&format=text%2Fhtml&CXML_redir_for_subjs=121&CXML_redir_for_hrefs=&timeout=30000&debug=on
> 
> but this returns the following error (and it's actually not exactly what i 
> want):
> 
>> Virtuoso 37000 Error XM029: Free-text expression, line 0: Invalid character 
>> in free-text search expression, it may not appear outside quoted string at è
>> 
>> 
>> SPARQL query:
>> define sql:big-data-const 0 
>> #output-format:text/html
>> define sql:signal-void-variables 1 define input:default-graph-uri 
>> <http://dbpedia.org> prefix dbpedia: <http://dbpedia.org/resource/>
>> select * where {
>>  ?s rdfs:label ?l .
>>  FILTER(bif:contains(?l, "贝拉克·奥巴马")).
>> }
> 
> 
> 
> So, is there a way for a performant exact literal search ignoring the 
> language?
> 
> Or could you maybe just make the standard `str(?l) = "something"` way quick 
> if `?l` is a literal?
> 


The problem is that when you use the contain function with unicode characters 
it has difficulties find word separation, so you need to quote the word 
yourself:

Try like this:

prefix dbpedia: <http://dbpedia.org/resource/>
select * where {
  ?s rdfs:label ?l .
  FILTER(bif:contains(?l, "'贝拉克·奥巴马'")).
}

which will give you super fast results:

http://dbpedia.org/sparql?default-graph-uri=http%3A%2F%2Fdbpedia.org&query=prefix+dbpedia%3A+%3Chttp%3A%2F%2Fdbpedia.org%2Fresource%2F%3E%0D%0Aselect+*+where+%7B%0D%0A++%3Fs+rdfs%3Alabel+%3Fl+.%0D%0A++FILTER%28bif%3Acontains%28%3Fl%2C+%22%27%E8%B4%9D%E6%8B%89%E5%85%8B%C2%B7%E5%A5%A5%E5%B7%B4%E9%A9%AC%27%22%29%29.%0D%0A%7D&format=text%2Fhtml&CXML_redir_for_subjs=121&CXML_redir_for_hrefs=&timeout=30000&debug=on


Patrick
---
Patrick van Kleef
Program Manager
OpenLink Software

http://www.openlinksw.com/
http://twitter.com/openlink/


------------------------------------------------------------------------------
_______________________________________________
Virtuoso-users mailing list
Virtuoso-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/virtuoso-users

Reply via email to