On Wed,  6 Oct 2021 16:23:15 +0000
Rasmus Liland <j...@posteo.no> wrote:

>        "SELECT * FROM gene2refseq
>         LEFT JOIN gene_info ON
>           gene_info.GeneID = gene2refseq.GeneID
>         WHERE gene2refseq.`RNA_nucleotide_accession.version` 
>           LIKE ?"

<...>

>       x1 <- DBI::dbGetQuery(conn=conn, 
>         statement=statement, 
>         param=list(Håkan20210914$RNANucleotideAccession))

I think that the problem here is that you pass a vector as a bound
parameter to LIKE, when parameter placeholders usually expect a scalar.
DBI transparently handles this:

>> The elements of the `params` argument do not need to be scalars,
>> vectors of arbitrary length (including length 0) are supported.  For
>> queries, calling dbFetch() binding such parameters returns
>> concatenated results, equivalent to binding and fetching for each
>> set of values and connecting via rbind().

I think this means that DBI runs a SELECT for each value in
Håkan20210914$RNANucleotideAccession, which is understandably slower
than a single query. Unfortunately, it's hard to pass vectors of values
to queries with bound parameters; the SQL engines I know don't have a
syntax for "WHERE param IN (:multi_placeholder:)". SQLite comes with
carray [1], but I don't know whether it's exposed by RSQLite (could be
hard to do in a pointer-safe way), and you're already aware of the
traditional way of doing that: create a temporary table, populate it
and JOIN with the rest of the query.

-- 
Best regards,
Ivan

[1] https://www.sqlite.org/carray.html

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to