Re: AW: Derby Scheduler and FETCH FIRST question

Rick Hillegas Sat, 27 Jan 2018 13:31:01 -0800

On 1/26/18 12:11 AM, [email protected] wrote:

Hello Rick,

and thanks for your reply.

I will try what you've written as soon as I have time for it.
Unfortunately I already switch to another project and I don't knowwhen I will get the time to have a look on this again (blame mysuperiors ;-) ).
2) is surely a good idea.

1) Did you also read my 2nd mail?

I just responded to that 2nd message. Can you share the CREATE TABLE andCREATE INDEX statements associated with this table? They may providesome further clues.


Thanks,
-Rick

I also tried using a subselect, so I have a WHERE clause. I had thesame idea as you that the scheduler might not recognize the ORDER BYand FETCH FIRST.
It was faster, but still not what I would have expected. I've worked alot with Borland Interbase / Firebird, MySQL and especially withPostgreSQL.
And PostgreSQL would have done a lot faster than this.
By the way: PostgreSQL also has a more easy to use approach in aspectof analysis: https://www.postgresql.org/docs/9.6/static/using-explain.html
Would be great of Derby would offer something similar.

Regards,

Gerrit

*Von:*Rick Hillegas [mailto:[email protected]]
*Gesendet:* Freitag, 26. Januar 2018 00:39
*An:* [email protected]
*Betreff:* Re: Derby Scheduler and FETCH FIRST question

On 1/24/18 4:45 AM, [email protected] <mailto:[email protected]> wrote:

    Hello everyone,

    I'm using Apache Derby v10.14.1.0 and having some problems using
    the FETCH FIRST clauses.

    https://db.apache.org/derby/docs/10.14/ref/rrefsqljoffsetfetch.html

    I'm accessing the database using the Derby Embedded driver.

    I have a table which contains some indexes as well as some fields
    and a BLOB field. The table is somewhat big (means many rows, ~13 GB).

    I'm using a query like this (timestamp has an index):

    SELECT * FROM history ORDER BY timestamp DESC FETCH FIRST 10 ROWS ONLY

    The query takes ages (about 27 minutes for that ~13 GB table) and
    I can see how Derby slowly fills up my harddisk.

    And a look in the "tmp" folder of the database shows several
    ".tmp" files.

    First I get several files having 10 MB, then I get two big files
    having 5 GB, then the 10 MB files are deleted, then the 5 GB files
    are deleted and finally I get the result.

    As I thought something is wrong with my application I also did the
    same query on the same database and table using SQuirreL v3.8.1.
    But the result is the same.

    I would have expected that the scheduler of Derby would first look
    at the timestamp column / index (which should be sorted), taking
    the first 10 values from there and

    finally reading the first 10 rows matching these values.

    Instead it seems that it first processes the " SELECT * FROM
    history" part (as memory is not sufficient it swaps it to the
    harddisk), orders it and takes the first 10 elements.

    Is that correct?

    And if that is correct, where is the benefit of FETCH FIRST -
    beside that maybe not that much data is transferred (maybe only
    interesting if you use Derby not by the Embedded Driver because of
    the TCP/IP connection)?

    Regards,

    Gerrit

Hi Gerrit,
Can you share table and index DDL for this problem as well as thequery plan which Derby chose for the query? See the section on"Working with RunTimeStatistics" in the Derby Tuning Guide:http://db.apache.org/derby/docs/10.14/tuning/index.html
It may be that Derby did not choose the index. That in turn, may havehappened for 2 reasons:
1) You're selecting all of the columns in the table and there is nofiltering WHERE clause. That reduces the likelihood that Derby willpick an indexed access path since the optimizer sees this as a fulltable scan.
2) I don't think that any optimizer support was built for the FETCHFIRST clause. That's worth filing a performance bug for. I think thatthe FETCH FIRST clause is only applied at execution time in order toshort-circuit the number of rows which are returned.
Thanks,

-Rick

Re: AW: Derby Scheduler and FETCH FIRST question

Reply via email to