Re: [SM-USERS] Errors on thread view of over 20k messages folder

Marc Groot Koerkamp Tue, 21 Mar 2006 11:50:06 -0800

On Tue, March 21, 2006 20:13, Jonathan Angliss wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
>
>
> On Tue, March 21, 2006 11:13, Marc Groot Koerkamp wrote:
>
>


[..]

>
> [..]
>
>
>>> These IDs then need to be passed onto the code that fetches the
>>> message information.  This list of IDs is potentially enormous,
>>> especially in folders that have 20k messages (that's 20k message IDs
>>> that need to be returned).
>
>> Theoraticly it involves the parsing of around 12 bytes per message (10
>> bytes for representing the max value for a 32 bits int and 2 bytes for a
>>  space and of a paranthesis) which is around 240KB which is about the
>> same amount of data that has to be processed when we parse the sort
>> response (only the parentheses are not part of the short response).
>> Practically
>> the amount of data to be parsed is shorter because normally the imap
>> server does not return 10 byte string representations for uid's.
>
> Theoretically you are correct, the amount of data is roughly the same.
> Practically I believe you are wrong.  SORT uses a preg_split (we could
> probably change that to explode for speed), while THREAD checks byte for
> byte for special characters.  The code execution on SORT is pretty quick
> (one command), while the code execution on THREAD has to loop through a
> long list of values, checking for ( and ) and the numbers between to build
>  the thread.  At least in stable anyway.
>


preg_split is the reason i rewrote the code in the first place. (see my
cvs commit messages) Preg_split is extremely slow on large strings so I
have my doubts preg_split is the reason why sort is fast. We should remove
that call from the SORT functions as well because explode is much better
and a lot faster (but that's a different discussion). In fact, every
regular expression on large strings are performance killers so I try to
avoid them if possible (in most cases we can avoid them)

Besides, with current thread code i only need to iterate once through the
string response and that's done pretty quick.



>> Personally i don't think that the parsing of the thread response is an
>> issue in 1.5.1 because that part is rewriten the week before we released
>>  1.5.1. The reason for the rewrite had to do with the incapability of
>> parsing large threadresponses in the previous thread parse code (which
>> is the same as we use in 1.4.x).
>
> Having a quick look at the code, it just appears to be a tidy version of
> what is in stable, it is still looping through the THREAD response, looking
> for ( and ).  So I'm not sure, haven't tested.  Might be an ideal
> candidate for my dev laptop and profiler :)
>

You will be shocked when you compare 1.4.6 with 1.5.1 on a threaded
mailbox with 20k messages ;)
1.4.6 does a lot more loops over the thread response before it completes
parsing.

Marc.



-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
--
squirrelmail-users mailing list
Posting Guidelines: 
http://www.squirrelmail.org/wiki/MailingListPostingGuidelines
List Address: squirrelmail-users@lists.sourceforge.net
List Archives: 
http://news.gmane.org/thread.php?group=gmane.mail.squirrelmail.user
List Archives:  http://sourceforge.net/mailarchive/forum.php?forum_id=2995
List Info: https://lists.sourceforge.net/lists/listinfo/squirrelmail-users

Re: [SM-USERS] Errors on thread view of over 20k messages folder

Reply via email to