On Tue, 9 Jan 2007, Joey Hess wrote:
David, it seems that there's a bug in the Grep implementation of the MessageParser that can lead to a hang. See discussion at http://bugs.debian.org/395268
Thanks for the heads up.
I share Steinar's confusion about what $self->{'CURRENT_EMAIL_INDEX'} should be used for and how it relates to $self->{'email_number'}
I renamed CURRENT_EMAIL_INDEX TO CHUNK_INDEX, and I've added the following documentation: # Reading grep data provides us with an array of potential email # starting locations. However, due to included emails and attachments, # we have to validate these locations as actually being the start of # emails. As a result, there may be more "chunks" in the array than # emails. So CHUNK_INDEX >= email_number-1. I've found that when the grep implementation goes into an infinite loop, it's because the grep data does not match the file, as would be the case if the file was modified after "grep" was run. My next release will detect this case and try to recover.
As a temporary workaround, I've disabled the Grep implementation in the Debian package.
I'll ping you when the release comes out so that you can test it. (I'm not sure how to recreate the bug myself.) Comments on one of the emails are below. BTW, I see from the link you provided that this is marked as closed. Did 1.4005 fix the bug or not? David On Sun, 12 Nov 2006 03:29:04 +0100, Steinar H. Gunderson wrote:
If I had to guess, I'd assume $self->{'email_number'} was somehow a _logical_ message number, and thus unfit for any sort of indexing. That's a guess, though.
That's right. CURRENT_EMAIL_INDEX (now CHUNK_INDEX) refers to an entry in the grep data array that corresponds to some block of text in the file that begins "From ...". In the case that this block of text may not be the start of a new email, we will need to continue incrementing CHUNK_INDEX and reading more chunks.
Part of the reason seems to be that _adjust_cache_data() somehow merges or deletes messages without adjusting email_number; I'm not really sure what it is supposed to do.
At the end, after validating the start of the next email, I add up the chunk entries to get the final, validated entry for the email. As for not checking the result of read(), that was sloppy programming on my part. I thought there was no way for the grep data and the file to get out of sync, but apparently someone has found a way. :) I don't know what the cause is in this case, but I'll try to detect and avoid/correct it in the next release. _____________________________________________________________________ David Coppit [EMAIL PROTECTED] The College of William and Mary http://coppit.org/ "When the president does it that means that it is not illegal." - Richard Nixon on domestic surveillance, 5/19/1977 "Do I have the legal authority to do this? And the answer is, absolutely." - George W. Bush on domestic surveillance, 12/19/2005 -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]