Re: a bug of extra long line

Marco Colombo Fri, 23 Mar 2001 11:12:25 -0800

On Sat, 10 Mar 2001, Lawrence Greenfield wrote:

> Could you try the following comparison instead?
> 
>       p = buf + strlen(buf) - 1;
>       if (p == buf || (p[0] == '\n' && p[-1] != '\r')) {
>           /* either a \0 by itself or a \n without a \r */
>           p[0] = '\r';
>           p[1] = '\n';
>           p[2] = '\0';
> 
> The test is important to make sure that no bare \n get through.
> 
> Larry
> 

Larry, there's another bug here. I've discovered it because I had
a nasty mail loop with the 'contains bare newlines' error.

The problem was that the messages did not contain ANY bare newlines.

It did contain NULs, BTW.

All the checks for both bare NLs and NULs are performed in
imap/message.c:message_copy_strict(). If I follow the code right,
first the message is handled by imap/lmtpengine.c:copy_msg(), then
by message_copy_strict() for each recipient.

Since you're using prot_fgets() to read the message in copy_msg(),
if the message contains NUL, the strlen() is doomed to fail its job.
E.g., if the input stream is:

<headers>
\r\n
Ciao\0Come va\r\n
\r\n.\r\n

the prot_fget() will "return" (write into buf) the sequence:
Ciao\0Come va\r\n\0 (the trailing \0 is added by prot_fget),

p will be bug+3 instead of bug+14. So the part of the line after tne NUL
is ignored. In some cases message_copy_strict() will later complain about
a bare NL (the NUL never gets written to fout, of course).

There's no easy way to fix this, since it's the fget() semantic which is
broken here (it adds a \0 in the end of the "returned value", and caller
has no other way to know how many bytes has been read).

I ended up in rewriting the whole function to use prot_read(). Now
NULs get thru, and after are correctly detected by message_copy_strict().
Besides this, I'm just wondering why you seem to convert bare \n into \r\n
in copy_msg(), if later in message_copy_strict() you're checking for them.
I believe this isn't the intended behavior. Moreover, you're "ignoring"
\r\0 (at the beginning of the line?), later removing bare CRs (WHY? if
you're going to complain loud for bare \n, as per RFC2045, why are you
silently stripping bare \r?).

So, here's *two* version of the copy_msg() function, supplied as patch.
The first one is my attempt to maintain the same semantic of the old
copy_msg(), the second one is simpler, and lets message_copy_strict()
do the job, adding a check for bare CRs, too. Not submitted for inclusion,
of course, just for showing and testing (the second one runs on a test
box here). Feel free to use/modify the code.

This won't avoid mail loops (two hosts exchanging mail for each other
postmaster, both running Cyrus which rejects messages). But I'm starting
to think that's a sendmail bug, that lets the bare NLs, CRs, or NULs 
thru without rejecting them.

.TM.
-- 
      ____/  ____/   /
     /      /       /                   Marco Colombo
    ___/  ___  /   /                  Technical Manager
   /          /   /                      ESI s.r.l.
 _____/ _____/  _/                     [EMAIL PROTECTED]

--- cyrus-imapd-2.0.12/imap/lmtpengine.c.esi    Mon Feb 19 19:05:37 2001
+++ cyrus-imapd-2.0.12/imap/lmtpengine.c        Thu Mar 22 17:07:10 2001
@@ -531,49 +531,101 @@
  * newlines are fiddled. "." terminates */
 static int copy_msg(struct protstream *fin, FILE *fout)
 {
-    char buf[8192], *p;
+    enum {ST_BEGIN, ST_DOT, ST_DOTCR, ST_CR, ST_INLINE} state;
+    char    buf[BUFSIZ];
+    char    *p;
+    ssize_t count;
+    int            end = 0;
 
-    while (prot_fgets(buf, sizeof(buf)-1, fin)) {
-       p = buf + strlen(buf) - 1;
-       if (p == buf || p[-1] != '\r') {
-           p[0] = '\r';
-           p[1] = '\n';
-           p[2] = '\0';
-       } else if (*p == '\r') {
-           if (buf[0] == '\r' && buf[1] == '\0') {
-               /* The message contained \r\0, and fgets is confusing us.
-                  XXX ignored
-                  */
-           } else {
-               /*
-                * We were unlucky enough to get a CR just before we ran
-                * out of buffer--put it back.
-                */
-               prot_ungetc('\r', fin);
-               *p = '\0';
+    state = ST_BEGIN;
+    while (!end && (count = prot_read(fin, buf, sizeof(buf)))) {
+       for(p = buf; !end && (p < buf + count); p++) {
+           switch (state) {
+           case ST_BEGIN:
+               switch (*p) {
+               case '\n':
+                   fputc('\r', fout);
+                   fputc('\n', fout);
+                   state = ST_BEGIN;
+                   break;
+               case '\r':
+                   state = ST_CR;
+                   break;
+               case '.':
+                   state = ST_DOT;
+                   break;
+               default:
+                   fputc(*p, fout);
+                   state = ST_INLINE;
+                   break;
+               }
+               break;
+           case ST_DOT:
+               switch (*p) {
+               case '\n':
+                   fputc('\r', fout);
+                   fputc('\n', fout);
+                   state = ST_BEGIN;
+                   break;
+               case '\r':
+                   state = ST_DOTCR;
+                   break;
+               default:
+                   fputc(*p, fout);
+                   state = ST_INLINE;
+                   break;
+               }
+               break;
+           case ST_DOTCR:
+               switch (*p) {
+               case '\n':
+                   end = 1;
+                   break;
+               default:
+                   /* fputc('\r', fout); */
+                   fputc(*p, fout);
+                   state = ST_INLINE;
+                   break;
+               }
+               break;
+           case ST_CR:
+               switch (*p) {
+               case '\n':
+                   fputc('\r', fout);
+                   fputc('\n', fout);
+                   state = ST_BEGIN;
+                   break;
+               default:
+                   /* fputc('\r', fout); */
+                   fputc(*p, fout);
+                   state = ST_INLINE;
+                   break;
+               }
+               break;
+           case ST_INLINE:
+               switch (*p) {
+               case '\n':
+                   fputc('\r', fout);
+                   fputc('\n', fout);
+                   state = ST_BEGIN;
+                   break;
+               case '\r':
+                   state = ST_CR;
+                   break;
+               default:
+                   fputc(*p, fout);
+                   state = ST_INLINE;
+                   break;
+               }
+               break;
            }
        }
-       /* Remove any lone CR characters */
-       while ((p = strchr(buf, '\r')) && p[1] != '\n') {
-           strcpy(p, p+1);
-       }
-       
-       if (buf[0] == '.') {
-           if (buf[1] == '\r' && buf[2] == '\n') {
-               /* End of message */
-               goto lmtpdot;
-           }
-           /* Remove the dot-stuffing */
-           fputs(buf+1, fout);
-       } else {
-           fputs(buf, fout);
-       }
     }
 
-    /* wow, serious error---got a premature EOF. */
-    return IMAP_IOERROR;
+    if (!end) {
+       return IMAP_IOERROR;
+    }
 
- lmtpdot:
     return 0;
 }

--- cyrus-imapd-2.0.12/imap/lmtpengine.c.esi    Mon Feb 19 19:05:37 2001
+++ cyrus-imapd-2.0.12/imap/lmtpengine.c        Thu Mar 22 15:46:57 2001
@@ -158,6 +158,9 @@
     case IMAP_MESSAGE_CONTAINSNULL:
        return "554 5.6.0 Message contains NUL characters";
        
+    case IMAP_MESSAGE_CONTAINSCR:
+       return "554 5.6.0 Message contains bare carriage returns";
+
     case IMAP_MESSAGE_CONTAINSNL:
        return "554 5.6.0 Message contains bare newlines";
 
@@ -531,49 +534,86 @@
  * newlines are fiddled. "." terminates */
 static int copy_msg(struct protstream *fin, FILE *fout)
 {
-    char buf[8192], *p;
-
-    while (prot_fgets(buf, sizeof(buf)-1, fin)) {
-       p = buf + strlen(buf) - 1;
-       if (p == buf || p[-1] != '\r') {
-           p[0] = '\r';
-           p[1] = '\n';
-           p[2] = '\0';
-       } else if (*p == '\r') {
-           if (buf[0] == '\r' && buf[1] == '\0') {
-               /* The message contained \r\0, and fgets is confusing us.
-                  XXX ignored
-                  */
-           } else {
-               /*
-                * We were unlucky enough to get a CR just before we ran
-                * out of buffer--put it back.
-                */
-               prot_ungetc('\r', fin);
-               *p = '\0';
-           }
-       }
-       /* Remove any lone CR characters */
-       while ((p = strchr(buf, '\r')) && p[1] != '\n') {
-           strcpy(p, p+1);
-       }
-       
-       if (buf[0] == '.') {
-           if (buf[1] == '\r' && buf[2] == '\n') {
-               /* End of message */
-               goto lmtpdot;
+    enum {ST_BEGIN, ST_DOT, ST_DOTCR, ST_CR, ST_INLINE} state;
+    char    buf[BUFSIZ];
+    char    *p;
+    ssize_t count;
+    int            end = 0;
+
+    state = ST_BEGIN;
+    while (!end && (count = prot_read(fin, buf, sizeof(buf)))) {
+       for(p = buf; !end && (p < buf + count); p++) {
+           switch (state) {
+           case ST_BEGIN:
+               switch (*p) {
+               case '.':
+                   state = ST_DOT;
+                   break;
+               case '\r':
+                   fputc('\r', fout);
+                   state = ST_CR;
+                   break;
+               default:
+                   fputc(*p, fout);
+                   state = ST_INLINE;
+                   break;
+               }
+               break;
+           case ST_DOT:
+               switch (*p) {
+               case '\r':
+                   state = ST_DOTCR;
+                   break;
+               default:
+                   fputc(*p, fout);
+                   state = ST_INLINE;
+                   break;
+               }
+               break;
+           case ST_DOTCR:
+               switch (*p) {
+               case '\n':
+                   end = 1;
+                   break;
+               default:
+                   fputc('\r', fout);
+                   fputc(*p, fout);
+                   state = ST_INLINE;
+                   break;
+               }
+               break;
+           case ST_CR:
+               switch (*p) {
+               case '\n':
+                   fputc('\n', fout);
+                   state = ST_BEGIN;
+                   break;
+               default:
+                   fputc(*p, fout);
+                   state = ST_INLINE;
+                   break;
+               }
+               break;
+           case ST_INLINE:
+               switch (*p) {
+               case '\r':
+                   fputc('\r', fout);
+                   state = ST_CR;
+                   break;
+               default:
+                   fputc(*p, fout);
+                   state = ST_INLINE;
+                   break;
+               }
+               break;
            }
-           /* Remove the dot-stuffing */
-           fputs(buf+1, fout);
-       } else {
-           fputs(buf, fout);
        }
     }
 
-    /* wow, serious error---got a premature EOF. */
-    return IMAP_IOERROR;
+    if (!end) {
+       return IMAP_IOERROR;
+    }
 
- lmtpdot:
     return 0;
 }
 
--- cyrus-imapd-2.0.12/imap/message.c.esi       Tue Dec 26 22:35:41 2000
+++ cyrus-imapd-2.0.12/imap/message.c   Thu Mar 22 15:46:42 2001
@@ -254,9 +254,11 @@
                blankline = 1;
            }
            else if (*p == '\r') {
+               if (sawcr) r = IMAP_MESSAGE_CONTAINSCR;
                sawcr = 1;
            }
            else {
+               if (sawcr) r = IMAP_MESSAGE_CONTAINSCR;
                sawcr = 0;
                blankline = 0;
                if (inheader && *p >= 0x80) {
--- cyrus-imapd-2.0.12/imap/imap_err.et.esi     Mon May 29 01:19:39 2000
+++ cyrus-imapd-2.0.12/imap/imap_err.et Thu Mar 22 15:46:42 2001
@@ -68,6 +68,9 @@
 ec IMAP_MESSAGE_CONTAINSNULL,
    "Message contains NUL characters"
 
+ec IMAP_MESSAGE_CONTAINSCR,
+   "Message contains bare carriage returns"
+
 ec IMAP_MESSAGE_CONTAINSNL,
    "Message contains bare newlines"

Re: a bug of extra long line

Reply via email to