Hi Paul. > Here are comments on the parts of the diff not incorporated in this round: > > > -static const char __re_error_msgid[] = > > +static const char __re_error_msgid[] attribute_hidden = > > Since the constant is static, there should be no need for attribute_hidden, > as > the constant is used only in this compilation unit. Similarly for other > introductions of attribute_hidden.
Thanks, I'll review that. > > - re_dfa_t *dfa = bufp->buffer; > > + re_dfa_t *dfa = (re_dfa_t *) bufp->buffer; > > This cast is not needed, as bufp->buffer is already of the proper type. > Similarly elsewhere. I don't see that. In regex.h: | struct re_pattern_buffer | { | /* Space that holds the compiled pattern. It is declared as | `unsigned char *' because its elements are sometimes used as | array indexes. */ | unsigned char *__REPB_PREFIX(buffer); | ... | typedef struct re_pattern_buffer regex_t; And this is: | static bin_tree_t * | parse (re_string_t *regexp, regex_t *preg, reg_syntax_t syntax, | reg_errcode_t *err) | { | re_dfa_t *dfa = (re_dfa_t *) preg->buffer; So the cast seems necessary. > > - preg->buffer = dfa; > > + preg->buffer = (unsigned char *) dfa; > > This cast seems counterproductive, as dfa is already of the proper type > and the unsigned char * is not the proper type. Same argument for the cast. > > - dfa->subexp_map = re_malloc (Idx, preg->re_nsub); > > + /* some malloc()-checkers don't like zero allocations */ > > + if (preg->re_nsub > 0) > > + dfa->subexp_map = re_malloc (int, preg->re_nsub); > > + else > > + dfa->subexp_map = NULL; > > + > > Since malloc (0) is well-defined to return either NULL or a valid pointer > to a zero-byte object that can be freed, the code is working as-is. Not on older systems. Although yes, I'm not sure I still support such systems. I don't know which checker complained. I think it was valgrind at some point in the past. > I'd rather look for a solution that involves silencing the checking > without making the code bulkier and typically slower. This is in compilation, not execution; not sure the difference is really noticable. > > + /* > > + * Fedora Core 2, maybe others, have broken `btowc' that returns -1 > > + * for any value > 127. Sigh. Note that `start_ch' and `end_ch' are > > + * unsigned, so we don't have sign extension problems. > > + */ > > start_wc = ((start_elem->type == SB_CHAR || start_elem->type == > > COLL_SYM) > > - ? __btowc (start_ch) : start_elem->opr.wch); > > + ? start_ch : start_elem->opr.wch); > > end_wc = ((end_elem->type == SB_CHAR || end_elem->type == COLL_SYM) > > - ? __btowc (end_ch) : end_elem->opr.wch); > > + ? end_ch : end_elem->opr.wch); > > if (start_wc == WEOF || end_wc == WEOF) > > return REG_ECOLLATE; > > I don't see how this patch is helpful. If btowc returns -1 we are looking > at an encoding error, and we can't treat that byte as if it were a > character. In some single-byte locales, btowc returns 1 for values < 128, > and they're encoding errors too. Perhaps you could mention a use case? Gawk's profile5 test fails without this. This code way predates that test though. If a byte value given inside [...] is greater than 127, it should be left alone to be matched as-is (no arbitrary limits). > > /* Build single byte matching table for this equivalence class. */ > > + char_buf[1] = (unsigned char) '\0'; > > This should be unnecessary, as the rest of the code shouldn't be looking > at that byte. Is this something to pacify a lint checker? Or valgrind at some point. I think a static checker that someone submitted a report from. > > - wc = wc2; > > + wc = (wint_t) wc2; > > wc and wc2 are both integers so the cast is unnecessary. Similarly elsewhere. There was a compiler (maybe VMS) for which this was necessary. Thanks, Arnold