On 5/9/11 8:06 PM, Mårten Wikström wrote: >> Bash Version: 4.2 >> Patch Level: 8 >> Release Status: release >> >> Description: >> When parsing double double-quotes (i.e. """") it will be replaced by >> the >> value 0x7f, if there are characters before or after it. In bash 4.1 >> empty >> double-quotes were simply removed. >> >> Repeat-By: >> $ echo """"a >t >> $ hexdump -C t >> 00000000 7f 61 0a |.a.| >> 00000003 >> > > Fix: > After some debugging it turns out that the problem lies in > expand_word_internal() > in subst.c. In 4.1.0 the "" will be removed in expand_word_internal() > when we hit > line 8040:
Thanks for your investigation and analysis. You've correctly identified the place in the code that changed between bash-4.1 and bash-4.2 and the place that needs to be fixed. > > /* We do not want to add quoted nulls to strings that are only > partially quoted; we can throw them away. */ > if (temp == 0 && quoted_state == PARTIALLY_QUOTED) > continue; > > However, in 4.2.10 the "" will be converted to CTLNUL (0x7f). Because the > above > code has changed into > > /* We do not want to add quoted nulls to strings that are only > partially quoted; we can throw them away. */ > if (temp == 0 && quoted_state == PARTIALLY_QUOTED && (word->flags & > (W_NOSPLIT|W_NOSPLIT2))) > continue; > > which won't match our case (to only flag set in word->flags is > W_QUOTED). So instead we fall down to add_quoted_string: and it will > add the CTLNUL character. So we end up with two 0x7f bytes in the > resulting string when we get back to shell_expand_word_list(). Later > only the first 0x7f will be removed by > word_list_remove_quoted_nulls(). Correct. > > There are two problems/solutions here. The comment in the code above > seems to indicate that the quotes should actually be thrown away as is > done in 4.1. But on the other hand, word_list_remove_quoted_nulls() > seems to indicate it should remove all nulls, not just the first. > If I fix word_list_remove_quoted_nulls() to actually remove all > consecutive nulls, the problem is > solved. (At least my simple test-case works). If I revert the line > above to the 4.1 version it also > solves my problem. Unfortunately, that will not work. You can't throw away the empty strings unless you're sure that you won't be performing word splitting. The best example is f=" val" e= echo "$e"$f which should result in two fields, the first of which is the empty string. Bash-4.1 got that wrong. > > Alas, my understanding of the bash code is fairly limited so my fixes > will likely break something. Perhaps someone with a little more > insight could tell the right(tm) solution. > > Anyway, here are the patches. > Solution 1, fixing remove_quoted_nulls(): > > *** subst.c 2011-05-10 01:48:54.816322136 +0200 > --- ../bash-4.2-patched/subst.c 2011-05-10 01:53:31.350806960 +0200 > *************** remove_quoted_nulls (string) > *** 3706,3712 **** > break; > } > else if (string[i] == CTLNUL) > ! i++; > > prev_i = i; > ADVANCE_CHAR (string, slen, i); > --- 3706,3713 ---- > break; > } > else if (string[i] == CTLNUL) > ! while (string[i] == CTLNUL) > ! i++; > > prev_i = i; > ADVANCE_CHAR (string, slen, i); It's the right place, but the wrong fix. The code as it reads in bash-4.2 skips over each character immediately following a CTLNUL. If a sequence of CTLNULs appear, it skips every other one. I attached a patch that does the right thing. This bug has been there for a long time -- I stopped looking when I got back to bash-3.0. It was just masked by the code in expand_word_internal(). > Solution 2, reverting to 4.1 behaviour: As above, that doesn't do the right thing in all cases. Chet -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, ITS, CWRU c...@case.edu http://cnswww.cns.cwru.edu/~chet/
*** ../bash-4.2-patched/subst.c 2011-03-06 14:11:11.000000000 -0500 --- subst.c 2011-05-11 11:23:33.000000000 -0400 *************** *** 3707,3711 **** } else if (string[i] == CTLNUL) ! i++; prev_i = i; --- 3710,3717 ---- } else if (string[i] == CTLNUL) ! { ! i++; ! continue; ! } prev_i = i;