Good morning Santiago. You're right, I upgraded from jessie to stretch and grep package is: ii grep 2.27-2 amd64 GNU grep, egrep and fgrep I have a text file, identified as a html document by "file" command which only contains (from what I can see on the file) text characters. In that file there are numerous strings containing "2018", but when I use grep to find that string I get: <li><a href="/diario_boe/calendarios.php?a=2018">Calendario</a></li> <li><a href="/boe/dias/2018/02/21/index.php?s=1"><span class="linkBack">anterior</span></a></li> <li><a href="/boe/dias/2018/02/23/index.php?s=1"><span class="linkFwd">siguiente</span></a></li> <p><strong>Sumario</strong> BOE-S-2018-47:</p> <a href="/boe/dias/2018/02/22/pdfs/BOE-S-2018-47.pdf" title="BOE-S-2018-47 en formato PDF firmado " onclick="javascript: pageTracker._trackPageview('/boe/dias/2018/02/22/pdfs/BOE-S-2018-47.pdf');">PDF</a> <a href="/diario_boe/xml.php?id=BOE-S-20180222" title="Sumario jueves 22 de febrero de 2018 como documento XML">XML</a> <li><a href="../../../../../boe_n/dias/2018/02/22/index.php?d=47&s=N">Notificaciones</a></li> --->Coincidencia en el fichero binario ayer.html<----
Using previous grep version all strings were found, but now if I want grep to work as before I need to use "grep -a". I guess the previous version of grep took "-a" behaviour as the default one, which treated all files as text unless specified otherwise (which in my opinion is the right way to go), I can't happen to see the security issues in this behaviour and how those security issues dissapear if I specify the "-a" parameter. Looks to me (without reviewing grep's code) that it is trying to identify what kind of file it is checking while searching the file (a couple of lines are found before the binary message), and I guess it shouldn't do that. I think it just have to treat files as text unless specified otherwise with the --binary-files parameter. Regards.Francisco El Jueves 22 de febrero de 2018 15:33, Santiago R.R. <santiag...@riseup.net> escribió: El 22/02/18 a las 11:18, rodrifra escribió: > Package: grep > Version: 2.27-2 > Severity: normal > > Dear Maintainer, > > > * What led up to the situation? > > Scripts working with grep stopped working after the update. No patterns >where detected ant the message informing of coincidences in the binary file >was displayed. The file is a downloaded html and "file" command returns: > > selecc.html: HTML document, ISO-8859 text, with CRLF, LF line terminators > > * What exactly did you do (or not do) that was effective (or > ineffective)? > > Explicitly indicating grep to treat the file as text solved the problem: >"grep -a ...." > > -- System Information: > Debian Release: 9.3 > APT prefers stable-updates > APT policy: (500, 'stable-updates'), (500, 'stable') > Architecture: amd64 (x86_64) > > Kernel: Linux 4.9.0-5-amd64 (SMP w/1 CPU core) > Locale: LANG=es_ES.UTF-8, LC_CTYPE=es_ES.UTF-8 (charmap=UTF-8), > LANGUAGE=es_ES.UTF-8 (charmap=UTF-8) > Shell: /bin/sh linked to /bin/dash > Init: sysvinit (via /sbin/init) > I suppose you upgraded from jessie to stretch. I am not sure of fully understanding your message. Could you please clarify what version of grep didn't detect the patterns? Anyway, as far as I understand from upstream's comments, grep's previous behaviour when detecting "binary files" was not suitable. The change was made to avoid security issues, or undetermined behaviours, that could be related to invalid characters. In your case, the .html file could include invalid chars at the beginning, or the encoding is maybe wrong. This is probably not a bug. Cheers, -- Santiago