Hi - > > If the perceived problem is that build tree scans (-F) may contain > > binaries that refer to source files that are not appropriate for > > later sharing, then IMO this is too much change, and unnecessarily > > complicates other valid usage. > > Yes, that (and references to any other source files, whether those > scanned by -F or -R or simply because they are reachable on the file > system) is the problem that is being solved.
Files "simply reachable on the file system" are not indexed or sought by debuginfod, unless (a) they are scanned due to being listed with an explicit PATH, OR (b) being referred to from within an -F DWARF file as a source. That is all. There is nothing else "reachable". > > If you are certain that source file censorship needs to be in the > > code, I'd do it instead by adding just one option -S PATH to the code, > > which would act like a whitelist for -F source file retrievals. > > (There is no point to filtering -R rpm source files; those are only > > serviced from other indexed RPMs.) > > By default all -F directories are already whitelisted. -S is just for > extra places where source could be found. We are speaking about hypothetical work, so "are already" is incorrect. There is no whitelist of source files from -F type searches "already". Contemplating a whitelist: it may easily be the case that the sources are relatively far from the build tree being scanned - indeed separate sources is how we recommend gnu tools be built. Constructing the whitelist from the -F paths only is bound to be incomplete in this common usage scenario. > > So: > > debuginfod -S /usr/src/debug -S /usr/include -F PATH1 PATH2 ... PATHn > > would restrict -F source service to the given paths, and > > debuginfod -F PATH1 PATH2 > > would not, because normal people have trustworthy build systems etc. > > I guess we differ on how trustworthy generated debug files are. All this work depends on debug files being trustworthy! The man pages spell this out already. Imagine a doctored debug file deliberately conflicting with a well-known buildid. Or deliberately containing masses of garbage or harmful data. > What I would like is: > > - By default only restrict the files served to those under the > directory that the file-scanner uses (that is why I split the > -R and -F cases). Why? This is a tighter constraint than the problem statement at the top. The only additional risk here would be from an file-scanner-found dwarf file that makes a source reference to a file in a directory that was already explicitly identified for RPM scanning, i.e., not a sensitive location. > - Have a more restrictive mode that simply doesn't add anything > to the sources white list (that is -N in my patch). > - Have an anything goes mode (that is -A in my patch). > - Be able to whitelist more selectively (that is -S). IMHO, this is unnecessary complication. Maybe you'll see this if you write out documentation and sample usage for all these cases. > If I understand you correctly (given your other email in reply to > why adding globbing support isn't enough), you also want a mode > where all extra arguments on the command line are interpreted as > "scannable" (either file based or rpm based). This is the normal behaviour for unix tools. > So I think the real issue is the splitting of -R and -F argument > parsing. If that is the case, maybe just picking a default for how to > interpret the extra arguments, as dirs for the file scanner or dirs for > the rpm scanner or both, might make us both happy? The branch code does "both", because it is simple. - FChE