On Monday, 2020-09-07 at 11:39:32 -04, Alexander Bulekov wrote: > On 200902 1103, Darren Kenny wrote:
... >> > + >> > + while (ind >= 0 && fuzzable_memoryregions->len) { >> > + *result = (address_range){0, 0}; >> > + mr = g_ptr_array_index(fuzzable_memoryregions, i); >> > + if (mr->enabled) { >> > + abs_addr = mr->addr; >> > + for (root = mr; root->container; ) { >> > + root = root->container; >> > + abs_addr += root->addr; >> > + } >> > + /* >> > + * Only consider the region if it is rooted at the io_space >> > we want >> > + */ >> > + if (root == io_space) { >> > + hwaddr xlat, len; >> > + if(address_space_translate(as, abs_addr, &xlat, &len, >> > true, MEMTXATTRS_UNSPECIFIED) == mr){ >> > + ind--; >> >> I'm wondering what is the purpose of ind, we never really do anything >> with it except possibly decrement it here and test in the while >> condition. >> >> With candidate_regions also only being incremented here, we could just >> as easily compare that against index. >> > > Yes it is not clear. The overall idea here is: > * fuzzable_memoryregions contains regions that belong both to the > Memory/MMIO AddressSpace and the PIO AddressSpace. > * Thus fuzzable_mr can look like [PIO_1, MMIO_1, MMIO_2, PIO_2, PIO_3] > * If index == 4 and we want an MMIO region, we need to use that as an > index into the sub-array of only MMIO-Type regions > > I think instead, I should either > 1. Have separate arrays for PIO/MMIO MRs. This will simplify this > function, but I'm also not sure whether it is always possible to > identify whether the mr is pio/mmio (e.g. when a PCI BAR has not yet > been mapped) > 2. Have a single read/write operation instead of in/out and read/write. > Then, instead of differentiating between MMIO and PIO here, we could > do that in the OP. > 3. Instead of keeping track of MemoryRegions here, try instead to walk > the corresponding "flatview" and match the memory-region pointers. > > I'll try out (3) first. hopefully that will clear this up and make > everything more legible. OK, thanks. ... >> > +/* >> > + * Here, we interpret random bytes from the fuzzer, as a sequence of >> > commands. >> > + * Our commands are variable-width, so we use a separator, SEPARATOR, to >> > specify >> > + * the boundaries between commands. This is just a random 32-bit value, >> > which >> > + * is easily identified by libfuzzer+AddressSanitizer, as long as we use >> > + * memmem. It can also be included in the fuzzer's dictionary. More >> > details >> > + * here: >> > + * https://github.com/google/fuzzing/blob/master/docs/split-inputs.md >> > + * >> > + * As a result, the stream of bytes is converted into a sequence of >> > commands. >> > + * In a simplified example where SEPARATOR is 0xFF: >> > + * 00 01 02 FF 03 04 05 06 FF 01 FF ... >> > + * becomes this sequence of commands: >> > + * 00 01 02 -> op00 (0102) -> in (0102, 2) >> > + * 03 04 05 06 -> op03 (040506) -> write (040506, 3) >> > + * 01 -> op01 (-,0) -> out (-,0) >> > + * ... >> > + * >> > + * Note here that it is the job of the individual opcode functions to >> > check >> > + * that enough data was provided. I.e. in the last command out (,0), out >> > needs >> > + * to check that there is not enough data provided to select an >> > address/value >> > + * for the operation. >> > + */ >> >> Out if curiosity, do any of our corpus actually make use of the FUZZ string, >> or are we >> just falling back to always using the full size of the provided input? >> > > Do you mean if there is some case where "FUZZ" needs to be used as a > real value, rather than a magical separator? > > Or are asking whether the fuzzer actually generates inputs with the > "FUZZ" separator? > With ASan enabled, libfuzzer immediately figures out that "FUZZ" is an > interesting string (because it instruments memmem) and starts inserting > it all over the place. Without --enable-sanitizers, I add it to a fuzzer > dictionary for the same effect (see bullet-point 1 in PATCH v2 00/15). Should have responded to this, saw that you also used FUZZ later in the patchset when I finally got there :) Thanks, Darren.