On Wed, 27 Dec 2017 19:49:28 -0800 Alexei Starovoitov <a...@fb.com> wrote:
> On 12/27/17 5:38 PM, Masami Hiramatsu wrote: > > On Wed, 27 Dec 2017 14:49:46 -0800 > > Alexei Starovoitov <a...@fb.com> wrote: > > > >> On 12/27/17 12:09 AM, Masami Hiramatsu wrote: > >>> On Tue, 26 Dec 2017 18:12:56 -0800 > >>> Alexei Starovoitov <alexei.starovoi...@gmail.com> wrote: > >>> > >>>> On Tue, Dec 26, 2017 at 04:48:25PM +0900, Masami Hiramatsu wrote: > >>>>> Support in-kernel fault-injection framework via debugfs. > >>>>> This allows you to inject a conditional error to specified > >>>>> function using debugfs interfaces. > >>>>> > >>>>> Signed-off-by: Masami Hiramatsu <mhira...@kernel.org> > >>>>> --- > >>>>> Documentation/fault-injection/fault-injection.txt | 5 + > >>>>> kernel/Makefile | 1 > >>>>> kernel/fail_function.c | 169 > >>>>> +++++++++++++++++++++ > >>>>> lib/Kconfig.debug | 10 + > >>>>> 4 files changed, 185 insertions(+) > >>>>> create mode 100644 kernel/fail_function.c > >>>>> > >>>>> diff --git a/Documentation/fault-injection/fault-injection.txt > >>>>> b/Documentation/fault-injection/fault-injection.txt > >>>>> index 918972babcd8..6243a588dd71 100644 > >>>>> --- a/Documentation/fault-injection/fault-injection.txt > >>>>> +++ b/Documentation/fault-injection/fault-injection.txt > >>>>> @@ -30,6 +30,11 @@ o fail_mmc_request > >>>>> injects MMC data errors on devices permitted by setting > >>>>> debugfs entries under /sys/kernel/debug/mmc0/fail_mmc_request > >>>>> > >>>>> +o fail_function > >>>>> + > >>>>> + injects error return on specific functions by setting debugfs entries > >>>>> + under /sys/kernel/debug/fail_function. No boot option supported. > >>>> > >>>> I like it. > >>>> Could you document it a bit better? > >>> > >>> Yes, I will do in next series. > >>> > >>>> In particular retval is configurable, but without an example no one > >>>> will be able to figure out how to use it. > >>> > >>> Ah, right. BTW, as I pointed in the covermail, should we store the > >>> expected error value range into the injectable list? e.g. > >>> > >>> ALLOW_ERROR_INJECTION(open_ctree, -1, -MAX_ERRNO) > >>> > >>> And provide APIs to check/get it. > >> > >> I'm afraid such check would be too costly. > >> Right now we have only two functions marked but I expect hundreds more > >> will be added in the near future as soon as developers realize the > >> potential of such error injection. > >> All of ALLOW_ERROR_INJECTION marks add 8 byte overhead each to .data. > >> Multiple by 1k and we have 8k of data spent on marks. > >> If we add max/min range marks that doubles it for very little use. > >> I think marking function only is enough. > > > > Sorry, I don't think so. > > Even if it takes 16 bytes more for each points, I don't think it is > > any overhead for machines in these days. Even if so, we can provide > > a kconfig to reduce it. > > I mean, we are living in GB-order memory are, and it will be bigger > > in the future. Why we have to worry about hundreds of 16bytes memory > > pieces? It will take a few KB, and even if we mark thousands of > > functions, it never reaches 1MB, in GB memory pool. :) > > > > Of course, for many small-footprint embedded devices (like having > > less than 128MB memory), this feature can be a overhead. But they > > can cut off the table by kconfig. > > I still disagree on wasting 16-byte * num_of_funcs of .data here. > The trade-off of usability vs memory just not worth it. Sorry. > Please focus on testing your changes instead. Then what happen if the user set invalid retval to those functions? even if we limit the injectable functions, it can cause a problem, for example, obj = func_return_object(); if (!obj) { handling_error...; } obj->field = x; In this case, obviously func_return_object() must return NULL if there is an error, not -ENOMEM. But without the correct retval information, how would you check the BPF code doesn't cause a trouble? Currently it seems you are expecting only the functions which return error code. ret = func_return_state(); if (ret < 0) { handling_error...; } But how we can distinguish those? If we have the error range for each function, we can ensure what is *correct* error code, NULL or errno, or any other error numbers. :) At least fail_function needs this feature because it can check return value when setting it up. Thank you, -- Masami Hiramatsu <mhira...@kernel.org>