Some comments inline.
> On Feb 27, 2019, at 6:08 PM, Christopher Collins <[email protected]> wrote:
>
> Hello all,
>
> In this email, I would like to discuss Mynewt's test facilities.
>
> ### INTRO (AKA TL;DR)
>
> Unit tests come in two flavors: 1) simulated, and 2) real hardware.
> Writing a test capable of running in both environments is wasted
> effort. Let's revisit Mynewt's test facilities with a clear
> understanding that they must solve two different use cases.
>
> In this email, I want to:
> 1. Clarify the differences between the two test environments.
> 2. Propose some high-level changes to Mynewt's test facilities.
>
> ### TEST ENVIRONMENTS
>
> The two test environments are summarized below.
>
> ##### SIMULATED
>
> Simulated tests ("self tests"):
> * Are executed with `newt test`.
> * Automatically execute a set of tests. No user intervention
> required.
> * Report results via stdout/stderr and exit status.
> * Terminate when the tests are complete.
>
> A typical self test is a very basic application. Usually, a self test
> contains the package-under-test, the testutil package, and not much
> else. Most self tests don't even include the OS scheduler; they are
> simply a single-threaded sequence of test cases. For these simple
> tests, `sysinit()` can be executed between each test case, obviating
> the need for "setup" and "teardown" code. Self tests are allowed to
> crash; such crashes get reported as test failures. Finally, self tests
> have effectively unlimited memory at their disposal.
>
> ##### REAL HARDWARE
>
> Test apps that run on real hardware ("hw tests"):
> * Run a test server. The user executes tests on demand with the
> newtmgr `run` command.
> * Typically contain test suites from several packages.
> * Record results to a Mynewt log.
> * Run indefinitely. The device does not reset itself between
> tests.
>
> Hw tests are more complex than self tests. These test apps need to
> include the scheduler, the newtmgr server, a transport (e.g., BLE), and
> a BSP and all its drivers. Test code cannot assume a clean state at
> the start of each test. Instead, the code must perform manual clean up
> after each test case. Hw test cases must be idempotent and they must
> not crash. Finally, memory is constrained in this environment.
>
> ##### PREFER SELF
>
> Clearly, self tests are easier to write than hw tests. Equally
> important, self tests are easier to run - they can run in automated CI
> systems without the need for a complicated test setup.
>
> So self tests should always be preferred over hw tests. Hw tests should
> only be used when necessary, i.e., when testing drivers or the system
> as a whole. I won't go so far as to say there is never a reason to run
> the same test in both environments, but it is so rarely needed that it
> is OK if it requires some extra effort from the developer.
>
> ### TEST FACILITIES
>
> Mynewt has two unit test libraries: `testutil` and `runtest`.
>
> ##### TESTUTIL
>
> Testutil is a primitive unit test framework. There really isn't
> anything novel about this library - suites, cases, pass, fail, you get
> the idea :). This library is used in both test environments.
>
> I have one concern about this library:
>
> In self tests, does `sysinit()` get called automatically at the
> start of each test case?
>
> Currently, the answer is no.
>
> ##### RUNTEST
>
> Runtest is a grab bag of "other" test functionality:
>
> 1. Command handler for the `run` newtmgr command.
> 2. API for logging test results.
> 3. Generic task creation facility for multithreaded tests.
>
> Features 1 and 2 are only needed by hw tests. Feature 3 is needed by
> both self tests and hw tests.
>
> ### PROPOSALS
>
> 1. (testutil): Modify `TEST_CASE()` to call
> `sysinit()` if `MYNEWT_VAL(SELFTEST)` is enabled.
Would we want all TEST_CASE() to call sysinit if a single syscfg is defined?
Maybe for some tests you want it and some you dont?
>
> 2. (runtest): Remove the task-creation functionality from `runtest`.
> This functionality can go in a new library (call it "taskpool" for the
> sake of discusion).
>
> 3. (taskpool) Further, I think the taskpool functionality could be
> tailored specifically towards unit tests. The remainder of this email
> deals with this proposal
>
> I will use an example as motivation for this library. Here is a simple
> test which uses our to-be-defined taskpool library. This example tests
> that a `cbmem` can handle multiple tasks writing to it at the same
> time. Note: this is not meant to be good test code :).
>
> static struct cbmem cbm;
>
> /**
> * Low-priority task handler. Writes 20 entries to the cbmem.
> * Uses a rate of 1 OS tick per write.
> */
> static void
> task_lo(void *arg)
> {
> int i;
>
> for (i = 0; i < 20; i++) {
> cbmem_append(&cbm, "hello from task_lo", 18);
> os_time_delay(1);
> }
> }
>
> /**
> * High-priority task handler. Writes 10 entries to the cbmem.
> * Uses a rate of 2 OS ticks per write.
> */
> static void
> task_hi(void *arg)
> {
> int i;
>
> for (i = 0; i < 10; i++) {
> cbmem_append(&cbm, "hello from task_hi", 18);
> os_time_delay(2);
> }
> }
>
> TEST_CASE(cbmem_thread_safety) {
> /* Initialize cbmem. */
> uint8_t buf[1000];
> cbmem_init(&cbm, buf, sizeof buf);
>
> /* Create and run the two test tasks. */
> taskpool_create(task_hi, 10 /* Priority */);
> taskpool_create(task_lo, 20 /* Priority */);
>
> /* Wait for both tasks to complete. Fail after three seconds. */
> taskpool_wait(3 * OS_TICKS_PER_SEC);
>
> /* Test result automatically reported here. */
> }
>
> This example illustrates a few requirements of taskpool:
>
> 1) A taskpool task (tp-task) is allowed to "run off" the end of its
> handler function. Normally, it is a fatal error if a Mynewt
> task handler returns.
>
Not sure why I do not like this but is there some need to have a task basically
return?
> 2) Client code can block until all tp-tasks have terminated.
>
> In multithreaded test code, tasks are typically short-lived and simple.
> Mynewt's test facilities should make it easy to create such tasks.
>
> There is one more requirement that isn't expressed in the example, but
> which is important:
>
> 3) Client code can abort all tp-tasks before they return. This is
> needed when a fatal test failure occurs.
>
> I have some ideas for how this could be implemented, but I think that
> is best left for another email (or a PR!). Any concerns or ideas about
> implementation are welcome, but I am particularly interested in high
> level criticism:
>
> * Does this seem useful for unit testing?
> * Is any major functionality missing?
> * Any completely different ideas that are better than this one?
>
Seem fairly reasonable. Is taskpool_create() designed to save the test
developer some time creating tasks or does it have some other functionality?
> Thanks,
> Chris