"Serdar Tumgoren" <zstumgo...@gmail.com> wrote
Simple enough in theory, but the problem I'm hitting is where to begin
with tests on data that is ALL over the place.

I've just spent the last 2 days in a workshop with >30 of our company's end to end test team. These guys are professional testers, all they do is test software systems. Mostly they work on large scale systems comprising many millions of lines of code on multiple OS and physical servers/networks. It was very interesting working with them and learning more about the techniques they use. Some of these involve very esoteric (and expensive!) tools. However much of it is applicable to smaller systems.

Two key principles they apply:

1) Define the "System Under Test" (SUT) and treat it as a black box.
This involves your test boundary and working out what all the inputs are and all the outputs. Then you create a matrix mapping every set of inputs (the input vector) to every corresponding set ouf outputs (The output vector). The set of functions which maps the input to output is known as the transfor function matrix and if you can define it mathematically it becomes possible to automate a compete test cycle. Unfortunately its virtually never definable in real terms so we come to point 2...

2) Use risk based testing
This means look at what is most likely to break and focus effort on those areas. Common breakage areas are poor data quality and faulty interfaces. So test the inputs and outputs thoroughly.

In such a case, should I be writing test cases for *expected* inputs
and then coding the the parser portion of my program to handle the
myriad of possible "bad" data?

Or given the range of possible inputs, should I simply skip testing
for valid data at the parser level, and instead worry about flagging
(or otherwise handling) invalid input at the database-insertion layer
(and therefore write tests at that layer)?

The typical way of testing inputs with ranges is to test
just below the lower boundary, just above the boundary, the mid point, just below the upper boundary, just above the boundary known invalid values, "wildy implausible" values.

Thus for an input that can accept values between 1 and 100 you would test 0,1,50,100,101, -50 and 'five' say

Its not exhaustive but it covers a range of valid and invalid data points.
You could also test very large data values such as
12165231862471893479073407147235801787789578917897
Which will check for buffer and overflow type problems

But the point is you applyintelligence to determine the most likely forms of data error and test those values, not every possible value.

Or should I not be testing data values at all, but rather the results
of actions performed on that data?

Data errors are a high risk area therefore should always be tested.
Look to automate the testing if at all possible and write a program to generate the data sets used and ideally to generate the expected output data too - but thats hard since presumably you need the SUT to do that!

It seems like these questions must be a subset of the issues in the
realm of testing. Can anyone recommend a resource on the types of
tests that should be applied to the various tasks and stages of the
development process?

A friend recommended The Art of Software Testing -- is that the type
of book that covers these issues?

Yes and is one of the stabndard texts.
But most general software engineering texts cover testing issues.
For example try Somerville, Pressman, McConell etc.

suitable alternative that costs less than $100?

Most of the mentioned authors have at least 1 chapter on testing.

HTH,

PS. It was also interesting to hear how testing has moved on from the days I spent as a tester at the beginning of my career in software engineering. In particular the challenges of Agile techniques for E2E testing and the move away from blanket testing to risk based testing, as well as the change in emphasis from "try to break it" - the dominant advice in my day - to "check it breaks cleanly under real loads"

--
Alan Gauld
Author of the Learn to Program web site
http://www.alan-g.me.uk/

_______________________________________________
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Reply via email to