There are two kinds of testing code: black box and white box, sometimes called glass box testing. Black box unit tests functionality at the interface boundaries. Nearly all unit tests are structured as black-box tests, because it guarantees software modularity, and forces an emphasis on the interface of the module. White box testing occurs when your tests can both observe and mutate state belonging to the software under test. These kinds of tests are strongly discouraged, because subtle bugs can appear if the test itself is buggy. Glass box testing occurs when your tests can only observe, but not mutate, the state belonging to the production code. Applications of glass box testing include hardware-level verification of a function's output. For example, verifying a skip-list's links are properly set is vital to the successful and bug-free operation of a skip-list's implementation.

Test-suite code clearly has to be able to access the code it is testing. In almost every case imaginable, this access occurs through the published interface of function, procedure, or method calls. The use of "mock objects" ensures information hiding remains intact, guaranteeing a total separation of concerns.

Unit test code for TDD is almost never written within the same project or module as the code being tested. By placing tests in a separate module or library, the production code remains pristine. Placing the TDD code inside the same module would fundamentally alter the production code. Use of conditional compilation directives can introduce subtle bugs.

Some may object that using strict black box testing does not provide access to private data and methods. This is intentional; as the software evolves, you may find the implementation of a class changes fundamentally. Remember a critical step of test-driven development is to refactor. Refactoring may introduce changes which adds or removes private members, or alters an existing member's type. These changes ought not break existing tests. Unit tests that exploit glass box testing are highly coupled to the production software; changing the implementation of a class or module may mean you must also update or discard existing tests, things which should never have to occur. For this reason, glass box testing must be kept to the minimum possible. White box testing should never be used in test-driven development.

In all cases, thought must be given to the question of deployment. The best approach is to develop your software so that you have three major components.The first major component is the unit test runner application framework itself. The second is the main entry module for the production logic. Both of these modules would link (preferably dynamically) to one or more libraries, each implementing some or all of the business logic under development. This guarantees total modularity and is thoroughly deployable.

[edit] Fakes, mocks and integration tests

Unit tests are so-named because they each test one unit of code. Whether a module of code has hundreds of unit tests or only five is irrelevant. A test suite should never cross process boundaries in a program, let alone network connections. Doing either introduces delays, which make tests run slowly, which in turn discourages developers from running the whole suite. Introducing dependencies on external modules and/or data also turns unit tests into integration tests. If one module misbehaves in a chain of inter-related modules, it may not be clear where to look for the cause of the failure.

When code under development relies on a database or a web service or any other external process or service, enforcing a unit-testable separation is an opportunity and a driving force to design more modular, more testable and more re-usable code[9]. Two steps are necessary:

  1. Whenever external access is going to be needed in the final design, an interface should be defined that describes the access that will be available.
  2. The interface should be implemented in two ways, one of which really accesses the external process, and the other is a fake or mock object. Fake objects need do little more than add a message such as "Person object saved" to a trace-log or to the console. Mock objects differ in that they themselves contain test assertions that can make the test fail, for example, if the person's name and other data are inconsistent. Fake and mock object methods that return data, ostensibly from a data store or user, can help the test process by always returning the same, realistic data that tests can rely upon. They can also be set into pre-defined fault-modes so that error handling routines can be developed and reliably tested.

A corollary of this approach is that the actual database or other external-access code is never tested by the TDD process itself. To avoid this, other tests are needed that instantiate the test-driven code with the 'real' implementations of the interfaces discussed above. Many developers find it useful to keep these tests quite separate from the TDD unit tests, and refer to them as integration tests. There will be fewer of them, and they need be run less often than the unit tests. They can nonetheless be implemented using the same testing framework, for example xUnit.

Integration tests that alter any persistent store or database should always be careful to leave them in a state ready for re-use, even if any test fails. This can be achieved using some combination of the following techniques where relevant and available to the developer.