It is becoming increasingly common for academics to engage in research into their own teaching, usually by introducing some form of innovation into one of their courses. When doing so, they are often inclined by disciplinary tradition, or urged by senior colleagues, to use experimental designs in order that any observed outcomes can be attributed to the innovation itself. This article points out the problems with experimental designs for naturalistic studies of innovative teaching in higher education. A genuine control is impossible. Practical difficulties in separating groups often result in contamination of designs. Educational issues are complex with many variables involved. Therefore, experimental designs with limited numbers of cells result in oversimplification because they deal with only a few of the relevant factors. Particular types of innovation are not precisely reproducible so generalisation can be misleading. As an alternative, triangulation across multi-method evaluations from several sources is recommended, with the aim of establishing evidence beyond reasonable doubt. Comparison and synthesis across related projects appears to be a promising way to derive recommendations for ways to formulate important aspects of innovations.