I wrote a program/library which I used to obtain results in an article. (Here it is, but my question is general.) I have tests that I run regularly using ctest (it takes a few minutes to run). In order to reproduce some tables or figures in the article, I have to construct a script or a simple driver program, that runs maybe 10 minutes, sometimes more, so I don't want to have this part of the regular test suite. At the same time, I want to make sure that the results from the article can be:
- reproduced later
- make sure they still give the same/correct results after I keep developing the library
Currently I try to have a small driver program that I run as part of the regular test suite, and if I want to reproduce results from the article, I uncomment some lines in there. Of course, I never know which exact lines and if I have to tweak some other parameters in order to get precisely the same results as in the article.
I also tried to have a Python script that calculates the exact figures/tables from the article. Such a script typically stops working after an update to the library, because it is not being run on a regular basis (takes too much time).
The best method that occurred to me is to have a Fortran (or C/C++) example, that will be regularly compiled (as part of the library), but not run in regular test suite. That way, at least I know that it compiles fine (and thus hopefully also runs). And I'll test some simple (smaller) example as part of a regular test suite.
What are optimal ways to handle this problem?