Enabling Reproductibility through Large-Scale Continuous Scientific Benchmarks for Applications in Rosetta