24 October 2019
Lets face it, developing software for science and engineering is really hard. First there are the garden variety bugs that you find in all software. As these are so common are many developer tools that help identify these errors. However, for science and engineering software there are a whole host of additional ways software can be incorrect - the physics might be messed up (Mars Climate Orbiter), errors may be present in the underlying mathematical methods or an inappropriate numerical method used in a given context, performance bugs, ML components that sometimes give unexpected output, etc. A single piece of software may cut across all these domains of expertise so it is incredibly difficult for a single individual to have enough time and knowledge to even understand all the different ways that the software might give incorrect results. The vital importance and complexity of code testing and verification is such that it needs to be an integral part of software development itself and a shared responsibility among all researchers involved.
Rigorous research software testing is often considered far too labor intensive, expensive and thankless of an activity. The whole incentive structure within academic research is heavily weighted towards maximizing the number of journal publications. There is rarely any real motivation for putting effort into making sure that research software is reusable by others (where ‘others’ may be their future selves). At the same time one of the most infuriating statements researchers learn to hate is ‘…but it works on my laptop’, and researchers can even struggling to reproduce their own results when we are revising a paper submitted 6 months before. This is not only bad for research but also greatly limits the potential social, and economic impact of the research.
Thankfully the rapid growth in digitalization and Cloud computing in recent years has brought with it a greatly enhanced software development ecosystem. Today fully automated software testing is for the most part free and relatively simple to set up. The technology barrier has been greatly lowed to the extent that it is now hard to justify not adding automated testing to your research software. The challenge now is to increase awareness throughout the research community that quality research software testing and reproducibility is shifted from being an idealistic goal to being an integral part of research that increases quality and productivity.
That is not to say software testing and reliability is a solved problem. We can always do more and better testing - and this means better, more reliable and reproducible research. Some areas present specific challenges for code verification and validation which will continue to be difficult topics. However, this does not detract from the fact that some level of automated testing and deployment can always be achieved as the outcome will be faster research rather than spending your days chasing down bugs or indeed trying to figure out why it only ‘works on my laptop’.
The overall aim of this sprint is to increase the reliability of software and reproducible data and computational research through the use of modern DevOps practices and identify how DevOps needs to be improved to better support the research community. —
We will be adopting an “unconference style” for this event. Meaning that we do not have a fixed agenda of the projects we will be working on but the community will decide the topics and tools they are more interested in discussing/exploring. Some topics that could be explored are:
Feel free to bring your own code to work on or join a team working on a code you are interested in. If you want to propose your own project/code then please open an issue on GitHub giving details of the software and what you would like help with (https://github.com/research-software-reactor/devops-for-research-sprint/issues). Similarly, browse github issues to see what existing projects you would like to participate in. The sprint will begin with an elevator pitch from each proposed project to recruit their dev team for the sprint.
Folks interested in improving the reliability of research software or interested in Continuous Integration, Continuous delivery and DevOps practices. You will have some coding experience and should be prepared to get involved in the sessions and discussion including leading topics.
Make sure to bring a laptop. You might need to install some packages so make sure you can do so.
No, you are free to use whatever infrastructure, tools and workflows you prefer. This is a community led event and the outcomes are meant to serve the community and help solve your problems and help others to have more efficient workflows.
Tea, cofee and lunch will be provided on both days.
You can propose a project by creating an issue in this repository https://github.com/research-software-reactor/devops-for-research-sprint/issues and attaching the label
project proposal 📃. We will then discuss the projects
on the day and break into teams.
Registration has not yet been opened. Please use the contact form on this site to register your interest.
All the attendees and organisers should abide by The Research Software Reactor Code of Conduct which can be found on this site.
Background training material and talks that will help you prepare for the day are listed below. Don’t worry if you don’t feel like a DevOps guru before you arrive as there will be lots of opportunities to ask questions of fellow practitioners and Azure experts over the course of the sprint. Depending on demand we may have break out sessions to walk through specific topics.
Tutorials and documentation:
Suggested reading on reproducibility in computational research: