There has been a massive interest in reproducible research / data analysis pipelines over the last few years. But... how can I ensure that what I produce as a Python user is reproducible? In this tutorial we'll be taking you on a journey down the rabbit hole of reproducibility. We'll be taking a step by step approach to reproducible scientific development in Python. This means you get a crash course on version control, execution environments, testing, and continuous integration. And a guide on how to integrate all of these in your software projects. By the end of the course we hope you will have the necessary tools to make your Python workflows reproducible no matter if you're starting a brand new project or if this is ready to be shared with the world.
This tutorial should be appropriate for anyone using Python for research / data analysis. It would be better suited for beginners/ intermediate level users aiming to enhance their code craftmanship and learn more about software development best practices. By the end of the course the attendees will have learnt about best practices for reproducible scientific code development and should be able to implement these techniques to their day to day workflows.
- Introduction to reproducibility and its importance (20 mins): presentation
- How reproducible are my workflows?: team discussion (attendees will make teams and discuss) if they have reproducible practices or how to ensure reproducibility (20 mins)
- Hands on:
- Starting a project: early considerations for reproducibility
- Introduction to licensing, data and software curation strategies that have to be considered in the early strategies of the project. Students will start setting up a 'mock project' (30 mins)
- How do I make sure my results are correct?
- Testing in Python
- How to develop software and test it at the same time. Students will acquire hands on experience on writing test cases. (30 mins)
- How can I identify bugs easier?
- Students will learn about test automation and Continuous integration (30 mins)
- My test is ready.. how can I share it? Measuring test quality, getting cited for your software, documentation, containers? execution environments? (40 mins)
closing up (10 mins)