While access to an RSE to help with your R workflows can be useful, it’s even better to equip your team with skills to follow best practice in their own work from the start.
What I offer
- Course development or augmentation of existing materials according to your participants needs.
- I specialise in teaching R, particularly Reproducible Research in R from a Research Software Engineering, which also includes topics like version control and
- Targeted training activities that are relevant to participant’s domain and workflows.
- Code along training, where I code and participants follow.
- Remote or in-person teaching.
- Delivery on Rstudio cloud for better control of computational environment.
- Course materials websites available beyond the course, inclusing full setup instructions on participant’s own systems.
- Interactive materials using the
- Development and delivery of hackathons, especially ReproHacks
It’s much better to set your team up following best practice from the start. New PhD students especially can waste a lot of time trying to figure things out on their own, often unaware of how much simpler and more effective their scientific code and data workflows could be if they knew a bit more about their tools and had some guidance on the most effective way to use them. Once (non best practice) habits form, it’s also often harder to motivate folks to invest in changing them. That’s why I’ve found working with PhD students especially fruitful. Having said that, it’s never too late to learn the full power of your tools and better ways of working and I’ve worked with many enthusiastic participants at all career stages!
Capacity building of researchers continues to be my favourite part of being an RSE. I find it most rewarding to see people’s reactions when they realise the power of R and associated ecosystem of tools and of following best practice in code development and research data management. I’m a Certified Software Carpentries instructor and have had a lot of experience by now developing and delivering creative training courses. From full day to multi-day courses, shorter and more targeted workshops and seminars or hackathons, I can help your team identify and get access to the training they need to get the most out of working with research code and data.
Examples of training I offer
All examples of training materials below can be updated with newer information or modified to suit your specific needs. And of course, I am always interested in developing something new so please get in touch so we can discuss your needs!
I have developed and taught a number of full to multi day courses. Here are a couple of examples of the types of courses I like to teach.
Reproducible Research Data and Project Management in R
Duration: 4 days
Level: Beginner to Intermediate
Teaching Platform: RStudio Cloud
This 4 day course was developed for the ACCE Doctoral Partnership program and delivered yearly from 2015 to 2021.
It focuses on data and project management through R and Rstudio, introduces students to best practice and equips them with modern tools and techniques for managing data and computational workflows to their full potential. The course is designed to be relevant to students with a wide range of backgrounds, working with anything from relatively small sets of data collected from field or experimental observations, to those taking a more computational approach and bigger datasets.
By the end of the workshop, participants will be able to:
- Understand the basics of good research data management and be able to produce clean datasets with appropriate metadata.
- Manage computational projects for reproducibility, reuse and collaboration.
- Use version control to track the evolution of research projects.
- Use R tools and conventions to document code and analyses and produce reproducible reports.
- Be able to publish, share materials and collaborate through the web.
- Manage dependencies for your project.
- Create a basic R package and implement tests for your functions.
- Understand why this all matters!
The full course materials, including a breakdown of topics covered each day, are freely available.
Git & GitHub through GitKraken Client - From Zero To Hero!
Duration: 1 day
Level: Beginner to Intermediate
Teaching Platform: GitKraken
In this course participants will learn about version control and collaboration through Git, GitHub & GitKraken Client. I developed the course as a staple training offering for the Sheffield RSE team and ran monthly training sessions for students and staff at the the University of Sheffield as well as specific groups.
The course covers:
- Version controlling your own project through Git & GitHub.
- Basic collaboration through forks on GitHub.
- Advanced team collaboration through branches on GitHub.
- Using GitKraken Client for a smooth version control experience.
Depending on the time available for training, a shorter workshop on a specific topic might be more appropriate. Here’s a few examples of workshops I’ve developed and delivered in response to client requests.
Introduction to GIS in R
Duration: 3.5 hrs
Teaching Platform: RStudio Cloud
Developed for the Programming for Evolutionary Biology 2018 Conference, this half day workshop introduces participants to the basics of Geographic Information Systems (GIS) in R.
Workshop aims and objectives:
- Understand the basics of GIS
- Understand spatial data types and formats
- Be able to work with, manipulate, combine and extract spatial data in R
- Be able to plot geospatial data
Duration: 1.5 hrs
The goal of this tutorial is to provide a practical exercise in creating metadata for an example field collected data product using package
By the end of the workshop, participants will:
- Understand basic metadata and why it is important.
- Understand where and how to collect and store them.
- Understand how they can feed into more complex metadata objects.
Reproducible Research in R with
Duration: 3 hrs
In this workshop we use materials associated with a published paper (text, data and code) to create a research compendium around it in R using package
By the end of the workshop, you should be able to:
- Be able to Create a Research Compendium to manage and share resources associated with an academic publication.
- Understand the basics of managing code as an R package.
- Be able to produce a reproducible manuscript from a single rmarkdown document.
- Appreciate the power of convention!
Perhaps you just want someone to speak about best practice when working with research code and data in R! I’ve given many talks on topics regarding research reproducibility (have a look at our page on advocacy services for more details), but here’s an example of the most well rounded talk on the topic.
Putting the R into Reproducble Research
Duration: 1 hr (+)
R and its ecosystem of packages offers a wide variety of statistical and graphical techniques and is increasing in popularity as the tool of choice for data analysis in academia. In addition to its powerful analytical features, the R ecosystem provides a large number of tools and conventions to help support more open, robust and reproducible research. This includes tools for managing research projects, building robust analysis workflows, documenting data and code, testing code and disseminating and sharing analyses. In this talk we’ll take a whistle-stop tour of the breadth of available tools, demonstrating the ways R and the Rstudio integrated development environment can be used to underpin more open reproducible research and facilitate best practice
I’ve given this talk a number of times and continue to update it every time. A recording of the last time I presented it at the RSE Sheffield Lunch Bytes seminar series is available.
I’ve been involved in many hackathons, both as participant and organiser, and find them extermely effective and productive learning environments. Whatever the learning experience you want to create and topic you want to base your hackathon around, I can help you design and facilitate all aspects of it.
You can get a taste for why I love hackathons from the slides that accompanied the invited talk I gave at the British Ecological Society Quantitative Ecology Special Interest Group Hackathon in 2019.
My speciality however, as founder and core team member of the ReproHack team, is running ReproHacks! See below for more details on the Reproducibility Hackathons we run.
Duration: Usually 1 day
ReproHacks provide sandbox environments for practicing Reproducible Research. During a ReproHack, participants attempt to reproduce published research of their choice from a list of proposed papers using publicly available associated code and data. Participants get to work with other people’s material in a low pressure environment and record their experiences on a number of key aspects, including reproducibility, transparency and reusability of materials. Authors receive invaluable feedback on the reproducibility and reusability of their work.
The project has seen a lot of development since being the topic of my 2019 SSI fellowship, we’ve experimented a lot with the format, from in person to remote, from standalone events to series of events or part of larger conferences and in March 2022, in collaboration with the University of Warwick, we ran a successful first HPC Reprohack on [Sulis Tier 2 HPC Platform for Ensemble Computing](Sulis : A Tier 2 HPC Platform for Ensemble Computing) over 10 days.
We also now have the ReproHack hub (https://www.reprohack.org/) which provides infrastructure for facilitating all aspects of an event. If you are interested in finding out more or running one yourselves, please have a look at the Hub. There are plently of resources to get you started. However, should you want myself to come run one for you, please get in touch!