In the academic year 2021-2022 this course will take place online.
February 21 (Monday) 2022
February 22 (Tuesday) 2022
February 24 (Thursday) 2022
February 25 (Friday) 2022
In this workshop participants will learn how to organize and clean quantitative data and how to control the quality of such data. These data management skills are highly relevant and useful for all researchers, in all disciplines and faculties, who do empirical quantitative research.
Data Carpentry can be seen as a more in-depth follow-up course to our Responsible Research Data Management course. As it offers a basic introduction to R, this course can also help as a preparation to our course Data Analysis with R.
For more information about the course, please consult the workshop website. Note: This is also the website where you can find the registration link (registration is not possible through the website of the EGSH, since this is a joint workshop by Leiden-Delft-Erasmus).
Data Carpentry develops and teaches workshops on the fundamental data skills needed to conduct research. Its target audience is researchers who do quantitative research and have little to no prior computational experience. It is domain-specific, building on learners' existing knowledge to enable them to quickly apply skills learned to their own research. Participants will be encouraged to help one another and to apply what they have learned to their own research problems. For more information on what we teach and why, please see our paper "Good Enough Practices for Scientific Computing".
After completion of this workshop, you will be able to:
- apply good data organization practices in spreadsheets;
- use OpenRefine to effectively clean and format data and automatically track any changes that you make;
- manipulate values, objects, functions, and arguments with R;
- organize and manipulate data within data frames in R, using the tidyr package;
- produce scatter plots, boxplots, and barplots from data in R using ggplot;
- and create an R Markdown document containing R code, text, and plots.
The course is aimed at doctoral students and other researchers from Erasmus University Rotterdam, Leiden University, and Delft University of Technology. You don't need to have any previous knowledge of the tools that will be presented at the course.
Requirements: Participants must have access to a computer with a Mac, Linux, or Windows operating system (not a tablet, Chromebook, etc.) that they have administrative privileges on. They should have a few specific software packages installed (listed on this webpage).
Accessibility: We are dedicated to providing a positive and accessible learning environment for all. Please notify the instructors in advance of the workshop if you require any accommodations or if there is anything we can do to make this workshop more accessible to you.
Pre-workshop survey: Please be sure to complete the pre-workshop survey before the workshop. You can find the survey link on the workshop website.
In session 1, you will learn how to implement data management in spreadsheet programs, and how to use the OpenRefine tool to effectively clean and format data and automatically track any changes that you make.
In session 2-4, the focus is on using R, RStudio, and RMarkdown for data wrangling and data visualization. You will learn how to work with data types and variables in R. Using a real life dataset, you will learn how to import data in R, to organize and manipulate the data, and create custom plots based on the dataset.
The workshop will be taught by certified Carpentries instructors as well as a team of helpers (supporting learners one-on-one if they are stuck installing software, understanding a certain line of code, or any other parts of the learning process) from Leiden University, Delft University of Technology, and Erasmus University Rotterdam. For the up-to-date list of helpers and instructors, see the workshop website.