Data carpentry for social sciences and humanities

Methodology courses and philosophy of science


Workshop information

ECTS: 2.5
Number of sessions: 2 or 4
Hours per sessions: 4 or 8
Course fee: free


In the academic year 2022-2023 the course will take place both online and offline.

Edition 1 (online)

October 24 2022
09.00-13.00

October 25 2022
09.00-13.00

October 27 2022
09.00-13.00

October 28 2022
09.00-13.00

Edition 2 (online)

February 20 2023
09.00-13.00

February 21 2023
09.00-13.00

February 23 2023
09.00-13.00

February 24 2023
09.00-13.00

Edition 3 (offline)

June 12 2023
Whole day

June 13 2023
Whole day

Please consult the course website for further information and registration. Registration is not possible through the website of the EGSH.


Introduction

In this workshop participants will learn how to organize and clean quantitative data and how to control the quality of such data. These data management skills are highly relevant and useful for all researchers, in all disciplines and faculties, who do empirical quantitative research.

Data Carpentry can be seen as a more in-depth follow-up course to our Responsible Research Data Management course. As it offers a basic introduction to R, this course can also help as a preparation to our course Data Analysis with R.


Aims and working method

Data Carpentry develops and teaches workshops on the fundamental data skills needed to conduct research. Its target audience is researchers who do quantitative research and have little to no prior computational experience. It is domain-specific, building on learners' existing knowledge to enable them to quickly apply skills learned to their own research. Participants will be encouraged to help one another and to apply what they have learned to their own research problems. For more information on what we teach and why, please see our paper "Good Enough Practices for Scientific Computing".


Learning objectives

After completion of this workshop, you will be able to:

  • apply good data organization practices in spreadsheets;
  • use OpenRefine to effectively clean and format data and automatically track any changes that you make;
  • manipulate values, objects, functions, and arguments with R;
  • organize and manipulate data within data frames in R, using the tidyr package;
  • produce scatter plots, boxplots, and barplots from data in R using ggplot;
  • and create an R Markdown document containing R code, text, and plots.

Entry level

The course is aimed at doctoral students and other researchers from Erasmus University Rotterdam, Leiden University, and Delft University of Technology. You don't need to have any previous knowledge of the tools that will be presented at the course.


How to prepare

Requirements: Participants must have access to a computer with a Mac, Linux, or Windows operating system (not a tablet, Chromebook, etc.) that they have administrative privileges on. They should have a few specific software packages installed (listed on this webpage).

Accessibility: We are dedicated to providing a positive and accessible learning environment for all. Please notify the instructors in advance of the workshop if you require any accommodations or if there is anything we can do to make this workshop more accessible to you.

Pre-workshop survey: Please be sure to complete the pre-workshop survey before the workshop. You can find the survey link on the course website.


Sessions

You will first learn how to implement data management in spreadsheet programs, and how to use the OpenRefine tool to effectively clean and format data and automatically track any changes that you make.

Subsequently, the focus is on using R, RStudio, and RMarkdown for data wrangling and data visualization. You will learn how to work with data types and variables in R. Using a real life dataset, you will learn how to import data in R, to organize and manipulate the data, and create custom plots based on the dataset.


About the instructors

The workshop will be taught by certified Carpentries instructors as well as a team of helpers (supporting learners one-on-one if they are stuck installing software, understanding a certain line of code, or any other parts of the learning process) from Leiden University, Delft University of Technology, Erasmus University Rotterdam, and VU Amsterdam.