Quantitative and qualitative text analysis with MATLAB

Course information

ECTS: 2.5
Number of session: 4
Hours per session: 3
Course fee:

  • free for PhD candidates of the Graduate School
  • €525,- for non-members
  • consult our enrolment policy for more information


  • Enrolment- and course-related questions: enrolment@egsh.eur.nl
  • Course-related questions: Rob.grim@eur.nl

Telephone: +31 (0)10 4082607 (Graduate School).

In the academic year 2022-2023 this course will take place online.

Session 1
May 4 2023

Session 2
May 11 2023

Session 3
May 16 2023


Analysis of textual data (such as policy documents, social media content or in-depth interviews) is important in many research fields within the social sciences, humanities and other faculties. In this three-module course, students will acquire the necessary skills for pre-processing and analyzing textual data with the program MATLAB. You will learn how the use MATLAB for both quantitative and qualitative text analysis. Students are encouraged to bring their own dataset to work on!

Why learn MATLAB? MATLAB is one of today’s most versatile and leading learning platforms as denoted by Gartner (2021). Less known is that MATLAB is also highly accessible for social (data) science research and easier to get started with than e.g. Python. One of the clear advantages of using MATLAB - compared to Python - is that it’s much more accessible to people who are not familiar with programming. MATLAB makes applying advanced methods easy. Learning MATLAB is also very valuable for students who want to learn or are working with the program R.

Aims and working method

This course follows a learning-by-doing approach with practical hands-on examples and interactive notebooks. After a brief introduction to MATLAB, students will learn to work with the Text Analytics toolbox. Students will learn to create custom labelled and curated datasets and to apply various methods and applications for text analysis research, such as TF-IDF, BagofWords, bagOfNgrams, text-search, word-embeddings and sentiment analysis.

Entry requirements

Participating in this course does not require any previous programming experience. The course can be attended by researchers who are not yet experienced with text analysis.

Entry level

The course is useful for students who have no prior knowledge of and experience with MATLAB. Familiarity with a statistical package (SPSS, Stata, R, SAS) and/or a programming language (Python, R) is recommended.

Learning objectives

By completing this course participants will:

  • Be able to install packages and add-ons relevant for text analysis in MATLAB;
  • Acquire essential data engineering skills to organize, structure and prepare text data for qualitative and quantitative analysis in MATLAB;
  • Be able to work independently with the MATLAB Text Analytics toolbox, and to apply various text analysis research methods and functions in this program;
  • Visualize text-analysis results and produce high quality graphics with MATLAB.


Session 1
In the first session, students are familiarised with the MATLAB user interface, working with interactive notebooks, MATLAB Drive and installing toolboxes (i.e. the Text Analytics Toolbox). Through hands-on examples students learn to work with (among other things) chars, strings, tokenized documents, text search and simple regular expressions. Students will be introduced to visualizing text data in MATLAB using basic 2D scatter plots and word clouds. Home exercises are provided for further exploration and deepening of working with text in MATLAB.

Session 2
In the second session, students will master various pre-processing methods for text analysis, such as frequency counts, TD-IDF and custom labelled datasets, and learn to apply supporting functions for text analysis, such as BagofWords, bagOfNgrams and word-embeddings. The second session will also cover practical data management skills to handle and organize (large) collections of text data. Students are encouraged to bring their own dataset to work on.

Session 3
Each student presents a case study and shares lessons learned for working with text data and text analysis in MATLAB. A comparison of MATLAB with other software for text analysis will be discussed where needed. Lastly, we will look into the further capabilities of MATLAB’s Text Analytics Toolbox for advanced modeling and visualizing text data.

About the instructor

Rob Grim has held positions as a Data Analyst, as a Research Data Specialist and as Head of Research Support. He currently works as Business/Economics & Data Librarian at the EUR and as a member of the Erasmus Data Service Centre (EDSC) team. Rob has extensive experience with research data management, data-preprocessing and data analytics in various science disciplines. He has an interest in statistics, cognitive science and machine learning. Rob Grim has a background in Psychology.