You are here: AU  Summer university Courses OLD 2016 - Text Mining - the Great Unread

Text Mining the Great Unread – Data-intensive methods and digital tools for analysis of texts in the humanities and social sciences


Texts have always been essential to research and education in the humanities and social sciences. Close reading and detailed interpretation have traditionally constituted the standard approach to texts, that is, we combine qualitative methods and theoretically motivated arguments to a small textual corpus with the purpose of understanding the meaning of that corpus. However, the rapid expansion of digital full-text databases, increasingly faster computers, and advances in language technology are starting to impact the standard approach by offering a new digital and data-intensive paradigm in the study of text. Humanities and social science researchers are beginning to ask new types of questions and propose novel solutions to old problems by using faster and more efficient methods to collect, analyze, and visualize texts.

Many students (as well as researchers) experience a lack of digital competences when faced with text mining, that is, the application of tools and methods to analyze large sets of digitized texts. This is unfortunate because text mining 1) enables students to extract high quality information and acquire new knowledge in a fast and efficient manner; and 2) enhances the qualifications of students for a data-driven job market that is relying on the very same tools and methods. Finally, many tools and methods in text mining are in need of a thorough revision by academics who understand the importance of text meaning and context. Academia and industry alike are therefore in great need of students with text mining skills.

 “Text Mining the Great Unread” is an introductory level course to text mining tools and methods in the humanities and social sciences, which will supply participants with sufficient knowledge and experience to develop and implement their own text mining projects. The core of the course is a series of hands-on workshops supplemented by lectures and tutorials by international researchers and industry experts. Through the course, participants will become familiar with text mining methods and software for analyzing and visualizing texts. Participants will learn how to write their own text mining application in R and Python. Through the workshops, participants will also be presented with a range of paradigmatic studies and go through explain research design, best practice, and reporting standards. It is possible to work with one’s own corpus, but historical and contemporary corpora (both works of fiction, historical documents and websites) are also available in class. Participants are not expected to have prior experience with text mining (i.e., programming, statistics, or visualization).

Course Description

Find full course description in the course catalogue Bachelor's Level and Master's Level.

Comments on content: 
Revised 19.02.2016