Ruprecht-Karls-Universität Heidelberg
Siegel der Universität Heidelberg

Data and Text Mining


Driven by major advancements in sensor technology, instrumentation of experiments, and computer-based simulations, we are witnessing a dramatic increase in the amount and diversity of data being collected. Organizations in the government, industry as well as academic and private sectors have made significant investments in infrastructures to collect, manage, and analyze diverse types of data, ranging from scientific data collected in the physical, Earth, and life sciences to textual data collected by microblogging services and platforms to digitalization efforts in the Digital Humanities.

The objective of the specialization "Data and Text Mining" in the International Masters Program of Scientific Computing is to study models, techniques, tools, and architectures in support of managing and analyzing large-scale and diverse data sets. The focus is on traditional data mining concepts such as clustering, association rule mining, and classification to more advanced techniques like mining graph data, data/text streams, document collections, and social network data. Techniques such as features extraction approaches and probabilistic data analysis models will be learned, the latter playing an important role in analyzing text data. The applications include traditional frameworks such as scientific data warehouses employed in the natural sciences, the analysis of large-scale social networks, and the exploration of document collections, the latter being a prominent theme in the Digital Humanities.

Researchers in this field

  Prof. Michael Gertz

  Prof. Artur Andrzejak

Course Offerings / Tentative Study Plan:

Foundation Courses

These courses are offered as part of the Computer Science Curriculum in alternate years in the winter semester.

Other courses relevant to this specialization

Computer Science:


Seminars & Practicals

  • Seminars covering recent topics in data mining and text mining
  • Advanced practical addressing research in the areas of data mining and text mining

Application Fields (18 ECTS)

  • Geoinformatic
  • Computational Linguistics
  • Astrophysics

Sample Plan of Study

Winter Term 1

Summer Term 1

Winter term 2

Summer term 2

zum Seitenanfang