I have recently adjusted the syllabus of my course on digital humanities and text analysis and wanted to share some of the methods, tools and resources that I currently use. For some great tutorials and exercises on computer-based text analysis, you might want to check out the following:
- Hermeneutica – Computer-Assisted Interpretation in the Humanities
- The Programming Historian
- Introduction to Text Analysis: A Coursebook
Preliminary remarks
The target audience for the course are students in the media informatics Master’s program, i.e. students have a background in programming and data modelling. The course is divided in 3 phases: During phases 1+2, students spend 2×2 hours per week in class and learn the fundamentals of digital humanities and text analysis. Phase 3 is a free project phase, where groups of students work on their individual digital humanities projects and regularly meet with the lecturer to discuss the progress of their project.
Phase 1: Introduction to Digital Humanities
Week 1a
- General introduction to the course
- Introducing the DSH reading challenge: throughout the course, each student will present a preselected paper from the Digital Scholarship in the Humanities journal by summarizing its basic research goals, methods and results in 5 minutes (+5 minutes of general discussion)
Reading assignment: Prensky, M. (2001). Digital Natives, Digital Immigrants Part I. On the Horizon, 9(6), 1–6.
Week 1b
- Discussing the „digital“ and its implications, as in digital revolution, digital society, digital culture, digital natives, …
Reading assignment: Snow, C. P. (1959). The Two Cultures. London: Cambridge University Press.
Week 2a
- Discussing „humanities“
- What is special / challenging about digital humanities? Can you adopt the idea of digital natives and digital immigrants to scholarly disciplines?
Reading assignment: Michel, J.-B., Shen, Y. K., Aiden, A. P., Veres, A., Gray, M. K., Pickett, J. P., … Aiden, E. L. (2011). Quantitative analysis of culture using millions of digitized books. Science (New York, N.Y.), 331(6014), 176–82.
Week2b
- From Busa to Culturomics – A short history of the digital humanities
- Exercise: Building your own Index Thomisticus in 5 minutes (using Open Library and Voyant Tools)
- Exercise: Discussion of the Google Ngram Viewer (representativeness, OCR errors, etc.)
Reading assignment: Svensson, P. (2010). The Landscape of Digital Humanities. Digital Humanities Quarterly, 4(1), 1–31.
Week3a
- Defining the digital humanities: Overview and discussion of existing definitions
- Working definition for the course:
(1) DH as use of digital tools / methods / resources in the humanities, and
(2) DH as humanities, investigating digital culture.
Week3b
- Basic introduction to literary studies
- Overview of literary theories and typical research questions in that field
- Exercise: Analyzing “Every breath you take” in class
Reading assignment: Moretti, F. (2000). Conjectures on world literature. New Left Review, (Jan / Feb), 54–68.
Phase 2: How to do digital humanities? A hands-on introduction to computer-based text analysis.
Week4a
- Introduction to (close and) distant reading
- Examples for the application of distant reading approaches
- Tools for distant reading (Voyant, To See or Not to See)
Week4b
- Data acquisition – How to get digital texts?
- Overview of text repositories: Open Library, Project Gutenberg, Deutsches Textarchiv, TextGrid Repository, Folger Digital Texts, opensubtitles.com
- Tools: wget, beautiful soup
Week5a
- Introduction to data cleaning with the bash command line and regular expressions
- Tools: UnixShell, CygWin
Week5b
- Beyond raw text – annotating with XML / TEI
- Utilizing document markup with XSLT
Week6a
- Introduction to natural language processing and automatic tagging
- Tools: WebLicht, TreeTagger, NLTK
Week6b
Week7a
- How to interpret frequencies – Introduction to statistics
- Tool: R studio
Week7b
- Introducion to stylometry
- Tool: R Stylo
Week8a
- Introduction to topic modeling
- Tool: MALLET
Week8b
- Distant reading the DSH: throughout the course, students have “close read” and discussed typical articles from the DSH journal – at the end of the course, we apply frequency lists and topic modeling to a larger corpus of DSH articles.
Phase 3: DH projects
Students work on their own research projects and apply text analysis techniques
Example topics:
- Stylistic change in the different editions of the fairytales by brothers Grimm’s
- Intertextuality: Shakespeare in literature / film / lyrics
- Quantitative analysis of drama according to speakers, structure, speech, etc.
- etc.
Schreibe einen Kommentar