How to read 11 million books (All about the HathiTrust Research Center)
The HathiTrust (http://hathitrust.org) is a collection of 11 million research library electronic texts digitized by Google, and this hands-on class teaches participants how to use computers to analyze these materials. Sometimes called “distant” or “scalable” reading — forms of digital humanities research — this class demonstrates ways to literally count and tabulate the frequency of words in a text in order to find patterns and anomalies within it. Based on the resulting analysis, it is possible to more quickly learn what a corpus is about when compared to reading the corpus without the use of a computer. HathiTrust materials lend themselves quite easily to this sort of analysis. There are no prerequisites, but participants may want to bring their own laptop to the session.
- Wednesday, March 1, 2017
- 12:00PM - 1:00PM
- Classroom 129 | Center for Digital Scholarship
- CDS | Text Mining & Analysis Workshops