Library Workshops and Events
How to read 14 million books (All about the HathiTrust Research Center)
The HathiTrust (http://hathitrust.org) is a collection of 14 million research library electronic texts digitized by Google. This hands-on workshop teaches participants how to use computers to analyze the materials available in the HathiTrust collections. Sometimes called "distant" or "scalable" reading—forms of digital humanities research—this session demonstrates ways to count and tabulate the frequency of words in a text in order to find patterns and anomalies within it. Based on the resulting analysis, it is possible to more quickly learn what a corpus is about when compared to reading the corpus without the use of a computer. HathiTrust materials lend themselves quite easily to this sort of analysis.
There are no prerequisites, but participants may want to bring their own laptop to the session.
- Thursday, March 1, 2018
- 12:00PM - 1:00PM
- Classroom 129
- CDS | Text Mining & Analysis Workshops