Skip to Main Content

Digital Scholarship

Python

Python and R are open source programming tools commonly used by researchers for text mining. R is a favorite among statisticians and Python has more users in the humanities disciplines. Both have many add-ons useful for mining text and creating visualizations. 

Jupyter Notebook is a popular platform for using Python and makes it easy to share and explain code. Anaconda is a popular installation option for Python that includes both Python and Jupyter Notebook.

This example using Python code in Jupyter Notebook looks at the sentiment over the course of several animated movies; the code was adapted from Alice Zhao's Natural Language Processing in Python Tutorial.

Python Training Videos

Python is a programming language and as such, takes a while to learn its syntax. While the last video focuses specifically on text mining, the earlier videos can help you understand what the code is doing so you can better adapt it for your own uses.

For the NLP videos, you can copy and paste the code from the related GitHub site to try it out yourself. The read me on the first page has directions for installing Anaconda and the packages you will want to use for the example. One change you will need to make in the code for the "Exploratory Data Analysis" page is the class_="post_content" from the Scraps from the Loft website should be updated to class_="elementor-element elementor-element-74af9a5b elementor-widget elementor-widget-theme-post-content"

Once you figure you the basics, you can try pulling text from other transcripts or other websites and edit the names of things to fit what you want to call them (for example, movies instead of comedians). GitHub has all kinds of code examples related to Python and text mining, so you can adapt code developed by others for your own uses.

Python Books