Bryan Tarpley, Ph.D.

Associate Research Scientist
Center of Digital Humanities Research
Texas A&M University

As a Research Scientist, I leverage my full-stack development skills to create and evolve digital humanities projects, whether by developing and maintaining shared infrastructure, training stakeholders how to use digital tools, or creating boutique solutions. I have decades of experience working in the information technology sector for higher education, and have pedagogical experience and a terminal degree in the humanities, making me an insider to both academic and technological fields.

Education

Doctor of Philosophy, English 2019
Texas A&M University
Dissertation: "Making Room for Affect: Zadie Smith, David Foster Wallace, and the Authenticating Human"
Committee: Professors Sally Robinson (Chair), Emily Johansen, Mikko Tuhkanen, and Tommy Curry
Master of Arts, English 2008
Stephen F. Austin State University
Thesis: "Cain"
Advisor: Professor John McDermott
Bachelor of Arts, Computer Science 2005
Harding University

Employment

Associate Research Scientist of Critical Infrastructure Studies 2021-present
Center of Digital Humanities Research
Texas A&M University
See the description of my role at the center described below. Beginning in 2023, however, the center’s mandate pivoted such that we became chiefly concerned with project development for local TAMU faculty and graduate students.
Software Applications Developer III 2016-2021
Center of Digital Humanities Research
Texas A&M University
As the lead developer for the Center of Digital Humanities Research (CoDHR), I implemented technical solutions for large, international grant initiatives; taught the “Python for Humanists” courses as part of the Programming4Humanists continuing education program; carried out Summer Technical Assistance grants which incrementally developed projects for local faculty and graduate students; and managed the Humanities Visualization Space (a black-box room with a very large screen and surround sound speakers).
Lead Software Applications Developer 2014-2016
Information Technology, Infrastructure, and Operations
Texas A&M University
I played pivotal roles in two major campus-wide initiatives. The first was to engineer a self-service portal for faculty, students, and staff to use to transition their “Zimbra” email inboxes, calendars, and “briefcases” to Google cloud services; specifically Gmail, Google Calendar, and Google Drive. This involved creating a performant, enterprise grade web application with an asynchronous job queue. With this application we were able to migrate over 50,0000 Zimbra accounts, and our team was awarded “Team of the Semester” by TAMU Computing and Information Services. The second major initiative was carried out chiefly by me, and it involved developing “TAMUDirect,” which was another enterprise web application that allowed faculty to access and configure Google Group mailing lists for all of their courses.
Graduate Assistant Researcher 2012-2014
Initiative for Digital Humanities, Media, and Culture
Texas A&M University
As a graduate assistant researcher, I created the MySQL data schema for the Early Modern OCR Project (eMOP), as well as a simple query builder in PHP to explore that data. I also developed an application in C# that allowed eMOP researchers to select ideal instances of glyphs from Early Modern texts, create synthetic training images, and train the Tesseract 3 OCR engine used to OCR 46 million page images.
Graduate Assistant Teacher 2011-2012
Department of English
Texas A&M University
As a graduate assistant teacher, I taught a 2/2 load of composition and rhetoric and literature survey courses. I designed the literature survey courses that I taught. I also served as a summer intern for the writing program where I developed the course modules for the online section of English 210 (Technical Writing) which were taught by the department for several years.
Adjunct Faculty 2009-2011
Department of English
Stephen F. Austin State University
As an adjunct faculty member in the Department of English, I taught a 4/4 load of composition and rhetoric courses. I designed the courses I taught and participated in assessment.
WWW Specialist 2007-2009
Columbia Regional Geospatial Service Center
Stephen F. Austin State University
My role at the CRGSC was to design the center’s web presence, which involved creating a custom content management system (CMS) capable of servicing their unique needs. This CMS allowed them to schedule and orchestrate GIS training sessions for the Texas State Guard for emergency preparedness. I also helped with emergency response—I was part of a two-person team who set up a server with GIS software at the staging area of the Galveston response team after hurricane Rita.
Systems Administrator 2005-2007
Office of Instructional Technology
Stephen F. Austin State University
I administered the university’s Learning Management System, which at first was WebCT and then later Blackboard. I managed the hardware, which consisted of both production and staging instances of RedHat Linux servers (for WebCT) and then Windows servers (for Blackboard), a fiber-connected storage area network, a load balancer, tape backups, uninterruptible power supply, and cooling systems. I also trained and supervised an assistant systems administrator.
Programmer/Analyst 2003-2005
Management Information Systems
Harding University
My role in this position was to create custom data schemas and automated business processes to augment the university’s Student Information System (Banner). This mostly involved working with Oracle databases to write complex queries, create views and tables, and write subroutines using PL/SQL.

Other Appointments

Note: Note: In all three cases below, the role of “Technical Editor” is intended to connote the status of “Editor,” but with the technological requirements and affordances of the project being my chief purview.
Technical Editor 2023-present
Technical Editor 2019-present
Technical Editor 2017-present

Major Projects

Corpora 2018-present
Principal Investigator: Myself
My Role: Lead Developer
Corpora is a web-based, “dataset studio” for the Digital Humanities, allowing scholars to build, enhance, search, transform, and explore digital humanities project data. It is the culmination of my research into infrastructure studies, and has come to serve a crucial role for the following major projects, among others:
Texas Art Project (TAP) 2023-present
Principal Investigator: Tianna Uchacz
My Role: Lead Developer
TAP showcases the work of Texas artists, currently featuring the work of E.M. “Buck” Schiwetz, with more forthcoming. Allows for a filtered browsing/search of the artwork, zoomable images, and a mapping interface for showing the locations of Schiwetz’ subjects. Project data is hosted in Corpora. For this project, I also developed a custom Wordpress plugin which queries the Corpora API for project data and then presents images via IIIF tiling and plots GIS data using Leaflet.
Maria Edgeworth Letters Project (MELP) 2023-present
Principal Investigators: Susan Egenolf, Meredith Hale, Hilary Havens, Carrie Johnson, Jessica Richard, and Robin Runia
My Role: Lead Developer and Technical Editor
MELP makes available the collected letters of Maria Edgeworth, providing interfaces to browse, search, and explore them, including a IIIF image viewer of the letter images themselves. Project data hosted by Corpora. I wrote the logic to ingest the data from TEI encoded letters and built a custom Wordpress plugin that queries the Corpora API for project data and presents it. Of note is a custom letter-viewer which places a zoomable letter image side-by-side with its transcription, and keeps images in-sync with the transcription as the user scrolls.
The Carlyle Letters Online (CLO) 2022-present
Principal Investigator: Brent Kinser
My Role: Lead Developer
The CLO makes available the collected letters and photographs of Thomas and Jane Carlyle, providing interfaces to browse, search, and explore them. This project came to me as an admirable yet unsustainable boutique web application. Thankfully project data was almost entirely TEI encoded. I wrote logic to extract TEI data into Corpora. Much like TAP and MELP above, I wrote a custom Wordpress plugin to query the Corpora API and present the data, matching the legacy application’s appearance. Over time, I also implemented an advanced search feature which queries and presents search results obtained from Corpora’s API.
The New Variorum Shakespeare (NVS) 2019-present
Principal Investigator: Robert Stagg
My Role: Lead Developer
This project seeks to digitally represent the play text, textual variations over time across major editions, and scholarly commentary of every Shakespeare play. Excepting some initial HTML/CSS development, my role has been to engineer the entirety of the digital NVS, from the ingestion tasks that extract data from TEI encoded volumes to the presentation of that data via the web-based variorum and paratext viewers. The variorum viewer is an extremely complicated undertaking, involving reconstructing synthetic play lines based on the original textual editorial notations of variants, the construction of a histogram visualization of changes over time across editions, the highlighting of commentary lemmata, and many other ongoing challenges, including the construction of backend tools for the creation of future NVS editions. Everything pertaining to the digital NVS, from the backend to the frontend, is hosted by Corpora.
Principal Investigator: Susan Brown
My Role: Principal Investigator for TAMU Sub-award; Lead Developer
LINCS is a sprawling endeavor to schematize, convert, ingest, and explore large datasets consisting of linked open data. It involves many, chiefly Canadian institutions. My role, as principal investigator for Texas A&M’s contributions was twofold: first, to convert the ~2 million bibliographic metadata entries comprising the catalog of the Advanced Research Consortium (ARC) into linked open data for ingestion into the LINCS triplestore; second, to develop a “rich prospect browser” for visually exploring the LINCS dataset. Both of these objectives were accomplished using Corpora.
Principal Investigator: Britt Mize
My Role: Lead Developer
BABD is the most comprehensive record of texts, representations, and adaptations of Beowulf from 1705 to the present, in all languages, genres, and media forms. It provides a interfaces for filtering, searching, and viewing bibliographic records. My role was to create the data schema and web application using MySQL and Django. I also created a unique interface, akin to a network graph, that allows users to explore influences between artifacts in the database across time.
Reading the First Books: Multilingual, Early-Modern OCR for Primeros Libros 2016-2017
Principal Investigator: Hannah Alpert-Abrams
My Role: Lead Developer for TAMU contributions
This project was a two-year, multi-university effort to develop tools for the automatic transcription of early modern printed books. My role was to adapt the "eMOP Dashboard" (see eMOP project below) to allow a team of scholars from UT Austin to launch massively parallel OCR tasks on the Brazos Supercomputing Cluster. This eventually entailed a complete rewrite of the software, culminating in the basis for Corpora. This allowed the Primeros Libros team to perform OCR on every printed volume published in the Americas before 1601 using cutting edge neural network technology.
The Early Modern OCR Project (eMOP) 2012-2014
Principal Investigator: Laura Mandell
My Role: Graduate Assistant Researcher
This ambitious project involved training the Tesseract 3 OCR engine to produce automated transcriptions for over 46 million page images from documents published 1475-1800. My contributions to the project involved initially architecting and populating the relational database schema for the project, which evolved over time to store information about 45 million page images for historically printed documents published between 1500-1800. I also developed a tool called Franken+ which allowed the eMOP team to train the Tesseract OCR engine using a visual interface.

Selected Publications

2023
Tarpley, Bryan, Nancy Sumpter, and Kayley Hart. "Helping Humanists Hack: A Tale of Program Coordination, Classroom Support, Adaptive Pedagogy, and Python." Digital Humanities Workshops: Lessons Learned, edited by Jennifer Guiliano and Laura Estill, Routledge, 2023.
2023
Burdick, Anne, Laura Mandell, Bryan Tarpley, and Katayoun Torabi. "Using Data and Design to Bring the New Variorum Shakespeare Online." The Routledge Handbook of Shakespeare and Interface, edited by Clifford Werier and Paul Budra, Routledge, 2023.
2013
Torabi, Katayoun, Jessica Durgan, and Bryan Tarpley. "Early modern OCR project (eMOP) at Texas A&M University: using Aletheia to train Tesseract." ACM Document Engineering Proceedings, September 2013, pp. 23-26.
2009
Tarpley, Bryan. "The Hopeful Midwife: Facing Epistemic Limitations." Journal of Faith and the Academy, vol. 2, no. 2, 2009.

Selected Presentations

2025
“Corpora: A Dataset Studio for the Digital Humanities.” Cultures of Correspondence Symposium at Texas A&M University in College Station, TX.
2024
“Corpora: A Dataset Studio for the Digital Humanities.” TxDH Symposium at Baylor University in Waco, TX.
2024
“Corpora: A Dataset Studio for the Digital Humanities.” DH Inside Out, a pre-conference workshop for Digital Humanities at the Roy Rosenzweig Center for History and New Media in Arlington, VA.
2023
"Stratocumulus: A Network Graph Interface for Browsing Big Data." Making Links, University of Guelph, Canada. Co-presented with Akseli Palén.
2022
"Cockyboo: Archiving Harvey Matusow’s Journey from Red Baiter to Mr. Rogers." American Literature Association, Chicago, IL. Co-presented with Nick Kocurek.
2019
"Introducing the ESTC21: Converting the English Short Title Catalogue to Linked Data, Original Goals and Lessons Learned." Consortium of European Research Libraries Annual Seminar, Göttingen, Germany. Co-presented with Brian Geiger.
2017
"'So yo then man what’s your story?': David Foster Wallace, Paul Ricoeur, and Narrative Identity." Ricoeur Studies, Boston, MA. Co-presented with Greg McKinzie.
2017
"Breakdowns in Machine Reading: Attempting to De-privilege Modern English Print with the Power of Supercomputing and the DH Dashboard." Digital Frontiers at University of North Texas in Denton, TX.
2017
"The Psalter Project: Providing Mediated Access to Religio-Political Subjects in Early Modern England." Digital Humanities at McGill University in Montreal, Canada. Co-presented with Dr. Nandra Perry.
2016
"Enabling Enterprise Web Services with Asynchronous Job Queues." Texas A&M Tech Summit in Galveston, TX.
2013
"Early Modern OCR Project (eMOP) at Texas A&M." Document Engineering in Florence, Italy. Co-presented with Katayoun Torabi.

Professional Development

2024
Completed course "NLP Coding Libraries and Network Analysis for Text Corpora" at the Digital Humanities Summer Institute at the University of Victoria in Victoria, Canada.
2018
Completed course "The Frontend: Modern JavaScript & CSS Development" at the Digital Humanities Summer Institute at the University of Victoria in Victoria, Canada.
2017
Completed course "Wrangling Big Data for DH" at the Digital Humanities Summer Institute at the University of Victoria in Victoria, Canada.

Pedagogy

I have two years of experience teaching a 4/4 load of undergraduate composition and literature survey courses as an Adjunct in the Department of English at Stephen F. Austin State University, and an additional year of teaching similar courses as a Graduate Assistant Teacher at Texas A&M University.

I have seven years of experience teaching Python to humanists through the Programming for Humanists continuing education program at the Center of Digital Humanities Research at Texas A&M University. This involved teaching a hybrid course (both in-person and online, both synchronous and asynchronous) where I provided weekly 2-hour sessions of lecturing, live coding, and troubleshooting for up to a semester at a time. These courses were centered around humanities applications of Python, including things like Natural Language Processing, parsing and extracting data from XML, creating an OCR pipeline, querying API's, etc.

Technical Skills