GailBot: An Automatic Transcription System for Conversation Analysis

Glossary:
Resources
Overview
Availability and Usage
Plugins
Video Tutorials
Feedback
Acknowledgements
Citation
Liability Notice
Publication Resources

Resources

News:
GailBot paper published in Dialogue and Discourse. 
News Coverage by Tufts University.
GailBot AMLAP 2020 Presentation and Abstract

Documentation:
Gailbot User Manual

Forms
GailBot App. Request Form

Overview

Take a second to listen to the following conversation:

Most speech-to-text (STT) systems would produce a transcript that looks something like this:

Example Speech to Text transcription

This transcript is useful, but falls short of truly capturing the conversation. There are a few obvious omissions. Perhaps most importantly, there is a 4.1 second silence between lines 5 and 6. This gap does some heavy lifting: it shows that Speaker 2 expected Speaker 1 to understand their joke. In addition, STT systems typically ignore laughter. On line 2, Speaker 2 laughs, an early cue that they are joking. On line 7, Speaker 1 laughs, indicating that they finally understand the joke. To really understand the sequence, what we really need is a transcript more like the one below, which marks paralanguage – everything that comes along with the words.

Example Speech to Text transcription with added markers for paralinguistic features

The system to transcribe paralinguistic features – Jeffersonian transcription – was created by Gail Jefferson. Jeffersonian transcription is slow going. On one hand, the process of re-listening to the audio to add more and more details helps the researcher understand the mechanisms underlying the interaction. On the other hand, the laborious process limits the amount of data researchers can analyze. For example, it would be close to impossible to generate enough Jeffersonian transcripts to train a deep learning language model.

Enter: GailBot. We designed GailBot to create first-pass transcriptions of some paralinguistic features (speech rate, silences, overlaps and laughter). It interfaces with existing STT algorithms, and then applies post-processing modules that insert Jeffersonian symbols. All of this is useful, but GailBot’s most valuable characteristic is that it allows researchers to create, insert, and adjust plugins, which are a standardized interface for applying customized plugins to the GailBot pipeline.

Availability and Usage

GailBot is available to use both as a Graphical User Interface (GUI) based app for non-technical users and as a python package for developers. The GailBot app is currently only supported by Max OSX and can be installed directly using .pkg bundles.

Use the app. request form to submit a request to use GailBot. Thank you!

GailBot is also available as a python package that can be installed using pip and allows users to directly use GailBot within their python applications.

Please note that we are a small development team with limited resources, which means that there may be bugs and issues with GailBot. We appreciate your patience as we resolve these.

Plugins

GailBot provides a framework for users to add custom algorithms to identify specific paralinguistic features. Plugins are wrapper objects (implemented as Python classes) that provide a standard API for these algorithms to interact with the core pipeline. This means that customizing and changing plugins does not require modifying GailBot’s source code. Plugins may or may not be dependent on each other.

While users can implement custom plugins, we support a number of ‘official’ plugins as listed below.

Video Tutorials

In addition to written documentation (linked above), we provide a number of tutorial videos for users to get started with GailBot. Note that video tutorials may use concepts defined in the user manual (also linked above).

Basic tutorial for uploading and transcribing sources using GailBot. Includes adding sources, applying settings profiles, and selecting plugin suites
Tutorial for the creation of settings profiles, which define how a source is processed
Tutorial for creating an engine profile in GailBot, which defines which speech-to-text engine is used for transcription. Currently Watson, Google, Whisper and WhisperX are supported.
Tutorial for plugin suite installation in GailBot, which are used to transcribe paralinguistic features. Covers usage of officially supported and custom plugin suites.
Tutorial for creating a simple custom plugin suite for use with GailBot.

Feedback

Users can reach out with questions, feedback, or concerns to hilab-dev@elist.tufts.edu

Acknowledgements

GailBot has been made possible by a small but dedicated team of researchers and research assistants in the Human Interaction Lab at Tufts University.

Faculty, post-docs, and graduate student contributors:

Muhammad Umair

Umair is a PhD student in Computer Science: Human Robot Interaction in the School of Engineering at Tufts. Umair graduated with a bachelors in Computer Science from Tufts in 2021, and was involved in various research projects throughout his undergrad.

Umair is a PhD student in Computer Science at the Tufts Human Interaction Lab, and is the lead developer for GailBot. He is interested in improving naturalistic turn-taking in spoken dialogue systems. His projects are geared towards  collecting natural interaction data using both automated and manual methods and developing data-driven AI solutions to improve turn-taking, based on inspiration from human-human interaction. Currently, he is developing models for detecting Transition Relevance Places (TRPs) to improve the timing of turn-taking in dialogue systems and exploring the use of Large Language Models (e.g, ChatGPT) for use in Spoken interaction.

Dr. Julia Mertens

Julia is a cognitive scientist whose previous research includes the terms, biophysiology, facial expression ambiguity, autism spectrum disorders, first impressions, and eye gaze percentage. Her current research focuses on the mechanisms, causes and outcomes of miscommunication. She graduated from the University of Connecticut with a Bachelors of Science in Cognitive Science and Psychology in 2015, and worked as Research Assistant at the Facial Affective and Communicative Expressions Laboratory at Emerson College from 2015 to 2017.

Julia is currently working as a Senior Scientist at Boston Fusion in Lexington, MA.

Portait of Julia Mertens by JP De Ruiter

Dr. Saul Albert

Potrait of Saul Albert by J. P. De Ruiter

Saul initiated the GailBot Project with Umair and JP while he was a postdoc at Tufts, and he is working on finding functional compromise between the demands of CA’s Jeffersonian transcription conventions and the constraints of GailBot’s automation and digital transcription formats.

Currently, Saul is an Assistant Professor of Social Science (Social Psychology) at Loughborough University’s Communication & Media division. He is working on a range of applied CA/AI projects including how disabled people and care assistants use and adapt virtual assistant technologies for ‘smart homecare’ use cases. He is also working on a book entitled ‘Conversation Analysis for Conversation Design’ with Cathy Pearl and Elizabeth Stokoe due out with Routledge in 2024/25.

Professor J.P. De Ruiter

Prof. De Ruiter is the director of the Tufts Human Interaction Lab and is the GailBot project PI.

Undergraduate Contributors:

Hannah Shader (June 2023 – Present)

Hannah is a Tufts Undergraduate student, majoring in Computer Science, with interests in Machine Learning and AI. They have contributed to GailBot so far by creating the refactored built-in Plugin Suite for the GailBot application and refining the GailBot API to make it more intuitive. Their future projects will include allowing user selection of plugins within a Plugin Suite and implementing Machine Learning algorithms to detect paralinguistic features with GailBot’s built-in Plugin Suite. 

Vivian (Yike) Li (Sept. 2022 – Present)

Vivian is an undergrad at Tufts Major in Computer Science and Cognitive and Brain Science. She was a core contributor in the development of GailBot desktop app, from its inception to the beta release.

She has developed a desktop app to test smart speakers at Sonos, using Kotlin and Compose Multiplatform. Going forward, she hopes to further expand her knowledge of modern software frameworks and design patterns. Vivian is also a TA for CS105 – Programming Languages at Tufts.

Lakshita Jain (Sept. 2023 – Dec. 2023)

Lakshita is a Tufts Undergraduate student, majoring in Computer Science. She is a sophomore and will graduate in 2026. Her academic interests include software engineering, web development and machine learning.
On Gailbot, she will be working with making the plug-in suites more dynamic and integrating machine learning. Besides being a research assistant for GailBot, Lakshita is on the board for Tufts GirlGains and enjoys her spending time reading, out in nature, or baking desserts.

Anya Bhatia (Sept. 2023 – Dec. 2023)

Anya is a junior at Tufts majoring in computer science with minors in math and cognitive brain sciences. On Gailbot, she’ll be working on enhancing plug-in suites and integrating machine learning. Outside of being an RA, you can find her tutoring at the Staar center, reading, or dancing.

Jason Wu ( June 2023 – Sept. 2023)

Jason Wu is a Tufts undergraduate student majoring in Computer Science and International Relations. His contributions include developing and distributing the GailBot API and assisting in refactoring the built-in Plugin Suite. Outside of GailBot, he is a teaching assistant for CS40: Machine Structure and Assembly Language Programming. He enjoys traveling and attending Boston Symphony Orchestra concerts.

Jacob Boyar (June 2023 – Sept. 2023)

Jacob is a Tufts Undergraduate student majoring in Computer Science. He is a rising Junior and plans to graduate in 2025. His contributions to GailBot include assisting in refactoring the code and documentation. He has also created the video tutorial series for GailBot. Outside of GailBot, Jacob is a pixel art hobbyist who makes animations for fighting games. Currently, he is looking forward to his year abroad in Paris, France.

Siara Small (Sept. 2022 – June 2023)

Siara is an undergraduate student at Tufts University majoring in Computer Science and minoring in Physics and Entrepreneurship for Social Impact. She helped develop the graphical user interface for GailBot and to refactor the backend code to support additional features and functionality for the desktop application.  As an undergraduate, Siara is involved in Tufts Jumbocode, works as a Teaching Assistant for CS11: Introduction to Computer Science. Her academic interests include artificial intelligence, cyberpolicy, software development for social good, and quantum computing. 

Annika Tanner (Jan 2022. – June 2022)

Annika Tanner is a rising senior at Tufts University, studying Computer Science in the School of Arts and Sciences. She has an interdisciplinary passion in linguistics, psychology, tech, English literature and biology. 

During her time as a GailBot RA, she assisted in the redesign of GailBot’s backend by choosing and implementing time-efficient data structures to decrease program runtime. She also collaborated to create plugins that identify and transcribe additional paralinguistic features of talk, such as overlaps and laughter. 

Currently, Annika is wrapping up her Microsoft internship as a Technical Program Manager and is excited to embark on a new journey as a Teaching Fellow for CS15: Data Structures at Tufts! 

Muyin Yao (Jan. 2022 – June 2022)

Muyin Yao was a former student who worked on GailBot. She has a bachelor of Science in Mathematics and Computer Science at Tufts University.

Rosanna Vitiello (Jan. 2021 – June 2021)

Rosanna Vitiello was a former student who worked on GailBot.

Eva Denman (Sept. 2020 – Jan. 2020)

Eva Denman was a former student who worked on GailBot. She has a master’s degree in Human Factors Engineering. Eva worked on designing the GailBot GUI based on extensive testing with users. Currently, she is working at Fidelity Investments as a full time UX Researcher.

Citation

Please cite GailBot using the following bibtex:

@article{umair2022gailbot,
  title={GailBot: An automatic transcription system for Conversation Analysis},
  author={Umair, Muhammad and Mertens, Julia Beret and Albert, Saul and de Ruiter, Jan P},
  journal={Dialogue \& Discourse},
  volume={13},
  number={1},
  pages={63--95},
  year={2022}
} 

Liability Notice

Gailbot is a tool to be used to generate specialized transcripts. However, it is not responsible for output quality. Generated transcripts are meant to be first drafts that can be manually improved. They are not meant to replace manual transcription.

GailBot may use external Speech-to-Text systems or third-party services. The development team is not responsible for any transactions between users and these services. Additionally, the development team does not guarantee the accuracy or correctness of any plugin. Plugins have been developed in good faith and we hope that they are accurate. However, users should always verify results.

By using GailBot, users agree to cite Gailbot and the Tufts Human Interaction Lab in any publications or results as a direct or indirect result of using Gailbot.

Publication Resources

This section contains resources that were included as part of various publications.

Media Files