GailBot: An Automatic Transcription System for Conversation Analysis

News

Overview

Take a second to listen to the following conversation:

Most speech-to-text (STT) systems would produce a transcript that looks something like this:

This image has an empty alt attribute; its file name is Screenshot-2022-05-06-at-13.26.54.png

This transcript is useful, but falls short of truly capturing the conversation. There are a few obvious omissions. Perhaps most importantly, there is a 4.1 second silence between lines 5 and 6. This gap does some heavy lifting: it shows that Speaker 2 expected Speaker 1 to understand their joke. In addition, STT systems typically ignore laughter. On line 2, Speaker 2 laughs, an early cue that they are joking. On line 7, Speaker 1 laughs, indicating that they finally understand the joke. To really understand the sequence, what we really need is a transcript more like the one below, which marks paralanguage – everything that comes along with the words.

This image has an empty alt attribute; its file name is Screenshot-2022-05-06-at-13.28.59-1024x361.png

The system to transcribe paralinguistic features – Jeffersonian transcription – was created by Gail Jefferson. Jeffersonian transcribing is slow going. On one hand, the process of re-listening to the audio to add more and more details helps the researcher understand the mechanisms underlying the interaction. On the other hand, the laborious process limits the amount of data researchers can analyze. For example, it would be close to impossible to generate enough Jeffersonian transcripts to train a deep learning language model.

This image has an empty alt attribute; its file name is mRWd8eduaeBsc79OeQlBkCY3_1u70IyPUgmljc7zsiwmZtIVFoZ4JVeZaA2zvPiYwRb6f7kcxBiPtkKYFp8jmpGBwGH8DB5tZ6cMsUOL6VPoqel7NeaMUaw17xRdegWp68E6PfVMR6HMTk1lUw
Gail Jefferson (photo from Wikipedia)

Enter: GailBot. We designed GailBot to create first-pass transcriptions of some paralinguistic features (speech rate, silences, overlaps and laughter). It interfaces with existing STT algorithms, and then applies post-processing modules that insert Jeffersonian symbols. All of this is useful, but GailBot’s most valuable characteristic is that it allows researchers to create, insert, and adjust plugins, which are a standardized interface for applying customized plugins to the GailBot pipeline.

For more details, please refer to the GailBot paper published in Dialogue and Discourse

Availability and Usage

As part of our efforts to expand the availability of GailBot to researchers from various fields, we have developed and intend to maintain a Graphical User Interface (GUI) based GailBot App.

The GailBot app. can be has been developed for MacOSX and can be installed using either our .pkg or .dmg bundles. We recommend that users install the app. using the .pkg bundle. The installation bundles are available below:

  • GailBot .pkg bundle – beta release
  • GailBot .dmg bundle – beta release

NOTE: We are currently only distributing our beta release to select testers. If you would like to be considered for testing, please fill out this google form. Thank you!

Please follow our GailBot user manual for a step by step walkthrough of the installation process and a basic overview of the transcription process within the app.

We also have a pypi package that can be installed using pip and allows users to directly use GailBot within their python applications. However, please note that we are a small development team with limited resources, which means that have not been able to maintain the pypi package as we developed our GUI. We will update this at the earliest.

Below is a list of resources that may be useful:

This section contains a number of resources that users may find relevant:

Users can reach out with questions, feedback, or concerns to this email.

Acknowledgements

GailBot has been made possible by a small but dedicated team of researchers and research assistants in the Human Interaction Lab at Tufts University.

Here is the list of faculty, post-docs, and graduate students who have contributed:

  1. J.P de Ruiter
  2. Saul Albert
  3. Muhammad Umair
  4. Julia Mertens

Here is the list of undergraduate research assistants who have contributed:

  1. Siara Small – B.S in Computer Science @Tufts
  2. Yike Li – B.S in Computer Science @Tufts
  3. Annika Tanner – B.S in Computer Science @Tufts
  4. Muyin Yao – B.S in Computer Science @Tufts
  5. Rosana Vitiello – B.S in Computer Science @Tufts
  6. Eva Denman – B.S in Engineering Psychology @Tufts

Citation

Please cite GailBot using the following bibtex:

@article{umair2022gailbot,
  title={GailBot: An automatic transcription system for Conversation Analysis},
  author={Umair, Muhammad and Mertens, Julia Beret and Albert, Saul and de Ruiter, Jan P},
  journal={Dialogue \& Discourse},
  volume={13},
  number={1},
  pages={63--95},
  year={2022}
} 

Liability Notice

Gailbot is a tool to be used to generate specialized transcripts. However, it is not responsible for output quality. Generated transcripts are meant to be first drafts that can be manually improved. They are not meant to replace manual transcription.

GailBot may use external Speech-to-Text systems or third-party services. The development team is not responsible for any transactions between users and these services. Additionally, the development team does not guarantee the accuracy or correctness of any plugin. Plugins have been developed in good faith and we hope that they are accurate. However, users should always verify results.

By using GailBot, users agree to cite Gailbot and the Tufts Human Interaction Lab in any publications or results as a direct or indirect result of using Gailbot.

Publication Resources

This section contains resources that were included as part of various publications.

Media Files