De Ruiter, J. P. (2000). The production of gesture and speech. In McNeill, D. (Ed.), Language and Gesture (pp. 248-311). Cambridge: Cambridge University Press.

Fill in this form to receive a download link:

Research topics in the field of speech-related gesture that have received con- siderable attention are the function of gesture, its synchronization with speech, and its semiotic properties. While the findings of these studies often have interesting implications for theories about the processing of gesture in the human brain, few studies have addressed this issue in the framework of information processing.

In this chapter, I will present a general processing architecture for gesture production. It can be used as a starting point for investigating the processes and representations involved in gesture and speech. For convenience, I will use the term ‘model’ when referring to ‘processing architecture’ throughout this chapter.

Since the use of information-processing models is not believed by every gesture researcher to be an appropriate way of investigating gesture (see, e.g., McNeill 1992), I will first argue that information-processing models are essential theoretical tools for understanding the processing involved in gesture and speech. I will then proceed to formulate a new model for the production of gesture and speech, called the Sketch Model. It is an exten- sion of Levelt’s (1989) model for speech production. The modifications and additions to Levelt’s model are discussed in detail. At the end of the section, the working of the Sketch Model is demonstrated, using a number of illus- trative gesture/speech fragments as examples.

Subsequently, I will compare the Sketch Model with both McNeill’s (1992) growth-point theory and with the information-processing model by Krauss, Chen & Gottesman (this volume). While the Sketch Model and the model by Krauss et al. are formulated within the same framework, they are based on fundamentally different assumptions. A comparison between the Sketch Model and growth-point theory is hard to make, since growth-point theory is not an information-processing theory. Nevertheless, the Sketch Model and growth-point theory share a number of fundamental assump- tions.

De Ruiter, J.P. (2017). The asymmetric redundancy of gesture and speech. In Church, R.B., Alibali, M.W., & Kelly, S.D. (eds). Why gesture? How the hands function in speaking, thinking and communicating John Benjamins Publishing Company, Amsterdam.

Fill in this form to receive a download link:

A number of studies from the last decades have demonstrated that the iconic gestures are shaped not only by our mental imagery but also, quite strong-
ly, by structural properties of the accompanying speech. These findings are problematic for the central assumption in the Sketch Model (De Ruiter, 2000) about the function of representational gesture. I suggest a seemingly small but fundamental modification to the processing assumptions in the Sketch Model that not only accommodates the discussed empirical findings, but also explains many other well-known gesture phenomena. The new model also generates new and testable predictions regarding the relationship between gesture and speech.

De Ruiter, J. P. (2007). Postcards from the mind: the relationship between speech, imagistic gesture, and thought. Gesture, 7(1), 21-38.

Fill in this form to receive a download link:

In this paper, I compare three different assumptions about the relationship between speech, thought and gesture. These assumptions have profound conse- quences for theories about the representations and processing involved in ges- ture and speech production. I associate these assumptions with three simplified processing architectures. In the Window Architecture, gesture provides us with
a ‘window into the mind’. In the Language Architecture, properties of language have an influence on gesture. In the Postcard Architecture, gesture and speech are planned by a single process to become one multimodal message. The popular Window Architecture is based on the assumption that gestures come, as it were, straight out of the mind. I argue that during the creation of overt imagistic gestures, many processes, especially those related to (a) recipient design, and
(b) effects of language structure, cause an observable gesture to be very different from the original thought that it expresses. The Language Architecture and the Postcard Architecture differ from the Window Architecture in that they both incorporate a central component which plans gesture and speech together, how- ever they differ from each other in the way they align gesture and speech. The Postcard Architecture assumes that the process creating a multimodal message involving both gesture and speech has access to the concepts that are available in speech, while the Language Architecture relies on interprocess communication to resolve potential conflicts between the content of gesture and speech.

De Ruiter, J. P., Noordzij, M. L., Newman-Norlund, S. E., Newman-Norlund, R. D., Hagoort, P., Levinson, S. C., & Toni, I. (2010). Exploring the cognitive infrastructure of communication. Interaction Studies, 11, 51-77.

Fill in this form to receive a download link:

Human communication is often thought about in terms of transmitted messages in a conventional code like a language. But communication requires a specialized interactive intelligence. Senders have to be able to perform recipient design, while receivers need to be able to do intention recognition, knowing that recipient design has taken place. To study this interactive intelligence in the lab, we developed a new task that taps directly into the underlying abilities to communicate in the absence of a conventional code. We show that subjects are remarkably successful communicators under these conditions, especially when senders get feedback from receivers. Signaling is accomplished by the manner in which an instrumental action is performed, such that instrumentally dysfunctional components of an action are used to convey communicative intentions. The findings have important implications for the nature of the human communicative infrastructure, and the task opens up a line of experimentation on human communication.

On the origin of intentions

Fill in this form to receive a download link:

Any model of motor control or sensorimotor transformations starts from an intention to trigger a cascade of neural computations, yet how intentions themselves are generated remains a mystery. Part of the difficulty in dealing with this mystery might be related to the received wisdom of studying sensorimotor processes and intentions in individual agents. Here we explore the use of an alternative approach, focused on understanding how we induce intentions in other people. Under the assumption that generating intentions in a third person relies on similar mechanisms to those involved in generating first- person intentions, this alternative approach might shed light on the origin of our own intentions. Therefore, we focus on the cognitive and cerebral operations supporting the generation of communicative actions, i.e. actions designed (by a Sender) to trigger (in a Receiver) the recognition of a given communicative intention. We present empirical findings indicating that communication requires the Sender to select his behavior on the basis of a prediction of how the Receiver will interpret this behavior; and that there is spatial overlap between the neural structures supporting the generation of communicative actions and the generation of first-person intentions. These results support the hypothesis that the generation of intentions might be a particular instance of our ability to induce and attribute mental states to an agent. We suggest that motor intentions are retrodictive with respect to the neurophysiological mechanisms that generate a given action, while being predictive with respect to the potential intention attribution evoked by a given action in other agents.

De Ruiter, J. P., & Enfield, N. (2007, July 22-27). The BIC model: a blueprint for the communicator. In Stephanidis, C. (Ed.), Proceedings of the 4th International Conference on Universal Access in Human- Computer Interaction: Universal Access in Human-Computer Interaction. Applications and Services Lecture Notes in Computer Science (pp. 251-258). Berlin: Springer.

Fill in this form to receive a download link:

In this paper, we outline a cognitive architecture for communicators, called the BIC model. The model consist of three main components. First, a (B)iological component, in which the genetic or built-in capacities of the communicator are specified. Second, an (I)nteraction Engine which uses neo- Gricean mutual simulation to attribute communicative intentions to signals, and to create signals to convey communicative intentions to the I-system of other agents. The third component of the BIC model is a content addressable database of (C)onventions which is used to store form/meaning mappings that have been successfully computed by the I-system. These stored form/meaning mappings are indexed by types of communicative context, so they can be retrieved by the I-system to save computational resources. The model can be used both as a computational architecture for a communication module in an artificial agent and as a conceptual model of the human communicator.

De Ruiter, J.P. & De Ruiter, L.E. (2017). Don’t shoot the giant on whose shoulders we are standing. Commentary on Branigan & Pickering. Behavioral and Brain Sciences.

Fill in this form to receive a download link:

Structural priming is a sufficient, but not a necessary condition for proving the existence of representations. Absence of evidence is not evidence of absence. Cognitive science relies on the legitimacy of positing representations and processes without “proving” every component. Also, psycholinguistics relies on other methods, including acceptability judgments, to find the materials for priming experiments in the first place.

De Ruiter, J. P., & De Beer, C. (2013). A critical evaluation of models of gesture and speech production for understanding gesture in aphasia. Aphasiology, 27(9), 1015–1030.

Fill in this form to receive a download link:

Background: The background to this study is that aphasiologists have increasingly started to pay attention not only to the speech that people with aphasia produce, but also to their gestures. As there are a number of competing models about the production of gesture and speech in healthy subjects, it is important to evaluate whether, and if so how, these models could be used to guide the research into gesture and speech, and the relationship between these, in speakers with aphasia.

Aims: The aims and objectives of this study are to see how existing models of gesture and speech are able to accommodate the findings regarding the gesture and speech behaviour of speakers with aphasia, in the hope that (1) these models could shed light on the use of gesture in aphasic speakers, and potentially suggest new approaches to therapy for people with aphasia and (2) the aphasia gesture data might help fundamental psycholinguistics to evaluate the adequacy of existing gesture and speech models.

Methods & Procedures: The methodology here was theoretical. Four models of gesture and speech interaction were critiqued and we reviewed their ability to explain some of the central empirical findings in the area of gesture and speech in aphasia.
Outcomes & Results: The outcomes and results of this theoretical analysis were that, with respect to the relationship between gesture and speech in aphasia, (1) the four models under investigation could be reduced to two models, because three of the investigated models were based on the same core assumptions and (2) both of these models adequately explain these findings, but the Growth Point/Sketch/Interface Model is more satisfactory than the Lexical Access Model, because of the better fit with the experimental results on the use of gesture for facilitating word finding, and because it is more compatible with the finding that gestures are also used to enhance communicative efficiency by replacing speech.

An integrated theory of language production and comprehension

Fill in this form to receive a download link:

Currently, production and comprehension are regarded as quite distinct in accounts of language processing. In rejecting this dichotomy, we instead assert that producing and understanding are interwoven, and that this interweaving is what enables people to predict themselves and each other. We start by noting that production and comprehension are forms of action and action perception. We then consider the evidence for interweaving in action, action perception, and joint action, and explain such evidence in terms of prediction. Speci!cally, we assume that actors construct forward models of their actions before they execute those actions, and that perceivers of others’ actions covertly imitate those actions, then construct forward models of those actions. We use these accounts of action, action perception, and joint action to develop accounts of production, comprehension, and interactive language. Importantly, they incorporate well-de!ned levels of linguistic representation (such as semantics, syntax, and phonology). We show (a) how speakers and comprehenders use covert imitation and forward modeling to make predictions at these levels of representation, (b) how they interweave production and comprehension processes, and (c) how they use these predictions to monitor the upcoming utterances. We show how these accounts explain a range of behavioral and neuroscienti!c data on language processing and discuss some of the implications of our proposal.

De Ruiter, J. P., & Cummins, C. (2012, September 19-21). A model of intentional communication: AIRBUS (Asymmetric Intention Recognition with Bayesian Updating of Signals). In S. Brown- Schmidt, J. Ginzburg & S. Larsson (Eds.), Proceedings of SemDial 2012 (SeineDial): The 16th Workshop on the Semantics and Pragmatics of Dialogue. Workshop Series on the Semantics and Pragmatics of Dialogue (SEMDIAL-12), Paris, France (pp. 149-150).

Fill in this form to receive a download link:

The rapid and fluent nature of human communicative interactions strongly suggests the existence of an online mechanism for intention recognition. We motivate and outline a mathematical model that addresses these requirements. Our model provides a way of integrating knowledge about the relationship between linguistic expressions and communicative intentions, through a rapid process of Bayesian update. It enables us to frame predictions about the processes of intention recognition, utterance planning and other-repair mechanisms, and contributes towards a broader theory of communication.