Attention is Turing complete

Pérez, Jorge; Barceló Baeza, Pablo; Marinkovic, Javier

Attention is Turing complete

Files

TEXTO COMPLETO_Attention is Turing Complete.pdf(382.68 KB)

Date

2021

Authors

Pérez, Jorge

Barceló Baeza, Pablo

Marinkovic, Javier

Abstract

Alternatives to recurrent neural networks, in particular, architectures based on self-attention, are gaining momentum for processing input sequences. In spite of their relevance, the computational properties of such networks have not yet been fully explored. We study the computational power of the Transformer, one of the most paradigmatic architectures exemplifying self-attention. We show that the Transformer with hard-attention is Turing complete exclusively based on their capacity to compute and access internal dense representations of the data. Our study also reveals some minimal sets of elements needed to obtain this completeness result.

Keywords

Transformers, Turing completeness, Self-Attention, Neural networks, Arbitrary precision

URI

https://repositorio.uc.cl/handle/11534/75358

Collections

Artículos de revistas

Full item page