Attention is Turing complete

dc.catalogadoryvc
dc.contributor.authorPérez, Jorge
dc.contributor.authorBarceló Baeza, Pablo
dc.contributor.authorMarinkovic, Javier
dc.date.accessioned2023-11-19T02:34:22Z
dc.date.available2023-11-19T02:34:22Z
dc.date.issued2021
dc.description.abstractAlternatives to recurrent neural networks, in particular, architectures based on self-attention, are gaining momentum for processing input sequences. In spite of their relevance, the computational properties of such networks have not yet been fully explored. We study the computational power of the Transformer, one of the most paradigmatic architectures exemplifying self-attention. We show that the Transformer with hard-attention is Turing complete exclusively based on their capacity to compute and access internal dense representations of the data. Our study also reveals some minimal sets of elements needed to obtain this completeness result.
dc.fechaingreso.objetodigital2023-11-19
dc.fuente.origenPREI
dc.identifier.issn1532-4435
dc.identifier.urihttps://repositorio.uc.cl/handle/11534/75358
dc.information.autorucInstituto de Ingeniería Matemática y Computacional ; Barceló Baeza, Pablo ; 0000-0003-2293-2653 ; 13516
dc.issue.numero75
dc.language.isoen
dc.nota.accesoContenido completo
dc.pagina.final35
dc.pagina.inicio1
dc.revistaJournal of machine learning research
dc.rightsacceso abierto
dc.subjectTransformers
dc.subjectTuring completeness
dc.subjectSelf-Attention
dc.subjectNeural networks
dc.subjectArbitrary precision
dc.subject.ddc500
dc.subject.deweyCienciases_ES
dc.titleAttention is Turing complete
dc.typeartículo
dc.volumen22
sipa.codpersvinculados13516
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
TEXTO COMPLETO_Attention is Turing Complete.pdf
Size:
382.68 KB
Format:
Adobe Portable Document Format
Description: