default search action
Kazuki Irie
Person information
SPARQL queries
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [c34]Kazuki Irie, Róbert Csordás, Jürgen Schmidhuber:
Self-organising Neural Discrete Representation Learning à la Kohonen. ICANN (1) 2024: 343-362 - [c33]Kazuki Irie, Anand Gopalakrishnan, Jürgen Schmidhuber:
Exploring the Promise and Limits of Real-Time Recurrent Learning. ICLR 2024 - [i31]Lorenzo Tiberi, Francesca Mignacco, Kazuki Irie, Haim Sompolinsky:
Dissecting the Interplay of Attention Paths in a Statistical Mechanics Theory of Transformers. CoRR abs/2405.15926 (2024) - [i30]Róbert Csordás, Kazuki Irie, Jürgen Schmidhuber, Christopher Potts, Christopher D. Manning:
MoEUT: Mixture-of-Experts Universal Transformers. CoRR abs/2405.16039 (2024) - [i29]Kazuki Irie, Brenden M. Lake:
Neural networks that overcome classic challenges through practice. CoRR abs/2410.10596 (2024) - 2023
- [j1]Anand Gopalakrishnan, Kazuki Irie, Jürgen Schmidhuber, Sjoerd van Steenkiste:
Unsupervised Learning of Temporal Abstractions With Slot-Based Transformers. Neural Comput. 35(4): 593-626 (2023) - [c32]Róbert Csordás, Kazuki Irie, Jürgen Schmidhuber:
Approximating Two-Layer Feedforward Networks for Efficient Transformers. EMNLP (Findings) 2023: 674-692 - [c31]Kazuki Irie, Róbert Csordás, Jürgen Schmidhuber:
Practical Computational Power of Linear Transformers and Their Recurrent and Self-Referential Extensions. EMNLP 2023: 9455-9465 - [c30]Kazuki Irie, Jürgen Schmidhuber:
Images as Weight Matrices: Sequential Image Generation Through Synaptic Learning Rules. ICLR 2023 - [c29]Aleksandar Stanic, Anand Gopalakrishnan, Kazuki Irie, Jürgen Schmidhuber:
Contrastive Training of Complex-Valued Autoencoders for Object Discovery. NeurIPS 2023 - [i28]Kazuki Irie, Róbert Csordás, Jürgen Schmidhuber:
Topological Neural Discrete Representation Learning à la Kohonen. CoRR abs/2302.07950 (2023) - [i27]Kazuki Irie, Jürgen Schmidhuber:
Accelerating Neural Self-Improvement via Bootstrapping. CoRR abs/2305.01547 (2023) - [i26]Aleksandar Stanic, Anand Gopalakrishnan, Kazuki Irie, Jürgen Schmidhuber:
Contrastive Training of Complex-Valued Autoencoders for Object Discovery. CoRR abs/2305.15001 (2023) - [i25]Mingchen Zhuge, Haozhe Liu, Francesco Faccio, Dylan R. Ashley, Róbert Csordás, Anand Gopalakrishnan, Abdullah Hamdi, Hasan Abed Al Kader Hammoud, Vincent Herrmann, Kazuki Irie, Louis Kirsch, Bing Li, Guohao Li, Shuming Liu, Jinjie Mai, Piotr Piekos, Aditya A. Ramesh, Imanol Schlag, Weimin Shi, Aleksandar Stanic, Wenyi Wang, Yuhui Wang, Mengmeng Xu, Deng-Ping Fan, Bernard Ghanem, Jürgen Schmidhuber:
Mindstorms in Natural Language-Based Societies of Mind. CoRR abs/2305.17066 (2023) - [i24]Kazuki Irie, Anand Gopalakrishnan, Jürgen Schmidhuber:
Exploring the Promise and Limits of Real-Time Recurrent Learning. CoRR abs/2305.19044 (2023) - [i23]Róbert Csordás, Kazuki Irie, Jürgen Schmidhuber:
Approximating Two-Layer Feedforward Networks for Efficient Transformers. CoRR abs/2310.10837 (2023) - [i22]Kazuki Irie, Róbert Csordás, Jürgen Schmidhuber:
Practical Computational Power of Linear Transformers and Their Recurrent and Self-Referential Extensions. CoRR abs/2310.16076 (2023) - [i21]Kazuki Irie, Róbert Csordás, Jürgen Schmidhuber:
Automating Continual Learning. CoRR abs/2312.00276 (2023) - [i20]Róbert Csordás, Piotr Piekos, Kazuki Irie, Jürgen Schmidhuber:
SwitchHead: Accelerating Transformers with Mixture-of-Experts Attention. CoRR abs/2312.07987 (2023) - 2022
- [c28]Róbert Csordás, Kazuki Irie, Jürgen Schmidhuber:
CTL++: Evaluating Generalization on Never-Seen Compositional Patterns of Known Functions, and Compatibility of Neural Representations. EMNLP 2022: 9758-9767 - [c27]Róbert Csordás, Kazuki Irie, Jürgen Schmidhuber:
The Neural Data Router: Adaptive Control Flow in Transformers Improves Systematic Generalization. ICLR 2022 - [c26]Kazuki Irie, Róbert Csordás, Jürgen Schmidhuber:
The Dual Form of Neural Networks Revisited: Connecting Test Time Predictions to Training Patterns via Spotlights of Attention. ICML 2022: 9639-9659 - [c25]Kazuki Irie, Imanol Schlag, Róbert Csordás, Jürgen Schmidhuber:
A Modern Self-Referential Weight Matrix That Learns to Modify Itself. ICML 2022: 9660-9677 - [c24]Kazuki Irie, Francesco Faccio, Jürgen Schmidhuber:
Neural Differential Equations for Learning to Program Neural Nets Through Continuous Learning Rules. NeurIPS 2022 - [i19]Kazuki Irie, Imanol Schlag, Róbert Csordás, Jürgen Schmidhuber:
A Modern Self-Referential Weight Matrix That Learns to Modify Itself. CoRR abs/2202.05780 (2022) - [i18]Kazuki Irie, Róbert Csordás, Jürgen Schmidhuber:
The Dual Form of Neural Networks Revisited: Connecting Test Time Predictions to Training Patterns via Spotlights of Attention. CoRR abs/2202.05798 (2022) - [i17]Anand Gopalakrishnan, Kazuki Irie, Jürgen Schmidhuber, Sjoerd van Steenkiste:
Unsupervised Learning of Temporal Abstractions with Slot-based Transformers. CoRR abs/2203.13573 (2022) - [i16]Kazuki Irie, Francesco Faccio, Jürgen Schmidhuber:
Neural Differential Equations for Learning to Program Neural Nets Through Continuous Learning Rules. CoRR abs/2206.01649 (2022) - [i15]Kazuki Irie, Jürgen Schmidhuber:
Images as Weight Matrices: Sequential Image Generation Through Synaptic Learning Rules. CoRR abs/2210.06184 (2022) - [i14]Róbert Csordás, Kazuki Irie, Jürgen Schmidhuber:
CTL++: Evaluating Generalization on Never-Seen Compositional Patterns of Known Functions, and Compatibility of Neural Representations. CoRR abs/2210.06350 (2022) - [i13]Kazuki Irie, Jürgen Schmidhuber:
Learning to Control Rapidly Changing Synaptic Connections: An Alternative Type of Memory in Sequence Processing Artificial Neural Networks. CoRR abs/2211.09440 (2022) - 2021
- [c23]Róbert Csordás, Kazuki Irie, Jürgen Schmidhuber:
The Devil is in the Detail: Simple Tricks Improve Systematic Generalization of Transformers. EMNLP (1) 2021: 619-634 - [c22]Imanol Schlag, Kazuki Irie, Jürgen Schmidhuber:
Linear Transformers Are Secretly Fast Weight Programmers. ICML 2021: 9355-9366 - [c21]Kazuki Irie, Imanol Schlag, Róbert Csordás, Jürgen Schmidhuber:
Going Beyond Linear Transformers with Recurrent Fast Weight Programmers. NeurIPS 2021: 7703-7717 - [i12]Imanol Schlag, Kazuki Irie, Jürgen Schmidhuber:
Linear Transformers Are Secretly Fast Weight Memory Systems. CoRR abs/2102.11174 (2021) - [i11]Kazuki Irie, Imanol Schlag, Róbert Csordás, Jürgen Schmidhuber:
Going Beyond Linear Transformers with Recurrent Fast Weight Programmers. CoRR abs/2106.06295 (2021) - [i10]Róbert Csordás, Kazuki Irie, Jürgen Schmidhuber:
The Devil is in the Detail: Simple Tricks Improve Systematic Generalization of Transformers. CoRR abs/2108.12284 (2021) - [i9]Róbert Csordás, Kazuki Irie, Jürgen Schmidhuber:
The Neural Data Router: Adaptive Control Flow in Transformers Improves Systematic Generalization. CoRR abs/2110.07732 (2021) - [i8]Kazuki Irie, Jürgen Schmidhuber:
Training and Generating Neural Networks in Compressed Weight Space. CoRR abs/2112.15545 (2021) - [i7]Kazuki Irie, Imanol Schlag, Róbert Csordás, Jürgen Schmidhuber:
Improving Baselines in the Wild. CoRR abs/2112.15550 (2021) - 2020
- [b1]Kazuki Irie:
Advancing neural language modeling in automatic speech recognition. RWTH Aachen University, Germany, 2020 - [c20]Kazuki Irie, Alexander Gerstenberger, Ralf Schlüter, Hermann Ney:
How Much Self-Attention Do We Need? Trading Attention for Feed-Forward Layers. ICASSP 2020: 6154-6158 - [c19]Wei Zhou, Wilfried Michel, Kazuki Irie, Markus Kitza, Ralf Schlüter, Hermann Ney:
The Rwth Asr System for Ted-Lium Release 2: Improving Hybrid Hmm With Specaugment. ICASSP 2020: 7839-7843 - [c18]Alexander Gerstenberger, Kazuki Irie, Pavel Golik, Eugen Beck, Hermann Ney:
Domain Robust, Fast, and Compact Neural Language Models. ICASSP 2020: 7954-7958 - [i6]Wei Zhou, Wilfried Michel, Kazuki Irie, Markus Kitza, Ralf Schlüter, Hermann Ney:
The RWTH ASR System for TED-LIUM Release 2: Improving Hybrid HMM with SpecAugment. CoRR abs/2004.00960 (2020)
2010 – 2019
- 2019
- [c17]Albert Zeyer, Parnia Bahar, Kazuki Irie, Ralf Schlüter, Hermann Ney:
A Comparison of Transformer and LSTM Encoder Decoder Models for ASR. ASRU 2019: 8-15 - [c16]Kazuki Irie, Albert Zeyer, Ralf Schlüter, Hermann Ney:
Training Language Models for Long-Span Cross-Sentence Evaluation. ASRU 2019: 419-426 - [c15]Christoph Lüscher, Eugen Beck, Kazuki Irie, Markus Kitza, Wilfried Michel, Albert Zeyer, Ralf Schlüter, Hermann Ney:
RWTH ASR Systems for LibriSpeech: Hybrid vs Attention. INTERSPEECH 2019: 231-235 - [c14]Kazuki Irie, Rohit Prabhavalkar, Anjuli Kannan, Antoine Bruguier, David Rybach, Patrick Nguyen:
On the Choice of Modeling Unit for Sequence-to-Sequence Speech Recognition. INTERSPEECH 2019: 3800-3804 - [c13]Kazuki Irie, Albert Zeyer, Ralf Schlüter, Hermann Ney:
Language Modeling with Deep Transformers. INTERSPEECH 2019: 3905-3909 - [i5]Kazuki Irie, Rohit Prabhavalkar, Anjuli Kannan, Antoine Bruguier, David Rybach, Patrick Nguyen:
Model Unit Exploration for Sequence-to-Sequence Speech Recognition. CoRR abs/1902.01955 (2019) - [i4]Jonathan Shen, Patrick Nguyen, Yonghui Wu, Zhifeng Chen, Mia Xu Chen, Ye Jia, Anjuli Kannan, Tara N. Sainath, Yuan Cao, Chung-Cheng Chiu, Yanzhang He, Jan Chorowski, Smit Hinsu, Stella Laurenzo, James Qin, Orhan Firat, Wolfgang Macherey, Suyog Gupta, Ankur Bapna, Shuyuan Zhang, Ruoming Pang, Ron J. Weiss, Rohit Prabhavalkar, Qiao Liang, Benoit Jacob, Bowen Liang, HyoukJoong Lee, Ciprian Chelba, Sébastien Jean, Bo Li, Melvin Johnson, Rohan Anil, Rajat Tibrewal, Xiaobing Liu, Akiko Eriguchi, Navdeep Jaitly, Naveen Ari, Colin Cherry, Parisa Haghani, Otavio Good, Youlong Cheng, Raziel Alvarez, Isaac Caswell, Wei-Ning Hsu, Zongheng Yang, Kuan-Chieh Wang, Ekaterina Gonina, Katrin Tomanek, Ben Vanik, Zelin Wu, Llion Jones, Mike Schuster, Yanping Huang, Dehao Chen, Kazuki Irie, George F. Foster, John Richardson, Klaus Macherey, Antoine Bruguier, Heiga Zen, Colin Raffel, Shankar Kumar, Kanishka Rao, David Rybach, Matthew Murray, Vijayaditya Peddinti, Maxim Krikun, Michiel Bacchiani, Thomas B. Jablin, Robert Suderman, Ian Williams, Benjamin Lee, Deepti Bhatia, Justin Carlson, Semih Yavuz, Yu Zhang, Ian McGraw, Max Galkin, Qi Ge, Golan Pundak, Chad Whipkey, Todd Wang, Uri Alon, Dmitry Lepikhin, Ye Tian, Sara Sabour, William Chan, Shubham Toshniwal, Baohua Liao, Michael Nirschl, Pat Rondon:
Lingvo: a Modular and Scalable Framework for Sequence-to-Sequence Modeling. CoRR abs/1902.08295 (2019) - [i3]Christoph Lüscher, Eugen Beck, Kazuki Irie, Markus Kitza, Wilfried Michel, Albert Zeyer, Ralf Schlüter, Hermann Ney:
RWTH ASR Systems for LibriSpeech: Hybrid vs Attention - w/o Data Augmentation. CoRR abs/1905.03072 (2019) - [i2]Kazuki Irie, Albert Zeyer, Ralf Schlüter, Hermann Ney:
Language Modeling with Deep Transformers. CoRR abs/1905.04226 (2019) - 2018
- [c12]Kazuki Irie, Shankar Kumar, Michael Nirschl, Hank Liao:
RADMM: Recurrent Adaptive Mixture Model with Applications to Domain Robust Language Modeling. ICASSP 2018: 6079-6083 - [c11]Kazuki Irie, Zhihong Lei, Ralf Schlüter, Hermann Ney:
Prediction of LSTM-RNN Full Context States as a Subtask for N-Gram Feedforward Language Models. ICASSP 2018: 6104-6108 - [c10]Albert Zeyer, Kazuki Irie, Ralf Schlüter, Hermann Ney:
Improved Training of End-to-end Attention Models for Speech Recognition. INTERSPEECH 2018: 7-11 - [c9]Kazuki Irie, Zhihong Lei, Liuhui Deng, Ralf Schlüter, Hermann Ney:
Investigation on Estimation of Sentence Probability by Combining Forward, Backward and Bi-directional LSTM-RNNs. INTERSPEECH 2018: 392-395 - [i1]Albert Zeyer, Kazuki Irie, Ralf Schlüter, Hermann Ney:
Improved training of end-to-end attention models for speech recognition. CoRR abs/1805.03294 (2018) - 2017
- [c8]Kazuki Irie, Pavel Golik, Ralf Schlüter, Hermann Ney:
Investigations on byte-level convolutional neural networks for language modeling in low resource speech recognition. ICASSP 2017: 5740-5744 - [c7]Pavel Golik, Zoltán Tüske, Kazuki Irie, Eugen Beck, Ralf Schlüter, Hermann Ney:
The 2016 RWTH Keyword Search System for Low-Resource Languages. SPECOM 2017: 719-730 - 2016
- [c6]Zoltán Tüske, Kazuki Irie, Ralf Schlüter, Hermann Ney:
Investigation on log-linear interpolation of multi-domain neural network language model. ICASSP 2016: 6005-6009 - [c5]Kazuki Irie, Zoltán Tüske, Tamer Alkhouli, Ralf Schlüter, Hermann Ney:
LSTM, GRU, Highway and a Bit of Attention: An Empirical Overview for Language Modeling in Speech Recognition. INTERSPEECH 2016: 3519-3523 - [c4]Ralf Schlüter, Patrick Doetsch, Pavel Golik, Markus Kitza, Tobias Menne, Kazuki Irie, Zoltán Tüske, Albert Zeyer:
Automatic Speech Recognition Based on Neural Networks. SPECOM 2016: 3-17 - 2015
- [c3]Rami Botros, Kazuki Irie, Martin Sundermeyer, Hermann Ney:
On efficient training of word classes and their application to recurrent neural network language models. INTERSPEECH 2015: 1443-1447 - [c2]Kazuki Irie, Ralf Schlüter, Hermann Ney:
Bag-of-words input for long history representation in neural network-based language models for speech recognition. INTERSPEECH 2015: 2371-2375 - 2014
- [c1]Simon Wiesler, Kazuki Irie, Zoltán Tüske, Ralf Schlüter, Hermann Ney:
The RWTH English lecture recognition system. ICASSP 2014: 3286-3290
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-11-26 20:44 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint