The Transformer architecture, introduced by Vaswani et al. in 2017, serves as the backbone of contemporary language models. Over the years, numerous modifications to this architecture have been ...
Universal Transformer Memory uses neural networks to determine which tokens in the LLM's context window are useful or redundant.
Building on previous breakthroughs that showcased a transformer-based language model's capacity to predict adsorption energy ...
The STAR framework from Liquid AI uses evolutionary algorithms and a numerical encoding system to balance quality and efficiency in AI models.
Instead, they suggest, "it would be ideal for LLMs to have the freedom to reason without any language constraints and then translate their findings into language only when necessary." To achieve that ...
a family of small language models that employ a hybrid-head parallel architecture. By blending transformer attention mechanisms with state space models (SSMs), Hymba achieves superior efficiency and ...
Google had come up with the seminal transformer paper in 2017 which ended up launching the current AI revolution, but all its ...