A Miami-based startup called Subquadratic has made waves by claiming they've solved a mathematical bottleneck that’s been holding back large language models (LLMs) for nearly a decade. The company's new model, named SubQ, reportedly processes up to 12 times as much text at once as other models and matches the performance of giants like Google DeepMind on key tasks.
The initial announcement from Subquadratic was met with skepticism, given that they provided little evidence beyond self-published test scores. However, following a third-party evaluation by Appen, the company now has some ground to stand on. Results have validated their claims about speed and efficiency in certain data-heavy tasks.
Daniel Whedon, Subquadratic's chief technology officer, says they are taking time to ensure all future results are fully verified before sharing them publicly, acknowledging that healthy skepticism was expected. The company hopes this breakthrough could change how LLMs are built, with Justin Dangel predicting a new age of efficiency in the next few years.
At the heart of most LLMs is the transformer neural network and its dense attention mechanism, which can be incredibly computationally intensive. Subquadratic's approach involves sparse attention, drastically reducing the number of computations needed by selecting only some numbers to multiply rather than all. This simple yet effective method could significantly cut costs while maintaining performance.
The question now is whether this new model will truly revolutionise LLMs or prove to be just another groundbreaking claim that fizzles out. Only time will tell if Subquadratic has indeed cracked the code on efficiency, or if it's merely a passing fad in the ever-evolving world of artificial intelligence.







