Every time you ask ChatGPT a question, your request triggers a data relay race. Information leaves memory, passes through CPUs for preprocessing, travels to GPUs for heavy computation, and then makes its way back — that entire journey repeats for every single word the AI generates.
The bottleneck is structural: routing through some of the most expensive and power-intensive chips in the industry on every single request. That inefficiency is exactly what XCENA, a startup with offices in South Korea and the U.S., is trying to solve with its innovative chip design that places compute capabilities closer to DRAM.
If it works at scale, the implications for AI infrastructure costs could be significant. Indeed, XCENA just raised $135 million in a Series B at a valuation of $570 million, bringing its total raised to $185 million.
XCENA’s chip, the MX1, connects to the CPU through CXL — essentially a dedicated express lane between the processor and memory. It processes data before it ever needs to leave the memory module, bringing compute to the data instead of the other way around. The company claims that what used to require 10 servers could potentially run on just one.
While global demand for memory solutions has surged since the second half of last year, XCENA is targeting hyperscalers spending tens of billions a year on AI infrastructure, where even a small gain in memory efficiency can mean hundreds of millions in savings. The MX1 is still a prototype with mass production scheduled to roll off Samsung’s foundry lines by the end of 2026.







