Every clip you inquire ChatGPT a question, your petition triggers a information relay race. Information leaves memory, passes done a CPU for preprocessing, travels to a GPU for dense computation, and past makes its mode backmost — and that full travel repeats for each azygous connection the AI generates.
The bottleneck is structural — it means routing done immoderate of the astir costly and power-intensive chips successful the manufacture connected each azygous request. That inefficiency is precisely what XCENA, a startup with offices successful South Korea and the U.S., is trying to solve. The four-year-old startup has designed a spot that places compute capabilities overmuch person to DRAM — the fast, short-term representation chips that store information a processor is actively utilizing — allowing regular information operations to beryllium handled adjacent memory, without the costly circular trips betwixt CPUs, GPUs, and memory.
If it works astatine scale, the implications for AI infrastructure costs could beryllium significant, which mostly explains capitalist enthusiasm astir the country. Indeed, XCENA conscionable raised $135 cardinal successful a Series B astatine a valuation of $570 million, bringing its full raised to $185 million.
XCENA CEO Jin Kim co-founded the startup successful 2022 alongside CTO Dohun Kim and CPO Harry Juhyun Kim, each veterans of Samsung and SK Hynix, the representation giants that proviso chips powering Nvidia’s GPUs. “CPUs and GPUs person some gotten smarter implicit the decades. Memory ne'er did. XCENA wants to alteration that,” Kim said successful an interrogation with TechCrunch. “The caller emergence successful representation prices and related stocks points to a broader displacement successful AI infrastructure toward memory-centric architectures,” helium added. (This month, the 3 companies that predominate the planetary representation spot marketplace — Samsung, SK Hynix, and Micron — each crossed a trillion-dollar valuation for the archetypal time.)
XCENA is betting its concern connected the thesis that “inference isn’t conscionable a compute problem; it’s progressively a representation scaling problem,” said Kim.
XCENA’s chip, the MX1, connects to the CPU done CXL (Compute Express Link) — fundamentally a dedicated explicit lane betwixt the processor and representation — processing information earlier it ever needs to permission the representation module. It brings compute to the data, not the different mode around. The institution claims that what utilized to necessitate 10 servers could perchance tally connected conscionable one.
“While GPUs excel astatine matrix multiplication — the dense mathematics down AI exemplary grooming — overmuch of the surrounding information orchestration, including preprocessing, KV cache absorption [the strategy that stores anterior speech discourse truthful a exemplary doesn’t person to reprocess it], and information caching, inactive runs connected CPUs. Our spot handles those tasks straight wrong the representation module itself,” Kim said.
Demand for representation solutions has surged since the 2nd fractional of past year, and the institution believes the timing is moving successful its favor.
Conversations with respective planetary representation vendors are successful aboriginal stages, though Kim declined to sanction them. The company’s perfect customers are hyperscalers spending tens of billions a twelvemonth connected AI infrastructure, wherever adjacent a tiny summation successful representation ratio tin mean hundreds of millions successful savings.
The MX1 is inactive a prototype. Mass accumulation chips are scheduled to rotation disconnected Samsung’s foundry lines by the extremity of 2026, with the institution expecting to make gross starting successful 2027.
While neural processing portion (NPU) makers are competing to situation Nvidia for grooming workloads, XCENA is targeting the memory-intensive furniture that sits underneath each of it.
XCENA’s closest rivals see Astera Labs and Marvell, some Nasdaq-listed companies moving connected next-generation representation connectivity. Marvell is simply a large, established subordinate already moving successful the aforesaid space, Kim said, adding that the differentiator comes down to intelligence property. “We person thousands of cores,” Kim said. Based connected nationalist specs, Marvell’s attack relies connected a fistful of general-purpose cores by comparison.
Those cores are built connected RISC-V — an open-source spot plan blueprint — and optimized specifically for information processing, with each halfway deliberately kept tiny and efficient. Beyond the cores themselves, XCENA designs its ain interior representation hierarchy, interconnect bus, and DRAM controller — a level of vertical integration that astir spot companies, including larger rivals, typically outsource.
Seoul-based VC firms Altinum and IMM Investment co-led the Series B round, on with Corstone Asia and existing investors SBI Investment and Mirae Asset Capital. The company, which has much than 90 unit crossed offices successful Pangyo, a tech hub extracurricular Seoul, and successful Sunnyvale, is besides successful conversations with planetary investors astir further funding.
When you acquisition done links successful our articles, we whitethorn gain a tiny commission. This doesn’t impact our editorial independence.
Kate Park is simply a newsman astatine TechCrunch, with a absorption connected technology, startups and task superior successful Asia. She antecedently was a fiscal writer astatine Mergermarket covering M&A, backstage equity and task capital.















English (US) ·