Data Center
Qualcomm® Data Center Solutions power racks, servers, and cards in data centers around the world, delivering leading performance per watt, high-performance density, and low total cost of ownership (TCO). We are committed to a data center roadmap with an annual cadence moving forward, focused on industry-leading AI inference performance, energy efficiency, and TCO.
From accelerator cards to racks
Our data center portfolio is built on our foundation of high-performance, low-power computing, leading AI, and systems expertise. This allows us to make AI inference efficient at scale. Our solutions support flexible and open deployment models, from accelerator cards to full AI inference racks, and integrate seamlessly with existing infrastructure.
Leading NPU
Qualcomm® Hexagon™ NPU technology is custom-designed, optimized, and scaled-up for data center AI inference workloads for leading efficiency and performance.
Superior memory capacity
Supports LPDDR memory for lower cost and increased memory capacity.
Low TCO
Rack-level focus and optimizations specifically for AI inference enable fast generative AI inference at high performance per dollar per watt.
Innovative memory architecture
Based on near-memory computing, it provides a generational leap in efficiency and performance for AI inference workloads by delivering significantly higher effective memory bandwidth and much lower power consumption.
Performance-per-watt leadership
Industry-leading efficiency for generative AI workloads.
Rich software stack
Our hyperscaler-grade AI software stack, which spans end-to-end from the application layer to system software layer, is optimized for AI inference.
Open ecosystem
Supports leading machine learning (ML) frameworks like PyTorch and ONNX, inference engines like vLLM, generative AI frameworks like LangChain and CrewAI, and LLM/LMM inference optimization techniques like disaggregated serving.
Confidential Computing
Built-in security features for enterprise-grade deployments.
Data Center Offerings
Qualcomm® Cloud AI 100 Ultra
Optimized for generative AI inference, supporting models up to 100B parameters on a single 150W card.
Qualcomm® AI 200
A rack-level solution optimized for AI inference and TCO, featuring Hexagon NPU technology, direct liquid cooling, 768 GB of LPDDR memory per card, PCIe for scale up, Ethernet for scale out, confidential computing, and a rack-level power consumption of 160 kW. Commercial availability in 2026.
Qualcomm® AI 250
An advanced rack-level solution with innovative memory architecture based on near-memory computing, providing a generational leap in efficiency and performance for AI inference workloads by delivering greater than 10x higher effective memory bandwidth and much lower power consumption than Qualcomm AI200. Commercial availability in early 2027.
Qualcomm AI Inference Suite
A comprehensive set of ready-to-use AI applications, agents, tools, and libraries for developing and deploying AI inference on premises or via cloud deployments.
Server CPUs
Qualcomm Data Center Solutions is developing state-of-the-art data center CPU solutions. Stay tuned for more details.
Get started
with these resources
Interested in reducing your inference cost and improving your performance per watt in data centers?
Evaluate the Qualcomm Cloud AI 100 Ultra
Try gen AI inferencing on Qualcomm Cloud AI Ultra in our developer playgrounds from Cirrascale.
Cirrascale Inference Cloud Powered by Qualcomm Technologies
Scale your offerings efficiently with inference as a service powered by Qualcomm Cloud AI.
Run Inference On Premises
Certain applications need low-latency, secure, and private solutions at a low cost not found with current cloud inference providers.
AI Inference Use Cases
AI Chatbot
Deploy LLM-powered AI chatbots with retrieval-augmented generation (RAG) using custom data to enhance business functions from customer service, sales, and marketing to internal enterprise process automation.
Text-to-Image Applications
Generate pictures, diagrams, and graphics, from text inputs.
Agentic Workflows
Designed to simplify the deployment of multi-agent systems for enterprise and generative AI applications.
Text-to-Code Applications
Produce high-quality programming code from natural language conversation with generative AI models. CodeGen is the neural network model behind this process.
Featured platforms for Qualcomm Cloud AI 100 Ultra
Amazon EC2
Amazon EC2 DL2q instances, powered by Qualcomm Cloud AI 100 Standard accelerators, can be used to run popular gen AI applications.
Cirrascale
The Cirrascale AI Innovation Cloud utilizes the Qualcomm Cloud AI 100 Ultra inference accelerator for customers to test, utilize and fully deploy large language models (LLM), natural language processing (NLP), or object detection workloads in the cloud.
ALLaM Playground
Access the ALLaM developer playground to evaluate the Qualcomm AI Inference Suite capabilities and take advantage of ready-to-use AI applications and agents, tools and libraries for building cloud and hybrid AI enabled applications.
Core42
Qualcomm Cloud AI 100 Ultra enables AI inference on Core42's Condor AI and HPC platform, Condor Galaxy. This helps Core42 bring cloud-scale inference to the data center and edge.
Ecosystem Partners & Collaborators
Featured case studies
Ready to get started?
Contact our sales team to get a demonstration or additional documentation about the benefits of Qualcomm Cloud AI and the Qualcomm AI Inference Suite.
