Qualcomm Technologies partners with Meta and Ollama on the new quantized Llama 3.2 Models
The journey of revolutionizing on-device AI accuracy, performance and memory footprint continues to evolve. Meta's recent Llama 3.2 1B/3B has been a game-changer and they’re already getting new upgrades. Qualcomm Technologies has teamed up with Meta and Ollama once again to support the new Llama 3.2 quantized models.
This collaboration underscores our commitment to pushing the boundaries of what's possible with on-device AI and increasing the accessibility and effectiveness of the cutting-edge Llama models for a wide range of AI applications. Developers can integrate these optimized models into their applications seamlessly using Ollama on AI PCs powered by Snapdragon X Elite and Snapdragon X Plus processors.
The new quantized Llama models excel in use cases like knowledge retrieval, summarization, and instruction following, outperforming competitors in these areas. They provide a 2-3x speedup and a 45-60% reduction in model size and memory footprint compared to their original format.
We invite developers to explore these quantized models and explore developing apps with RAG and Function calling capabilities using Ollama for Snapdragon X series compute platforms and integrate them into their applications to unlock new possibilities in on-device AI.
Stay tuned for more updates as we continue to innovate and expand the capabilities of Llama models.

