Unlock AI Insights with Local Processing on Windows

Share

Key Points:

• Microsoft is bringing cloud-hosted DeepSeek R1 models to Copilot+ PCs, starting with Qualcomm Snapdragon X, followed by Intel Core Ultra 200V and others.
• The models are optimized for Neural Processing Unit (NPU) on Copilot+ PCs, allowing developers to build and deploy AI-powered applications that run efficiently on-device.
• The first release, DeepSeek-R1-Distill-Qwen-1.5B, is available in AI Toolkit, with 7B and 14B variants arriving soon.

Microsoft is making headway in bringing AI to the edge with the introduction of cloud-hosted DeepSeek R1 models on Azure AI Foundry, now available on Copilot+ PCs. This is a significant development in the world of artificial intelligence, as it enables developers to build and deploy AI-powered applications that run efficiently on-device, taking full advantage of the powerful Neural Processing Units (NPUs) on Copilot+ PCs.

The first release, DeepSeek-R1-Distill-Qwen-1.5B, is now available in AI Toolkit, with 7B and 14B variants arriving soon. These optimized models allow developers to build and deploy AI-powered applications that can run efficiently on-device, taking full advantage of the powerful NPUs on Copilot+ PCs. The NPU on Copilot+ PCs offers a highly efficient engine for model inferencing, unlocking a paradigm where generative AI can execute not just when invoked, but enable semi-continuous running services.

To achieve this, Microsoft has employed several key learnings and techniques from their work on Phi Silica, including separating out the various parts of the model to drive the best trade-offs between performance and efficiency, low-bit rate quantization, and mapping transformers to the NPU. Additionally, they use the Windows Copilot Runtime (WCR) to scale across the diverse Windows ecosystem with ONNX QDQ format.

Get ready to play! To experience DeepSeek on your Copilot+ PC, simply download the AI Toolkit VS Code extension, and access the DeepSeek model optimized in the ONNX QDQ format. You can download it locally by clicking the "Download" button or try the cloud-hosted source model in Azure Foundry by clicking on the "Try in Playground" button under "DeepSeek R1".

The Distilled Qwen 1.5B model consists of a tokenizer, embedding layer, a context processing model, token iteration model, language model head, and de-tokenizer. With this release, developers can experiment with the model and get it ready for deployment using AI Toolkit’s Playground. The model is also available in the Azure AI Foundry for cloud-hosted deployment.

Microsoft is also showcasing the impressive results of their efforts with the NPU-optimized DeepSeek R1 model. The model retains the same reasoning ability as the original model, despite running on an NPU-enabled device. With the speed and power characteristics of the NPU-optimized version of the DeepSeek R1 models, users will be able to interact with these groundbreaking models entirely locally.

In the near future, we can expect to see AI-powered applications running more efficiently and effectively on-device, taking full advantage of the powerful NPUs on Copilot+ PCs. This is just the beginning of a new era of innovation in AI, and we can’t wait to see what the future holds for this technology.

Read the rest: Source Link

You might also like: Try AutoCAD 2025 for Windows, best free FTP Clients on Windows & browse Windows games to download.
Remember to like our facebook and our twitter @WindowsMode for a chance to win a free Surface every month.


Discover more from Windows Mode

Subscribe to get the latest posts sent to your email.