Key Points:
• AWS has released its next-gen Trainium2 chips, promising 20.8 peak petaflops of compute and 30-40% better price/performance than current GPU-based instances.
• AWS is looking to reduce its reliance on outside infrastructure and establish dominance in the AI chip game alongside Google, Microsoft, and others.
• The company has partnered with OpenAI rival Anthropic to use its platform to train and deploy its Claude models, demonstrating its infrastructure’s ability to support intense workloads.
AWS has made a significant move in the AI infrastructure market with the release of its next-gen Trainium2 chips. This next-generation AI chip is designed to power AWS’ most powerful instances for generative AI, and the company is already teasing its Trainium3 chip, expected in late 2025.
AWS’ Trainium2 instances are powered by 16 connected Trainium2 chips, providing 20.8 peak petaflops of compute, making it ideal for training and deploying large language models (LLMs) with 100 billion-plus parameters. According to AWS, this provides a 30-40% better price/performance compared to current GPU-based instances.
The company is also enhancing its partnership with OpenAI rival Anthropic, who will use AWS’ platform to train and deploy its Claude models, demonstrating AWS’ infrastructure’s ability to support intense workloads from one of the leading builders of AI today. This partnership highlights AWS’ ability to handle intense AI workloads, rivaling those of Nvidia, Google Cloud, and Microsoft Azure.
However, analysts say it’s still early to tell whether AWS will move ahead of the pack, as it is still relatively new in the AI chip game, and Nvidia still holds roughly 80% market share for AI chips. The company’s focus on self-sufficiency and reliance on its own AI chip development is evident, but it remains to be seen how effective this strategy will be.
While the Trainium2 instances are designed for high-performance computing, some experts question whether all companies will need that much power. For instance, smaller enterprises may require more specialized and scalable models that don’t require such massive compute resources. However, Trainium2’s potential lies in its ability to apply general knowledge, tying smaller models into larger ones to enhance capabilities, as well as its utility in retrieval-augmented generation (RAG).
Read the rest: Source Link
You might also like: How to get Windows Server 2022, Try Windows 11 Pro for Workstations & browse Windows Azure content.
Remember to like our facebook and our twitter @WindowsMode for a chance to win a free Surface every month.
Discover more from Windows Mode
Subscribe to get the latest posts sent to your email.