1 DeepSeek R1 Model now Available in Amazon Bedrock Marketplace And Amazon SageMaker JumpStart
rochellharris6 edited this page 4 months ago


Today, we are delighted to announce that DeepSeek R1 distilled Llama and Qwen models are available through Amazon Bedrock Marketplace and Amazon SageMaker JumpStart. With this launch, you can now release DeepSeek AI's first-generation frontier design, DeepSeek-R1, together with the distilled versions varying from 1.5 to 70 billion specifications to construct, experiment, and responsibly scale your generative AI ideas on AWS.

In this post, we show how to start with DeepSeek-R1 on Amazon Bedrock Marketplace and SageMaker JumpStart. You can follow comparable actions to deploy the distilled versions of the models also.

Overview of DeepSeek-R1

DeepSeek-R1 is a large language design (LLM) developed by DeepSeek AI that utilizes support finding out to enhance thinking abilities through a multi-stage training process from a DeepSeek-V3-Base foundation. A crucial differentiating function is its reinforcement knowing (RL) action, which was used to refine the design's reactions beyond the standard pre-training and tweak procedure. By including RL, DeepSeek-R1 can adjust better to user feedback and objectives, eventually enhancing both relevance and clarity. In addition, DeepSeek-R1 employs a chain-of-thought (CoT) approach, suggesting it's geared up to break down complex inquiries and reason through them in a detailed way. This directed thinking process allows the model to produce more accurate, transparent, and detailed responses. This design integrates RL-based fine-tuning with CoT abilities, aiming to produce structured reactions while focusing on interpretability and user interaction. With its extensive abilities DeepSeek-R1 has captured the market's attention as a versatile text-generation model that can be integrated into various workflows such as agents, rational thinking and data interpretation tasks.

DeepSeek-R1 uses a Mixture of Experts (MoE) architecture and is 671 billion parameters in size. The MoE architecture allows activation of 37 billion criteria, making it possible for effective reasoning by routing questions to the most appropriate expert "clusters." This method enables the design to concentrate on different problem domains while maintaining general performance. DeepSeek-R1 needs at least 800 GB of HBM memory in FP8 format for inference. In this post, we will utilize an ml.p5e.48 xlarge instance to release the design. ml.p5e.48 xlarge features 8 Nvidia H200 GPUs supplying 1128 GB of GPU memory.

DeepSeek-R1 distilled models bring the reasoning capabilities of the main R1 model to more efficient architectures based upon popular open designs like Qwen (1.5 B, 7B, 14B, and 32B) and Llama (8B and [forum.batman.gainedge.org](https://forum.batman.gainedge.org/index.php?action=profile