Future of DeepSeek at AWS

awsmind
Jun 29, 2025
5 min read

by Justin Cook

Earlier this year, DeepSeek-R1 models were launched on Amazon Bedrock, both in the Custom Model Import and the Marketplace. I’ve tested this fully managed, serverless solution, and the best part is being able to use a single API in Bedrock for a wide range of features and tools.

DeepSeek R1 is a revolutionary AI model that learns through practice and rewards, similar to how children learn by trial and error. Its unique Mixture of Experts (MoE) architecture has 671 billion parameters but activates only 37 billion for each task, making it both efficient and cost-effective. The model excels in complex tasks like math and coding, achieving nearly 80% accuracy on advanced math tests and outperforming 96% of human programmers in challenges. There are two versions: R1-Zero, which learns through reinforcement learning alone, and R1, which combines reinforcement learning with example-based learning for clearer, human-understandable explanations. Remarkably, DeepSeek R1 offers similar or better performance than top models like GPT-4 and Claude, but at about 27 times lower cost.

DeepSeek’s model is available under an MIT license, providing strong abilities in reasoning, coding, natural language understanding, decision support, software development, mathematical problem-solving, scientific analysis, data insights, and knowledge management.

Security & Data Privacy With Amazon Bedrock, you get enterprise-grade security features such as data encryption, fine-grained access controls, and secure connectivity options. Your inputs and model outputs are never shared with providers. You also get compliance certifications to ensure secure data handling.

Content Filtering You can customize safeguards based on your needs with Amazon Bedrock Guardrails. These include content filtering, protection from sensitive information leaks, and security measures to avoid model hallucinations. This ensures the model interacts in line with your policies and filters harmful content in generative AI applications.

Model Breakdown You can rip apart and review models like DeepSeek-R1 in Amazon Bedrock to find the best fit for your use case. This can be done through automatic evaluations using metrics like accuracy and robustness or through human evaluations for more subjective metrics.

Steps to Launch DeepSeek-R1 on AWS

1. Deploy DeepSeek-R1 in Bedrock Console: — Go to the Amazon Bedrock console, request access to DeepSeek-R1, and follow the steps to deploy.

2. Deploy in the Marketplace or SageMaker AI: — Amazon Bedrock: Ideal for teams wanting fast integration with pre-trained models via APIs.— Amazon SageMaker: Best for advanced customizations, training, and deployment.

3. Deploy DeepSeek-R1-Distill Models: — You can also use Amazon EC2 with AWS Trainium or AWS Inferentia for a cost-effective deployment of distilled models.

4. Additional Deployment Options: — DeepSeek-V3 and Janus-Pro-7B can be deployed on EC2 instances.

How to Deploy on Amazon EKS Auto Mode (Simplified)

For a lighter deployment, you can use the DeepSeek-R1-Distill-Llama-8B model. This requires fewer resources compared to the full DeepSeek-R1 model.

- Install Necessary Tools: — Install `kubectl` and `Terraform` to manage the deployment process.

- Create EKS Cluster & NodePool: — Use Terraform to provision an EKS cluster and GPU-enabled NodePool.

- Deploy the Model on Kubernetes: — Replace configuration placeholders with your model’s parameters, then apply the deployment using `kubectl`.

- Check Pod Status: — Monitor the pod’s status, ensuring it is running, and check the logs for any startup errors.

- Interact with the Model: — Set up a local proxy and use `curl` to send requests to interact with the model.

Building a Chatbot UI

If you prefer a more user-friendly interface, you can build a chatbot UI to interact with DeepSeek-R1. You can use the pre-built source code from the GitHub repository to set it up, and push the image to your Amazon ECR repository.

Once the UI is deployed, you can access it via a load balancer and log in with the credentials stored in Kubernetes secrets.

By following these steps, you can efficiently deploy DeepSeek-R1 on AWS, taking advantage of flexible scaling and resource control to keep costs low while ensuring high performance. For more deployment patterns, visit the GitHub repository for detailed guides.

Last, I played with EKS Auto mode and the DeepSeek-R1-Distill-Llama-8B distilled model, which doesn’t really require the GPU like the full DeepSeek-R1 model with 671B parameters, it provides a lighter, though less powerful, option compared to the full model. If you’d prefer to deploy the full DeepSeek-R1 model, replace the distilled model in the vLLM configuration.

Here’s a simplified version of the steps:

1. Install Prerequisites We’ll use AWS CloudShell for the setup. - Install kubectl: ```bashcurl -LO “https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl "sudo install -o root -g root -m 0755 kubectl /usr/local/bin/kubectl```

- Install Terraform: ```bashsudo yum install -y yum-utilssudo yum-config-manager — add-repo https://rpm.releases.hashicorp.com/AmazonLinux/hashicorp.repo sudo yum -y install terraform```

2. Create EKS Cluster with Terraform - Clone the repo: ```bashgit clone -b v0.1 https://github.com/aws-samples/deepseek-using-vllm-on-eks cd deepseek-using-vllm-on-eks```

- Initialize and apply Terraform: ```bashterraform initterraform apply -auto-approve```

- Set up kubectl with the new EKS cluster: ```bash$(terraform output configure_kubectl | jq -r)```

3. Create EKS Auto Mode NodePool - Create NodePool for GPU support: ```bashkubectl apply -f manifests/gpu-nodepool.yaml```

- Check NodePool status: ```bashkubectl get nodepool/gpu-nodepool```

4. Deploy DeepSeek Model - Update model parameters: ```bashsed -i “s|__MODEL_NAME_AND_PARAMETERS__|deepseek-ai/DeepSeek-R1-Distill-Llama-8B — max_model 2048|g” manifests/deepseek-deployment-gpu.yaml```

- Deploy model on Kubernetes: ```bashkubectl apply -f manifests/deepseek-deployment-gpu.yaml```

- Check pods status: ```bashkubectl get po -n deepseek```

5. Interact with DeepSeek Model - Set up local proxy: ```bashkubectl port-forward svc/deepseek-svc -n deepseek 8080:80 > port-forward.log 2>&1 &```

- Send a curl request: ```bashcurl -X POST “http://localhost:8080/v1/chat/completions" -H “Content-Type: application/json” — data ‘{“model”: “deepseek-ai/DeepSeek-R1-Distill-Llama-8B”,“messages”: [{“role”: “user”, “content”: “What is Kubernetes?”}]}’```

6. Build and Deploy Chatbot UI - Get ECR repository URI: ```bashexport ECR_REPO=$(terraform output ecr_repository_uri | jq -r)```

- Build and push the chatbot UI image: ```bashdocker build -t $ECR_REPO:0.1 chatbot-ui/application/.aws ecr get-login-password | docker login — username AWS — password-stdin $ECR_REPOdocker push $ECR_REPO:0.1```

- Update deployment manifest and deploy the UI: ```bashsed -i “s#__IMAGE_DEEPSEEK_CHATBOT__#$ECR_REPO:0.1#g” chatbot-ui/manifests/deployment.yamlkubectl apply -f chatbot-ui/manifests/ingress-class.yamlkubectl apply -f chatbot-ui/manifests/deployment.yaml```

- Get the URL to access the chatbot UI: ```bashecho http://$(kubectl get ingress/deepseek-chatbot-ingress -n deepseek -o json | jq -r ‘.status.loadBalancer.ingress[0].hostname’)```

- Retrieve the UI login credentials: ```bashecho -e “Username=$(kubectl get secret deepseek-chatbot-secrets -n deepseek -o jsonpath=’{.data.admin-username}’ | base64 — decode)\nPassword=$(kubectl get secret deepseek-chatbot-secrets -n deepseek -o jsonpath=’{.data.admin-password}’ | base64 — decode)”```

After logging in, you can use the chatbot UI to interact with the model. These steps will help you deploy DeepSeek-R1 on AWS EKS, utilizing flexible scaling and resource control to keep costs low and performance high.

Overall DeepSeek is a fun new change to leverage AWS to grow your AI expertise.

These steps will help you deploy DeepSeek-R1 on AWS EKS, utilizing flexible scaling and resource control to keep costs low and performance high.

Future of DeepSeek at AWS

Here’s a simplified version of the steps:

Recent Posts

Comments