Overview

  • Founded Date 28 August 2022
  • Sectors Automotive Jobs
  • Posted Jobs 0
  • Viewed 8
Bottom Promo

Company Description

DeepSeek’s First-generation Reasoning Models

DeepSeek’s first-generation reasoning models, achieving efficiency equivalent to OpenAI-o1 throughout mathematics, code, and thinking jobs.

Models

DeepSeek-R1

Distilled designs

DeepSeek team has actually demonstrated that the reasoning patterns of bigger models can be distilled into smaller sized models, resulting in better performance compared to the reasoning patterns found through RL on little models.

Below are the designs created via fine-tuning against numerous dense models widely utilized in the research study community utilizing thinking data produced by DeepSeek-R1. The evaluation results show that the smaller sized thick designs perform exceptionally well on criteria.

DeepSeek-R1-Distill-Qwen-1.5 B

DeepSeek-R1-Distill-Qwen-7B

DeepSeek-R1-Distill-Llama-8B

DeepSeek-R1-Distill-Qwen-14B

DeepSeek-R1-Distill-Qwen-32B

DeepSeek-R1-Distill-Llama-70B

License

The design weights are certified under the MIT License. DeepSeek-R1 series assistance business usage, permit for any modifications and derivative works, consisting of, but not restricted to, distillation for training other LLMs.

Bottom Promo
Bottom Promo
Top Promo