Categories
AI

Deepseek-r1 (Reasoning models)

DeepSeek’s first generation reasoning models with comparable performance to OpenAI-o1.

Deepseek logo (2025-01-22)

What is a Reasoning model

In the words of OpenAI.

A new series of AI models designed to spend more time thinking before they respond. As AI becomes more advanced, it will solve increasingly complex and critical problems. It also takes significantly more compute to power these capabilities.

https://openai.com/o1/ (2025-01-22)

Or as an excerpt from Wikipedia.

It spends time “thinking” before it answers, making it better at complex reasoning tasks, science and programming…

https://en.wikipedia.org/wiki/OpenAI_o1 (2025-01-22)

Now we have two options: OpenAI’s ‘o1-model’ and Deepseek R1. The main difference is that the OpenAI model is only available to paying customers in a very limited quantity. Due to its much lower computational requirements and pricing, it’s possible to use Deepseek online on their website. Due to the power of open source, you can also run it on your Ollama.

Try Online

DeepSeek-V3 achieves a significant breakthrough in inference speed over previous models.

It tops the leaderboard among open-source models and rivals the most advanced closed-source models globally.

https://www.deepseek.com (2025-01-22)

https://www.deepseek.com

Try local with Ollama

https://ollama.com/library/deepseek-r1 (self)

1.5B Qwen DeepSeek R1

ollama pull deepseek-r1:1.5b

7B Qwen DeepSeek R1

ollama pull deepseek-r1:7b

8B Llama DeepSeek R1

ollama pull deepseek-r1:8b

14B Qwen DeepSeek R1

ollama pull deepseek-r1:14b

32B Qwen DeepSeek R1

ollama pull deepseek-r1:32b

70B Llama DeepSeek R1

ollama pull deepseek-r1:70b

671B DeepSeek R1

ollama pull deepseek-r1:671b

What i learned using Deepseek-r1

It feels quite different to prompt reasoning models. Most of your prompt engineering skills for instruct models change when working with reasoning models. Zero-shot prompting seems to be a great approach for them. Or asking the model to take all the time it needs, as opposed to asking for fast results, will apparently improve results.

Reasoning models take notably longer to respond, but they also have great results in many use cases. DeepSeek seems to produce great results, in some benchmarks, even better than OpenAI’s o1-model for a fraction of the price (API or CPU), or even self-hosted as it is open source in comparison to OpenAI o1-Model which is not available for local usage.

Hallucination of browsing

I see deepseek-r1:32b making references to looking things up online with no means to do so in my example. It thinks it looks stuff up, but this is all happening as part of the reasoning.

2025-02-01 deepseek-r1:32b looking up facts, but having no access for real.