DeepSeek open-sources V4 large language model series

Asia / China0 views1 min

DeepSeek released a new series of open-source large language models called V4, comprising V4-Pro and V4-Flash, with features like hybrid attention mechanism and mHC. V4-Pro outperformed several other frontier models across three benchmarks.

Chinese AI developer DeepSeek released the V4 series of open-source large language models, including V4-Pro and V4-Flash. V4-Pro has 1.6 trillion parameters and activates 49 billion parameters when answering user prompts, while V4-Flash contains 284 billion parameters and activates 13 billion. The models feature a hybrid attention mechanism that reduces KV cache memory usage by 90%. DeepSeek trained V4 using a 27 trillion token dataset and a two-step post-training workflow. V4-Pro outperformed several frontier models, including Claude Opus 4.6, across three benchmarks. V4-Pro and V4-Flash are available in preview on Hugging Face.

This content was automatically generated and/or translated by AI. It may contain inaccuracies. Please refer to the original sources for verification.

DeepSeek open-sources V4 large language model series

Comments (0)