Qwen2.5 3B

简介

State‑of‑the‑art large language model useful on a variety of language understanding and generation tasks.
The Qwen2.5‑3B‑Instruct is a state‑of‑the‑art multilingual language model with 3 billion parameters, excelling in language understanding, generation, coding, and mathematics.

效果视频

适用平台

SC8380

性能信息

推理速度: 18 TPS

技术细节

Input sequence length for Prompt Processor:128
Context length:4096
Number of parameters:3B
Precision:W4A16 (4-bit weights, 16-bit activations)
Num of key-value heads: The model uses Grouped-Query Attention (GQA).
Information about the model parts: The model is split into 5 parts, and weight sharing is enabled across models with different auto-regression lengths (e.g., 128 and 32).
Supported languages: Multiple languages, including English and various European languages that use the Latin alphabet.
Minimum QNN SDK version required:2.31

应用领域

对话
内容生成
客户支持

支持平台类型

SC8380

授权信息

Source Model: Apache 2.0
Deployable Model: Apache 2.0

下载链接

点这里下载