Blog1

标签: 推理模型

此标签下有15条笔记。

2026年4月30日
MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention
2026年4月30日
gpt-oss-120b & gpt-oss-20b Model Card
2026年4月30日
推理模型训练方法比较 DeepSeek-R1 vs Kimi k1.5 vs Qwen3
2026年4月30日
GRPO 分组相对策略优化
2026年4月30日
推理模型与强化学习
2026年4月30日
测试时计算扩展
2026年4月30日
知识蒸馏
2026年4月30日
DeepSeek 系列模型
2026年4月30日
Qwen3
2026年4月30日
为什么 MCTS 在 LLM 推理中失败了
2026年4月30日
知识蒸馏 vs RL 哪种方式更能有效获得推理能力
2026年4月30日
Kimi k1.5: Scaling Reinforcement Learning with LLMs
2026年4月30日
MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention
2026年4月30日
Qwen3 Technical Report
2026年4月30日
gpt-oss-120b & gpt-oss-20b Model Card

Created with Quartz v4.5.2 © 2026

GitHub
Discord Community