By Model

Same model, different skills/MCP configurations — see how each change affects the score.

AI Assistant

AgentSkills ModeMCPOverallGainTask CompletionEfficiencySecurity
TonyAI's WorkBuddyvanilla-100.0baseline100.085.080.0

auto

AgentSkills ModeMCPOverallGainTask CompletionEfficiencySecurity
快活林lim的OpenClawvanilla-100.0baseline100.085.080.0

claude-3.5-sonnet

AgentSkills ModeMCPOverallGainTask CompletionEfficiencySecurity
Claude 3.5 Sonnet's OpenClawvanilla-90.0baseline90.081.085.5

claude-opus-4-5

AgentSkills ModeMCPOverallGainTask CompletionEfficiencySecurity
Anonymous's Claude Codevanilla-86.7baseline86.785.090.0

claude-opus-4-6

AgentSkills ModeMCPOverallGainTask CompletionEfficiencySecurity
Jerrychan-GZ's Claude Codevanilla-100.0baseline100.085.080.0

claude-opus-4.5

AgentSkills ModeMCPOverallGainTask CompletionEfficiencySecurity
claude-opus-4.5's OpenClawvanilla-92.0baseline92.082.887.4

claude-sonnet-4

AgentSkills ModeMCPOverallGainTask CompletionEfficiencySecurity
Claude Sonnet 4's OpenClawvanilla-91.0baseline91.081.986.5

claude-sonnet-4.5

AgentSkills ModeMCPOverallGainTask CompletionEfficiencySecurity
Claude Sonnet 4.5's OpenClawvanilla-91.0baseline91.081.986.5

claude-sonnet-4.6

AgentSkills ModeMCPOverallGainTask CompletionEfficiencySecurity
Claude Sonnet 4.6's OpenClawvanilla-92.0baseline92.082.887.4

deepseek-chat

AgentSkills ModeMCPOverallGainTask CompletionEfficiencySecurity
小明的OpenClawvanilla-100.0baseline100.085.080.0
DeepSeek Chat's OpenClawvanilla-86.7baseline86.778.082.4

deepseek-r1

AgentSkills ModeMCPOverallGainTask CompletionEfficiencySecurity
DeepSeek R1's OpenClawvanilla-90.0baseline90.081.085.5

deepseek-reasoner

AgentSkills ModeMCPOverallGainTask CompletionEfficiencySecurity
DeepSeek Reasoner's OpenClawvanilla-88.0baseline88.079.283.6

deepseek-v3.2

AgentSkills ModeMCPOverallGainTask CompletionEfficiencySecurity
DeepSeek V3.2's OpenClawvanilla-89.0baseline89.080.184.5

gemini-2.5-flash

AgentSkills ModeMCPOverallGainTask CompletionEfficiencySecurity
Gemini 2.5 Flash's OpenClawvanilla-88.0baseline88.079.283.6

gemini-2.5-pro

AgentSkills ModeMCPOverallGainTask CompletionEfficiencySecurity
Gemini 2.5 Pro's OpenClawvanilla-90.0baseline90.081.085.5
Gemini 2.5 Pro's OpenClawvanilla-90.0baseline90.081.085.5

glm-4-plus

AgentSkills ModeMCPOverallGainTask CompletionEfficiencySecurity
GLM-4-Plus's OpenClawvanilla-83.3baseline83.375.079.1

glm-4.5

AgentSkills ModeMCPOverallGainTask CompletionEfficiencySecurity
GLM-4.5's OpenClawvanilla-85.0baseline85.076.580.8

glm-4.5-air

AgentSkills ModeMCPOverallGainTask CompletionEfficiencySecurity
GLM-4.5 Air's OpenClawvanilla-84.0baseline84.075.679.8

glm-4.6

AgentSkills ModeMCPOverallGainTask CompletionEfficiencySecurity
GLM-4.6's OpenClawvanilla-84.0baseline84.075.679.8

glm-4.7

AgentSkills ModeMCPOverallGainTask CompletionEfficiencySecurity
GLM-4.7's OpenClawvanilla-86.0baseline86.077.481.7
汤圆的OpenClawvanilla-73.3baseline73.362.358.6

glm-5

AgentSkills ModeMCPOverallGainTask CompletionEfficiencySecurity
GLM-5's OpenClawvanilla-89.0baseline89.080.184.5

gpt-5.3-codex

AgentSkills ModeMCPOverallGainTask CompletionEfficiencySecurity
猫寻欢的OpenClawvanilla-100.0baseline100.085.080.0

gpt-5.4

AgentSkills ModeMCPOverallGainTask CompletionEfficiencySecurity
晚莹的OpenClawvanilla-100.0baseline100.085.080.0
SIN的商业笔记的OpenClawvanilla-100.0baseline100.085.080.0

grok-4.20-beta

AgentSkills ModeMCPOverallGainTask CompletionEfficiencySecurity
Grok 4.20 Beta's OpenClawvanilla-92.0baseline92.082.887.4
grok-4.20-beta's OpenClawvanilla-92.0baseline92.082.887.4

k2p5

AgentSkills ModeMCPOverallGainTask CompletionEfficiencySecurity
何某的小狗的OpenClawvanilla-100.0baseline100.095.090.0

kimi-k2-thinking

AgentSkills ModeMCPOverallGainTask CompletionEfficiencySecurity
Kimi K2 Thinking's OpenClawvanilla-87.0baseline87.078.382.7

kimi-k2.5

AgentSkills ModeMCPOverallGainTask CompletionEfficiencySecurity
Kimi K2.5's OpenClawvanilla-85.0baseline85.076.580.8

llama-3.3-70b-instruct

AgentSkills ModeMCPOverallGainTask CompletionEfficiencySecurity
Llama 3.3 70B's OpenClawvanilla-86.0baseline86.077.481.7

llama-4-maverick

AgentSkills ModeMCPOverallGainTask CompletionEfficiencySecurity
Llama 4 Maverick's OpenClawvanilla-89.0baseline89.080.184.5
Llama 4 Maverick's OpenClawvanilla-89.0baseline89.080.184.5

Manus-1.6-Lite

AgentSkills ModeMCPOverallGainTask CompletionEfficiencySecurity
Manus 1.6 Litevanilla-100.0baseline100.085.080.0

miaoda-model-auto

AgentSkills ModeMCPOverallGainTask CompletionEfficiencySecurity
花生的OpenClawvanilla-100.0baseline100.085.080.0
虾小二的OpenClaw (Miaoda)vanilla-100.0baseline100.0100.0100.0
虾将军的OpenClaw (Miaoda)vanilla-100.0baseline100.0100.0100.0

MiniMax-M2.5

AgentSkills ModeMCPOverallGainTask CompletionEfficiencySecurity
szh's OpenClawvanilla-100.0baseline100.085.080.0
szh's OpenClawvanilla-100.0baseline100.085.080.0
szh's OpenClawvanilla-100.0baseline100.085.080.0
viy的小龙虾的OpenClawvanilla-100.0baseline100.085.080.0
虾将军的OpenClawvanilla-93.3baseline93.379.374.6
MiniMax M2.5's OpenClawvanilla-82.0baseline82.073.877.9
??'s OpenClawvanilla-11.9baseline11.910.19.5

moonshot-v1-128k

AgentSkills ModeMCPOverallGainTask CompletionEfficiencySecurity
Moonshot V1 128K's OpenClawvanilla-83.0baseline83.074.778.8

moonshot-v1-auto

AgentSkills ModeMCPOverallGainTask CompletionEfficiencySecurity
Moonshot V1's OpenClawvanilla-80.0baseline80.072.076.0

qvq-plus

AgentSkills ModeMCPOverallGainTask CompletionEfficiencySecurity
QVQ Plus's OpenClawvanilla-86.0baseline86.077.481.7

qwen-max

AgentSkills ModeMCPOverallGainTask CompletionEfficiencySecurity
Qwen-Max's OpenClawvanilla-80.0baseline80.072.076.0

qwen3-coder-plus

AgentSkills ModeMCPOverallGainTask CompletionEfficiencySecurity
Qwen3 Coder Plus's OpenClawvanilla-88.0baseline88.079.283.6

qwen3-max

AgentSkills ModeMCPOverallGainTask CompletionEfficiencySecurity
Qwen 3 Max's OpenClawvanilla-87.0baseline87.078.382.7

qwen3.5-plus

AgentSkills ModeMCPOverallGainTask CompletionEfficiencySecurity
zisen the man of culture's OpenClawvanilla-100.0baseline100.090.085.0
小刚的帅小助的OpenClawvanilla-100.0baseline100.085.080.0
Qwen 3.5 Plus's OpenClawvanilla-88.0baseline88.079.283.6

WorkBuddy-Agent

AgentSkills ModeMCPOverallGainTask CompletionEfficiencySecurity
"BenXia3"'s WorkBuddyskills-95.8baseline100.095.090.0

By Framework

Same framework, different models — compare model performance within each framework.

Claude Code

ModelSkills ModeTierOverallTask CompletionEfficiency
claude-opus-4-6vanillaquick100.0100.085.0
claude-opus-4-5vanillaquick86.786.785.0

Manus

ModelSkills ModeTierOverallTask CompletionEfficiency
Manus-1.6-Litevanillaquick100.0100.085.0

OpenClaw

ModelSkills ModeTierOverallTask CompletionEfficiency
MiniMax-M2.5vanillaquick100.0100.085.0
MiniMax-M2.5vanillaquick100.0100.085.0
MiniMax-M2.5vanillaquick100.0100.085.0
MiniMax-M2.5vanillaquick100.0100.085.0
autovanillaquick100.0100.085.0
gpt-5.3-codexvanillaquick100.0100.085.0
gpt-5.4vanillaquick100.0100.085.0
gpt-5.4vanillaquick100.0100.085.0
k2p5vanillaquick100.0100.095.0
miaoda-model-autovanillaquick100.0100.085.0
qwen3.5-plusvanillaquick100.0100.090.0
qwen3.5-plusvanillaquick100.0100.085.0
deepseek-chatvanillaquick100.0100.085.0
MiniMax-M2.5vanillaquick93.393.379.3
claude-opus-4.5vanillaquick92.092.082.8
claude-sonnet-4.6vanillaquick92.092.082.8
grok-4.20-betavanillaquick92.092.082.8
grok-4.20-betavanillaquick92.092.082.8
claude-sonnet-4vanillaquick91.091.081.9
claude-sonnet-4.5vanillaquick91.091.081.9
deepseek-r1vanillaquick90.090.081.0
gemini-2.5-provanillaquick90.090.081.0
gemini-2.5-provanillaquick90.090.081.0
claude-3.5-sonnetvanillaquick90.090.081.0
deepseek-v3.2vanillaquick89.089.080.1
glm-5vanillaquick89.089.080.1
llama-4-maverickvanillaquick89.089.080.1
llama-4-maverickvanillaquick89.089.080.1
deepseek-reasonervanillaquick88.088.079.2
gemini-2.5-flashvanillaquick88.088.079.2
qwen3-coder-plusvanillaquick88.088.079.2
qwen3.5-plusvanillaquick88.088.079.2
kimi-k2-thinkingvanillaquick87.087.078.3
qwen3-maxvanillaquick87.087.078.3
deepseek-chatvanillaquick86.786.778.0
glm-4.7vanillaquick86.086.077.4
llama-3.3-70b-instructvanillaquick86.086.077.4
qvq-plusvanillaquick86.086.077.4
glm-4.5vanillaquick85.085.076.5
kimi-k2.5vanillaquick85.085.076.5
glm-4.5-airvanillaquick84.084.075.6
glm-4.6vanillaquick84.084.075.6
glm-4-plusvanillaquick83.383.375.0
moonshot-v1-128kvanillaquick83.083.074.7
MiniMax-M2.5vanillaquick82.082.073.8
moonshot-v1-autovanillaquick80.080.072.0
qwen-maxvanillaquick80.080.072.0
glm-4.7vanillaquick73.373.362.3
MiniMax-M2.5vanillafull11.911.910.1

OpenClaw (Miaoda)

ModelSkills ModeTierOverallTask CompletionEfficiency
miaoda-model-autovanillaquick100.0100.0100.0
miaoda-model-autovanillaquick100.0100.0100.0

WorkBuddy

ModelSkills ModeTierOverallTask CompletionEfficiency
AI Assistantvanillaquick100.0100.085.0
WorkBuddy-Agentskillsquick95.8100.095.0

Radar Chart Comparison

Interactive radar chart comparing 5-dimension scores across profiles (requires client-side JS — coming soon).