One-shot Tool Call
Tool call benchmark: 18/18 passed (100% first-attempt accuracy), 2 skipped. Full suite across Bash, File Ops, MCP, Skills, and Generation.
OKTool call benchmark: 18/18 passed (100% first-attempt accuracy), 2 skipped. Full suite across Bash, File Ops, MCP, Skills, and Generation.
OK