| Shell | Model | Backend | Hash | Format | Test | Status | Reason | Trajectory | Invoked | Match | Warmup (s) | Measured (s) | Details |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| continue-cli | local/MFDoom/deepseek-r1-tool-calling:8b | ollama | unknown | tool | Patch File | FAIL | clean | no | 0% | 88.241 | 12.29 | details | |
| continue-cli | local/MFDoom/deepseek-r1-tool-calling:8b | ollama | unknown | tool | Python File Simple | FAIL | clean | no | 0% | 5.074 | 4.774 | details | |
| continue-cli | local/MFDoom/deepseek-r1-tool-calling:8b | ollama | unknown | tool | Shell Date | FAIL | clean | no | 0% | 5.027 | 4.524 | details | |
| continue-cli | local/MFDoom/deepseek-r1-tool-calling:8b | ollama | unknown | tool | Shell PWD | FAIL | clean | no | 0% | 4.523 | 4.727 | details |