CodeClash

Benchmarking Goal-Oriented Software Engineering

最近更新: 9天前

claw-eval

Claw-Eval is an evaluation harness for evaluating LLM as agents. All tasks verified by humans.

最近更新: 9天前

hapi

https://github.com/gitcodeaction/hapi

最近更新: 10天前

claude-devtools

https://github.com/matt1398/claude-devtools

最近更新: 18天前

long-prompts-analysis

https://github.com/nilenso/long-prompts-analysis

最近更新: 1个月前

system_prompts_leaks

https://github.com/asgeirtj/system_prompts_leaks

最近更新: 1个月前

pi-mono

https://github.com/badlogic/pi-mono

最近更新: 1个月前

dw-dengwei.daily-arXiv-ai-enhanced

https://github.com/dw-dengwei/daily-arXiv-ai-enhanced

最近更新: 6个月前

jax-ml.scaling-book

https://github.com/jax-ml/scaling-book

最近更新: 6个月前

skindhu.How-To-Scale-Your-Model-CN

https://github.com/skindhu/How-To-Scale-Your-Model-CN

最近更新: 6个月前

搜索帮助