Benchmarking Goal-Oriented Software Engineering
最近更新:
9天前
Claw-Eval is an evaluation harness for evaluating LLM as agents. All tasks verified by humans.
最近更新:
9天前
https://github.com/gitcodeaction/hapi
最近更新:
10天前
https://github.com/matt1398/claude-devtools
最近更新:
18天前
https://github.com/nilenso/long-prompts-analysis
最近更新:
1个月前
https://github.com/asgeirtj/system_prompts_leaks
最近更新:
1个月前
https://github.com/badlogic/pi-mono
最近更新:
1个月前
https://github.com/dw-dengwei/daily-arXiv-ai-enhanced
最近更新:
6个月前
https://github.com/jax-ml/scaling-book
最近更新:
6个月前
https://github.com/skindhu/How-To-Scale-Your-Model-CN
最近更新:
6个月前