arxiv:2505.18102
Thanawat Lodkaew
skydddoogg
ยท
AI & ML interests
None yet
Recent Activity
upvoted a paper about 4 hours ago
How Can I Publish My LLM Benchmark Without Giving the True Answers Away? upvoted a paper about 5 hours ago
Do Coding Agents Deceive Us? Detecting and Preventing Cheating via Capped Evaluation with Randomized Tests submitted a paper about 5 hours ago
Do Coding Agents Deceive Us? Detecting and Preventing Cheating via Capped Evaluation with Randomized Tests