●builderYou can now benchmark code models across 12 languages using a contamination-aware framework, relevant if your product targets non-Python codebases.
●researcherEnables contamination-controlled multilingual code generation evaluation — useful for benchmarking models on proprietary or low-resource language generalization.