Analysis Reveals Unreliable LLM-as-a-Judge Performance in Multilingual Settings | HACKOBAR_