A Comprehensive Vietnamese evaluation benchmark for Large Language Models (LLMs).
As the Generative AI wave spread globally around 2023, with tens of thousands of new models emerging in just a few years, Vietnam was not outside this trend. Many research groups and startups began experimenting with Vietnamese AI models. However, a common limitation quickly surfaced: the lack of a unified evaluation standard for the Vietnamese language.
Most well-resourced teams had to build their own evaluation sets, leading to fragmentation and making it difficult to compare results. To fill this gap, in November 2023, Zalo AI, in collaboration with the Japan Advanced Institute of Science and Technology (JAIST), introduced VMLU (Vietnamese Multitask Language Understanding) - the first comprehensive Vietnamese evaluation benchmark for Large Language Models (LLMs).
A Comprehensive Benchmark, Open to the Community
VMLU consists of two main components: the dataset and the standardized evaluation tool.
10,880 multiple-choice questions, covering 58 subjects across 4 domains: STEM, Social Sciences, Humanities, and Miscellaneous.
Questions are categorized into 4 difficulty levels: Primary School, Secondary School, High School, and Professional (Undergraduate & Postgraduate).
The accompanying tool provides detailed instructions, making it easy for research groups to deploy, test, and compare results fairly.
Importantly, VMLU is open and free to the Vietnamese AI community. This enables independent developers, small startups, and research institutions alike to access and use a unified standard.
Momentum for the Vietnamese AI Wave
VMLU is not just a technical toolkit but also a catalyst for the research community. With a common standard, Vietnamese AI models can be compared, improved, and brought closer to international benchmarks.
Previously, Zalo AI had repeatedly created playgrounds for the community with the Zalo AI Challenge, Zalo AI Hackathon, and Zalo AI Summit, encouraging young engineers to apply AI to solve social problems. VMLU continues this spirit, laying another important foundation for the Vietnamese AI ecosystem.
Recognized Achievements
Just one year after its announcement, VMLU has proven its practical value:
3,729 LLM evaluations were performed.
155 individuals and organizations submitted results.
45 domestic and international LLMs were officially announced on the platform.
This demonstrates that VMLU is not just a research project; it has truly become a common standard for the Vietnamese AI community, where developers learn, compare, and collaboratively enhance the quality of their models.