On the morning of March 14, the National Innovation Center (NIC) launched the 2025 Innovation Challenge Program with the goal of promoting the development of the artificial intelligence (AI) field in Vietnam.
The 2025 Innovation Challenge focuses on the ViGen Project with an effort to create a high-quality open-source Vietnamese dataset for training, evaluating, and thereby improving the efficiency of large language models (LLMs).
The Vietnamese dataset is intended to help AI models better understand Vietnamese culture, context, and expressions. The project is expected to increase the presence of Vietnamese in AI development and contribute to promoting the digital economy .

The ViGen project originated from a tripartite cooperation between Meta Group, NIC and the organization “AI for Vietnam”. In which, the National Innovation Center plays the role of the managing, coordinating and ensuring unit to ensure the project is consistent with Vietnam’s national goals.
The mission of the ViGen project is to make AI models support Vietnamese naturally and comprehensively from the core to unlock the potential of AI applications in Vietnam.
ViGen will build large-scale, high-quality open-source Vietnamese datasets to train and evaluate the capabilities of AI models.
The ViGen project also contributes to ensuring that AI development in Vietnam is consistent with cultural values and ethical standards, aiming to build a locally relevant and responsible open-source AI ecosystem.
To support the project, Meta will contribute its open-source datasets, which include insights on mobility and social connectivity, as well as training data from AI-powered population maps.
According to Mr. Vo Xuan Hoai, Deputy Director of the National Innovation Center, AI is transforming the world . Therefore, developing large-scale, high-quality, open-source Vietnamese datasets for AI training and evaluation has become an urgent priority.
“ The ViGen project is in line with Resolution 57 of the Politburo to promote breakthroughs in science, technology, innovation and national digital transformation. With the joint efforts of policymakers, researchers, developers, experts and users, we will turn AI into a powerful tool for all Vietnamese people and make Vietnam a global AI powerhouse, ” said the Deputy Director of the National Innovation Center.

Vietnamese is used by more than 100 million people, however, Vietnamese data used to train AI models currently accounts for only a very small proportion, less than 1%. That is the reason why the output of AI models has informational value but is not natural, does not fully convey the value of Vietnamese, leading to low usefulness and inefficiency.
Mr. Tran Viet Hung, founder & CEO of AI for Vietnam, shared: “ The ViGen project will contribute to the community large and high-quality datasets in Vietnamese to improve the current situation of Vietnamese being considered a language with a very modest presence in AI ”.
According to Mr. Tran Viet Hung, the ViGen project also shows the power and value of open source models like Llama, which allow for the development of innovative solutions that take into account the context of the Vietnamese language.
In fact, in Vietnam, there have appeared Vietnamese virtual assistants developed based on the Llama large language model such as Misa's virtual assistant that automates information retrieval and Viettel's legal virtual assistant. These are initial examples showing the application of AI in Vietnamese people's lives, especially in the public sector.

Comment (0)