Nguyen (William) Nguyen

5+ years turning AI research into products adopted by global industrial leaders

Senior Applied Scientist at Aitomatic

Email CV LinkedIn Bio Google Scholar Twitter Github

            5+
            Years in AI
          

            9
            Publications at Top Venues
          

            3
            US Patent Applications
          

            800+
            GitHub Stars
          

About

I build domain-specific AI systems that solve real problems in industries where general-purpose models fall short. At Aitomatic, I led development of SemiKong and Llamarine, the first open-source LLMs for the semiconductor and maritime industries, now adopted by global industrial leaders including Tokyo Electron and Furuno. My multi-agent framework ProSEA achieved 93.2% accuracy on FinanceBench, outperforming established frameworks like LlamaIndex RAG and LangChain ReAct.

Before Aitomatic, I earned my master's degree from the University of Rochester (advisor: Prof. Chenliang Xu), where my work on object state understanding achieved on par with the commercial SOTA model with a smaller, faster model. I spent nearly 3 years at VinAI Research working under Professor Nguyen Minh Hoai, where I surpassed state-of-the-art on scene text recognition by 3-5% and reduced annotation costs by 50%. I completed my bachelor's with Distinction from Vietnam National University, where I worked with Professor Hoang Van Xiem.

My research spans NLP, computer vision, and machine learning, published at CVPR, NAACL, ACM MM, and AAAI. I currently focus on intelligent AI agents that learn from failure to self-improve, and on specialized LLMs for industries like semiconductors and maritime. SemiKong was featured by VentureBeat, Meta AI Blog, Tom's Hardware, and shared by Yann LeCun.

I'm always happy to discuss ideas and collaborate. Feel free to drop me an email at nguyennm1024@gmail.com.

News

03/2025 One paper accepted at JSAI 2025.
12/2024 One paper accepted at OSAI4MU-25 Workshop, AAAI 2025.
07/2024 Our SemiKong work was featured in VentureBeat, MSN, Yann LeCun's share, Meta AI Blog, Tom's Hardware, MarkTechPost, Digialps, Gadgets360, and many other media.
07/2024 One paper accepted at ACM MM 2024.
07/2024 I joined Aitomatic as a Senior Applied Scientist.
03/2024 One paper accepted at NAACL 2024.
03/2024 I received research internship offers from Bosch AI Research and Amazon, USA.
07/2023 One paper accepted at AV4D Workshop, ICCV 2023.
08/2022 I started my journey with the University of Rochester since the Fall 2022.
01/2022 I began working as an AI Research Engineer with the applied team at VinAI Research.
03/2021 One paper accepted at CVPR 2021.
07/2020 I graduated with Distinction from Vietnam National University in 2020.
12/2019 I started my journey with VinAI Research as an AI Research Resident in December 2019.

Featured Work

Domain-specific AI systems that bring expert-level intelligence to industries where general-purpose models fall short.

ProSEA: Problem Solving via Exploration Agents

preprint, 2025

paper

93.2% accuracy on FinanceBench, outperforming LlamaIndex RAG (56.7%), LangChain ReAct (81.6%), and OpenAI Assistants (42.7%). Deployed in production, impressing customers in complex problem solving. Learns from failures to adaptively replan.

Llamarine: Maritime Industry-specific LLM

JSAI, 2025

paper model

The first open-source maritime LLM. Adopted by Furuno for a navigation assistant achieving 100% accuracy on regulation-compliant actions.

SemiKong: Semiconductor Industry-Specific LLM

OSAI4MU-25 Workshop, AAAI, 2025

paper github

The first open-source semiconductor LLM. Adopted by Tokyo Electron for root cause analysis, reducing troubleshooting time by 30%. Featured by VentureBeat and shared by Yann LeCun.

Publications & Research

ProSEA: Problem Solving via Exploration Agents

William Nguyen, Vinh Luong, Christopher Nguyen

preprint, 2025

paper

Complex real-world problems require more than a single AI call; they demand iterative reasoning. ProSEA is a modular multi-agent framework where a Manager Agent orchestrates domain-specialized Expert Agents, enabling adaptive replanning based on structured feedback from failures and newly discovered constraints. Achieves state-of-the-art performance on the FinanceBench benchmark.

Llamarine: Open-source Maritime Industry-specific Large Language Model

William Nguyen, An Phan, Konobu Kimura, Hitoshi Maeno, Mika Tanaka, Quynh Le, William Poucher, Christopher Nguyen

Annual Conference of the Japanese Society for Artificial Intelligence, 2025

paper model

The first open-source domain-specific LLM for the maritime industry. Outperforms several commercial products including GPT-4o-mini, GPT-4o, Claude-3.5-Sonnet, and open-source models such as Llama3.1 8B, Llama3.1 70B, and Llama3.3 70B.

SemiKong: Curating, Training, and Evaluating A Semiconductor Industry-Specific Large Language Model

Christopher Nguyen, William Nguyen, Atsushi Suzuki, Daisuke Oku, Hong An Phan, Sang Dinh, Zooey Nguyen, Anh Hai Ha, Shruti Raghavan, Huy Vo, Thang Nguyen, Lan Nguyen, Yoshikuni Hirayama

OSAI4MU-25 Workshop, AAAI, 2025

paper github

The first open-source semiconductor-focused LLM. Outperforms several commercial products including Claude-3.5-Sonnet, Haiku, Opus, Command-R, and open-source models such as Llama3 70B. Featured by VentureBeat, Meta AI Blog, Tom's Hardware, and shared by Yann LeCun.

DANA neurosymbolic agent framework diagram

DANA: Domain-Aware Neurosymbolic Agents for Consistency and Accuracy

Vinh Luong, Sang Dinh, Shruti Raghavan, William Nguyen, Zooey Nguyen, Quynh Le, Hung Vo, Kentaro Maegaito, Loc Nguyen, Thao Nguyen, Anh Hai Ha, Christopher Nguyen

preprint, 2024

project page github paper

A unified agentic framework that incorporates expert knowledge using neurosymbolic methods to create domain-specific agents with high consistency and accuracy, significantly outperforming ChatGPT assistant and ReAct agent.

EAGLE egocentric video understanding model

EAGLE: Egocentric AGgregated Language-video Engine

Jing Bi, Yunlong Tang, Luchuan Song, Ali Vosoughi, Nguyen Nguyen, Chenliang Xu

ACM MM, 2024

paper

A unified multimodal LLM designed to comprehensively solve tasks related to egocentric video understanding, enabling AI to interpret first-person video from the viewer's perspective.

OSCaR object state captioning illustration

OSCaR: Object State Captioning and State Change Representation

Nguyen Nguyen, Jing Bi, Ali Vosoughi, Yapeng Tian, Pooyan Fazli, Chenliang Xu

NAACL, 2024

github paper

Introduces a new task for understanding object states and how they change over time. The trained Multimodal-LLM significantly surpasses previous state-of-the-art models and achieves on par with the commercial SOTA model on both GPT-4 and human evaluations.

Linguistics-based scene text spotting approach

Efficiently Leveraging Linguistics Knowledge for Scene Text Spotting

Nguyen Nguyen, Yapeng Tian, Chenliang Xu

Arxiv

paper

A simple but effective approach that incorporates language knowledge from large text corpora to improve both text detection and recognition in natural scenes.

MISAR multimodal instructional system with AR

MISAR: A Multimodal Instructional System with Augmented Reality

Jing Bi*, Nguyen Nguyen*, Ali Vosoughi*, Chenliang Xu (* equal contribution)

AV4D Workshop, ICCV, 2023

github paper

A comprehensive system that guides humans to work more efficiently and accurately by leveraging LLMs to interpret and process information from visual, auditory, and contextual dimensions.

Dictionary-guided Scene Text Recognition

Nguyen Nguyen, Thu Nguyen, Vinh Tran, Minh Triet Tran, Thanh Duc Ngo, Thien Huu Nguyen, Minh Hoai Nguyen

CVPR, 2021

project page github paper

A novel approach to incorporate dictionary knowledge in both training and testing phases for scene text recognition. Also introduces VinText, the largest scene text dataset for Vietnamese.

Master Thesis

State-aware Object Understanding

Nguyen Nguyen

University of Rochester, Department of Computer Science, 2024

thesis

Can AI truly understand what happens when you crack an egg or fold a shirt? This thesis tackles a fundamental challenge in vision-language intelligence: teaching models to perceive, describe, and reason about how objects change state. The resulting system achieves on par with the commercial SOTA model using a smaller, faster architecture — pointing toward AI that understands not just what things are, but what happens to them.

Patents

Delivering Domain-Expert Agents and Models Using Synthetic Knowledge

Christopher Nguyen, Manh-Nguyen Nguyen, Hong An Phan, Zooey Nhu-Quynh Nguyen, The-Vinh Luong, Elise NhuY Nguyen, Thomas Rasmussen, Anh Hai Ha, Phi-Hung Vo, Xuan-Sang Dinh, Huy-Thuan Bui, Anh-Quoc Dang, Timothy Michael Gerard Rozario

US Patent App. 63/726,322, 2024

Delivering Domain-Expert Agents for Improving Problem-Solving

Christopher Nguyen, Manh-Nguyen Nguyen, Hong An Phan, Zooey Nhu-Quynh Nguyen, The-Vinh Luong, Elise Nhu-Y Nguyen, Thomas Rasmussen, Anh Hai Ha, Phi-Hung Vo, Xuan-Sang Dinh, Huy-Thuan Bui, Anh-Quoc Dang

US Patent App. 63/721,419, 2024

Domain-Aware Neurosymbolic Agents For Improving Problem-Solving Accuracy And Consistency

Christopher Nguyen, The Vinh Luong, Xuan Sang Dinh, Zooey Nhu-Quynh Nguyen, Shruti Raghavan, Manh-Nguyen Nguyen, Quynh Thi-Tham Le, Phi Hung Vo, Tan Loc Nguyen, Anh Hai Ha, Phuong Thao Nguyen

US Patent App. 63/696,337, 2024

Experience

Aitomatic July. 2024 - present

Senior Applied Scientist
Domain-specific Foundation Models and Agents
University of Rochester Aug. 2022 - May. 2024

Research Assistant
Department of Computer Science
VinAI Research Jan. 2022- June. 2022

AI Research Engineer
R&D Group
VinAI Research Dec. 2019- Dec. 2021

AI Research Resident
Computer Vision Research Group
Teko Vietnam April. 2019- Nov. 2019

AI Engineer Intern
Data Science Group
Vietnam National University 2016 - 2020

B.S. Student
Information Technology Department

Academic Service

Reviewer

WACV 2022, CVPR 2023, CVPR 2024, ACMMM 2024, AAAI 2025, NAACL 2025, CVPR 2025, ACL 2025

Invited Speaker

VinAI 2021, OSAI4MU AAAI'25

Organizer & Competition jury member (2021)

Vietnamese Scene Text Challenge 2021 / Ho Chi Minh city AI Challenge 2021: Vietnamese Scene Text Recognition

Teaching assistant (2018 - 2019)

Computer vision and machine learning courses for Samsung Display Vietnam's staff, University of Engineering and Technology - Vietnam National University