Personal Homepage

Hello! I am a Ph.D student in AI at Yonsei University. I am fortunate to be advised by Prof. Jinyoung Yeo.

My research interest is in the field of NLP. I currently focus on Conversational AI where knowledge, reasoning and interaction come into play. Especially, I am interested in
(1) Personalized Agent that can perform/develop/be maintained in lifelong manner,
(2) Code/Tool-learning,
(3) Efficient, scalable training/evaluation mechanisms under evolving environments.

Recent News

· [2025.02] I will be joining Microsoft Research Asia (MSRA) as an intern this summer.

· [2024.06] I am returning to Amazon AGI Foundational Models Group (the group name has changed) as an Applied Scientist intern.

· [2023.06] I am joining Amazon Alexa AI as an Applied Scientist intern, primary mentors: Dr. Hann Wang, Dr. Nikolaos Malandrakis and Dr. Nagesh Panyam, and Managed by Dr. Angeliki Metallinou. Can’t wait!

Education/Experience

Yonsei University, Seoul, Korea ^{2022.03 - Current}

Ph.D. student in Artificial Intelligence
Adviser: Prof. Jinyoung Yeo

Yonsei University, Seoul, Korea ^{2020.03 - 2022.02}

M.S. student in Artificial Intelligence
Adviser: Prof. Jinyoung Yeo

University of California, Santa Cruz, The United States of America ^{2018.01 - 2018.07}

Exchange student in Computer Science and Engineering

R.O.K Army, Seoul, Korea ^{2015.10 - 2017.07}

2 years of Compulsory Military Service. Served as a sergeant.

Yonsei University, Seoul, Korea ^{2014.03 - 2020.02}

B.S. student in Computer Science

Publications (* equal contribution)

Work-In-Progress

Enhancing the Robustness of Tool-Learners to API Evolution

Under-review/Preprints

ToolHaystack: Stress-Testing Tool-Augmented Language Models in Realistic Long-Term Interactions
Beong-woo Kwak, Minju Kim, Dongha Lim, Hyungjoo Chae, Dongjin Kang, Sunghwan Kim, Dongil Yang, Jinyoung Yeo

Embodied Agents Meet Personalization: Exploring Memory Utilization for Personalized Assistance
{Taeyoon Kwon, Dongwook Choi}*, Sunghwan Kim, Hyojun Kim, Seungjun Moon, Beong-woo Kwak, Kuan-Hao Huang, Jinyoung Yeo

Web-Shepherd: Advancing PRMs for Reinforcing Web Agents
Hyungjoo Chae, Sunghwan Kim, Junhee Cho, Seungone Kim, Seungjun Moon, Gyeom Hwangbo, Dongha Lim, Minjin Kim, Yeonjun Hwang, Minju Gwak, Dongwook Choi, Minseok Kang, Gwanhoon Im, ByeongUng Cho, Hyojun Kim, Jun Hee Han, Taeyoon Kwon, Minju Kim, Beong-woo Kwak, Dongjin Kang, Jinyoung Yeo

Conference (AAAI: 2, ACL: 4, EMNLP: 2, NAACL: 2)

LLM Meets Scene Graph: Can Large Language Models Understand and Generate Scene Graphs? A Benchmark and Empirical Study}
Dongil Yang, Minjin Kim, Sunghwan Kim, Beong-woo Kwak, Minjun Park, Jinseok Hong, Woontack Woo, Jinyoung Yeo
ACL 2025

Can You Share Your Story? Modeling Clients’ Metacognition and Openness for LLM Therapist Evaluation
Minju Kim, Dongje Yoo, Yeonjun Hwang, Minseok Kang, Namyoung Kim, Minju Gwak, Beong-woo Kwak, Hyungjoo Chae, Harim Kim, Yunjoong Lee, Min Hee Kim, Dayi jung, Kyong-Mee Chung, Jinyoung Yeo
ACL 2025 (Findings)

One Missing Piece for Open-Source Reasoning Models: A Dataset to Mitigate Cold-Starting Short CoT LLMs in RL
Hyungjoo Chae, Dongjin Kang, Jihyuk Kim, Beong-woo Kwak, Sunghyun Park, Haeju Park, Jinyoung Yeo, Moontae Lee, Kyungjae Lee
ACL 2025 (Industry Track)

Do llms have distinct and consistent personality? trait: Personality testset designed for llms with psychometrics
Seungbeen Lee, Seungwon Lim, Seungju Han, Giyeong Oh, Hyungjoo Chae, Jiwan Chung, Minju Kim, Beong-woo Kwak, Yeonsoo Lee, Dongha Lee, Jinyoung Yeo, Youngjae Yu
NAACL 2025 (Findings)

Language models as compilers: Simulating pseudocode execution improves algorithmic reasoning in language models
Hyungjoo Chae, Yeonghyeon Kim, Seungone Kim, Kai Tzu-iunn Ong, Beong-woo Kwak, Moohyeon Kim, Seonghwan Kim, Taeyoon Kwon, Jiwan Chung, Youngjae Yu, Jinyoung Yeo
EMNLP 2024

Coffee-gym: An environment for evaluating and improving natural language feedback on erroneous code
Hyungjoo Chae, Taeyoon Kwon, Seungjun Moon, Yongho Song, Dongjin Kang, Kai Tzu-iunn Ong, Beong-woo Kwak, Seonghyeon Bae, Seung-won Hwang, Jinyoung Yeo
EMNLP 2024

Pearl: A Review-driven Persona-Knowledge Grounded Conversational Recommendation Dataset
Minjin Kim, Minju Kim, Hana Kim, Beong-woo Kwak, Soyeon Chun, Hyunseo Kim, SeongKu Kang, Youngjae Yu, Jinyoung Yeo, Dongha Lee
ACL 2024 (Findings)

Modularized Transfer Learning with Multiple Knowledge Graphs for Zero-shot Commonsense Reasoning
Yu Jin Kim, Beong-woo Kwak, Youngwook Kim, Reinald Kim Amplayo, Seung-won Hwang, Jinyoung Yeo
NAACL 2022 Oral presentation

TrustAL: Trustworthy Active Learning using Knowledge Distillation
Beong-woo Kwak, Youngwook Kim, Yu Jin Kim, Seung-won Hwang, Jinyoung Yeo
AAAI 2022 Poster presentation (acceptance rate: 15%)

Dual Task Framework for Improving Persona-grounded Dialogue Dataset
{Minju Kim, Beong-woo Kwak}*, Youngwook Kim, Hong-in Lee, Seung-won Hwang, Jinyoung Yeo
AAAI 2022 Oral presentation (acceptance rate: 4%)

Hosted on GitHub Pages — Theme by orderedlist