Beong-woo Kwak

Hello! I am a Ph.D student in AI at Yonsei University, advised by Prof. Jinyoung Yeo.

My research interest is in NLP, focusing on Conversational AI where knowledge, reasoning, and interaction come into play. In particular, I’m interested in:

  1. Personalized agents that can perform/develop/be maintained in a lifelong manner,
  2. Code / Tool-learning,
  3. Efficient, scalable training/evaluation under evolving environments.

News

Sep 18, 2025 :tada: Our “Web-Sheperd” on the Process Reward Model of Web Agent got accepted to NeurIPS 2025 Spotlight.
Aug 21, 2025 :tada: Our “ToolHaystack” on the Long-term Interaction of Tool-augmented Language Models is accepted to EMNLP 2025. See you at Suzhou!
Jun 01, 2025 :sparkles: I will be joining Microsoft Research Asia (MSRA) as an intern this summer. Under supervision of Dr. Liang Wang, Dr. Nan Yang, Dr. Xingxing Zhang.
May 16, 2025 :tada: Three papers about Long-CoT, Scene Graph Generation, and LLM Simulation of Psychological Patients have been accepted to ACL 2025.
Jan 23, 2025 :tada: Our paper TRAIT on LLM Personality Evaluation has been accepted to NAACL 2025.
Sep 20, 2024 :tada: Two papers about Algorithmic Reasoning in LLMs and NL Feedback on Coding Agents have been accepted to EMNLP 2024.
Jun 23, 2024 :tada: Our “Pearl” on the Review-driven, Persona-grounded Conversational Recommentation is accepted to ACL 2024.
May 16, 2024 :sparkles: I am returning to Amazon AGI Foundational Models Group (the group name has changed) as an Applied Scientist intern.
Jun 27, 2023 :sparkles: I am joining Amazon Alexa AI as an Applied Scientist intern, primary mentors: Dr. Hann Wang, Dr. Nikolaos Malandrakis and Dr. Nagesh Panyam, and Managed by Dr. Angeliki Metallinou. Can’t wait!

Experience & Educational Timeline

  • Jun 2025 - Dec 2025
    Research Scientist Intern Microsoft Research Asia (MSRA)
    @Beijing, China
    Worked on self-evolving tool agents and agentic RL
    Mentors: Dr. Liang Wang, Dr. Nan Yang, Dr. Xingxing Zhang
    Government-funded collaborative research program (IITP)
  • Jun 2024 - Sep 2024
    Applied Scientist Intern Amazon AGI
    @Sunnyvale, CA, USA
    Return Internship
    Worked on Code LLMs
  • Jun 2023 - Sep 2023
    Applied Scientist Intern Amazon Alexa AI
    @Sunnyvale, CA, USA
    Worked on Tool-augmented LLMs
    Mentors: Dr. Hann Wang, Dr. Nikolaos Malandrakis,
    Dr. Nagesh Panyam
  • Mar 2022 - Current
    Ph.D. Student, Artificial Intelligence Yonsei University
    @Seoul, Republic of Korea
    Advised by Prof. Jinyoung Yeo
  • Mar 2020 - Feb 2022
    M.S., Artificial Intelligence Yonsei University
    @Seoul, Republic of Korea
    Advised by Prof. Jinyoung Yeo
  • Jan 2018 - Jul 2018
    Exchange Student University of California
    @Santa Cruz, CA, USA
    International exchange program
  • Oct 2015 - Jul 2017
    Sergeant Republic of Korea Army
    @Seoul, Republic of Korea
    Compulsory military service during B.S.
    in the Republic of Korea Army
  • Mar 2014 - Feb 2022
    B.S., Computer Science Yonsei University
    @Seoul, Republic of Korea

Activities

  • Reviewer
    • AAAI, EMNLP, ACL, NAACL
  • Teaching Experiences
    • NVIDIA: Teaching Assistant on NLP
    • Teaching Assistant at Yonsei Univ: Text and Language Understanding, Big Data, Natural Language Processing
  • Company Experiences
    • Microsoft Research Asia (MSRA), Beijing, China - Research Scientist Intern (Jun 2025 - Dec 2025)
    • Amazon AGI Foundational Models Group, Sunnyvale, CA, USA - Applied Scientist Intern (Jun 2024 - Sep 2024)
    • Amazon Alexa AI, Sunnyvale, CA, USA - Applied Scientist Intern (Jun 2023 - Sep 2023)
  • Honors and Awards
    • Encouragement Prize, Korea Capstone Design Fair (As a representative of Yonsei University) - Ministry of Trade, Industry and Energy, Korea
    • Top Prize, Software Capstone Design - Computer Science Department, Yonsei Univ
    • Top Presentation Prize, Yonsei Creative Exhibition Presentation - College of Engineering, Yonsei Univ
    • Student Teaching Assistant Scholarship (Head Professor TA) - Department of Artificial Intelligence, Yonsei Univ

Publications

  1. Embodied Agents Meet Personalization: Exploring Memory Utilization for Personalized Assistance
    Taeyoon Kwon, Dongwook Choi, Sunghwan Kim, Hyojun Kim, Seungjun Moon, Beong-woo Kwak, Kuan-Hao Huang, and Jinyoung Yeo
    arXiv, 2025
  2. Web-Shepherd: Advancing PRMs for Reinforcing Web Agents
    Hyungjoo Chae, Sunghwan Kim, Junhee Cho, Seungone Kim, Seungjun Moon, Gyeom Hwangbo, Dongha Lim, Minjin Kim, Yeonjun Hwang, Minju Gwak, Dongwook Choi, Minseok Kang, Gwanhoon Im, ByeongUng Cho, Hyojun Kim, Jun Hee Han, Taeyoon Kwon, Minju Kim, Beong-woo Kwak, Dongjin Kang, and Jinyoung Yeo
    In NeurIPS Spotlight, 2025
  3. ToolHaystack: Stress-Testing Tool-Augmented Language Models in Realistic Long-Term Interactions
    Beong-woo Kwak, Minju Kim, Dongha Lim, Hyungjoo Chae, Dongjin Kang, Sunghwan Kim, Dongil Yang, and Jinyoung Yeo
    In Findings of the 2025 Conference on Empirical Methods in Natural Language Processing, 2025
  4. One Missing Piece for Open-Source Reasoning Models: A Dataset to Mitigate Cold-Starting Short CoT LLMs in RL
    Hyungjoo Chae, Dongjin Kang, Jihyuk Kim, Beong-woo Kwak, Sunghyun Park, Haeju Park, Jinyoung Yeo, Moontae Lee, and Kyungjae Lee
    In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 6: Industry Track), Jul 2025
  5. LLM Meets Scene Graph: Can Large Language Models Understand and Generate Scene Graphs? A Benchmark and Empirical Study
    Dongil Yang, Minjin Kim, Sunghwan Kim, Beong-woo Kwak, Minjun Park, Jinseok Hong, Woontack Woo, and Jinyoung Yeo
    In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Jul 2025
  6. Can You Share Your Story? Modeling Clients’ Metacognition and Openness for LLM Therapist Evaluation
    Minju Kim, Dongje Yoo, Yeonjun Hwang, Minseok Kang, Namyoung Kim, Minju Gwak, Beong-woo Kwak, Hyungjoo Chae, Harim Kim, Yunjoong Lee, Min Hee Kim, Dayi Jung, Kyong-Mee Chung, and Jinyoung Yeo
    In Findings of the Association for Computational Linguistics: ACL 2025, Jul 2025
  7. Do LLMs Have Distinct and Consistent Personality? TRAIT: Personality Testset designed for LLMs with Psychometrics
    Seungbeen Lee, Seungwon Lim, Seungju Han, Giyeong Oh, Hyungjoo Chae, Jiwan Chung, Minju Kim, Beong-woo Kwak, Yeonsoo Lee, Dongha Lee, Jinyoung Yeo, and Youngjae Yu
    In Findings of the Association for Computational Linguistics: NAACL 2025, Jul 2025
  8. Language Models as Compilers: Simulating Pseudocode Execution Improves Algorithmic Reasoning in Language Models
    Hyungjoo Chae, Yeonghyeon Kim, Seungone Kim, Kai Tzu-iunn Ong, Beong-woo Kwak, Moohyeon Kim, Seonghwan Kim, Taeyoon Kwon, Jiwan Chung, Youngjae Yu, and Jinyoung Yeo
    In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, Jul 2024
  9. Coffee-Gym: An Environment for Evaluating and Improving Natural Language Feedback on Erroneous Code
    Hyungjoo Chae, Taeyoon Kwon, Seungjun Moon, Yongho Song, Dongjin Kang, Kai Ong, Beong-woo Kwak, Seonghyeon Bae, Seung-won Hwang, and Jinyoung Yeo
    In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, Jul 2024
  10. Pearl: A Review-driven Persona-Knowledge Grounded Conversational Recommendation Dataset
    Minjin Kim, Minju Kim, Hana Kim, Beong-woo Kwak, SeongKu Kang, Youngjae Yu, Jinyoung Yeo, and Dongha Lee
    In Findings of the Association for Computational Linguistics ACL 2024, Jul 2024
  11. Modularized Transfer Learning with Multiple Knowledge Graphs for Zero-shot Commonsense Reasoning
    Yu Jin Kim, Beong-woo Kwak, Youngwook Kim, Reinald Kim Amplayo, Seung-won Hwang, and Jinyoung Yeo
    In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Jul 2022
  12. Dual task framework for improving persona-grounded dialogue dataset
    Minju Kim, Beong-woo Kwak, Youngwook Kim, Hong-in Lee, Seung-won Hwang, and Jinyoung Yeo
    In Proceedings of the AAAI conference on artificial intelligence, Jul 2022
  13. Trustal: Trustworthy active learning using knowledge distillation
    Beong-woo Kwak, Youngwook Kim, Yu Jin Kim, Seung-won Hwang, and Jinyoung Yeo
    In Proceedings of the AAAI conference on artificial intelligence, Jul 2022