Skip to content
View 21jun's full-sized avatar
🍉
🍉

Highlights

  • Pro

Block or report 21jun

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
21jun/README.md

Wonjun Lee

Ph.D./M.S. Student at POSTECH NLP Lab Advised by Prof. Gary Geunbae Lee

Research Interests: Automatic Speech Recognition (ASR), Large Language Models (LLM), MLOps

Google Scholar


Education

2021.02 – Present
POSTECH (Pohang, Korea)
Ph.D./M.S. in Computer Science and Engineering

2017 – 2021
Sejong University (Seoul, Korea)
B.S. in Software Engineering & Data Science


Selected Works

Kanary 1B is a pre-release ~0.95B-parameter Korean ASR model built on NVIDIA’s canary-1b-v2, extending it with prompt-based control over punctuation, inverse text normalization, and foreign-word rendering tailored for Korean.


Awards and Honors

Korean AI Contest (Speech Recognition)

  • Grand Prize (2nd), Minister’s Award, Ministry of Science and ICT, 2022 link link

  • 5th Place, Director’s Award, National Information Society Agency (NIA), 2023 link

  • 4th Place, Director’s Award, National Information Society Agency (NIA), 2021 link

ICT Challenge 2025

  • Global Innovation Award, Minister of Science and ICT, 2025
    Project: Multi-modal Multi-session Counseling System link link

Conference Recognition

  • Best Paper Nominee, SIGDial 2024
    “Enhancing Dialogue Speech Recognition with Robust Contextual Awareness via Noise Representation Learning” link

Publications

International Conferences

Speak & Spell: LLM-Driven Controllable Phonetic Error Augmentation for Robust Dialogue State Tracking
Jihyun Lee, Solee Im, Wonjun Lee, Gary Geunbae Lee
IJCNLP-AACL 2025, Mumbai, India link

DeRAGEC: Denoising Named Entity Candidates with Synthetic Rationale for ASR Error Correction
Wonjun Lee*, Solee Im*, Jinmyeong An, Yunsu Kim, Jungseul Ok, Gary Geunbae Lee
ACL Findings 2025, Vienna, Austria link

DyPCL: Dynamic Phoneme-level Contrastive Learning for Dysarthric Speech Recognition
Wonjun Lee*, Solee Im*, Heejin Do, Yunsu Kim, Jungseul Ok, Gary Geunbae Lee
NAACL 2025, Albuquerque, USA link

Enhancing Dialogue Speech Recognition with Robust Contextual Awareness via Noise Representation Learning
Wonjun Lee*, San Kim*, Gary Geunbae Lee
SIGDial 2024, Kyoto, Japan link

An Investigation Into Explainable Audio Hate Speech Detection
Wonjun Lee*, Jinmyeong An*, Yejin Jeon, Jungseul Ok, Yunsu Kim, Gary Geunbae Lee
SIGDial 2024, Kyoto, Japan link

Acoustic Feature Mixup for Balanced Multi-aspect Pronunciation Assessment
Heejin Do, Wonjun Lee, Gary Geunbae Lee
INTERSPEECH 2024, Kos Island, Greece link

Optimizing Two-Pass Cross-Lingual Transfer Learning: Phoneme Recognition and Phoneme to Grapheme Translation
Wonjun Lee, Yunsu Kim, Gary Geunbae Lee
ASRU 2023, Taipei, Taiwan link

Exploring the Viability of Synthetic Audio Data for Audio-Based Dialogue State Tracking
Jihyun Lee*, Yejin Jeon*, Wonjun Lee, Yunsu Kim, Gary Geunbae Lee
ASRU 2023, Taipei, Taiwan link


Domestic Conferences

How to Use Speech Related Digital Biomarkers in Patients With Depressive Disorder
Wonjun Lee*, Seungyeon Seo*, Hyun Jeong Kim
Digital Health Research (Korean Society of Digital Health) link

한국어 자모단위 음성인식 결과 후보정을 위한 신경망 기반 자모 병합 방법론
임솔이*, 이원준*, 이근배, 김윤수
HCLT 2023, Jeju, Korea

다국어 음성인식을 위한 언어별 출력 계층 구조 Wav2Vec2.0
이원준, 이근배
HCLT 2021


Patents

Method and Apparatus for Multilingual Speech Recognition based on Artificial Intelligence Models
U.S. Patent, link


Pinned Loading

  1. NVIDIA-NeMo/NeMo NVIDIA-NeMo/NeMo Public

    A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

    Python 16.5k 3.3k