Yonsei

Yonsei MLSys Student Group

Welcome to the Yonsei MLSys Student Group Webpage!

We are dedicated to exploring diverse aspects of machine learning and computational systems while fostering a vibrant social community for graduate and undergraduate students interested in computer science at Yonsei University.

Feel free to reach out with any questions at .

We are currently sponsored by ASO Lab, CIP Lab, CoreLab and MICV Lab.


Uncovering the Intriguing Properties of Vision Transformers: Insights and Applications

Chanyoung Kim - MICV Lab, AI, Yonsei University

Vision Transformers (ViTs) have become prominent models in computer vision, offering competitive performance across a variety of tasks. Beyond their effectiveness, ViTs exhibit a range of interesting and non-obvious characteristics. For example, different layers capture varying types of information, each contributing uniquely to the overall representation. Furthermore, attention patterns show that some image patches focus more on local features, while others prioritize global context—even within a single input. These observations point to an organized yet adaptable way in which ViTs handle visual information. In this seminar, we examine these hidden properties of ViTs through empirical findings and theoretical perspectives. We also explore how a deeper understanding of such characteristics can inform the design of downstream applications. By investigating the internal workings of ViTs, we aim to provide insights that can lead to more effective and targeted use of transformer-based models in visual learning tasks.

CV

Catering Courtesy of ASO Lab

Kernel Optimization

Kiung Jung - ASO Lab, CS, Yonsei University

Recent deep learning models consist of tens of thousands of GPU kernels. While early optimization efforts primarily focused on computation itself, advancements in both hardware and software have made non-computational overheads increasingly prominent. In this talk, we discuss the impact of such overheads on real-world models and explores optimization strategies to mitigate them, with a particular focus on kernel launch overhead.

CV

Catering Courtesy of ASO Lab

Character Talking Head Generation

Sejong Yang - CIPLAB, CS, Yonsei University

In this presentation, Sejong Yang shares his journey toward character talking head generation. The first part introduces his life experiences and the motivations that led him to pursue this topic as his PhD goal. He also presents four related research projects, offering insights into recent trends and studies in face modeling and talking head generation. Although still a work in progress, this slide deck—also used in his recent job talks—illustrates his ongoing efforts to extend talking head video generation beyond humans to encompass a wide range of characters throughout his PhD journey.

PPT CV

Catering Courtesy of MICV Lab

Introduction to Jason Lee's Career and Research

Jason Lee - CIPLab, AI, Yonsei University

This presentation introduces the career journey and research interests of Jaeseong Lee (Jason Lee). It begins with an overview of the AI startup ecosystem, including how AI startups generate revenue and the roles of research scientists and engineers. Jason then shares insights from his experiences at three different AI startups, offering a more detailed perspective on their operations and culture. Finally, he briefly discusses his motivation for starting a PhD and outlines the focus of his upcoming thesis research.

PPT CV

Catering Courtesy of MICV Lab

Enhancing Coarse-Grained Pipeline Architecture in High-Level Synthesis via Program Slicing-based Decoupled Software Pipelining

Kunmo Jeong - CoreLab, EE, Yonsei University

High-Level Synthesis (HLS) has emerged as a key methodology for bridging the gap between software design and hardware implementation. Among various optimization strategies, Coarse-Grained Pipeline Architecture (CGPA) has shown promise in improving throughput by partitioning programs into pipeline stages. However, achieving effective parallelism remains challenging due to control and data dependencies. This work investigates the integration of Program Slicing-based Decoupled Software Pipelining (PS-DSWP) into CGPA to improve stage-level parallelism and performance. By applying program slicing techniques prior to software pipelining, the proposed approach enhances the decoupling of stages and reduces inter-stage communication overhead. Experimental results on representative benchmarks demonstrate notable improvements in resource utilization and latency reduction compared to baseline CGPA implementations. This study highlights the potential of PS-DSWP as a complementary technique for advancing pipeline-based HLS designs.

PPT CV

Catering Courtesy of CoreLab

Toward Large Language Reasoning Models

Junyoung Jang - AIPR, School of Computing, KAIST

This talk covers post-training techniques aimed at enhancing the reasoning capabilities of Large Language Models (LLMs). Key topics include foundational reinforcement learning methodologies as well as advanced approaches such as PPO, DPO, GRPO, and DAPO. We analyze how these methods affect model reasoning performance and response consistency. Furthermore, we discuss the effectiveness and limitations of each technique through representative evaluation benchmarks and experimental results, offering insights into future research directions.

PPT CV

Catering Courtesy of CoreLab
Invited Talk

Architecture-Specific Compilation for Quantum Machine Learning

Enhyeok Jang - eSCaL, EE, Yonsei University

The recently proposed racetrack architecture based on trapped-ion qubits by Quantinuum, and the zoned architecture using Rydberg atoms by QuEra, have hardware characteristics that differ from conventional superconducting-based architectures with static topologies. These qubit systems, controlled by optical lasers, offer flexible connectivity for entangling operations, but they also introduce a new compilation challenge of shuttling overhead. In such systems, ions or neutral atoms should physically move to designated gate operation zones, adding a new layer of complexity for the qubit re-alignment. In this presentation, we explore compilation methodologies to design variational quantum programs for quantum machine learning applications to be effectively trained on these emerging architectures.

PPT CV

Invited Talk

Breaking the Memory Wall: Toward Efficient LLM Inference from GPU Architectures to Real-World PIM Systems

Hyeoncheol Kim - ASO Lab, CS, Yonsei University

As the data demands of large language models (LLMs) continue to grow, challenges such as the memory wall have become increasingly prominent in modern AI accelerators. These limitations lead to constrained scalability, underutilized compute resources, and increased energy consumption. This talk presents research conducted by Hyeoncheol Kim, which explores system-level techniques to improve the efficiency of LLM inference across both GPU-based systems and real-world Processing-In-Memory (PIM) hardware. The presentation covers architectural considerations and compiler-level integration strategies designed to reduce data movement overhead and enhance execution throughput.

PPT CV

Catering Courtesy of CIP Lab

How LLMs Are Changing the Way We Code

Yejin Lee - ToC Lab, CS, Yonsei University

Large Language Models (LLMs) are making coding easier and more efficient by helping with writing, debugging, and understanding code. Recent research has focused on improving their accuracy, reasoning, and ability to work with different programming languages. This presentation will cover key advancements, challenges, and what the future holds for LLMs in software development.

PPT CV

Catering Courtesy of CIP Lab

Introduction to Kyobin Choo's Products and Research

Kyobin Choo - MICV Lab, CS, Yonsei University

This presentation introduces the company projects and research endeavors led by Kyobin Choo, highlighting key challenges encountered while developing a positron emission tomography (PET) quantification solution for the early diagnosis of Alzheimer’s disease. These challenges naturally expanded into broader research in medical imaging, encompassing diverse computer vision tasks such as volumetric image segmentation, deformable image registration, denoising, and cross-modality image synthesis. In particular, this talk explores how diffusion models have been effectively leveraged to address medical image-specific challenges—distinct from those found in natural images—within generative tasks such as image-to-image translation.

PPT CV

Catering Courtesy of MICV Lab

From Data to ML: Opportunities in Data Systems for the ML Lifecycle

Chaerim Lim - BDAI Lab, CS, Yonsei University

With the rapid advancement of machine learning (ML) and increasing industry adoption, organizations are exploring how to seamlessly integrate ML with their existing data and systems. As ML applications expand, a critical challenge arises: how can organizations leverage their well-established data infrastructure to support ML workflows effectively? This talk explores the intersection of ML and data management, focusing on how ML pipelines—from training to inference and deployment—can be optimized from a data-centric perspective. Additionally, we will discuss how data management systems can better support ML-driven applications, particularly in scenarios like Retrieval-Augmented Generation (RAG), and highlight ongoing research in this domain.

PPT CV

Catering Courtesy of MICV Lab

Utilization and Suggestions for Frameworks in Deep Learning Research

Sejong Yang - CIP Lab, CS, Yonsei University

This presentation explores the objectives pursued in deep learning research and the tools utilized to achieve them. It provides practical tips for machine learning researchers and offers system researchers insights into the workflow of machine learning research. The tools introduced include Slurm, Docker, Conda, Git, WandB, PyTorch, and PyTorch-Lightning.

PPT CV

Catering Courtesy of ASO Lab

Short Journeys of Exploring Computer Vision Research

Jinwoo Kim - CIP Lab, CS, Yonsei University

In this seminar, I will share my journey through computer vision research, from early explorations to recent developments. Starting with my first steps into video instance segmentation during my 2020 internship, I will walk through key milestones: my first ICCV paper in 2021, my experience with workshops, and the path to my first co-first author paper at CVPR 2022. I will also discuss my research on object-centric learning, which led to another CVPR paper in 2023, and my latest work on an object-centric dataset. Through these experiences, I will reflect on how each project began, the challenges faced, and the insights gained along the way.

PPT CV

Catering Courtesy of ASO Lab

Optimizing Large Language Models: Advances in System Software

Heelim Choi - Core Lab, EE, Yonsei University

Large Language Models (LLMs) have revolutionized AI applications, yet their high computational and memory demands pose significant challenges in inference and deployment. This talk explores system software optimizations that enhance the efficiency of LLM execution. We discuss key innovations such as PagedAttention (vLLM) for memory-efficient inference, GPU-accelerated serving with TensorRT-LLM, and distributed execution strategies in DeepSpeed and Megatron-LM. These approaches enable higher throughput, lower latency, and reduced hardware costs, making LLM deployment more scalable. Finally, we examine the real-world impact of these techniques and future directions in LLM system optimization.

PPT CV

Catering Courtesy of ASO Lab

Accelerating Your Research Journey with Triton DSL

Kiung Jung - ASO Lab, CS, Yonsei University

In the rapidly evolving landscape of deep learning and high-performance computing, Triton emerges as a powerful Python-based domain-specific language (DSL) for writing highly optimized GPU kernels with ease. By abstracting away much of the complexity of traditional CUDA programming, Triton enables researchers and developers to write efficient, high-performance code with minimal effort.

Many state-of-the-art AI models, including DeepSeek and FlashAttention, leverage Triton DSL to develop custom kernels, significantly enhancing computational efficiency. By utilizing Triton, researchers can optimize memory access patterns, effectively parallelize computations, and outperform standard implementations.

For those seeking to accelerate their machine learning workloads, Triton provides an intuitive yet highly expressive approach to kernel development, bridging the gap between productivity and performance. Whether you’re optimizing deep learning frameworks or developing custom operations, Triton DSL empowers you to push the boundaries of computational efficiency.

PPT CV

Catering Courtesy of ASO Lab

Chanyoung Kim’s Research Introduction and Analysis of ViT Attention Mechanism

Chanyoung Kim - MICV Lab, AI, Yonsei University

This presentation provides a brief introduction to the research conducted by Chanyoung Kim. His work spans diverse areas, including audio-visual representation, image semantic segmentation, and medical imaging. Additionally, it addresses fundamental questions regarding the ViT attention mechanism and includes an in-depth analysis of these aspects.

PPT CV

Catering Courtesy of ASO Lab

Introduction to the Researches and Projects of Sejong Yang

Sejong Yang - CIP Lab, CS, Yonsei University

This presentation introduces the various research projects and tasks undertaken by Sejong Yang in the field of computer vision. It provides a concise overview of foundational theories related to models employed in diverse computer vision tasks. Special attention is given to the basic concepts underlying generative models, including their foundational theories and methods for leveraging generative model priors effectively.

PPT CV

Introduction to Compositional Understanding in Computer Vision

Jinwoo Kim - CIP Lab, CS, Yonsei University

This presentation provides a brief introduction of a resrach domain, called “compositional understanding in computer vision.” Compositional understanding in computer vision focuses on comprehending “what is an object” and “how they interact with each other” from given visual inputs.

PPT CV

Optimizing Across Boundaries: IR and PIM for Enhanced Computational Efficiency

Heelim Choi - Core Lab, EE, Yonsei University

This presentation provides a concise introduction to compiler optimization and the importance of Intermediate Representation (IR) in supporting various hardware and software platforms. It explores applications in areas such as graph processing, homomorphic encryption, and deep learning, demonstrating how optimization techniques can enhance performance. To address memory bottlenecks, the concept of Processing In Memory (PIM) technology is introduced, with a focus on DRAM-based PIM hardware. The presentation outlines the design of PIM-friendly IR and its integration into compilers as a cohesive strategy for improving efficiency in memory-intensive tasks.

PPT CV

Hello DL Compiler

Kiung Jung - ASO Lab, CS, Yonsei University

A compiler is a software tool that acts as an interface between the user and the hardware. DL models, like classical programs, cannot be directly executed on GPUs without the assistance of compilers. In this talk, we will explore various DL compiler optimizations, such as kernel fusion, and delve into the PyTorch software stack. Additionally, we will examine how the Triton language is transformed from a Python Domain Specific Language (DSL) into PTX.

PPT CV