Seminar | University of Kentucky College of Arts & Sciences

Date:

Fri, Dec 6 2024, 3:00pm - Fri, Dec 6 2024, 4:00pm

Location:

MDS 220

Speaker(s) / Presenter(s):

Dr. Hongyi Wang

Title: LLM360: Towards Fully Transparent Open-Source LLMs

Abstract: The recent surge in “open-weight” Large Language Models (LLMs), such as LLaMA and Mistral, provides diverse options for AI practitioners and researchers. However, most LLMs have only released partial artifacts, such as final model weights or inference code, while technical reports increasingly limit their scope to high-level design choices and superficial statistics. These choices hinder progress in the field by reducing transparency in LLM training and forcing teams to rediscover critical details of the training process. In this talk, I will introduce LLM360, an initiative to fully open-source LLMs, advocating for the availability of all training code, data, model checkpoints, and intermediate results to the community. The goal of LLM360 is to foster open and collaborative AI research by making the entire LLM training process transparent and reproducible for everyone. I will discuss in detail the LLMs we have pre-trained and are currently training from scratch, which achieve leading benchmark performance. Additionally, I will present our fully open and transparent LLM pre-training dataset, TxT360.

Bio: Hongyi Wang will join the CS department at Rutgers University as a tenure-track Assistant Professor in Fall 2025. He spent two years as a postdoctoral fellow at CMU, working with Prof. Eric Xing. His research focuses on large-scale machine learning algorithms and systems. He obtained his Ph.D. from the Department of Computer Sciences at the University of Wisconsin-Madison. Dr. Wang has received several accolades, including the Rising Stars Award from the Conference on Parsimony and Learning in 2024, the NAACL 2024 Best Demo Award runner-up, and the Baidu Best Paper Award at the SpicyFL workshop at NeurIPS 2020.