Gonzalez Group

Enthusiastically Advancing AI Systems at UC Berkeley

Joseph E. Gonzalez

Joey Gonzalez's group at UC Berkeley focuses on advancing AI systems, including LLM Agents, data systems, efficent ML training and inference, and generative model evaluation. His group has spawned many successful projects such as Gorilla, Chatbot Arena, vLLM, MemGPT (now Letta), SGLang and more.
Visit the Google Scholar page for more our current publications. His students also have a podcast on PhD life here.

DS SERVE: A Framework for Efficient and Scalable Neural Retrieval

Jinjian Liu and Yichuan Wang and Xinxi Lyu and Rulin Shao and Joseph E Gonzalez and Matei Zaharia and Sewon Min

S*: Test time scaling for code generation

Dacheng Li and Shiyi Cao and Chengkun Cao and Xiuyu Li and Shangyin Tan and Kurt Keutzer and Jiarong Xing and Joseph E Gonzalez and Ion Stoica

arXiv preprint arXiv:2502.14382

RedunCut: Measurement-Driven Sampling and Accuracy Performance Modeling for Low-Cost Live Video Analytics

Gur-Eyal Sela and Kumar Krishna Agrawal and Bharathan Balaji and Joseph Gonzalez and Ion Stoica

arXiv preprint arXiv:2512.24386

BARE: Leveraging Base Language Models for Few-Shot Synthetic Data Generation

Alan Zhu and Parth Asawa and Jared Quincy Davis and Lingjiao Chen and Boris Hanin and Ion Stoica and Joseph E Gonzalez and Matei Zaharia

arXiv preprint arXiv:2502.01697

Adaptive Semantic Prompt Caching with VectorQ

Luis Gaspar Schroeder and Shu Liu and Alejandro Cuadron and Mark Zhao and Stephan Krusche and Alfons Kemper and Matei Zaharia and Joseph E Gonzalez

arXiv e-prints

Why do multi-agent llm systems fail?

Mert Cemri and Melissa Z Pan and Shuyi Yang and Lakshya A Agrawal and Bhavya Chopra and Rishabh Tiwari and Kurt Keutzer and Aditya Parameswaran and Dan Klein and Kannan Ramchandran and Matei Zaharia and Joseph E Gonzalez and Ion Stoica

arXiv preprint arXiv:2503.13657

TurboDiffusion: Accelerating Video Diffusion Models by 100-200 Times

Jintao Zhang and Kaiwen Zheng and Kai Jiang and Haoxu Wang and Ion Stoica and Joseph E Gonzalez and Jianfei Chen and Jun Zhu

arXiv preprint arXiv:2512.16093

LLMs Can Easily Learn to Reason from Demonstrations Structure, not content, is what matters!

Dacheng Li and Shiyi Cao and Tyler Griggs and Shu Liu and Xiangxi Mo and Eric Tang and Sumanth Hegde and Kourosh Hakhamaneshi and Shishir G Patil and Matei Zaharia and Joseph E Gonzalez and Ion Stoica

arXiv preprint arXiv:2502.07374

SLA: Beyond Sparsity in Diffusion Transformers via Fine-Tunable Sparse-Linear Attention

Jintao Zhang and Haoxu Wang and Kai Jiang and Shuo Yang and Kaiwen Zheng and Haocheng Xi and Ziteng Wang and Hongzhou Zhu and Min Zhao and Ion Stoica and Joseph E Gonzalez and Jun Zhu and Jianfei Chen

arXiv preprint arXiv:2509.24006