Under Submission
FlexRAG: Understanding and Optimizing Retrieval-Augmented Generation Serving
Wenqi Jiang, Suvinay Subramanian, Cat Graves, Gustavo Alonso, Amir Yazdanbakhsh, and Vidushi Dadu
Accelerating Graph-based Vector Search via Delayed-Synchronization Traversal [Paper]
Wenqi Jiang, Hang Hu, Torsten Hoefler, and Gustavo Alonso
SwiftSpatial: Spatial Joins on Modern Hardware [Paper]
Wenqi Jiang, Martin Parvanov, and Gustavo Alonso
Conference Papers
[VLDB’25] Chameleon: a Heterogeneous and Disaggregated Accelerator System for Retrieval-Augmented Language Models [Paper]
Wenqi Jiang, Marco Zeller, Roger Waleffe, Torsten Hoefler, and Gustavo Alonso
Proceedings of the VLDB Endowment
[KDD’25] PipeRAG: fast retrieval-augmented generation via algorithm-system co-design [Paper]
Wenqi Jiang, Shuai Zhang, Boran Han, Jie Wang, Bernie Wang, and Tim Kraska
Proceedings of the 31th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
[WWW’24] MS MARCO Web Search: A Large-scale Information-rich Web Dataset with Millions of Real Click Labels [Paper]
Qi Chen, Xiubo Geng, Corby Rosset, Carolyn Buractaon, Jingwen Lu, Tao Shen, Kun Zhou, Chenyan Xiong, Yeyun Gong, Paul Bennett, Nick Craswell, Xing Xie, Fan Yang, Bryan Tower, Nikhil Rao, Anlei Dong, Wenqi Jiang, Zheng Liu, Mingqin Li, Chuanjie Liu, Zengzhong Li, Rangan Majumder, Jennifer Neville, Andy Oakley, Knut Magne Risvik, Harsha Vardhan Simhadri, Manik Varma, Yujing Wang, Linjun Yang, Mao Yang, and Ce Zhang
International World Wide Web Conference
[NeurIPS’23] Data-Informed Geometric Space Selection [Paper]
Shuai Zhang and Wenqi Jiang
Thirty-seventh Conference on Neural Information Processing Systems
[SC’23] Co-design Hardware and Algorithm for Vector Search [Paper] [Code]
Wenqi Jiang, Shigang Li, Yu Zhu, Johannes de Fine Licht, Zhenhao He, Runbin Shi, Cedric Renggli, Shuai Zhang, Theodoros Rekatsinas, Torsten Hoefler, and Gustavo Alonso
The International Conference for High Performance Computing, Networking, Storage and Analysis
[KDD’21] FleetRec: Large-Scale Recommendation Inference on Hybrid GPU-FPGA Clusters [Paper] [Talk] [Code]
Wenqi Jiang*, Zhenhao He*, Shuai Zhang, Kai Zeng, Liang Feng, Jiansong Zhang, Tongxuan Liu, Yong Li, Jingren Zhou, Ce Zhang, and Gustavo Alonso
Proceedings of the 27th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
[FPL’21] Distributed Recommendation Inference on FPGA Clusters [Paper] [Code]
Yu Zhu, Zhenhao He, Wenqi Jiang, Kai Zeng, Jingren Zhou, and Gustavo Alonso
31th International Conference on Field-Programmable Logic and Applications
[MLSys’21] MicroRec: Efficient Recommendation Inference by Hardware and Data Structure Solutions [Paper] [Talk] [Code]
Wenqi Jiang, Zhenhao He, Shuai Zhang, Thomas B. Preußer, Kai Zeng, Liang Feng, Jiansong Zhang, Tongxuan Liu, Yong Li, Jingren Zhou, Ce Zhang, and Gustavo Alonso
4th Conference on Machine Learning and Systems
Journal Papers
Dynamic Sampling and Selective Masking for Communication-Efficient Federated Learning [Paper]
Shaoxiong Ji, Wenqi Jiang, Anwar Walid, and Xue Li
IEEE Intelligent Systems
Tutorials
[SIGMOD’23] Data Processing with FPGAs on Modern Architectures [Paper] [Website]
Wenqi Jiang, Dario Korolija, and Gustavo Alonso
International Conference on Management of Data