Lei Cao
- Assistant Professor, Computer Science
- Member of the Graduate Faculty
Contact
- (520) 621-4632
- Gould-Simpson, Rm. 721
- Tucson, AZ 85721
- caolei@arizona.edu
Biography
I am an Assistant Professor at the Computer Science department of University of Arizona. I also hold a research affiliation at MIT CSAIL where I spent several years as a Postdoc Associate and then a Research Scientist, actively collaborating with Prof. Samuel Madden, Prof. Michael Stonebraker, and Dr. Michael Cafarella. Before that I worked for IBM T.J. Watson Research Center as a Research Staff Member. I have conducted research in the broad areas of data systems and data science ranging from the low-level core database performance optimization to designing the high level, application specific machine learning techniques. My recent research falls in the emerging area of "Systems for AI and AI for Systems", focused on building data management and analytics tools that satisfy the SAUL properties: Scalable, Automatic, Human-in-the-loop.Degrees
- Ph.D. Computer Science
- Worcester Polytechnic Institute
Work Experience
- MIT (2021 - 2022)
- MIT (2017 - 2021)
- IBM T.J. Watson Research Center (2015 - 2016)
Interests
Teaching
Databases, Data Systems
Research
Data management, Cloud databases, Data cleaning and integration, Anomaly Detection
Courses
2024-25 Courses
-
Database Sys Implement
CSC 560 (Spring 2025) -
Thesis
CSC 910 (Spring 2025) -
Database Design
CSC 460 (Fall 2024) -
Directed Research
CSC 492 (Fall 2024) -
Research
CSC 900 (Fall 2024)
2023-24 Courses
-
Database Sys Implement
CSC 560 (Spring 2024) -
Research
CSC 900 (Spring 2024) -
Research
CSC 900 (Fall 2023) -
Thesis
CSC 910 (Fall 2023)
2022-23 Courses
-
Adv Tpc Data Systems
CSC 696J (Spring 2023) -
Research
CSC 900 (Spring 2023) -
Independent Study
CSC 599 (Fall 2022) -
Research
CSC 900 (Fall 2022)
Scholarly Contributions
Proceedings Publications
- Chen, Z., Gu, Z., Cao, L., Fan, J., Madden, S., & Tang, N. (2023). Symphony: Towards Natural Language Query Answering over Multi-modal Data Lakes. In CIDR.
- Chen, Z., Cao, L., & Madden, S. (2022). RoTaR: Efficient Row-Based Table Representation Learn- ing via Teacher-Student Training. In NeuIPS Workshop.
- Hofmann, D., Van Nostrand, P., Cao, L., Madden, S., & Rundensteiner, E. (2022). A Demonstration of AutoOD: A Self-Tuning Anomaly Detection System
. In PVLDB. - Tang, J., Zuo, Y., Cao, L., & Madden, S. (2022). Generic Entity Resolution Models,. In NeuIPS Workshop.