"Sylvia" Ziyu Zhang

   

Email: sylziyuz[at]mit[dot]edu

I am a third-year PhD student at MIT CSAIL. I am co-advised by Prof. Julian Shun in the Parallel Algorithms Group and Prof. Michael Cafarella in the Data Systems Group. My general research interest is tackling data management problems, especially in high-dimensional vectors and unstructured data, with techniques in both algorithms design and systems implementation.

Previously, I was working at OtterTune with Prof. Andy Pavlo on database optimization. Before that, I completed my undergraduate studies at Carnegie Mellon University School of Computer Science, where I was advised by Prof. David Woodruff.


Acknowledgements

In addition to the support from my advisors, I am grateful for the support from an NSF Graduate Research Fellowship and an MIT Jacobs Presidential Graduate Fellowship.


Research

My research interests currently resolve around graph-based algorithms for high dimensional vector data, in particular, clustering and approximate nearest neighbor search. I care about designing highly scalable algorithms, aligning the algorithm design objective with the roles these algorithms play in data management systems, and understanding why they work/do not work under different datasets and compute situations.

Along the lines of exploring the intersection of graph algorithms and high dimensional vector data/unstructured data, I have also worked on refining causal graphs with observational data and dimensionality reduction algorithms for general tensor networks.

Publications

CleANN: Efficient Full Dynamism in Graph-based Approximate Nearest Neighbor Search
Ziyu Zhang, Yuanhao Wei, Josh Engels, and Julian Shun.
Preprint.
KramaBench: A Benchmark for AI Systems on Data-to-Insight Pipelines over Data Lakes
Eugenie Lai, Gerardo Vitagliano, Ziyu Zhang, Sivaprasad Sudhir, Om Chabra, Anna Zeng, Anton A. Zabreyko, Chenning Li, Ferdi Kossmann, Jialin Ding, Jun Chen, Markos Markakis, Matthew Russo, Weiyang Wang, Ziniu Wu, Michael J. Cafarella, Lei Cao, Sam Madden, and Tim Kraska.
Preprint.
From Logs to Causal Inference: Diagnosing Large Systems
Markos Markakis, Brit Youngmann, Trinity Gao, Ziyu Zhang, Rana Shahour, Peter Baile Chen, Chunwei Liu, Ibrahim Sabek, and Michael Cafarella.
VLDB 2025.
Press ECCS to Doubt (Your Causal Graph)
Markos Markakis, Ziyu Zhang, Rana Shahout, Trinity Gao, Chunwei Liu, Ibrahim Sabek, and Michael Cafarella.
SIGMOD 2024 GUIDE-AI (Best Paper Award).
Sawmill: From Logs to Causal Diagnosis of Large Systems
Markos Markakis, An Bo Chen, Brit Youngmann, Trinity Gao, Ziyu Zhang, Rana Shahour, Peter Baile Chen, Chunwei Liu, Ibrahim Sabek, and Michael Cafarella.
SIGMOD 2024 Demo.
Near-Linear Time and Fixed-Parameter Tractable Algorithms for Tensor Decompositions(*)
Arvind V. Mahankali, David Woodruff, and Ziyu Zhang.
ITCS 2024.

Teaching

  • Fall 2024 — Teaching Assistant for Database Systems at MIT
  • 2021 - Winter 2022 - (Head) Teaching Assistant for 15-451 Algorithms Design and Analysis at CMU
  • Fall 2019 - Fall 2020 Teaching Assistant for 15-151 Mathematical Foundations for Computer Science at CMU
  • Spring 2019 - Teaching Assistant for 15-122 Principles of Imperative Programming at CMU

Non-academic Stuff

  • I am originally from Northeastern China.
  • In my free time, I dance with the MIT Ballroom Dance Team and actively attend collegiate competitions with my partner Edan.
  • I enjoy making awesome (we think) food with Prashanti.
  • I am also an aspiring newcomer mountaineer thanks to Fan Pu and our other friends.
  • Lastly, I am quantum entangled with the wonderful Abigale Kim.
  • Website template is due to Prashanti Anderson.

News


Under Construction.