Yi Zhou (周奕)

Research Scientist at ByteDance Seed

About Me

profile photo

I am a research scientist at ByteDance Seed, with a focus on AI for Science and Artificial General Intelligence. I am currently developing biomolecular foundation models to bridge the gap between structure prediction, conformational sampling, and protein understanding and design. Our latest work, SeedFold (along with SeedProteo), is the first model to outperform AlphaFold3, enabled by scaling laws and linear attention.

Besides that, I am experienced in deep generative modeling, "old-school" NLP, and large language models. I established the cryo-EM research team at Bytedance Research, and our first work CryoSTAR was accepted by Nature Methods.

I was part of the Fudan NLP group and earned my Master's degree from Fudan University (2017~2021), under the mentorship of Prof. Xiaoqing Zheng. In 2020, I gained invaluable experience visiting the University of California, Los Angeles. During this visit, I had the opportunity to collaborate with Prof. Cho-Jui Hsieh and Prof. Kaiwei Chang, focusing primarily on the topic of model security in the field of NLP. Since 2021, I joined the ByteDance AI Lab, where I have been working closely with Prof. Hao Zhou(2021~2022) and Prof. Quanquan Gu(2023~).

Email: zhouyi.naive[-]bytedance[-]com / dugu9sword[-]gmail[-]com / yizhou17[-]fudan[-]edu[-]cn

We Are Hiring

We are looking for talented and self-motivated students to join us in building the next-generation biomolecular foundation models. They can have experience in cryo-EM, protein dynamics, protein design, protein language models, etc. If you are passionate about AI for Science, please send me your resume.

Selected Publications

(A full list can be found on Google Scholar.)

Biomolecular Foundation Models

  • SeedFold: Scaling Biomolecular Structure Prediction (ByteDance Seed Technical Report, 2025)
  • SeedProteo: Accurate De Novo All-Atom Design of Protein Binders (ByteDance Seed Technical Report, 2025)
  • Structure-informed Language Models Are Protein Designers (ICML 2023, oral)
  • Regularized Molecular Conformation Fields (NeurIPS 2022, spotlight)
  • Zero-Shot 3D Drug Design by Sketching and Generating (NeurIPS 2022, spotlight)

Cryo-EM

  • CryoSTAR: Leveraging Structural Prior and Constraints for Cryo-EM Heterogeneous Reconstruction (Nature Methods)
  • CryoFM: A Flow-based Foundation Model for Cryo-EM Densities (ICLR 2025)
  • A Generative Foundation Model for Cryo-EM Densities (biorxiv)

NLP

  • The Volctrans GLAT System: Non-autoregressive Translation Meets WMT21 (EMNLP-WMT 2021)
  • Defense against Synonym Substitution-based Adversarial Attacks via Dirichlet Neighborhood Ensemble (ACL 2021)
  • Evaluating and Enhancing the Robustness of Neural Network-based Dependency Parsing Models with Adversarial Examples (ACL 2020)

Softwares

Academic Services