
Bio (CV)
I am a research scientist at Facebook AI Research (FAIR). Previously, I was an instructor at Princeton and the IAS. In 2017, I completed a PhD in computer science at Stanford where I worked on machine learning and natural language processing co-advised by Chris Manning and Percy Liang. Before that I got a BASc at University of Toronto and worked on capsules with Geoffrey Hinton.
Contact: sidawxyz [at] gmail.com
Recently I worked on Code LLM: Self-play SWERL, CWM, AST-FIM, SWE-RL, eval-arena, LEVER, Coder-reviewer, InCoder, MBR-exec; and code benchmarks: SWE-bench M, Spider 2.0, LiveCodeBench, SAFIM, CRUXEval, DS-1000.
Selected papers (Google Scholar)
Measuring all the noises of LLM Evals
Sida Wang. 2025. [proj] [post]
Toward Training Superintelligent Software Agents through Self-Play SWE-RL
Yuxiang Wei, Zhiqing Sun, Emily McMilin, Jonas Gehring, David Zhang, Gabriel Synnaeve, Daniel Fried, Lingming Zhang, Sida Wang. 2025.
CWM: An Open-Weights LLM for Research on Code Generation with World Models
FAIR CodeGen team. 2025.
Structure-Aware Fill-in-the-Middle Pretraining for Code
Linyuan Gong, Alvin Cheung, Mostafa Elhoushi, Sida Wang. 2025.
SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution
Yuxiang Wei, Olivier Duchenne, Jade Copet, Quentin Carbonneaux, Lingming Zhang, Daniel Fried, Gabriel Synnaeve, Rishabh Singh, Sida I. Wang. Neurips, 2025.
CRUXEval: A Benchmark for Code Reasoning, Understanding and Execution
Alex Gu, Baptiste Rozière, Hugh Leather, Armando Solar-Lezama, Gabriel Synnaeve, Sida I. Wang. ICML, 2024.
Accessing higher dimensions for unsupervised word translation
Sida Wang. Neurips, 2023.
Coder reviewer reranking for code generation
Tianyi Zhang, Tao Yu, Tatsunori B Hashimoto, Mike Lewis, Wen-tau Yih, Daniel Fried, Sida I. Wang. ICML, 2023.
DS-1000: a natural and reliable benchmark for data science code generation
Yuhang Lai, Chengxi Li, Yiming Wang, Tianyi Zhang, Ruiqi Zhong, Luke Zettlemoyer, Scott Wen-tau Yih, Daniel Fried, Sida Wang, Tao Yu. ICML, 2023.
Natural language to code translation with execution
Freda Shi, Daniel Fried, Marjan Ghazvininejad, Luke Zettlemoyer, Sida I. Wang. EMNLP, 2022.
Bilingual lexicon induction via unsupervised bitext construction and word alignment
Haoyue Shi, Luke Zettlemoyer, Sida I. Wang. ACL, 2021.
Learning adaptive language interfaces through interaction
Sida I. Wang. Stanford University, 2017. [thesis]
Naturalizing a programming language via interactive learning
Sida I. Wang, Samuel Ginn, Percy Liang, Christopher D Manning. ACL, 2017. [slides] [project]
Data noising as smoothing in neural network language models
Ziang Xie, Sida I. Wang, Jiwei Li, Daniel Lévy, Aiming Nie, Dan Jurafsky, Andrew Y. Ng. ICLR, 2017.
Learning language games through interaction
Sida I. Wang, Percy Liang, Chris Manning. ACL, 2016. [slides] [project]
Estimating mixture models via mixture of polynomials
Sida I. Wang, Arun Chaganty, Percy Liang. NIPS, 2015. [poster] [code]
Fast and adaptive online training of feature-rich translation models
Spence Green, Sida Wang, Dan Cer, Chris Manning. ACL, 2013
Dropout training as adaptive regularization
Stefan Wager, Sida I. Wang, Percy Liang. NIPS, 2013. [slides] [poster] [code]
Fast dropout training
Sida I. Wang, Christopher D. Manning. ICML, 2013. [slides] [talk] [code]
Baselines and bigrams: simple, good sentiment and text classification
Sida I. Wang, Chris Manning. ACL, 2012. [code]
Object recognition using capsules
Geoffrey E. Hinton, Alex Krizhevsky, Sida I. Wang. ICANN, 2011.
Services
-
Area chair for Neurips, ICLR, ICML
-
Reviewer for TACL, ARR, Neurips, ICLR, EMNLP, JMLR, ICML
- Spring 2018: COS495 Natural language processing
- Fall 2015: Head TA for CS224 Natural language processing
- Winter 2013: TA for CS229T Statistical machine learning