
Andrew Gordon Wilson

UPDATE: I have moved to NYU. If you are a student wanting to work with me, I encourage you to apply to NYU Courant CS, Math, or the Center for Data Science (CDS), and to list me in your application. My new website is You will be re-directed there shortly.

I lead a machine learning group at Cornell where I advise students in ORIE, Computer Science, Statistics, and CAM. You can check out some of our work here. I also teach classes on Bayesian machine learning and information theory.  I organized the NIPS 2017 symposium on Interpretable Machine Learning

I am interested in developing flexible, interpretable, and scalable machine learning models, often involving kernel learning, deep learning, and Gaussian processes.  I am particularly excited about probabilistic approaches. My work has been applied to time series, vision, NLP, spatial statistics, public policy, medicine, and physics.
Outside of work, I am a classical pianist who particularly enjoys Glenn Gould's playing of Bach.

I can be reached at, and on Twitter @andrewgwils.

Andrew Gordon Wilson
Assistant Professor
235 Rhodes Hall
Cornell University

I received an Amazon Research Award for "New Directions in Non-Convex Optimization for Deep Learning". Thank you for the support, Amazon!

Check out our new code and group pages!

Three new papers appearing at NIPS 2018!

I am an Area Chair/SPC for AAAI 2018, AISTATS 2018, UAI 2018, NeurIPS 2018, AISTATS 2019, ICML 2019, UAI 2019, NeurIPS 2019, ICLR 2020.

Upcoming talks: I am giving invited talks at CMStatistics 2017, MSR Cambridge, Cambridge University, UCL Gatsby, Banff International Research Centre (Interface of Statistics and Machine Learning) 2018, DALI 2018, and SIAM ALA (Applied Linear Algebra) 2018, Allerton 2018, and the Toronto Deep Learning Summer School!

My thesis provides an introduction to probabilistic non-parametric model construction, Gaussian processes and kernel design, and a vision for scalable and automatic kernel learning, with ideas for future directions.

Covariance kernels for fast automatic pattern discovery and extrapolation with Gaussian processes
Andrew Gordon Wilson
PhD Thesis, January 2014.
[PDF, BibTeX]

Google scholar page

A Simple Baseline for Bayesian Uncertainty in Deep Learning
Wesley Maddox, Timur Garipov, Pavel Izmailov, Andrew Gordon Wilson
To appear in Advances in Neural Information Processing Systems (NeurIPS), 2019
[PDF, arXiv, code, BibTeX]

Function-Space Distributions over Kernels
Greg Benton, Jayson Salkey, Wesley Maddox, Julio Albinati, Andrew Gordon Wilson
To appear in Advances in Neural Information Processing Systems (NeurIPS), 2019. [PDF, arXiv, code, BibTeX, to come!]

Exact Gaussian Processes on a Million Data Points
Ke Alexander Wang, Geoff Pleiss, Jake Gardner, Stephen Tyree, Kilian Weinberger, Andrew Gordon Wilson
To appear in Advances in Neural Information Processing Systems (NeurIPS), 2019
[PDF, arXiv, code, example notebook, BibTeX]

Subspace Inference for Bayesian Deep Learning
Pavel Izmailov*, Wesley Maddox*, Polina Kirichenko*, Timur Garipov*, Dmitry Vetrov, Andrew Gordon Wilson
Uncertainty in Artificial Intelligence (UAI), 2019
[PDF, arXiv, code, BibTeX]

Practical Multi-fidelity Bayesian Optimization for Hyperparameter Tuning
Jian Wu, Saul Toscano-Palmerin, Peter I. Frazier, Andrew Gordon Wilson
Uncertainty in Artificial Intelligence (UAI), 2019
[PDF, arXiv, BibTeX]

SWALP: Stochastic Weight Averaging in Low Precision Training
Guandao Yang, Tianyi Zhang, Polina Kirichenko, Junwen Bai, Andrew Gordon Wilson, Christopher De Sa
International Conference on Machine Learning (ICML), 2019
[PDF, arXiv, code, BibTeX]

SysML: The New Frontier of Machine Learning Systems
A. Ratner et. al, 2019
[PDF, arXiv, BibTeX]

Cyclical Stochastic Gradient MCMC for Bayesian Deep Learning
Ruqi Zhang, Chunyuan Li, Jianyi Zhang, Changyou Chen, Andrew Gordon Wilson
arXiv pre-print, 2019
[PDF, arXiv, code, BibTeX]

There Are Many Consistent Explanations of Unlabeled Data:
Why You Should Average
Ben Athiwaratkun, Marc Finzi, Pavel Izmailov, Andrew Gordon Wilson
International Conference on Learning Representations (ICLR), 2019
[PDF, arXiv, code, BibTeX]

Change Surfaces for Expressive Multidimensional Changepoints and Counterfactual Prediction
William Herlands, Daniel B. Neill, Hannes Nickisch, Andrew Gordon Wilson
To appear in the Journal of Machine Learning Research (JMLR), 2019
[PDF, arXiv, BibTeX]

GPyTorch: Blackbox Matrix-Matrix Gaussian Process Inference with GPU Acceleration
Jake Gardner, Geoff Pleiss, David Bindel, Kilian Weinberger, Andrew Gordon Wilson
Neural Information Processing Systems (NIPS), 2018
[PDF, arXiv, GPyTorch website, GPyTorch repository, BibTeX]

Loss Surfaces, Mode Connectivity, and Fast Ensembling of DNNs
Timur Garipov*, Pavel Izmailov*, Dmitrii Podoprikhin*, Dmitry Vetrov, Andrew Gordon Wilson
Neural Information Processing Systems (NIPS), 2018
[PDF, arXiv, code, BibTeX]

Scaling Gaussian Process Regression with Derivatives
David Eriksson, Kun Dong, Eric Lee, David Bindel, Andrew Gordon Wilson
Neural Information Processing Systems (NIPS), 2018
[PDF, arXiv, code, BibTeX]

Probabilistic FastText for Multisense Word Embeddings
Ben Athiwaratkun, Andrew Gordon Wilson, Anima Anandkumar
Association for Computational Linguistics (ACL), 2018
Oral presentation
[PDF, arXiv, code, BibTeX]

Averaging Weights Leads to Wider Optima and Better Generalization
Pavel Izmailov*, Dmitrii Podoprikhin*, Timur Garipov*, Dmitry Vetrov, Andrew Gordon Wilson
Uncertainty in Artificial Intelligence (UAI), 2018.
Oral presentation
[PDF, arXiv, code, BibTeX]

Automated Local Regression Discontinuity Design Discovery
William Herlands, Ed McFowland III, Andrew Gordon Wilson, Daniel B. Neill
Knowledge Discovery and Data Mining (KDD), 2018
[PDF, Video, BibTeX]

Constant-Time Predictive Distributions for Gaussian Processes
Geoff Pleiss, Jacob Gardner, Kilian Q. Weinberger, Andrew Gordon Wilson
International Conference on Machine Learning (ICML), 2018
[PDF, arXiv, code, BibTeX]

Hierarchical Density Order Embeddings
Ben Athiwaratkun, Andrew Gordon Wilson
International Conference on Learning Representations (ICLR), 2018.
[PDF, arXiv, code, BibTeX]

Product Kernel Interpolation for Scalable Gaussian Processes
Jacob Gardner, Geoff Pleiss, Ruihan Wu, Kilian Weinberger, Andrew Gordon Wilson
Artificial Intelligence and Statistics (AISTATS), 2018.
[PDF, arXiv, code, BibTeX]

Gaussian Process Subset Scanning for Anomalous Pattern Detection in Non-iid Data
William Herlands, Ed McFowland, Andrew Gordon Wilson, Daniel B. Neill
Artificial Intelligence and Statistics (AISTATS), 2018.
[PDF, arXiv, BibTeX]

Bayesian GAN
Yunus Saatchi and Andrew Gordon Wilson
Neural Information Processing Systems (NIPS), 2017
[PDF, arXiv, code, Short Video, BibTeX]

Bayesian Optimization with Gradients
Jian Wu, Matthias Poloczek, Andrew Gordon Wilson, Peter I Frazier
Neural Information Processing Systems (NIPS), 2017
Oral Presentation
[PDF, arXiv, Code, NIPS Oral Presentation, BibTeX]

Scalable Log Determinants for Gaussian Process Kernel Learning
Kun Dong, David Eriksson, Hannes Nickisch, David Bindel, Andrew Gordon Wilson
Neural Information Processing Systems (NIPS), 2017
[PDF, arXiv, Code, Video (from David E), BibTeX]

Scalable Levy Process Priors for Spectral Kernel Learning
Andrew Loeb, Phillip Jang, Matthew Davidow, and Andrew Gordon Wilson
Neural Information Processing Systems (NIPS), 2017
[PDF, arXiv, Code, Video (from Phillip J), BibTeX]

Multimodal Word Distributions
Ben Athiwaratkun and Andrew Gordon Wilson
Association for Computational Linguistics (ACL), 2017
[PDF, arXiv, Code, BibTeX]

Learning Scalable Deep Kernels with Recurrent Structure
Maruan Al-Shedivat, Andrew Gordon Wilson, Yunus Saatchi, Zhiting Hu, Eric P. Xing
To appear in the Journal of Machine Learning Research (JMLR), 2017.
[PDF, arXiv, code [PyTorch, more recent], code [Keras+Matlab], BibTeX]

Stochastic Variational Deep Kernel Learning
Andrew Gordon Wilson*, Zhiting Hu*, Ruslan Salakhutdinov, and Eric P. Xing
Neural Information Processing Systems (NIPS), 2016
[PDF, arXiv, Video, code (GPyTorch, more recent), code (Caffe+Matlab), BibTeX]

Deep Kernel Learning
Andrew Gordon Wilson*, Zhiting Hu*, Ruslan Salakhutdinov, and Eric P. Xing
Artificial Intelligence and Statistics (AISTATS), 2016
[PDF, arXiv, code (GPyTorch, more recent), code (Keras+Matlab), code (Caffe+Matlab), BibTeX]

Thoughts on Massively Scalable Gaussian Processes
Andrew Gordon Wilson, Christoph Dann, and Hannes Nickisch
arXiv pre-print, 2015
(See KISS-GP and Deep Kernel Learning for more empirical demonstrations).
[PDF, arXiv, code (GPyTorch, more recent), code (older tutorials), BibTeX, Music]

Scalable Gaussian Processes for Characterizing Multidimensional Change Surfaces

William Herlands, Andrew Gordon Wilson, Seth Flaxman, Daniel Neill, Wilbert van Panhuis, and Eric P. Xing
Artificial Intelligence and Statistics (AISTATS), 2016
[PDF, BibTeX]

Bayesian nonparametric kernel learning
Junier Oliva*, Avinava Dubey*, Andrew Gordon Wilson, Barnabas Poczos, Jeff Schneider, and Eric P. Xing. 
Artificial Intelligence and Statistics (AISTATS), 2016
[PDF, BibTeX]

The human kernel
Andrew Gordon Wilson, Christoph Dann, Christopher G. Lucas, and Eric P. Xing
Neural Information Processing Systems (NIPS), 2015
[PDF, arXiv, Supplement, BibTeX]

Kernel interpolation for scalable structured Gaussian processes (KISS-GP)
Andrew Gordon Wilson and Hannes Nickisch
International Conference on Machine Learning (ICML), 2015
Oral Presentation
[PDF, Supplement, arXiv, code (GPyTorch, newer), code (tutorials, older), BibTeX, Theme Song, Video Lecture]

Fast kronecker inference in Gaussian processes with non-Gaussian likelihoods
Seth Flaxman, Andrew Gordon Wilson, Daniel Neill, Hannes Nickisch, and Alexander J. Smola
International Conference on Machine Learning (ICML), 2015
Oral Presentation
[PDF, Supplement, BibTeX, Code, Video Lecture]

À la carte - learning fast kernels
Zichao Yang, Alexander J. Smola, Le Song, and Andrew Gordon Wilson
Artificial Intelligence and Statistics (AISTATS), 2015
Oral Presentation
[PDF, BibTeX]

Fast kernel learning for multidimensional pattern extrapolation
Andrew Gordon Wilson*, Elad Gilboa*, Arye Nehorai, and John P. Cunningham
Advances in Neural Information Processing Systems (NIPS) 2014
[PDF, BibTeX, Code, Slides]

Variational inference for latent variable modelling of correlation structure
Mark van der Wilk, Andrew Gordon Wilson, Carl Edward Rasmussen
NIPS Workshop on Advances in Variational Inference, 2014
[PDF, BibTeX]

A Bayesian method to quantifying chemical composition using NMR: application to porous media systems
Yuting Wu, Daniel J. Holland, Mick D. Mantle, Andrew Gordon Wilson, Sebastian Nowozin, Andrew Blake, and Lynn F. Gladden
European Signal Processing Conference (EUSIPCO), 2014

Bayesian inference for NMR spectroscopy with applications to chemical quantification
Andrew Gordon Wilson, Yuting Wu, Daniel J. Holland, Sebastian Nowozin, Mick D. Mantle, Lynn F. Gladden, and Andrew Blake
In Submission
. February 14, 2014
[arXiv, PDF, BibTeX]

Covariance kernels for fast automatic pattern discovery and extrapolation with Gaussian processes
Andrew Gordon Wilson
PhD Thesis, January 2014
[PDF, BibTeX]

processes as alternatives to Gaussian processes
Amar Shah, Andrew Gordon Wilson, and Zoubin Ghahramani
Artificial Intelligence and Statistics, 2014
[arXiv, PDF, Supplementary, BibTeX]

The change point kernel
Andrew Gordon Wilson
Technical Report (Note), University of Cambridge.
November 2013.
[PDF, BibTeX]

GPatt: Fast multidimensional pattern extrapolation with Gaussian processes
Andrew Gordon Wilson, Elad Gilboa, Arye Nehorai, and John P. Cunningham
October 21, 2013.   In Submission.
[arXiv, PDF, BibTeX, Resources and Tutorial]

Bayesian optimization using Student-t processes
Amar Shah, Andrew Gordon Wilson, and Zoubin Ghahramani
NIPS Workshop on Bayesian Optimisation, 2013.
[PDF, BibTeX]

Gaussian process kernels for pattern discovery and extrapolation
Andrew Gordon Wilson and Ryan Prescott Adams
International Conference on Machine Learning (ICML), 2013.
Oral Presentation
[arXiv, PDF, Correction, Supplementary, BibTeX, Slides, Resources and Tutorial, GPyTorch implementation, Video Lecture]

Modelling input varying correlations between multiple responses
Andrew Gordon Wilson and Zoubin Ghahramani
European Conference on Machine Learning (ECML), 2012
Nectar Track  for "significant machine learning results"
Oral Presentation

[PDF, BibTeX]

A process over all stationary covariance kernels
Andrew Gordon Wilson
Technical Report, University of Cambridge.
June 2012.
[PDF, BibTeX]

Gaussian process regression networks
Andrew Gordon Wilson, David A. Knowles, and Zoubin Ghahramani
International Conference on Machine Learning (ICML), 2012.
Oral Presentation
[PDF, BibTeX, Slides, Supplementary, Video Lecture, Original Code, New Code]

Generalised Wishart processes
Andrew Gordon Wilson and Zoubin Ghahramani
Uncertainty in Artificial Intelligence (UAI), 2011.
Best Student Paper Award
[PDF, BibTeX]

Copula processes
Andrew Gordon Wilson and Zoubin Ghahramani
Advances in Neural Information Processing Systems (NIPS), 2010.

[PDF, BibTeX, Slides, Video Lecture]