Task Force on Statistics and Machine Learning

View All Task Forces


Charge

Many of the 21st century’s most important discoveries — in fields ranging from astrophysics to finance, and from neuroscience to public policy — will emerge from sophisticated and innovative analyses of massive data sets. The field of statistics and machine learning, which develops the theories and techniques that enable scholars to extract meaning from data, will catalyze research agendas across multiple disciplines and departments. Excellence in this evolving science of data will be essential to any university that aspires to produce world-class education and research.

Princeton University has a distinctive history in statistics and a unique opportunity for creative leadership. Princeton alumni and faculty members — including Alonzo Church '24 *27, Alan Turing *38, John Von Neumann and John Tukey *39 — pioneered mathematical and computational ideas that made possible the development of modern data science. Despite its historical strength in the field, Princeton eliminated its Department of Statistics in 1985. Today, Princeton has a vibrant collection of outstanding statisticians in multiple departments, and it has the freedom to design an interdisciplinary center that is unfettered by prior departmental structures and so can be tailored to seize the opportunities presented by new developments in the field.

To do so effectively, Princeton will have to plan carefully. It must aim for the highest levels of quality. It must develop a vibrant and cohesive community in statistics and machine learning while also nurturing the field’s connections to research and teaching throughout the University. It must ensure that students and faculty in the field have access to the infrastructure and data sets they need to do their work well. And the University must figure out whether it should concentrate its efforts on particular data sets or topics, and, if so, which ones.

Earlier this year, Princeton took an initial step forward in this critical field by launching its new Center for Statistics and Machine Learning. As Princeton prepares to increase the scope of the center’s activity, I would ask this task force to consider the following questions:

  1. What are Princeton's current strengths and weaknesses in statistics and machine learning? How does Princeton’s position compare to that of other leading research universities?
  2. What existing resources — including faculty positions, graduate student slots and facilities — are now supporting the University's efforts in statistics and machine learning, or can be repurposed to do so?
  3. How should Princeton define the intellectual and scholarly core of its Center for Statistics and Machine Learning? To what extent must Princeton focus on specific kinds of data, or specific areas of research, in order to ensure that its center’s work achieves world-class eminence?
  4. Which departments are the key partners for the Center for Statistics and Machine Learning? How should the University decide on the right mix of center-only faculty lines (if any) versus joint appointments?
  5. What graduate and undergraduate curriculum should the center offer, and what degree programs should it support? What student constituencies should the center serve, and how it can do so effectively? What faculty resources would the center require to staff its teaching agenda?
  6. What infrastructure — such as research staff, computational facilities and data sets — would the center require in order to achieve world-class eminence in its teaching and research?
  7. How should the University evaluate the success of investments that it makes in the field of statistics and machine learning?

Members

Chair

  • John Storey, William R. Harman '63 and Mary-Love Harman Professor in Genomics; Professor of Molecular Biology and the Lewis-Sigler Institute for Integrative Genomics;  Director, Center for Statistics and Machine Learning

Faculty members

  • Jonathan Cohen, Robert Bendheim and Lynn Bendheim Thoman Professor in Neuroscience; Professor of Psychology and the Princeton Neuroscience Institute; Co-Director, Princeton Neuroscience Institute
  • Jay Dominick, Vice President for Information Technology and Chief Information Officer
  • Jianqing Fan, Frederick L. Moore, Class of 1918, Professor in Finance; Professor of Operations Research and Financial Engineering; Chair, Department of Operations Research and Financial Engineering; Director, Committee for Statistical Studies
  • Kosuke Imai, Professor of Politics; Director, Program in Statistics and Machine Learning
  • Matthew Salganik, Professor of Sociology
  • Christopher Sims, John J. F. Sherrerd '52 University Professor of Economics
  • James Stone, Professor of Astrophysical Sciences and Applied and Computational Mathematics; Director, Princeton Institute for Computational Science and Engineering; Director, Fund for Canadian Studies
  • Olga Troyanskaya, Professor of Computer Science and the Lewis-Sigler Institute for Integrative Genomics

Staff members

  • Kara Dolinski, Director, Genome Databases, Lewis-Sigler Institute for Integrative Genomics

Reports