What the sold-out NeurIPS conference says about the state of AI

On September 4, the Neural Information Processing Systems Foundation released between 2,000 and 2,500 tickets to its annual conference, the premier conference in the field of machine learning, now known by the acronym NeurIPS.

They sold out in 12 minutes.

The conference organizers are expecting 8,000 attendees at the Palais des Congrès de Montréal the first week of December, and the majority of the tickets were held in reserve for presenters, sponsors, reviewers, and other conference participants. Still, the quick sellout of the first block of publicly released tickets is a testament to the enthusiasm that machine learning generates.

Bernhard Schölkopf, chief machine learning scientist for Amazon’s retail division

“AI research was somewhat stuck until people noticed that by using huge datasets, serious computing infrastructure, and state-of-the art machine learning methods, very real progress can be made,” says Bernhard Schölkopf, chief machine learning scientist for Amazon’s retail division and a member of the NeurIPS advisory board. “This has transformed NeurIPS from the premier machine learning conference to the hottest AI conference.”

Alex Smola, machine learning director for Amazon Web Services’ Deep Engine group

“The poster sessions are probably the best part of NeurIPS, due to the higher degree of engagement you can have with the authors and discuss technical details,” says Alex Smola, machine learning director for Amazon Web Services’ Deep Engine group and an area chair at NeurIPS. “This is quite unsurpassed. Likewise, the workshops are quite valuable. They’ve turned into mini-conferences — actually, not quite so mini, with hundreds of people attending the largest ones.”

Dilek Hakkani-Tür, a senior principal scientist in the Alexa AI group

Last year, Dilek Hakkani-Tür, a senior principal scientist in the Alexa AI group, was an invited speaker at the first NeurIPS Workshop on Conversational AI. “The number of people was amazing,” she says. “It was one of the biggest workshop rooms, and it was one of the most crowded workshops. This year, I imagine it’s going to be even more crowded.”

Hakkani-Tür is one of the organizers of this year’s workshop, which also has eight Amazon scientists on its programming committee. “Paper submissions are more than 50 percent higher this year,” she says. “There has been a lot of work on task-oriented dialogue for a long time, but for the past few years now, there’s a lot of work on more chitchat-based or social-bot types of systems. And there are many submissions to this workshop about those.”

“We have seen a lot of advancements in image processing and speech and in many machine learning problems,” she adds. “Within dialogue, it’s still so hard to have machines that can learn how to converse in an open domain. I think that’s why people want to work on the problem.”

Amazon is one of the conference’s top-tier (diamond) sponsors and is also a sponsor of the Women in Machine Learning and Black in AI workshops. Ten Amazon researchers are members of NeurIPS boards and committees, and 15 are coauthors on eight conference papers.

Amazon will also have a booth in the exhibition hall — recognizable by the enormous (working) replica of an Echo device inside it.

Amazon researchers’ involvement with NeurIPS gives them a valuable vantage point from which to assess the conference program and what it says about the state of machine learning research.

“Five years ago, it felt like 90 percent of the conference was around deep learning,” says Ralf Herbrich, the director of machine learning science for Amazon’s Core AI group and the demonstrations and competitions chair at NeurIPS. “There is now a trend towards a diversification of approaches to solving that pattern-recognition and prediction-of-future-events problem that we call machine learning. I find that very healthy.”

Two approaches in particular stand out to Herbrich. One is the revival of a technique called Bayesian learning. Conventional machine learning attempts to discern statistical patterns that describe the data in a data set. Bayesian learning attempts to discern the same types of patterns — but it also estimates the probability that the patterns it discerns are correct.

Ralf Herbrich, co-head of Amazon's Core Artificial Intelligence (AI) group. He wears a navy button down and leans against a white background with blue and gray illustrations.

Ralf Herbrich, the director of machine learning science for Amazon’s Core AI group

That means that Bayesian systems can learn more efficiently than conventional machine learning systems. When a Bayesian system assigns a low probability to the accuracy of its statistical model, new data can modify the model dramatically. As the probable accuracy increases, however, the model becomes more resistant to change.

“For Amazon, it’s one of the most interesting problems, because we believe that more selection is better for customers,” Herbrich says. “But more selection also means that you have a longer tail of entities, whether that’s books or fashion items or groceries. The tail is by definition a sparse set of observations. What you need there is something that is closer to human reasoning, which can learn from very few examples.”

The other machine learning technique that Herbrich sees drawing increased attention is something called a spiking neural network. Deep neural networks, which in recent years have come to dominate the field of machine learning, consist of multiple layers of virtual “neurons” that take in data from the layers below, process it, and pass it to the layers above. For any given input to the network, many if not most of the computations performed by individual neurons will prove irrelevant to the final output and get filtered out. But they’re performed nonetheless.

In a spiking neural network, which more faithfully simulates the human brain, a neuron “fires” only when it receives the particular input it’s been trained to recognize — when, that is, its contribution to the output matters. Because, for any given input, only a small fraction of neurons fire, spiking neural networks are much more power efficient than conventional deep neural networks.

“Think of a device like the Echo,” Herbrich says. “We run pretty intelligent compute on it, which is detecting wake word. Now think of a use case where we run it on a battery. As more and more compute moves to the edge, energy efficiency becomes an important criterion because it constrains how often and where you can use it.”

Inderjit Dhillon, an Amazon fellow in the Search Technologies group and a professor of computer science at the University of Texas at Austin, was a senior area chair for NeurIPS, overseeing the review of around 5,000 paper submissions. One trend he observed in those submissions is an increased concentration on so-called adversarial methods.

“I think there is a genuine concern that if automakers start deploying neural networks in, for example, self-driving cars, how easy is it for the cars to misread a sign?” Dhillon says. “You can try to make your neural network more robust by generating adversarial examples. You look at the way neural networks learn, and then you try to construct adversarial examples where the neural network might get fooled, and then to make the network more robust, you re-train it to make the right prediction on those examples.”

For example, Prem Natarajan, a vice president of Alexa AI and director of the Alexa Natural Understanding group, is senior author on a NeurIPS paper titled “Unsupervised Adversarial Invariance,” which examines adversarial techniques for purging data sets of “nuisance factors” that can mislead machine learning systems.

“I also found that there was a thread on societal implications of AI, and fairness in AI,” Dhillon adds. “What struck me is that it seemed to be more at the level of the invited talks and workshops, but less in terms of the actual technical papers. These are very important societal problems, and the hope is that more and more people start thinking about them with a technical eye.”

NeurIPS papers with Amazon coauthors

(Amazon researchers in bold):

"Scalable Hyperparameter Transfer Learning" | Authors: Valerio Perrone, Rodolphe Jenatton, Matthias Seeger, Cedric Archambeau

"Deep State Space Models for Time Series Forecasting" | Authors: Syama Sundar Rangapuram, Matthias Seeger, Jan Gasthaus, Lorenzo Stella, Bernie Wang, Tim Januschowski

"Unsupervised Adversarial Invariance" | Authors: Ayush Jaiswal, Rex Yuu Wu, Wael Abd-Almageed, Prem Natarajan

"Informative Features for Model Comparison" | Authors: Wittawat Jitkrittum, Heishiro Kanagawa, Patsorn Sangkloy, James Hays, Bernhard Schölkopf, Arthur Gretton

"Temporal abstraction for recurrent dynamical models" | Authors: Alexander Neitz, Giambattista Parascandolo, Stefan Bauer, Bernhard Schölkopf

"Does mitigating ML's impact disparity require treatment disparity?" | Authors: Zachary Lipton, Julian McAuley, Alexandra Chouldechova

"Understanding the Role of Adaptivity in Machine Teaching: The Case of Version Space Learners" | Authors: Yuxin Chen, Adish Singla, Oisin Mac Aodha, Pietro Perona, Yisong Yue

"Privacy Amplification by Subsampling: Tight Analyses via Couplings and Divergences" | Authors: Borja Balle, Gilles Barthe, Marco Gaboardi

Workshops with Amazon participants

Conversational AI:
Amazon organizer: Dilek Hakkani-Tür
Program Committee: Amina Shabbeer, Chandra Khatri, Kevin Small, Lambert Mathias, Parminder Bhatia, Praveen Bodigutla, Ravi Jain, Yan Yang