Here’s another article on the advances made on neural network design.

A new area in artificial intelligence involves using algorithms to automatically design machine-learning systems known as neural networks, which are more accurate and efficient than those developed by human engineers. But this so-called neural architecture search (NAS) technique is computationally expensive. A state-of-the-art NAS algorithm recently developed by Google […]

Neural networks have become a hot topic over the last few years, but evaluating the most efficient way to build one is still more art than science. In fact, it’s more trial and error than art. However, MIT may have solved that problem.

The NAS (Neural Architecture Search, in this context) algorithm they developed “can directly learn specialized convolutional neural networks (CNNs) for target hardware platforms — when run on a massive image dataset — in only 200 GPU hours,” MIT News reports. This is a massive improvement over the 48,000 hours Google reported taking to develop a state-of-the-art NAS algorithm for image classification. The goal of the researchers is to democratize AI by allowing researchers to experiment with various aspects of CNN design without needing enormous GPU arrays to do the front-end work. If finding state of the art approaches requires 48,000 GPU arrays, precious few people, even at large institutions, will ever have the opportunity to try.

Here’s an interesting talk from Microsoft Research YouTube channel by Yuija Li about Gated Graph Sequence Neural Networks. Details about the presentation and a link to the paper are below the video.

Link to paper

From the description:

Graph-structured data appears frequently in domains including chemistry, natural language semantics, social networks, and knowledge bases. In this work, we study feature learning techniques for graph-structured inputs. Our starting point is previous work on Graph Neural Networks (Scarselli et al., 2009), which we modify to use gated recurrent units and modern optimization techniques and then extend to output sequences. The result is a flexible and broadly useful class of neural network models that has favorable inductive biases relative to purely sequence-based models (e.g., LSTMs) when the problem is graph-structured. We demonstrate the capabilities on some simple AI (bAbI) and graph algorithm learning tasks. We then show it achieves state-of-the-art performance on a problem from program verification, in which subgraphs need to be matched to abstract data structures.