Neural Networks for Data Science Applications

Master Degree in Data Science

Year 2019/2020 (6 credits)

Mailing list

A Google Group is active to receive all info on the course (organization, material, …), ask questions, and discuss with other students:

General overview

The course will introduce neural networks in the context of data science applications. After a brief overview on supervised learning and numerical optimization, the course will describe recent techniques and algorithms (going under the broad name of “deep learning” or differentiable programming), that allows to successfully apply neural networks to a wide range of problems, e.g., in computer vision and natural language processing.

Students will be introduced first to convolutional networks (for image analysis), and then to recurrent neural networks for processing sequential data and tackle so-called ‘sequence to sequence’ problems (e.g., machine translation). Optional topics include attention-based models and embedding techniques for other types of data (e.g., graph-structured data).

Each topic will be supplemented by a practical laboratory where all concepts will be developed on realistic use cases through the use of the TensorFlow 2.0 library.

Slides and notebooks

1TBAIntro to the course
2TBASupervised learning and numerical optimization
LabTBAAutomatic differentiation in TF with GradientTape

Environment setup

Students are invited to bring their own laptop for the lab sessions. In order to have a working Python installation with all prerequisites, you can install the Anaconda distribution from the web (Python 3.7 version).

We will use the TensorFlow 2.0 release candidate in the course, that you can install from the Anaconda prompt simply as (more information available from the installation link, especially for the alternative GPU-enabled installation):

Alternatively, you can run all notebooks freely using the Google Colaboratory service (which you can access with a standard Gmail account or the account).

Reading material

The main reference book for the course is Deep Learning (MIT Press, 2016), which you can view for free in HTML form or buy following any of the links in the website. A more up-to-date reference, available for free as a set of Jupyter notebooks, is Dive into Deep Learning, which is however based on the MXNet deep learning framework.

Interesting additional links

Blog posts:

  • On the idea of differentiable programming, I wrote a small article on Medium. Interesting work in DP is now being done outside the Python ecosystem, e.g., see this blog post in the Flux blog (a Julia framework) or the manifesto of Swift for TensorFlow, arguing for automatic differentiation as a first-tool construct in the programming language. Another interesting perspective is the Software 2.0 blog post by Andrej Karpathy.
  • To learn more about reverse-mode AD, you can check this blog post with a working example in Rust, or a two-part tutorial I wrote (in Italian) with a toy implementation in Python (part 1 and part 2).
  • Practically anything coming from is worth reading and exploring if you are into deep learning.
  • More in general, the old blog and the new blog of Andrej Karpathy have several interesting pieces, including this full ‘recipe’ for training neural networks and why you should understand backpropagation.
  • Another interesting blog that you can check out is the one from Lilian Weng (link), currently research scientist at OpenAI.

Scientific articles: