Neural Networks for Data Science Applications
Year 2019/2020 (6 credits)
A Google Group is active to receive all info on the course (organization, material, …), ask questions, and discuss with other students:
The course will introduce neural networks in the context of data science applications. After a brief overview on supervised learning and numerical optimization, the course will describe recent techniques and algorithms (going under the broad name of “deep learning” or differentiable programming), that allows to successfully apply neural networks to a wide range of problems, e.g., in computer vision and natural language processing.
Students will be introduced first to convolutional networks (for image analysis), and then to recurrent neural networks for processing sequential data and tackle so-called ‘sequence to sequence’ problems (e.g., machine translation). Optional topics include attention-based models and embedding techniques for other types of data (e.g., graph-structured data).
Each topic will be supplemented by a practical laboratory where all concepts will be developed on realistic use cases through the use of the TensorFlow 2.0 library.
Slides and notebooks
|1||TBA||Intro to the course|
|2||TBA||Supervised learning and numerical optimization|
|Lab||TBA||Automatic differentiation in TF with GradientTape|
Students are invited to bring their own laptop for the lab sessions. In order to have a working Python installation with all prerequisites, you can install the Anaconda distribution from the web (Python 3.7 version).
We will use the TensorFlow 2.0 release candidate in the course, that you can install from the Anaconda prompt simply as (more information available from the installation link, especially for the alternative GPU-enabled installation):
$ pip install tensorflow==2.0.0-rc0
Alternatively, you can run all notebooks freely using the Google Colaboratory service (which you can access with a standard Gmail account or the uniroma1.it account).
The main reference book for the course is Deep Learning (MIT Press, 2016), which you can view for free in HTML form or buy following any of the links in the website. A more up-to-date reference, available for free as a set of Jupyter notebooks, is Dive into Deep Learning, which is however based on the MXNet deep learning framework.
Interesting additional links
- On the idea of differentiable programming, I wrote a small article on Medium. Interesting work in DP is now being done outside the Python ecosystem, e.g., see this blog post in the Flux blog (a Julia framework) or the manifesto of Swift for TensorFlow, arguing for automatic differentiation as a first-tool construct in the programming language. Another interesting perspective is the Software 2.0 blog post by Andrej Karpathy.
- To learn more about reverse-mode AD, you can check this blog post with a working example in Rust, or a two-part tutorial I wrote (in Italian) with a toy implementation in Python (part 1 and part 2).
- Practically anything coming from Distill.pub is worth reading and exploring if you are into deep learning.
- More in general, the old blog and the new blog of Andrej Karpathy have several interesting pieces, including this full ‘recipe’ for training neural networks and why you should understand backpropagation.
- Another interesting blog that you can check out is the one from Lilian Weng (link), currently research scientist at OpenAI.
- Baydin, A.G. et al., Automatic differentiation in machine learning: a survey, JMLR, 2018 [absolute best introduction to AD].
- Bottou, et al., Optimization methods for large-scale machine learning. SIAM Review, 2018 [very good starting point for optimization in ML].
- Schmidhuber, J., Deep learning in neural networks: An overview. Neural Networks, 2015 [if you are into the history of DL].