Module 5: consolidations and electives
[Back to main page]
Electives
Python: Python for programmers - Data science with Python and Pandas - Data analytics with NumPy and Pandas - Deeper dive into Python - Statistics with Python - Scalable analytics with Python (DASK) - Image processing with Python - Machine Learning with Python
Machine learning: Machine Learning with Python - Machine learning with Dataiku - Mathematical methods for data Science and machine Learning - Machine Learning review - Deep networks - Deep learning: a practical approach in MATLAB
Other: Introduction to MATLAB - LaTeX: introduction and further type setting techniques - Working with containers - Advanced HPC topics
Python for programmers
[Back to top]
- slides (html, unformatted pdf, GitHub repository used during the course)
- official tutorial
- official documentation
- other references:
- old courses from “Site du zéro”: Apprenez à programmer en Python, Pratiques avancées et méconnues en Python, Utilisation avancée des listes en Python, Le pattern Decorator en Python, La programmation scientifique avec Python
- A whirlwind tour of python by Jake VanderPlas (pdf, jupyter notebooks, with plotting notebook by Kinga Sipos)
- Tutorials from Zeste de savoir: Apprendre à programmer avec Python 3, Notions de Python avancées, La programmation orientée objet en Python, Les slices en Python, Variables, scopes et closures en Python
Data science with Python and Pandas
[Back to top]
- GitHub repository of the course
- pdf or html of all notebooks
- notebooks:
- Pandas introduction (pdf, html)
- Pandas objects (pdf, html)
- importing excel files (pdf, html)
- operations with Pandas objects (pdf, html)
- combining information in Pandas (pdf, html)
- splitting data (pdf, html)
- advanced plotting (pdf, html)
- insight into Machine Learning (pdf, html)
- exercises (pdf, html)
- solutions to exercises (pdf, html)
Data analytics with NumPy and Pandas
[Back to top]
- GitHub repository of the course
- pdf or html of all notebooks
- notebooks:
- Numpy array creation (pdf, html)
- Numpy array and maths (pdf, html)
- Numpy and matplotlib (pdf, html)
- Numpy indexing (pdf, html)
- Numpy combining arrays (pdf, html)
- Pandas introduction (pdf, html)
- Pandas structures (pdf, html)
- Pandas import plotting (pdf, html)
- Pandas operation (pdf, html)
- Pandas combine (pdf, html)
- Pandas splitting (pdf, html)
- Pandas realword (pdf, html)
- Exercises
Deeper dive into Python
[Back to top]
- Beyond notebooks
- Python (Jupyter) notebooks - What are notebooks good at? - What are notebooks bad at? - What is needed to run Python code? - How to run python code? - Setting up VSC Live Share
- Objects and scope
- Everything is objects - Object identity - The mystery of small integers - Object type - Object value - Mutable vs. immutable objects - Accessing objects
- Scopes (local, non-local, global, builtin) - Scope gotchas
- What can you do with objects: attributes (or properties) - method - standard functions and operations
- Classes (doc)
- What are classes? (instance) - Class syntax - Simplest class - Non-declared goods
- __dict__ property to Store instance properties; __slots__ to enforce a specific set of properties
- Adding a method - Aside: f-strings (python >= 3.6) - Class methods (1st argument automatically receives the reference to the instance itself)
- self to store data - hasattr(object, name) to check for attributes - __init__ to always have data and meaningful constructor
- Aside: “dunder” or “magic” methods (fixed names for objects interfacing with specific language functions: surrounded by double underscores; do not create your own new names of this format !)
- __str__(self) used whenever call print on the object
- Class variables (property shared by all instances)
- Parameter defaults gotchas - Class variables gotchas
- Usage counters - Example: inventory system - Container class - Implementation details
- Private attributes (convention: _)
- Implementing the container - Implementing an iterator - Implementing other things
- Class inheritance (super() to access the base class method if overriden)
- Multiple inheritance
- Aside: getters (functions returning a value as a property; @property) and setters (function that receives the value when a property is written to; @<funcname>.setter)
- Functional programming
- “Pure” functions: do not depend on external or internal state; no “side effects”
- Functional programming and immutability (best way: use immutable data structures)
- Aside: shallow and deepcopies (<object>.copy(); for potentially mutable objects, a recursive “deep” copy is needed: deepcopy(<obj>) from the copy module)
- “Higher-order” functions: functions that accept functions as parameters (in Python, functions are treated as values)
- Standard primitives for functional programming: some built-in, some in the functools module (e.g. reduce)
- Generators and lazy evaluation (e.g. map returns a generator (= an iterable which is evaluated one value at a time wheniterating), an example of lazy evaluation)
- Decorators
- e.g. @mydec on the line before myfunc() equivalent to mydec(myfunc)
- Dealing with arguments and return values - Decorator parameters
- A practical example
- Stacking decorators (the function is wrapped in the inverse order of decorators)
- Decorators and function identity (after being wrapped by a decorator, the name of the function changes; if not desirable, use from functools import wraps)
- Tracing recursive calls
- Memoization (optimize the code by caching intermediate results)
- Modules and packages
- What are modules and packages? (modules = mechanism for sharing code across multiple Python source files; packages = a way to organize multiple modules together in a tree-like structure of submodules)
- Importing code (import and star-import; this later is discouraged as it modifies the global namespace in an unpredictable way)
- Module example
- What happens on import? - What else happens on import? (__pycache__ folder)
- __name__: at runtime, contains the name of the current module; set to __main__ if executed directly
- Where does Python search for modules? sys.path from sys module (directory containing the input script or current directory; PYTHONPATH environment variable; the installation-dependent default), can be modified at runtime
- Packages (organized in subfolder; when published, contain additional metadata)
- Python environments
- a specific Pythonversion plus a collection of extra packages
- What problem do environments solve? (working with different package versions) - What’s needed for a virtual environment? (a package manager to install packages in the environment (e.g. pip ); a virtual environment manager (e.g. virtualenv )
- Virtual environment workflow: creating, activating, installing dependencies, running code - Virtual environment example
- Tricky parts of Python package management (non-python code and libraries; pip supports a binary format called “wheels”)
- What about (Ana)conda? (conda = package / virtual environment manager part of the Anaconda distribution; can manage Python packages; has more support for binary packages)
Statistics with Python
[Back to top]
Scalable analytics with Python (DASK)
[Back to top]
Image processing with Python
[Back to top]
- GitHub repository of the course
- pdf or html of all notebooks
- notebooks:
- introduction (pdf, html)
- Numpy with images (pdf, html)
- image import/export (pdf, html)
- basic Image processing: Filtering, scaling, thresholding (pdf, html)
- binary operations, regions (pdf, html)
- applications: Satellite image (pdf, html)
- functions (pdf, html)
- pattern matching, local maxima (pdf, html)
- watershed algorithm (pdf, html)
- 3D case (pdf, html)
- create a short complete analysis (pdf, html)
- image registration (pdf, html)
- pixel classification (pdf, html)
- image classification by machine learning: Optical text recognition (pdf, html)
- deep learning (pdf, html)
- image classification using deep learning (pdf, html)
- semantic segmentation: Github resources (pdf, html)
- application: DICOM (pdf, html)
Machine Learning with Python
[Back to top]
Machine learning with Dataiku
[Back to top]
- introduction to Dataiku (slides)
- data preparation and machine learning models (slides)
- image classification model (deep learning) (slides)
- how to get a certificate (slides)
- personal reading notes from internet about Dataiku concepts (v1, v2)
Mathematical methods for data science and machine learning
[Back to top]
- GitHub repository of the course
- pdf of all notebooks
- notebooks:
- introduction (pdf, html) / day 1 (pdf, html)
- linear algebra (pdf, html)
- vector operations - matrix operations - projection and the dot product - orthogonal matrices - change of basis - eigenvalues and eigenvectors of matrices
- calculus (pdf, html)
- differentiation of univariate functions - rules of differentiation - differentiation of multivariate functions (the Jacobian, the Hessian) - chain rule for univariate and multivariate functions - the Taylor approximation - the Newton-Raphson method - gradient descent method - backpropagation
- other ressources:
Machine learning review
[Back to top]
- folder with all notebooks
- pdf of all notebooks
- day 1
- day 2
- slides:
- regression (linear, polynomial; ridge, LASSO, elastic net; performance evaluation) - classification (logistic regression; naïve Bayes; k-nearest neighbors; performance evaluation) - Support Vector Machines (SVM; regression/classification) (SVC; SVR) - ensemble methods (regression/classification) (decision trees; random forests; bagging, boosting)
- notebook (pdf, html)
- chapter 2: end-to-end machine learning project
- day 3
- slides
- clustering (k-means; hierarchical clustering) - density estimation (Gaussian Mixture Model (GMM)) - sequence prediction (HMM; RNN; LSTM) - feature extraction (PCA; kernel PCA; manifold Learning)
- part 1: notebook (pdf, html)
- Chapter 8: dimensionality reduction
- part 2: notebook (pdf, html)
- Chapter 9: unsupervised learning
- day 4
- slides
- neural networks - training the NN - activation functions - loss functions - faster optimizers than gradient descent (momentum optimization; RMSprop; adaptative moment (Adam)) - NN as an alternative to other ML algorithms
- notebook (pdf, html)
- Chapter 10: introduction to artificial NN with Keras
Deep networks
[Back to top]
- folder with all notebooks
- pdf of all notebooks
- day 1: deep forward networks
- slides
- What are Deep Forward Networks ? - regularization for deep learning - training and optimization for deep models
- notebook (pdf, html)
- Chapter 11: training deep NN
- day 2: convolutional neural networks
- slides
- CNN components - most important architectures - object detection - face detection
- notebook - part 1 (pdf, html)
- Chapter 14: deep computer vision using convolutional NN
- notebook - part 2 (pdf, html)
- day 3: recurrent NN
- slides
- recurrent NN components - training RNNs - optimization techniques - examples - Natural Language Processing (NLP)
- notebook - part 1 (pdf, html)
- text generation with an RNN
- notebook - part 2 (pdf, html)
- day 4: state-of-the-art machine learning
- summary video: pdf, pptx
- BERT Applications - AdaNet: AutoML with Ensembles - AutoAugment -Synthetic Data - Polygon-RNN++ - DAWNBench - Video-to-video synthesis - Semantic segmentation - Symmetric semantic segmentation - AlphaGo - ALPHA ZERO - OpenAI vs. Humans on Dota 2 - Deep Learning Frameworks
- surveys: 1, 2, 3
Introduction to MATLAB
[Back to top]
Deep learning: a practical approach in MATLAB
[Back to top]
- slides
- course material: deep learning in 6 lines, pretrained model exercise, MNIST exercise, transfer learning exercise, GPU coder, ECG signal classification, improving network accuracy, speech command recognition, reinforcement learning
LaTeX: introduction and further type setting techniques
[Back to top]
Working with containers
[Back to top]
Advanced HPC topics
[Back to top]
Questions, comments or suggestions ? Don’t hesitate to contact me !