Back

Syllabus

Course Number 0368-4238-01

Course Name Advanced Topics in Machine Learning for Computational Biology

Academic Unit The Raymond and Beverly Sackler Faculty of Exact Sciences -
Computer Science

Lecturer Dr. Jerome TubianaContact

Contact Email: jeromet@tauex.tau.ac.il

Office HoursBy appointment

Mode of Instruction Lecture

Credit Hours 3

Semester 2024/2

Day Sun

Hours 15:00-18:00

Building Ornstein - Chemistry

Room 102

Course is taught in English

Syllabus Not Found

Short Course Description

Data-driven science is a ubiquitous paradigm in modern biology: heaps of data are collected on a biological system of interest, from which one must discover qualitative scientific insights or build accurate quantitative models.

This course is an overview of advanced machine learning algorithms commonly used in modern computational biology research. The goals are three-fold: i) Learn the underlying mathematical principles behind these algorithms ii) Learn how and when to use them for scientific purposes iii) Understand their limitations. The algorithms will be illustrated on various biological systems (brain recordings, single cell data, protein sequences, molecules, etc.). Tentative syllabus below, is subject to changes.

Topic 1: Linear models and extensions (GLM, GAM, LASSO) for Tabular Data.

Topic 2: Decision trees and extensions (Decision rules; boosted trees) for Tabular Data.

Topic 3: Interpretable Machine Learning with Model agnostic-explanations.
a) Partial Dependency and Accumulated Local Effects plots.
b) Permutation Feature Importance.
c) LIME and SHapley Additive exPlanations (SHAP).

Topic 4: Deep learning Architectures for Unstructured Data.
a) Convolutional Neural Networks.
b) Graph Neural Networks.
c) Transformers.
d) Saliency maps.

Topic 5: Data visualizations with low-dimensional embeddings (PCA, tSNE, UMAP).

Topic 6: Meaningful feature extraction with Matrix Factorization Algorithms (K-means; Non-negative Matrix Factorization; Sparse PCA; Sparse Dictionary Learning).

Topic 7: Deep Generative Models
a) Autoregressive Generative Models.
b) Variational Inference & Variational Autoencoders.
c) Denoising Diffusion Generative Models.

Topic 8: Developing and Troubleshooting Deep Learning models.

Language: The course will be given in English.
Prerequisites: Introduction to Machine Learning (0368-3235) or Introduction to Statistical Learning (0365.3130) or another equivalent course. No prior knowledge of biology is required.
Evaluation: Evaluation will be based on home assignments (theoretical and case studies on biological data, 40%), on oral presentation of a research article (50%) and a written report of another article (10%).

Optional Reading Material:
a) An Introduction to Statistical Learning by James, Witten, Hastie, Tibshirani & Taylor. https://www.statlearning.com
b) Interpretable Machine Learning by Molnar. https://christophm.github.io/interpretable-ml-book/
c) Understanding Deep Learning by Prince.
https://udlbook.github.io/udlbook/

Full Syllabus

Course Requirements

Students may be required to submit additional assignments
Full requirements as stated in full syllabus

PrerequisiteIntroduction to Machine L (03683235)

The specific prerequisites of the course,
according to the study program, appears on the program page of the handbook