Machine Learning and Data Science

This page contains links to some of my projects that showcase my interests and competences in machine learning and bioinformatics. The projects are grouped in:

Research Projects
Topics in Machine Learning
Topics in Bioinformatics

Research Projects

The following projects some of the coding work for my research work.

Antimicrobial Resistance Prediction

This work revolved around the use of multimodal learning to jointly combine clinical proteomics (MALDI-TOF) with chemical features to enhance the prediction of antimicrobial resistance.

GitHub

Personalized Epigenomic Imputation

The research project on large-scale imputation of reference human epigenomes culminated in the application of the eDICE model to impute individual-specific epigenetic patterns, a case study which is one of the first applications of deep learning to personalized epigenomics.

GitHub

Network Propagation for GWAS Analysis

The use of information diffusion methodologies enhances the statistical power of GWA studies by including domain knowledge in the form of molecular networks.

GitHub

Representation Learning of Cancer Somatic Mutations Profiles

The repository contains the results of some research work that aimed to learn meaningful representations for individual profiles of cancer somatic mutations.

GitHub

Topics in Machine Learning

The following projects span a wide variety of machine learning topics, and are available on github as jupyter notebooks.

Hierarchical Clustering of Features

This notebook explores the use of seriation to find an ordering of the features of a dataset that highlights a block struckture in the correlation matrix (blockmodeling). The approach shown here is based on the Pearson Correlation Coefficient, but can be taken as a basis in general for other correlation measures (e.g. distance correlation), or simply to reorder a distance matrix.

Dimensionality Reduction

This notebook presents the background over the curse of dimensionality and explores a few dimensionality reduction techniques and their inner workings.

Variational Inference and Variational Autoencoders

Variational Autoencoders (VAEs) are among the most prominent generative models in the machine learning literature. This notebook explores and simulates the core mechanism behind VAEs, variational inference, and implements a VAE example.

Bioinformatics Projects

The following projects span a wide variety of machine learning topics, and are available on github as jupyter notebooks.

Compartmental Models for Infectious Diseases

One of the simples classes of mathematical models that describe the spread of infectious diseases is that of compartmental models. This notebook presents the basic concepts and some simulations to highlight their inner workings.