About Me

I am a Bioinformatician working to design and test broadly protective vaccines. For this I apply machine learning to public sequence databases to generate insights which can be applied to specific problems with smaller datasets. Using medical statistics I analyse pre-clinical data to identify candidates to take to clinical trial.

Here I post useful snippets of code, demonstrations, and exploartion of statistical techniques. If you would like to contact me please get in touch through linkedIn.


Posts

Confusion matrix

Confusion matrix

When reporting binary classification models, just looking at accuracy doesn't give the full picture. You need sensitivity & specificity, or precision & recall.

Ordinal responses

Ordinal GLM

When a response variable is ordered categories you can predict all categories using a single proportional odds model, instead of separate logistic regression models.

Phylogenetic relatedness

Phylogenetic mixed modelling

Simulate data for a phylogenetic mixed model using a phylogenetic tree to impose the phylogenetic covariance structure.

Log likelihood difference

Gaussian mixture

Unsuperivsed classification based on bivariate data using a gaussian mixture model.

Power & resolution

Power & resolution

When binning a continuous variable, e.g. into high and low, positive and negative, you lose information and statistical power.

Correlation Matrix

Additive mixed modelling

Fitting an additive mixed model to account for both long an short term effects of time while fitting an autocorrelation structure to the residuals.

Time series data

Gaussian process

Fitting a gaussian process to time series data. Using this approach domain knowledge can be used to choose an appropriate covariance function.

GraphicCite

Graphic Cite

Web scraping, analysing, and interactive visualisations of data on scientific publishing.

The Denatured

The Denatured

A pop science outreach website produced by a small team of writers and vloggers which I organise.

Jupyter Slides

Jupyter Slides

Easily turn your code, workflow, and results into a presentation. This is great for sharing results with or without your code.