I am a Bioinformatician working to design and test broadly protective vaccines. For this I apply machine learning to public sequence databases to generate insights which
can be applied to specific problems with smaller datasets. Using medical statistics I analyse pre-clinical data to identify candidates to take to clinical trial.
Here I post useful snippets of code, demonstrations, and exploartion of statistical techniques.
If you would like to contact me please get in touch through linkedIn.
When reporting binary classification models, just looking at accuracy doesn't give the full picture. You need sensitivity & specificity, or precision & recall.
When a response variable is ordered categories you can predict all categories using a single proportional odds model, instead of separate logistic regression models.
Phylogenetic mixed modelling
Simulate data for a phylogenetic mixed model using a phylogenetic tree to impose the phylogenetic covariance structure.
Unsuperivsed classification based on bivariate data using a gaussian mixture model.
Power & resolution
When binning a continuous variable, e.g. into high and low, positive and negative, you lose information and statistical power.
Additive mixed modelling
Fitting an additive mixed model to account for both long an short term effects of time while fitting an autocorrelation structure to the residuals.
Fitting a gaussian process to time series data. Using this approach domain knowledge can be used to choose an appropriate covariance function.
Web scraping, analysing, and interactive visualisations of data on scientific publishing.
A pop science outreach website produced by a small team of writers and vloggers which I organise.
Easily turn your code, workflow, and results into a presentation. This is great for sharing results with or without your code.