About Me

👋 Hi, I’m @DAWells 🦠 I’m interested in using multiomics and AI to treat disease 👨‍💻 I'm using long/short read NGS and knowledge graphs

Here I post useful snippets of code, demonstrations, and explorations of statistical techniques. If you would like to contact me please get in touch through linkedIn.


Posts

Comparison of averages

By any means

The impact of different averages on rankings.

Clustering cancer types

Exploring cancer types with neo4j

How to identify and visualise clusters in knowledge graphs.

Algorithmic DNA

Codon pair optimisation

Optimising genes with a genetic algorithm.

Confusion matrix

Matplotlib cmap

Simple demos of matplotlib's cmaps.

Flu treemap

Flu subtypes

How is the flu sequence data distributed across subtypes, hosts and countries on the NCBI data base?

Confusion matrix

Confusion matrix

When reporting binary classification models, just looking at accuracy doesn't give the full picture. You need sensitivity & specificity, or precision & recall.

ggtree examples

ggtree samples

Code snippets for various ggtree examples.

Ordinal responses

Ordinal GLM

When a response variable is ordered categories you can predict all categories using a single proportional odds model, instead of separate logistic regression models.

Phylogenetic relatedness

Phylogenetic mixed modelling

Simulate data for a phylogenetic mixed model using a phylogenetic tree to impose the phylogenetic covariance structure.

Log likelihood difference

Gaussian mixture

Unsuperivsed classification based on bivariate data using a gaussian mixture model.

Power & resolution

Power & resolution

When binning a continuous variable, e.g. into high and low, positive and negative, you lose information and statistical power.

Correlation Matrix

Additive mixed modelling

Fitting an additive mixed model to account for both long an short term effects of time while fitting an autocorrelation structure to the residuals.

Time series data

Gaussian process

Fitting a gaussian process to time series data. Using this approach domain knowledge can be used to choose an appropriate covariance function.

GraphicCite

Graphic Cite

Web scraping, analysing, and interactive visualisations of data on scientific publishing.

The Denatured

The Denatured

A pop science outreach website produced by a small team of writers and vloggers which I organise.

Jupyter Slides

Jupyter Slides

Easily turn your code, workflow, and results into a presentation. This is great for sharing results with or without your code.