View on GitHub

Jupyter Notebooks

Links to some of my notebooks in HTML format


Teaching Notebooks

The following notebooks are from a two day environmental DNA (eDNA) workshop I gave in June at the Auckland campus of the University of Otago. It was attended by University students and researchers from across New Zealand. The first half of the first day was team taught by Ngoni Faya (Genomics Aotearoa) and Dinindu Senanayake (NeSI). I prepared and taught the remaining 1.5 days, using material developed by Gert-Jan Jeunen (University of Otago) and myself. All of these lessons were taught using Jupyter Hub on the NeSI server, which will now be the model for future workshops through Otago Carpentries. A full schedule and more course material is available on the course web page

Metabarcoding Basics

Demultiplexing and Trimming

Denoising and Clustering

Introduction to Qiime

Taxonomy Assignment

Importing Metabarcoding Data into R

Getting Started Exploring Data

Graphing Data with R


Finding Haplotypes in Alignments

The following three notebooks were used to create scripts for a project to determine frequencies of closely related haplotypes in a mock community experiment, in order to evaluate efficacy of denoising/clustering algorithms. The first two also server to illustrate my approach to script development. For larger scripts, my first step is usually to work out the different steps, using one or two example files or samples. Then I often create functions from these initial steps and further troubleshoot these in a subsequent notebook. This makes it easier to reuse any of these steps in other projects, if needed. Finally, if the steps need to be used by a student or other researchers I will export the notebook as a Python script and add Argparse arguments for command line use.

Find haplotypes

Find haplotypes using functions

The last notebook was used to develop a script for extracting and counting the frequency of haplotypes in the data, using the outputs from the first script.

Extract haplotypes from NGS data


Converting Tables

Following are a few examples of notebooks for converting table formats.

Convert Obitools tab to Qiime2

Create sqlite DB from taxonomy table

Create reference sequence from BLAST results


What the gff

Here are a few notebook for converting or modifying annotation files in GFF3 (and some GTF) format. The first one will convert a GTF for use in SeqMonk and enable the gene names to be visualised more easily.

Add gene names to GTF for SeqMonk

Change values in column of gff

Convert range data to gff with diff categories

Convert dN-gmap to dex-compatible map gff

Finding overlapping ranges between two gff files