Code

(See also: Github profile )

For a full listing of my public software projects, and to find out what I've been doing lately, see my profile on Github.

I've highlighted some projects below, roughly grouped by area of application.


Reproducible research-software deployments with Ansible

I look after many devops aspects of our research group's software environment, ensuring that they have access to both stable and development versions of the tools we're building. Partly this is achieved through deployment of Python packages, but for the wider range of requirements (e.g. webservers, dependency compilation) I've been building a collection of Ansible roles and configurations. As a result, much of our cluster-setup can be reproduced or reconfigured on-demand using the codes stored in our group github repositories.


Components for real-time astronomical transient alerting and response

voeventdb.server & voeventdb.remote

Built with: Python, SQLAlchemy, Flask, Pytest, PostgreSQL, Apache, Ansible

voeventdb.server is a database-store and accompanying RESTful query API for archiving and retrieving VOEvent packets.

voeventdb.remote is an accompanying Python client-library for end-users (i.e. astronomers).

Together, these tools serves two main purposes:

  • They allow people distributing or monitoring VOEvent packets to ‘catch up’ with any missed data, in the event of a network or systems outage.
  • They allow astronomers to search through the archive of VOEvents. This can be useful for planning future observations, or looking for related events in a particular region of sky, or mapping the distribution of detected events, etc.

Comes with a complete set of docs including Jupyter notebook tutorials and demos.

voevent-parse

Built with: Python, LXML.

A lightweight library for parsing, manipulating, and generating VOEvent XML packets, (i.e. machine-readable transient alerts) built on lxml. The aim of this library was to make working with VOEvent packets simpler and more intuitive - take a look at the usage examples or even the tutorial and judge for yourself. It's reasonably well documented. As of version 0.8 (Jan 2015) voevent-parse is fully Python 3 compatible.

Anyone trying to use lxml.objectify and struggling with namespace handling, PyType annotations etc. might be interested in the first few function definitions here.

fourpiskytools

Built with: Python, voevent-parse, Comet.

A 'quick-start' template to help astronomers get started sending or receiving VOEvents. Essentially provides a stock configuration for starting Comet and connecting it to some scripts built with voevent-parse. This allows you to connect to our VOEvent broker and get desktop notifications when a VOEvent arrives - more info, and some background on VOEvents, can be found here.


Transients detection and image-cataloguing

TKP TraP

Built with: Numpy/Scipy, PostgreSQL, MongoDB, Django

An astronomical transient-detection pipeline for ingesting radio-synthesis images. In a nutshell, we extract source intensities from images, then build a lightcurve catalog and search it for variability. This requires some fairly involved NumPy routines for the source-extraction (all in-house Python, at least for the first edition), and some hairy SQL queries for building and searching the catalogs. TraP received its first open release in Feb 2015, with an accompanying paper providing an extensive reference on the underlying algorithms.

I wrote a short summary piece on TraP which you can read here, and I've also given a couple of short summary talks introducing it.

See also...

Banana

A Django-based web-interface for exploring and visualising the results, providing a means for astronomers to explore their data using a fluid visual interface, without requiring local installations or specialised knowledge (of e.g. SQL-queries).

Demo-instance deployment scripts

I recently put together a set of Ansible deployment scripts for installing TraP and the Banana data-exploration interface, so if you'd like to see a demo-version running on a cloud-instance this can be arranged - just drop me a line.


Automating radio-astronomy data reduction

drive-ami / drive-casa

Built with: Python, pexpect.

Two interface libraries which make heavy use of pexpect to enable complex scripting of astronomical data reduction tools from Python. The CASA package is quite widely used in the radio astronomy community, so I've put up some basic docs to help others get started.

Update: drive-casa has seen some user-uptake, with active feedback and contributions from a handful of users, so it's nice to know I wasn't crazy to bother documenting what is effectively a very niche tool.

amisurvey / chimenea

Built with: Python, drive-ami, drive-casa.

These packages represent a (telescope-specific) end-to-end data-reduction pipeline and the more generally applicable data-reduction algorithm used therein. Both build on the interfacing packages described above, introducing various data-structures to allow a higher-level view of the data-flow.

Now fully written up and published!


High-performance data reduction for lucky imaging

Coelacanth

Built with: C++, CMake, Boost, TBB, Minuit2, UNURAN, UnitTest++.

"Codes for EMCCD and Lucky-Imaging Analysis", around 15K lines of C++ code that grew out of my PhD project. The data-volumes and limited processing power available on-site (i.e. while up a mountain observing) required a set of high-performance codes for specialized data-reduction. Part of the challenge was to implement complex algorithms such as Drizzle in C++ in a maintainable fashion - you can see the results here. The final pipeline made use of the Thread Building Blocks pipeline pattern to achieve excellent throughput.


Bric-a-brac

Less substantial, but possibly still useful:

  • autocrunch
    A Python script demonstrating how to use pyinotify to monitor a local directory for files that have been transferred with rsync, then process them in a parallel fashion using a multiprocessing pool (via whatever Python reduction process you care to define). This has been road-tested quite a bit, and includes decent logging and error handling.
  • pyds9_ex
    The DS9 FITS file viewer is fully scriptable, but only has a low-level interface. This wrapper provides some convenience routines around that low-level functionality.
  • python-imap-monitoring
    Procmail for the Python + GMail generation. Monitor your GMail inbox and trigger Python scripts in response to special emails.
  • FSlint for humans
    Some scripts to parse the output from FSlint into a useful CSV summary, allowing the user to locate the largest duplicate files, and who they belong to, on a multi-user cluster with lots (~100's of TB) of disk space.