A lifetime ambition: Barcoding UK Collembola

Peter Shaw
University of Roehampton, UK

Carly Benefer
Plymouth University


This blog post is not a write-up of work done so much as it is a wish list of work to do!  Here we describe the background to, and early stages of a long-term project to answer an apparently simple question: “What species of Collembola occur in Britain?” by using molecular barcodes.   

At the moment, we have three separate programmes collecting these springtail barcodes.  One focuses specifically on just two genera (Entomobrya and Lepidocyrtus) overseen by Brent Emerson of UEA + la Laguna Tenerife.  A second forms part of PhD work by Stephanie Bird – co-funded by the Royal Horticultural Society.  The third programme is more ad-hoc, overseen by Carly Benefer (Plymouth) and it is this third programme that forms the basis of this blog.  We should also add mention of some work by Jonathan Ellis of Manchester Metropolitan University, who (with MSc students) has been barcoding inter-tidal Anuridas.

The barcodes in question are based on sequence comparisons between mitochondrial cytochrome oxidase I (COI), a gene that is rapidly becoming the international standard for determination of animal species.  It has several convenient features: being mitochondrial there is only one haplotype per body (two for most nuclear alleles), and it is unaffected by sexual recombination.  For the same reason there are many multiple copies of each sequence per cell, making extraction easier.  It just happens to evolve at a rate that corresponds – roughly – with species separation (though not in plants, curiously – we still lack a universal botanical barcode).  There is a dedicated online database for this gene; the Barcode Of Life Database.

Personal Experience – A Story of Optimism

When marking student write-ups, one normally scrawls red pen around anything written in the first person.  One of my commonest comments on essays is “Avoid ‘I and We’”.  Please therefore forgive my extensive use of the first person into the next few paragraphs: since this is a blog I would like to convey a personal perspective on my increasing awareness of my own ignorance!

I inherited the role of UK recorder for the Collembola from Steve Hopkin in 2006, and now maintain a database of all publications naming UK Collembola, field collections of the group and (latterly) gather some records from online photographs.  All that I want to do is to know what to call them!   

My journey into the Collembola has been characterised by progressive reductions in confidence, each time that the scale of the task became clearer!  When I started my PhD on the springtails and fungi of lodgepole pine plantations (under Michael Usher and John Dighton) back in 1982, I expected to know most of the UK’s Collembola by my writing-up stage in about 1985.  By 1985 I realised that this was way too optimistic, but I thought that I could at least name the common species of pine forests reliably using Arne Fjellberg’s 1982 Norwegian key, with a little dusting off of Gisin’s 1960’s German work.

This happy delusion lasted until I started meeting Steve Hopkin while he was preparing the FSC key, when I realised that I was still too optimistic, and that the UK Collembolan fauna was far richer than Norway’s.  Happily, Steve finished his draft manuscript before being killed in a car crash, and I was able to help a little by proofreading the final manuscripts for the FSC.  Then I felt confident that I could name UK Collembola – until I realised I was again being too optimistic when the photomicrographic community started turning up multiple unrecognisable symphypleona, quite widely in the southern UK.

So now to name Collembola in the UK we had a dedicated key, plus photos of some aliens.  Oh, and a native groundwater community that was previously unknown and has probably little changed certainly since the ice age – that happens to include at least one springtail new to the UK: Hymenaphorura nova.

I could, at last, feel confident about naming UK springtails.

You can perhaps guess what I’m going to say next?  Yes, of course:  I was being way too optimistic.  Recent work by people (including Antonio Carapelli, David Porco, Francesco Cicconardi, Brent Emerson, Felipe Soto-Adames, Aron Katz, Mark Stevens – forgive me for many omissions here) has shown that many well-known “species” of springtail are in fact multiple clades that seem as genetically isolated as are true species.  This is the problem of ‘cryptic’ species, and it raises a whole new set of taxonomic questions.  

Cryptic Species

To take one example from many: the most widely recorded springtail in the UK, the one that turns up in virtually every inland collection (urban, grassland, woodland, even some montane) is Parisotoma notabilis, formerly Isotoma notabilis.

(Right: “Parisotoma notabilis” 1mm, Collected Spadeadam forest 20iv2012.  Its clade is not yet known.)

Recent work by David Porco and others shows that this “species” conceals (at least) four clades whose mitochondrial (COI) and nuclear DNA (28S rRNA) differ between clades and are co-inherited, in other words (at least) four cryptic species. 

This example shows how far we are from being able to give definitive names to collections yet:  Parisotoma notabilis was first described by Schäffer in 1896, so the taxon defined by his type specimen must be the taxon to bear the name “notabilis”.  It is not obvious how we can ever be sure about the true identity of Schäffer’s type specimen since the DNA will have degraded hopelessly – in this case the same location (type locality) was re-sampled, hoping that the population is unchanged since 1896.  The other 3 clades should then be given new names.  (In this case it seems that the type specimen is from the main clade that also occurs in Britain, which would be a relief if confirmed).   

From the viewpoint of a biogeographic recorder this is catastrophic, since not only are all the existing (150 years of) records invalidated, but so also are the great majority of new records when no DNA sequencing is done. 

Aside: This sort of problem is not confined to Collembola – see for example Project Waxtongue for a project posing a similar set of questions about the waxcap fungi.

So, the only way to actually know with any confidence what genetic “species” of Collembola we have in the UK is to obtain fresh samples of as many taxa as is feasible, and to then barcode them.  The majority will probably prove to be clades found widely elsewhere in the world, but we should have a few endemic lines too.

A Collaborative Effort

This is the background to a collaboration between Universities of Roehampton (Peter Shaw) and Plymouth (Carly Benefer).  Peter has collected, photographed and identified Collembola from various ad-hoc collections, and sent them to Plymouth for extraction/PCR/sequencing.  So far we have been able to use a small ‘seedcorn’ internal fund to generate barcodes for 44 species [maybe more soon?], mainly collected from the London area, but also northern England, and from one area of Caledonian forest north of Loch Ness. 

In some cases the results are so simple that the molecular approach seems overkill.  Neanura muscorum is one of the commonest and most widespread Collembola in Europe, so finding that collection from Northumberland showed a >98% match to a mainland European Neanura muscorum was reassuring but unsurprising.  Likewise collections of Allacma fusca, Dicyrtomina ornata and Isotomurus maculatus matched international collections almost perfectly.  These are all relatively large and visually distinctive species. 

(Left: Isotomurus maculatus from Digby Stuart College, Roehampton, London, which turns out to have close relations on Marion Island in the sub-Antarctic.)

Most collections of the Tomocerids also matched their nominate species quite well, especially Pogonognathellus longicornis, the biggest species in the UK. 

(Clockwise from top left: Dicyrtomina ornata, Neanura muscorum, Allacma fusca, Pogonognathellus longicornis)

One sequence has corrected a taxonomic mis-apprehension: Peter collected Xenylla from the strandline on Lindisfarne, and (seeing a dividing line between the mucro and dens) called it Xenylla maritima, whose mucro is separate from the dens.  The sequence came back as a close match to Xenylla humicola from Manitoba (whose key features include the mucro being fused to the dens).  Re-checking the Lindisfarne collection showed that the mucro was indeed partially fused to the dens but with a dividing line visible for half its width, which turns out to be correct for X. humicola but is not quite what the diagram in the FSC key shows. 

(Right: Xenylla humicola mucro-dens junction)

The early results show some clades to be international, with suggestions of European lineages causing unseen invasions.  There was a perfect match between a water-loving springtail Isotomurus maculatus in Shaw’s college gardens (London) and collections from Marion Island in the sub-Antarctic!  This was presumably a recent and accidental introduction. 

Similarly a clade of Tomocerus minor from woods in Surrey matched 100% to a collection from Victoria in Australia, while a Hypogastrura purpurescens from the college gardens was a 99% match to a collection from central Chile.  Orchesella cincta and O. villosa are both large, mobile surface-active and common Collembola, and UK collections proved to be a 100% match to barcodes of these species from Canada, France and Poland.

The work by Jonathan Ellis and Michelle Davies on Anurida maritima around the UK (admittedly using a different sequence, 28S rRNA) found no evidence for cryptic speciation, despite long-standing claims that two distinct forms of this common inter-tidal springtail co-exist around the UK.  

Such simplification is always welcome!

Some of the results suggested errors on the databases, notably a repeated observation that springtails in the genus Tomocerus match very closely to a couple of sequences from nemertean worms.  Although usually thought of as marine flatworms (to be pedantic maybe closer to molluscs and annelids, but definitely nowhere near anything in the arthropoda), some nemerteans live in damp soil on land where they predate leaf litter invertebrates.  The occurrence of nemertean sequences in springtail barcodes may therefore represent contamination from gut contents (suggesting that the nemerteans in question had been eating Tomocerids recently), although more collections will be needed to verify this.

A final observation is that several common springtails did not match closely to uploaded sequences, notably collections of Pseudisotoma sensibilis.  This “species” is common and widespread throughout the UK, though with a preference for tree bark and acidic soils.  Its diagnostic features include clavate hairs by its feet for attachment, and it comes in several colour morphs (white/pale yellow or dark blue are commonest, also sometime grey).  During my PhD work the two colour morphs (Shown below: Pseudosotoma sensibilis, two “colour morphs”) were so consistently found in different microhabitats that I analysed them as different species (European Journal of Soil biology 32, 89-97).  We have barcoded three colour morphs now – indeed they differed slightly, but didn’t match closely to anything on BOLD or genbank, despite this species having been sequenced in Canada and France.  

So much more to be done!