INHS Reports September-October 1998

Database Gathers Taxonomic Information

Before the computer, scientists organized their data in notebooks, on note cards, or in their heads. If we were lucky, these observations were published in scientific journals or monographs for future generations to read and use as a reference. When taxonomists write papers describing the insects they study, they may look at as few as two or as many as several hundred, even thousands of specimens.

The information about insect specimens caught, pinned, and placed in museums is on minute labels attached to the pin impaling the specimen. It often requires either excellent eyesight or a magnifying glass to examine these labels, and at a minimum a good dissecting microscope to discern the pertinent features distinguishing one species from another. Now, with computers on the desks of most scientists, this suite of information can be logged into a database for easier retrieval. A database stores information about a group of things (in our case, the fly family Therevidae) in records, one record per fly. We tag each insect with a unique number and add that label to those already on the pin so that in the future researchers may retrieve the information from the database for their studies. We have over 50,000 therevids cataloged to date by undergraduate students working with our project.

Each record is composed of many fields, or pieces of information that we may want to retrieve about that fly. Where was the fly collected? When? By whom? Under what conditions and with what equipment? Was it collected as an adult or reared to that stage? Was it associated with a plant or another organism such as a predator? What is the scientific name of the specimen? Has anyone ever called it anything else? Is it on loan from a museum or collection? Has it been illustrated? What is its sex? Is the specimen missing body parts? Has it been ground up for molecular studies? Has it been dissected? Has someone written about it in the literature? Is it a type specimen, used as a model for the description of the species? Most of this information can be gathered from looking at the specimen and the labels attached to its pin.


Specimen collection tray containing therevid species in INHS Insect Collection.

The National Science Foundation (NSF) recognized the importance of getting taxonomists to record their data electronically in its granting program "Partnerships for Enhancing Expertise in Taxonomy" (PEET). In addition to training the next generation of taxonomists and providing funds for the study of little-known groups of organisms, NSF has emphasized the need to electronically catalog and distribute information about the organisms we study. As part of our project on the fly family Therevidae (stiletto flies), we examined existing systems for cataloging specimen data and decided that many existing customized systems were at once more complex (requiring a dedicated computer programmer to create and maintain) and not responsive enough to our demands for special features.

Many scientists avoid cataloging their work because they think they must be computer geniuses to create a database. With today's database applications, this no longer has to be the case. Three years ago, we chose an off-the-shelf database engine, FileMaker(TM) Pro, that was known for its ease of use and its ability to work in both Macintosh(TM) and Windows(TM) environments. With a rudimentary knowledge of this database engine (we are entomologists, not computer programmers) we developed, in collaboration with taxonomists, five files (specimens, taxa, lots, museums, people) that were connected by lookups and formed the original basis for recording information about therevid specimens. Over the last three years, the static lookups have become dynamic relations, where changing the data in one place changes it everywhere those data are referenced. The number of related files has swelled from 5 to 24, and in 1998 the database structure finally gained a name: Mandala. The word means interconnectedness and typifies these interrelated files that are linked by specimen number, taxon name, and/or literature citation. The system includes context sensitive help at both the file and field level (clicking on the help button while in a field takes the user to specific help for that field). There is also a system for electronic recording and tracking of questions and their resolution.

Do scientists need everything we have developed over the last three years to do their systematic research? It is likely they could use more, not less. The system is flexible enough that parts of it may be used and others ignored. If a better system comes along, the data can be exported to it. Specific fields, customized for the user, can be exported for reports or monographs. Additional fields may be easily defined; the intent of a field may be modified (e.g., elevation to depth, as we did for another NSF PEET project focused on tiny deep sea mollusks called aplacophorans). Pop-up lists may also be modified to reflect different collecting methods and geographical reference points (oceanographic basins rather than geopolitical units). When modifying the databases to work with another group of flies, acrocerids that are parasitic on spiders, we found we needed to document more than one specimen associated with another (e.g., one spider with many parasites). This addition may not be immediately useful to our project, but no doubt provides a generalized improvement to our database structure.

Why is it important to have all of these data on the computer? Specimens are often scattered in museums around the world. Scientists studying them may have temporary access to them, but then must return the specimens after study. By recording the data electronically, it is easier to retrieve only those pieces of information that are relevant to the questions being asked. It is also easier to make these data available via CD ROM or on the World Wide Web to a wider audience, such as those working on biodiversity or agroecology issues. Later this year we hope to have some of the published information from our databases searchable on the Web.

http://www.inhs.uiuc.edu/cee/wwwtest/therevid/stiletto_fly.html

Gail E. Kampmeier and Michael E. Irwin, Center for Economic Entomology

Next Article
Index to This Issue
Index to Survey Reports
INHS Home Page



INHS Please report any problems with or suggestions about this page to:
in hspubs@mail.inhs.uiuc.edu
Subject: INHSPUB-00458