We considered this group to be “expert users” who could provide insight into the limitations of available natural history database interfaces (including Symbiota portals) for managing collections and showing collections to non-expert visitors. Not all herbaria use Symbiota-based databases for managing their collections. However, all the herbarium professionals who participated in our meeting were familiar with the structure and functionality of Symbiota portals.
Herbarium professionals use collection databases in a variety of ways. Among other things, these include: tracking specimen loans and exchanges; reporting collection activities to upper institutional administration; reporting to funding agencies; data entry for new and existing specimens; displaying and exhibiting collections to visitors. The discussions during this meeting centered around three main themes for an improved interface that would help herbarium operations:
- Easier management of herbarium data including loan activity and generating reports
- Improved options for data entry by students and volunteers
- Enhanced search functionality for non-expert visitors to the herbarium (e.g., external groups, donors, undergraduate classes for non-majors)
Need for tools to enhance and improve data management
Herbarium specimens are often loaned and received to and from other institutions for research purposes. New specimens are also sent and received as gifts or part of exchange agreements. Tracking these activities is an important part of herbarium professionals’ duties. In the past, these records were kept in log books but nowadays, they are mostly stored electronically with direct links to the specimen records via the accession numbers.
For loan activity, meeting participants discussed the utility of using a database interface to find subsets of specimens for loans. This could be the actual specimens or a “loan” of electronic records. The ability to sort, pile, and tag digitized specimens in groups for re-determination after they come back from loans would also be a useful feature. In many cases, specimens sent out on loan come back with new identifications based on expert assessment by the person requesting the loan. The ability to group the records and batch update the determinations would be a useful feature for herbarium professionals.
Herbarium professionals also provide detailed reports to upper level administrators. This could be reporting the number of loans, gifts and exchanges and the growth of the collection in a particular region or taxonomic group or across the entire collection.
Also, there is a need for tracking specimen citations in academic journals to illustrate the value of the collection to the broader research community.
Herbarium professionals also need to track changes and edits to existing electronic specimen records and record the number of new records added to the database. The ability to perform these functions and generate activity reports with a slider to define a date range would be a particularly useful feature.
There was also discussion about the utility of linking to other resources containing data obtained from particular specimens (e.g., DNA sequence data from a particular specimen that is stored in the GenBank database).
The following quotes from meeting participants illustrate these points:
“So it’s a little a little limiting as I’d have to … go through six steps to figure out what they [students working on data entry] changed.”
“One of the questions that was asked in an external review was what percentage of your collection is actually being used? Well, that’s a difficult question to answer, you can look at the amount of loans that are out there and that gives you a percentage … But then there’s the rest of the collection that is valuable because it’s used for comparative purposes [but this may not be tracked] … But we don’t really have a good way of determining in some cases who’s actually using it”
Need for tools to increase efficiency of data entry
Over the past 10-15 years, herbarium collections are increasingly being digitized and made available online. This increased access has broadened the utility of specimens and their associated collection data, making them relevant to a much wider audience. Digitizing specimens, particularly the data transcription part of the procedure, is slow. Improving efficiency was discussed as an important aspect of an improved database interface. Some key features include:
- The ability to update and annotate records from expert determinations and comments that are received online. Being able to make batch annotations and redeterminations would be particularly useful in this regard.
- Herbarium specimens are often collected in duplicate sets that are distributed to several different herbaria. An enhanced ability to search for and “ingest” existing digitized records of duplicate specimens held at other herbaria would allow digitizers to take advantage of work already completed.
- The work of digitizing is often carried out by undergraduate students and volunteers who may have limited experience and knowledge of botany. Designing a data entry form that is easier for non-expert users would cut back on training time and make the entire process more efficient.
- Batch uploads of existing records that share some provenance (e.g., taxonomy, collection locality) would also greatly improve data entry efficiency. An example quote from one meeting participant illustrates this point: “I used to work at the New York Botanical Gardens and with EMU [the NYBG database] you could put in the genus and species scan the first barcode scan the last barcode and, bam, it made 20 records.”
The importance of an enhanced search functionality
In the same way that finding a particular book title in a library database is not always easy, finding specimen records in existing databases can be challenging, particularly if precise search terms are not known.
One example is the spelling of a collector’s name. If the exact spelling is not known, the ability to search for “sounds like” so an exact match is not needed, would be useful.
In other situations, it might be necessary to find specimens from a certain habitat type (e.g., oak forest). Most herbarium databases have a field for habitat but in some cases, the information is mistakenly added to the locality field.
Being able to search across fields for particular search terms would eliminate the problem of finding specimen records that were incorrectly entered.
In addition, being able to search for visual features such as flower or fruit color is often a way that non-expert users may want to access the data.
After searching for a particular set of specimens, the ability to display them in a novel or engaging way was also mentioned by meeting participants as important. One example is to place specimens on a phylogenetic tree to illustrate their evolutionary relationships. Or to place points or thumbnails on a map so users could understand species distributions. This would be particularly powerful if a time axis was incorporated to show changes in distribution.
Many users are more familiar with common names than scientific names so being able to search for “bur oak” might be preferable to searching for “Quercus macrocarpa.” Connecting to outside databases (e.g., USDA Plants Database) that include common names was an important feature discussed by meeting participants.
Visual searches were also mentioned as being more compelling to non-expert visitors to herbaria. One example was to present a screen full of thumbnail images, with a sidebar listing specimen categories that could be selected by the user to filter the image cloud. Gradually, the cloud would shrink as more filter terms were applied. Eventually, the user could zoom in on individual thumbnails for a high resolution view of a particular specimen.
A role for machine learning to score features such as flower color or leaf shape would also help non-experts herbarium visitors find specimens of interest. Most of these characters are not part of specimen metadata so they would not be part in the database record. If a machine learning algorithm could find specimens that look like other specimens this would help non-expert users. A similar feature is used by Amazon, Netflix, and Spotify: you searched for this item/film/song so you might like these alternatives that share similar characteristics.
Visitors are often interested in specific groups of specimens often for non-traditional uses, e.g., art, or looking for specimens by their chemistry for medicinal uses or understanding interactions with other trophic interactions, e.g., insects pests.
Some quotes from meeting participants illustrate some of these points:
“visual piles … that can be manipulated and zoomed in on and zoomed out that can be created in all kinds of different ways based on dragging and dropping, or based on a search criteria. Some machine learning algorithm presents bins or piles … that you can save and share. I think that sort of mimics what you do in a [physical] collection that to my knowledge, you can’t really do easily with any online interface right now”
“Along those lines … being able to put things into different groups or bins or piles … for comparison”