Bringing the whole zoo to Wikidata
In 2022, Wikimedia Australia offered three $1000 (AUD) Wikidata Fellowships to curate a data set, develop a prototype or undertake an investigation using Wikidata supported by an experienced mentor. Fellow Dr Margaret Donald is a statistician and prolific Wikimedian who has coordinated Wiki Loves Earth in recent years. Supported by mentor Toby Hudson with Annie Reynolds, Margaret describes her project creating new mix’n’match catalogues to populate Wikidata with Australian animals.
When I ran Wiki Loves Earth 2021 I tried to determine how many Australian animals and plants did not have images in Wikimedia Commons, only to discover that most Australian animals were not in Wikidata at all. While this is being remedied by a mix’n’match catalogue for the Australian Faunal Directory (AFD), taxonomic information only existed for a small number of Australian fauna in Wikidata, just 9.73%.
I used my Wikidata Fellowship to create new mix’n’match catalogues and update older ones. As a result 95% of the animals listed in the AFD are now in Wikidata and, if an animal has an English Wikipedia article, anyone reading that page can see that the animal is found in Australia, via the AFD item in the taxonbar, the information bar at the bottom of most faunal pages that links to various biological and taxonomic databases. So while everyone knows that koalas are native to Australia, they would not have known that Batrachomatus nannup (formerly Allomatus nannup) is too. This little predaceous diving beetle which lives upstream of Nannup in Western Australia has pages in other language Wikipedias as well, including cebwiki, minwiki, nlwiki, svwiki and viwiki. And because viwiki embraces tools, Vietnamese readers can see this beetle is Australian via the viwiki taxonbar. And they have a link to the original description (added via the automated citation manager cite Q).
A number of matched Wikidata items (perhaps more than 100) had no statement beyond the statement that this Qitem had an article (in plwiki, nlwiki, nowiki, …). (See anatomy of a Wikidata taxon item below for further context). This meant that no bot had touched them since their creation, perhaps two or more years ago. However, the moment that these items stated that they were instances of a taxon, they attracted considerable work by bots, making the information contained in the external databases more readily available. The significance of this contribution can be seen from the fact that there are 376 wikis with 656,790 pages describing taxa having AFD identifiers with Cebuano Wikipedia having 100,599 pages and English Wikipedia 38,944 (queried 15 June 2022).
Hundreds of English Wikipedia articles have been updated with taxonbars which are no longer empty. Many authorities for these taxa have now been disambiguated and the pages include links to their first descriptions. Due to our work, we have been able to add hundreds of pre-existing articles to the Australian project by updating talk pages.
At June 15 there were 159,242 AFD identifiers attached to QIDs (or Q numbers), the unique identifier of a data item on Wikidata, an increase of 24,179 from the start of the fellowship. Of these QIDs, 130,556 have an additional external id (a Wikidata property that corresponds the taxon name/s) which is an increase of 4,514. This is due to fellowship work on the insecta catalog and the two new working catalogs we created for parts of the Interim Register of Marine and Non-marine Genera (IRMNG) and to Annie Reynolds’ persistent work putting up matching IRMNG and Global Biodiversity Information Facility identifiers.
In all, I have now added 40,202 Australian Faunal Directory identifiers to Wikidata, with most of these being new items. This involved reconciling all taxon names against Wikidata to try to ensure that duplicate items were not added. Each new taxon added required at least five Wikidata statements, while for general I uploaded six statements.
Thanks to Toby, I created csv-based mix’n’match catalogues, updated mix’n’match catalogues and downloaded them. And thanks to this fellowship, together we have made a major contribution to the representation of Australian biota in Wikidata.
Anatomy of a Wikidata taxon item
This is a typical Wikidata entry for a taxon, showing four statements and one external identifier.