Writing Women Back into History Through Wikidata

Brodie Hoare's Wikidata Fellow Report, 2023
, Ali Smith. Keywords: Wikidata

In 2023 Wikimedia Australia partnered with Wikimedia Aotearoa New Zealand to offer several Wikidata fellowships. The fellows were asked to curate a data set, develop a prototype or undertake an investigation using Wikidata. Brodie Hoare, Online Collections Data Analyst for Tāmaki Paenga Hira, Auckland War Memorial Museum, investigated a proof-of-concept bot for the purpose of batch-uploading to Wikidata. The dataset she worked on were notable publications written by women from within the Museum's archives.


Here is her report:

Writing Women Back into History Through Wikidata

by Brodie Hoare

Online Collections Data Analyst for Tāmaki Paenga Hira, Auckland War Memorial Museum

Archival science, and the librarians and information professionals who have contributed to this discipline, can (to the uninitiated) summon solely images of dusty shelves, neglected filofaxes and boxes of forgotten tape cartridges. However, archiving can in fact be a powerful political act: it can be joyful, optimistic, radical, and it can even be... feminist.

The practice of archiving stories regarding women in media goes back to the 19th century. The purpose driving the study and creation of Women’s Archives is myriad: these collections are created to promote gender equality, to catalogue journalistic efforts to effectively “write women into history” (Severson, 2018), and are important also to give women’s histories a “room of one’s own” in archival spaces that have historically neglected the stories of women (Mason & Zanish-Belcher, 1999). The inherently political nature of an archive is best described with the illustrious description by Freeland & Von Hodenberg (2023), stating “The archive is not only a physical space where records are collected, cared for, labelled, ordered, and sometimes discarded. It is also a conceptual realm of narrative, memory, and memorialisation. To archive items means taking part in a process of shaping and evidencing histories that is entwined in epistemic hierarchies of power.” The conceptual realms of narrative, memory and memorialisation mentioned are not isolated to merely the physical bodies of archives. There has been a renewed push to enhance the profile and visibility of women in history, but this time to extend efforts into digital landscapes, and herein lies the opportunity to do so on the back of a well-developed and high-traffic history-keeping platform such as Wikimedia, along with it’s underlying data repository Wikidata.

In 1955 Levin librarian Enid Roberts, along with her friend Betty Holt, began to collect and supply material on the "status and way of life of New Zealand women". The newspaper clippings and carefully curated magazine articles soon grew into a collection that was large enough to need more hands to care for it: Enid Evans, chief librarian at Auckland Museum acquired this collection, seeing the value in preserving women's stories at a time when other research and archival institutions did not. Nearly 8000 articles and short biographies were amassed over the 30 years it was active, with a crowd-sourced curation strategy based on the submission of articles from the public alongside weekly sorting and archival maintenance each Tuesday by the Women's Archive Committee. These records were painstakingly stored and catalogued, with little possibility of knowing what would be possible in the advent of the Information Age and internet, and with only optimism guiding their efforts to capture notable women in a time where gender precluded women from entering the history books. The leap of faith all the women involved with this archive had to take, hoping that the culmination of their collective (and considerable) efforts would have some impact in the future, is moving to say the least, and I jumped at the opportunity to bring this collection to Wikidata when the opportunity arose. As most Wikimedians will know, "notability" is a prerequisite for the creation of articles on Wikipedia, requiring evidence from a secondary source to meet this requirement and prove "significant coverage" of the person and/or topic: the Women behind the Women's Archive had unwittingly created a data goldmine, and although these librarians had no idea at the time what Wikipedia or Wikidata, or even the internet, would be... Their efforts ensured that these women's names would eventually be able to be recorded in the largest encyclopaedia in the world.

Every subject on Wikidata that has structured data is called an entity and every entity has a page. Each entity’s page has a data model consisting of a label, a description, and finally statements nested the subcategories of properties, property values and qualifiers. While constructing the data model for these ‘Person’ records, I couldn’t help but notice some possible pitfalls in representing the Women’s Archive data in Wikidata. For instance, surnames are identified in statements with the potential property of “name in native language”, “birth name”, “married name”, “given name” and “family name”. While cleaning my dataset there were instances of women who (due to lack of details in the secondary source) were just assigned a surname with no indication of whether they were married and had kept their own name, taken their spouse’s name, or simply were unmarried with a maiden name. In these instances, it would perhaps be most appropriate to have a more neutral “surname” or “previous surname” property, that didn’t indicate marital status – as this seems to be used interchangeably with “family name”, I can see potential for confusion as this could also be reasonably assumed to describe the maiden name of the subject. To be able to accurately reflect the personhood, history and identity of women on Wikipedia it seems to take a lot more intentionality than would be required in other gender identities who have been less socially common to change surnames upon the commencement of marriage. I can imagine that the current conceptualisation of the surname properties is to ensure accuracy of information, however despite what may have seemed an innocuous decision in wording, it is these subtle, seemingly insignificant details that add another barrier in meaningfully representing women on Wikidata.

Other examples of slight quirks I found include property values for “employment” - some of the recorded roles held, and employment/career positions simply are not the correct fit for this column, and were required to be listed under “significant” event as a property. However it can be difficult to capture concepts such as “wife of a methodist minister” or “first of three women to sign the Treaty of Waitangi”, and being able to work within the value-based fields without the ability to abandon ship and use free text at times felt limiting. These archival records have not been digitised – so at this point, the data will be the only source of truth for some of these women on the internet. It felt imperative to me to get it right, but I must admit there were instances that nuance became lost in some of the rigidity of the data model and lack of accommodation for the state of occupation and employment before major women’s rights movements allowed women to enter the formalised workforce en masse.

Finally, a major encumbrance that I would be remiss not to mention is the restrictive nature of the” notability” prerequisite. Although my subjects from the Women’s Archive meet requirements through the heroic efforts of Enid Robert, Enid Evans, and every other woman from the Women’s Archive Committee – there is much that would have been missed while they were collecting items due to the constraints of the historic pieces collected being in print media. Freeland (2023) asserts that “Stories, songs, images, activist spaces, and homes all needed to be taken seriously as sites of history if women’s stories were to be told”, and indeed it can feel as if trying to fit women into a database of solely subjects with secondary sources can be fitting a square peg into a round hole. Freeland’s suggests that by “attending to those inhabiting the margins of the archive... historians became cognizant of its horizons, wary of its distortions, sceptical of its truth-claims". Enid Robert’s dream for recording a history of "status and way of life of New Zealand women" indeed will have been distorted by the biases inherent in which women were represented in media at the time, and whether their stories now will fit neatly enough into pre-existing data models within the Wiki ecosystem. Although no solution is perfect, a project like this is far from done with the uploading of just one dataset – it is only through the stress-testing and usage of existing systems to finetune, improve and accommodate that we can truly work towards accurate representation for every person on Wikidata no matter their gender, so that women can finally be written back into history.


Can History be Open Source? Wikipedia and the Future of the Past – Roy Rosenzweig Center for History and New Media. (n.d.). https://rrchnm.org/essays/can-history-be-open-source-wikipedia-and-the-future-of-the-past/

Freeland, J., & Von Hodenberg, C. (2023). Archiving, exhibiting, and curating the history of feminisms in the global twentieth century: an introduction. Womens History Review, 1–6. https://doi.org/10.1080/09612025.2023.2208401

Mason, K. M., & Zanish-Belcher, T. (1999). A ROOM OF ONE’S OWN: WOMEN’S ARCHIVES IN THE YEAR 2000. Archival Issues, 24(1), 37–54. http://www.jstor.org/stable/41102006

Severson, P. (2018). The politics of women’s digital archives and its significance for the history of journalism. Digital Journalism, 6(9), 1222–1238. https://doi.org/10.1080/21670811.2018.1513336

Discuss this page