Research data is considered the primary result and output of scientific research, and sharing and re-using data are key aspects of the transition to open science on a European level. Still, there are many unanswered questions regarding research data management, like understanding which data might be shared and under what conditions, who might share them, or even what research data is (Borgman, 2012). In the humanities, research data are the most diverse of all scientific disciplines because almost any data on human activity can be considered research data, such as newspapers or photographs, so the boundary between data and publication is very vague (Borgman, 2008; Thoegersen, 2018).
The paper aims to examine and answer several research questions related to data sets in the field of humanities. The first research question will analyse what types of research data are represented in humanities (1). In literature, a consensus has yet to be reached on the definition, and often, research data in the humanities are used as an umbrella term that includes different types of sources for research. DARIAH-DE defines research data in humanities as all those sources/materials and results collected, written, described, and/or evaluated in the context of a research and research question in the field of human and cultural sciences and in machine-readable form for the purpose of archiving, citation and for further processing. The purpose of this definition is to consider the particular characteristics of human-scientific research and the resulting heterogeneity of the underlying data (DARIAH-DE, 2021). For the purpose of this research, the authors have chosen to go with the DARIAH-DE definition of research data which will be the base for determining types of data represented in humanities. The research was conducted for humanities fields, including philosophy, theology, philology, history, art history, art science, archaeology, ethnology and anthropology, religious studies, and interdisciplinary humanities science. An analysis of data sets published in institutional and thematic repositories in the field of humanities in Croatia and Europe was conducted using Zenodo, Digital Academic Archives and Repositories (DABAR), CLARIN, CROSSDA, and DARIAH. The research aimed to determine the most prevalent data types in the humanities found in repositories. In the research on the storage of research data from the humanities in repositories, Buddenbohm et al. (2016) note that a culture of sharing and reusing research data has not yet been established; although research data are to some extent stored in repositories, they are difficult to find. Regarding that, the second research question will examine the extent to which data sets in the field of humanities are represented in repositories in open access and under which licence (2). The third research question will examine to what extent research data align with FAIR principles (3). The evaluation of data sets FAIRness will be conducted using the Wilkinson et al. (2016) FAIR guidelines principles and Routledge Open Research data guidelines, which are mentioned in Grant (2022) for FAIR humanities data. In addition, based on FAIR guidelines, authors will provide a practical framework for managing research data in humanities which can be used for assessing the quality of research datasets. The paper will present the types of research data in humanities that are represented in repositories, assessing their level of openness, licensing, and alignment with FAIR data principles.