Open Science for Arts, Design and Music/Guidelines/Before

From Meta, a Wikimedia project coordination wiki

When you are starting a new project it is important to design and plan the implementation of open access, open data and more broadly of open science.

Phase 1: Planning and preparing research - Data Management Plan (DMP) for Arts, Design and Music/ Step by Step "How to" guidelines[edit]

Prepare your Data Management Plan (DMP) for Arts, Design and Music

Data Management Plan (DMP) for Arts, Design and Music[edit]

The value of Data Management Planning[edit]

Even if the term data does not always resonate well with the communities of cultural practitioners, creative artists and scholars working in the arts domain, keeping a data management plan (DMP) as a living and evolving document throughout the lifetime of a research project is a useful tool. It gives an opportunity to systematically think through, make decisions about and document inputs, throughputs, and outputs of the project and highlight their important provenance details, legal or ethical challenges or sources of uncertainties. It is a roadmap that facilitates a shared understanding within the project (even in case of a solo project, like writing a dissertation) over which resources will be used, curated and produced during the project, how to backup and store them in a secure way, how to select resources for publication, in which forms and formats to publish them, what are the outputs that need long-term preservation and sustaining and how to document them and what are the associated costs and efforts.

Data Management Plans as a funding requirement[edit]

Increasingly, keeping a DMP alongside research projects is becoming a condition of funding by many research funders, such as SNSF or Horizon Europe.

Data Management Plans required by SNSF[edit]

Since October 2017, the submission of a data management plan (DMP) has been mandatory in most funding instruments. The SNSF also expects that data produced during the research work will subsequently be publicly accessible in digital databases, provided there are no legal, ethical, copyright or other clauses to the contrary (source). The SNSF DMP is relatively short: its expected length is about 2 pages of text in total, answering 10 questions, plus 2 checkboxes (source). It consists of four sections: (1) data collection and documentation, (2) ethics, legal and security issues, (3) data storage and preservation, and (4) data sharing and reuse.

You can find further information, FAQ guidelines and contact to support services under the following links: https://www.snf.ch/en/FAiWVH4WvpKvohw9/topic/research-policies

https://www.snf.ch/de/dMILj9t4LNk8NwyR/thema/open-research-data

Video support broken down section by section:

Data Management Plans required by Horizon Europe[edit]

Open Science for Arts, Design and Music/Guidelines/Horizon Europe/DMP

Data Management Plans templates[edit]

Checklist of the most important milestones throughout the research workflow that enable Open Access (and FAIR) sharing of research results and accompanying resources[edit]

  • Terms and conditions requests by grantmakers (examples from SNF, European research programmes)
  • Terms and conditions of your institutions (open access policy, regulations related to copyright, code of conduct, examples from the schools)
  • Terms and conditions of the partner organisations (examples from museums, other universities, NGOs, companies…)
  • Available data storage and other data infrastructure (cloud services such as NextCloud, data repository, OMEKA instance etc.)

Tips[edit]

  • Active involvement of all the project partners is key in a successful implementation of the DMP.
  • Doing research is usually not a linear process. In case DMPs are expected to be updated throughout the project lifetime, it is more than ok to indicate uncertainties, decisions to be made later in the initial versions of DMPs.
  • Indicating costs associated with preservation, publication and sustaining of project outputs is a key component of DMPs. It includes both time and effort spent with e.g. documentation and preparing resources for publication and in some cases also monetary costs. This resource guides you through the cost estimation aspect of DMPs: https://zenodo.org/record/4518901#.YxXxw7RBxD8

Data Management Plans tools[edit]

You do not have to use specific tools while working on project DMPs apart from an empty document but there are available DMP creator resources that systematically guide you through the process, such as:

General approach[edit]

Open by default - open licenses on all content where possible (CC0 for data; CC BY for texts, video, images; CC BY-SA for collaborative projects involving citizens) - OVERVIEW CHART of the CC licenses - which license what is allowed, which research scenario

  • Getting started with rights management FLOWCHART (ongoing project in the middle, backwards: identification of the rights holder, authorization where needed etc.; forward: OA to resulting publications, copyright retention and all the other OA-related issues that are to be negotiated with the publisher.
  • Include the rights management in the project (template)
  • Agreement with institutions - Heritage Data Reuse Charter (template)
  • Agreement with all the project team (including researchers, students, citizens, participants…)
  • Ethics of research and protection of privacy, sensitive data and GDPR

Checklist of the most important milestones throughout the research workflow that enable Open Access (and FAIR) sharing of research results and accompanying resources[edit]

  • (Re)using copyrighted materials
  • Identifying points in workflow where sensitive data comes to the picture
  • Documentation and capturing resources, preparing them for publication
  • Publication decisions (see below under 3.2. and 4)

Phase 2: Discovering and selecting relevant source materials to reuse (cultural heritage resources and also academic publications as literature)[edit]

Using open content[edit]

Content I can use:

  • Content under open licenses;
  • Content in public domain;
  • My own content.

Where can I find open content (content I can freely use)[edit]

Find open content in open repositories

Open repositories[edit]

Open-access and free resources can be found in open repositories.

Repositories of research data[edit]

Repositories of art[edit]

Repositories for art works (in general)

Repositories for images

Repositories of music[edit]

Catalogs (collections of databases):

  • The MusoW database brings together openly available music and musicology resources from across the Internet. It serves as a catalogue of databases. You can browse it along different search criteria here: https://projects.dharc.unibo.it/musow/records
  • You can use the Audio, Video search of the ProQuest database
  • See a full list of music-related research data repositories here_ https://www.re3data.org/search?query=music (the search results can be further refined to countries and repository features)

Databases, collections (a small selection of MusoW resources):

Open-upon-request music resources from proprietary providers

Find open content in open repositories

Using content that is not openly available[edit]

Checklist

  • Who made the work I want to use - Authors
    • you
    • others
    • you and co-authors (joint ownership)
  • Is there already a license on the work
  • Where is the work I want to use - Inside an institution, in a private property, outside in public view
  • Reusing sources from cultural heritage institutions - a checklist (this one can serve the basis and to be
  • Who commissioned the work - other rights or copyright owner
  • Did the production of the work involve other people, institutions, companies (publishers, company producing the work, music labels)
  • Are there contracts or agreements signed
  • Reuse agreement templates
  • Who is inside the work - Participants, actors, volunteers, citizens, children…
  • What is inside the work - Monuments, heritage sites, public art, other works…
  • Is the work a digital reproduction
  • How about the font?
  • Are you using texts from others? Is it a citation or more? - add the source and the attribution
Citation guidelines

Citation guidelines[edit]

  • Acknowledging all contributors involved in knowledge creation

Citations are hard currency in academia, a currency that you will never run out of as a re-user. This is a circumstance to keep in mind when deciding which contributions to your work to cite and which only to mention in a first footnote. The following guidelines https://dariahopen.hypotheses.org/747 provide you with a step-by-step citation guidelines. It covers both the perspective of the re-user who wishes to keep a clear provenance trail and acknowledge all contributions and contributors to their work as well as the creator/author's perspective who wants to make sure that their work is properly cited.

If you work in a team, adding the CREDiT taxonomy (Contributor Roles Taxonomy) to your work can make clear role distributions and responsibilities regarding your output.

  • Citing digital resources that are not papers or books (with examples)

Although scholarly work in the 21st century goes way beyond what can be placed on a bookshelf, in reality, such digital scholarly objects are still largely out of sight from research evaluation and recognition. Giving proper citations of them is first, essential step to change this for the better. An obvious golden rule is to follow the "Cite as" information on the landing page. In many cases, digital tools or software are still "gift rapped" to a research paper like in the following case: Csaba Oravecz, Tamás Váradi, Bálint Sass: The Hungarian Gigaword Corpus. In: Proceedings of LREC 2014, 2014.

In the absence of a "Cite as" notice, you can follow the examples below. Importantly, always add a Persisten Identifyer to the citation where possible, so that the citation can be tracked.

Citation templates for digital scholarly content types
Content type Schema Example
Blog posts [Authors], "Title" in [Name of the blog], [Date], [URL]. Erzsébet Tóth-Czifra, "10 practical tips to fight against the culture of non-citation in the humanities," in DARIAH Open, 29/02/2020, https://dariahopen.hypotheses.org/747.
Data (in a broad sense) [Creators]. (Year of creation). [Title] (version number, if relevant) [Content type] [Name of the repository], [URL, including the Persistent Identifier] Toscano, Maurizio, & Aitor Díaz. (2020). Mapping digital humanities in Spain - 1993-2019 (v1.0). [Data-set] Zenodo. https://doi.org/10.5281/zenodo.3893546
Corpus [Creators]. (Year of creation). [Title] (version number, if relevant) [Content type] [Name of the repository], [URL, including the Persistent Identifier] Truan, Naomi. 2016. Parliamentary Debates on Europe at the House of Commons (1998-2015) [Corpus]. ORTOLANG (Open Resources and TOols for LANGuage). https://hdl.handle.net/11403/uk-parl
Software [Creators]. (Year of creation). [Title] (version number (important!)) [Content type] [Name of the repository], [URL, including the Persistent Identifier] Strupler, Néhémie. (2018). Project Panormos Archaeological Survey: Data Visualisation Code (survey-analysis) (0.1.0). Zenodo. https://doi.org/10.5281/zenodo.1185024
Training material [Creators]. (Year of creation). [Title] (version number, if relevant) [Content type] [Name of the repository], [URL, including the Persistent Identifier] Marilena Daquino (2022). Polifonia - Making sense of musical heritage on the web. Version 1.0.0. DARIAH-Campus. [Webinar recording]. https://campus.dariah.eu/id/oPe9gFztuJQhQYwGhOtJF
Digital cultural heritage resource [Creators], (Date). [Title] [Holding institution, collection, subcollection as detailed as possible] [Found in (if relevant)] [URL, including the PID or URL + PID] Gertrude Käsebier, 1905. Happy Days. [Photograph.] Museum für Kunst und Gewerbe Hamburg, Sammlung Juhl. Found in Europeana Collection 2048429_Ag_DE_DDB_MKG. CC0, Source: https://www.europeana.eu/portal/en/record/2048429/item_36FQJ2YT6YSX4LYPEPBEBVEFT6P6G23G.html?q=%22Gertrude+K%C3%A4sebier%22#dcId=1583140820017&p=2

Digital assets are worth citing even if not all components of the citation data are known. For instance, we do not only know the creators of certain resarch tools. In this case the citation can look like as follows: OpenCOLLADA. [Research tool] Retrieved Oct 14, 2022 from https://marketplace.sshopencloud.eu/tool-or-service/5mpQoK .

  • How this all translates to citation styles?

Needless to say, all citation styles have their own, detailed guidelines covering a rich variety of content types to be cited. By using the open source reference management software, Zotero, you can select and format all your citations from one style to another by one click, or by following his manual.

Phase 3: Producing Open content[edit]

Find open content in open repositories

Checklist to produce open content (visual checklist)[edit]

  • I have checked legal, ethical conditions of sharing and have an agreement with all people and institutions involved - The license is included in the project, we signed an agreement, the grantmaker imposes already the license
  • I have selected which resources are to be shared, checked sensitive information, anonymisation where needed; perform selection accordingly where needed --> see the section Preparing sensitive data and resources for publication below. Even if not all your resources can be shared or shared openly, indicate deletions, closed access locations, reasons why certain parts cannot be (openly) shared in the documentation → see the ‘Levels of accessibility’ table below. Metadata (i.e. the description of your resources) should be openly available.
  • I have added the open license on all content (except where differently noted)
  • I have selected a data repository to my resources --> link to the repository finder. If you wish to publish them on multiple platforms (e.g. both on your institutional website and in a repository) make sure to interlink these platforms.
  • I described my resources either in rich metadata or in a README file, including file structure, provenance information, contributors, circumstances of collection, limitations, license, ‘cite as’ information etc.
  • I documented the software environment in which I’ve carried out the enrichment/analysis process
  • I have cited all sources and added attributions to all content also in public domain (author, title, date, license, photographer, owner)
  • I have made my content available in open formats - this is especially important if you are working in a proprietary software environment
  • I have added how I want my content to be cited/attributed
  • I have made available this content on compatible repositories
  • I have extracted open content and uploaded it on open repositories
  • The journal I have selected is in Open Access
Find open content in open repositories

Ownership and authorship of content created with and by Citizen Scientists[edit]

Involving the public in research projects can come in many forms, from opening up research agendas for local communities or society at large to shape them and set priorities through the many forms of co-creation to simply relying on user-generated content as research data, such as book reviews on Goodreads. Such collaboration dynamics define good practices in dealing with intellectual property issues.

To start with the scenario that falls closest to simply reusing resources generated by third-parties, the following guide on copyright and user-generated content is indicative.

For more co-creation based scenarios, you can find advice and good examples regarding how to decide on sharing ownership and authorship as well as their ethical entailments in the PARTHENOS Citizen Science training module here and here.

Phase 4: Preparing data, resources, research outputs for publication[edit]

Find open content in open repositories

Preparing sensitive data and resources for publication[edit]

Working digitally, scholars are facing a complex set of legal issues and ethical dilemmas whenever they want to store, use, publish and share data collected from or about human participants. On the top of such dilemmas, researchers are also challenged by the perceived confrontations between the open research culture and the proliferation of ethical review procedures and legal requirements for data protection. In OS-ADM, we see the doctrine "as open as possible, as closed as necessary" and responsible research integrity (including ethical conduct) as different sides of the same coin, as a set of key values of present-day knowledge creation such as fairness, transparency, equality and increased rigour and accountability in scholarly activities. The resources below guide the identification of sensitive data, associated changes in the research workflow, the selection of data/resources for publication, anonymization, pseudonymization and other good practices in data protection that help ensure the strictest ethical, legal conduct but also maximize the proportion op data/resources that are openly sharable.

  • The OpenAIRE "How to deal with sensitive data" guide gives an overview of what qualifies as sensitive data and how to prepare sensitive data for storage and sharing.
  • The "Protect" chapter of the CESSDA Data Management Expert Guide covers protocols for ethical review, processing personal data, anonymization and collecting informed consent. The guide had been written with Social Scientists in mind but is useful for anyone working with personal data.
  • The DARIAH ELDAH Consent Form Wizard supports arts and humanities researchers in obtaining GDPR-compliant, valid consent for data processing in the context of their specific professional activity.

This tool will guide you through a questionnaire that will consequently generate a GDPR-compliant form for obtaining consent from data subjects, tailored to your specific purpose and the data categories you intend to collect.

  • Good practices coming from arts disciplines
  • summary chart on the most frequent scenarios plus their conduct throughout the research workflow
  • Publishing closed access in institutional repositories or under embargo --> link to the rainbow chart
  • Institutional national data support services in Switzerland


Where to store and publish my research outputs[edit]

Find open content in open repositories
Where to store and publish my research outputs during the project (storage and backup)[edit]

During the project's lifetime, while your collaborators and the research process is still active, and your outputs are still in the making, it is best to store resources at a shared space with authentication and authorisation protocols in place from where all authorised contributors can access, modify and version them. In most cases, research performing organisations and their IT support have such cloud-based, networked drives in place that offer ample storage space and data security and automatic backup for most purposes. It could be proprietary services like Microsoft OneDrive, Google Drive, or open source ones such as NextCloud or ShareDocs. If you work with sensitive data – for example, personal data, copyrighted materials - it is worth enquiring with your institution's research support staff whether your intended storage solution meets your institution's data security policy.

Find open content in open repositories
Where to store and publish my research outputs once research outputs are ready to be shared[edit]

Taking full advantage of the available technologies and data sharing mandates, we can open up a much wider, bigger part of our research process than what is published in the final book, book chapter, research paper, or essay. Sharing resources beyond publications in the traditional sense can take place in multiple ways and platforms: by publishing your outputs on your institutional or project website, in an institutional or thematic repository or as part of shared, global and structured databases and knowledge graphs (link), such as WikiData. (A flowchart comes here too.)

Data repositories

Data repositories[edit]

Why data repositories?[edit]

Repositories are the safest homes for research outputs with long-term availability guarantees, secure access protocols, standard licensing and versioning options that come with many benefits compared to simply publishing your research outputs or artefacts on a website or leaving them on your GitHub repository. By sharing your resources in a trusted repository,

  • You can make sure that others (humans and machines alike) can find your work, beyond accidentally stumbling across it on your project website. Repositories usually have their own discovery platforms and are harvested by other scholarly databases.
  • You can make sure that your data will be preserved over a long term, and therefore remain available for future research or verification.
  • You make your data uniquely identifiable and citable and visible for machines and information management systems too. Assigning a persistent identifier to your data will point to their exact location, even if that changes over time - no broken links! You can read more about the value of persistent identifiers below.
  • You add rich metadata and a standard license to your work when publishing them in a repository so that reuse conditions are clear.
  • You can control who has access to your outputs by using the authentication control and - if needed - embargo features of the repository.
  • You have the possibilities to add different versions of your outputs and clearly indicate the latest one.
  • You can clearly define how to cite your work, or can follow the citation standards of the repository.
  • Sharing your data in a repository inevitably brings quality improvement. By depositing them, you will clean them, document them to make them understandable for third parties and add the metadata necessary for their deposit.
  • Last but not least: you comply with funder requirements as they usually require sharing resources associated with the research project in a trusted repository.

(adapted from here: https://campus.dariah.eu/resource/posts/dariah-pathfinder-to-data-management-best-practices-in-the-humanities)

If your research team still wishes to display the project outputs, resources on an institutional or project website, that is of course also possible. For that, consult the "What if I wish to display them elsewhere too (i.e. on a project/institutional website)?" section below. (to link!)

How to select the most suitable repository for my resources?[edit]

Depending on the scope of your project and the volume of your data, depositing can be realised in a complex, iterative process with a strong emphasis on precise data quality assessment, but it can also be simple and easy. As a first step, check which kinds of support structures are available in your institution. If your institution has solid data repository facilities and/or data stewards who can help your work, you are lucky! It is well-worth to involving them from the project planning phase onwards and discussing the best solutions, which work well for your research but are also compliant with the standards, specifications and protocols along which the repository operates. Repository staff can also assist you with understanding any specific data management requirements and associated costs. If your funder requires a data management plan, this information should be included.

In deciding where to store your data, you may have a number of choices about who will look after it and how its findability and potential can be maximised.

Where can I find repositories for my resources?[edit]

You can use generic repositories such as Zenodo, which will take almost any data sets, but you can also use more specific ones such as the Swiss National Data and Service Center for the Humanities (DaSCH) or the TextGrid Repository, primarily serving text-based humanities disciplines but containing also arts and music resources. If your institution has data sharing policies, it is advisable to use your own institutional repository. The main standard to assess quality criteria of data repositories is the CoreTrustSeal certificate. Choose Core Trust Seal certified repositories to make sure that your data will remain available in the future in a secure, sustainably maintained and curated environment.

You can find suitable research data repositories that best match the technical or legal requirements of your research data in the re3data.org database that you can browse by country, disciplines, Core, Trust, Seal compliance and many other parameters.

(adapted from here: https://campus.dariah.eu/resource/posts/dariah-pathfinder-to-data-management-best-practices-in-the-humanities)

Where can I find repositories for legal copies of my publications (pre-prints or post-prints)?[edit]

If you are looking for repositories for your publications (pre-prints and post-prints), you can find the most suitable one via the Registry of Open Access Repositories.

Where can I find repositories for research software?[edit]

Serving as an environment for experiments, visualisations, installations or research analysis, research software can be a valuable research output on its own and as such, it is also worth to be shared alongside other resources. Currently, there are two major repositories in Europe that are committed to software archiving, preservation and citation: Zenodo and Software Heritage.

1.) Sharing and archiving research software via Zenodo:

Relevant GitHub repositories can easily be connected to the project’s Zenodo collection to deposit software releases from GitHub to Zenodo, together with the provision of appropriate metadata, providing contextual information about the software. Zenodo mints DOIs for each released version of the software, and also creates a concept DOI which refers to all versions of a given software. This way, a PID will be assigned to all versions and specific deployments of source codes. This resource guides you through how to set up automated or semi-automated Github --> Zenodo or GitLab --> Zenodo exports: https://genr.eu/wp/cite/#Authorise

2) Sharing and archiving research Software via Software Heritage

Software Heritage is another publicly funded, European open archive that harvests all public GitHub repositories to ensure long-term availability, traceability and citability of source codes of research software. You can get started here: https://www.softwareheritage.org/faq/

Where can I find repositories for research software?[edit]

Serving as an environment for experiments, visualisations, installations or research analysis, research software can be a valuable research output on its own and as such, it is also worth to be shared alongside other resources. Currently, there are two major repositories in Europe that are committed to software archiving, preservation and citation: Zenodo and Software Heritage.

1.) Sharing and archiving research software via Zenodo:

Relevant GitHub repositories can easily be connected to the project’s Zenodo collection to deposit software releases from GitHub to Zenodo, together with the provision of appropriate metadata, providing contextual information about the software. Zenodo mints DOIs for each released version of the software, and also creates a concept DOI which refers to all versions of a given software. This way, a PID will be assigned to all versions and specific deployments of source codes. This resource guides you through how to set up automated or semi-automated Github --> Zenodo or GitLab --> Zenodo exports: https://genr.eu/wp/cite/#Authorise

2) Sharing and archiving research Software via Software Heritage

Software Heritage is another publicly funded, European open archive that harvests all public GitHub repositories to ensure long-term availability, traceability and citability of source codes of research software. You can get started here: https://www.softwareheritage.org/faq/

Data repositories

Storing, publishing large volumes of data[edit]

Criteria to select a data storage[edit]

The volume and granularity of your data define optimal storage methods and locations both during and after the analysis phase. To make an informed decision, it is worth to keep a few perimeters in mind, such as:

  • the size and nature of your data (mega- or gigabytes vs. tera-or petabytes)
  • Wether data is hot or cold (i.e. whether it belongs to an ongoing or concluded research activity)
  • Number of partners who need simultaneous access
  • Desired types of access to the data (through landing pages or APIs, or a SPARQL endpoint etc.)
  • Privacy concerns
  • Optimal transfer and retrieval time
  • The costs of storage
  • Length of storage

Institutional cloud storage for actively curated data[edit]

During the active phases of the project, a good practice is to use one’s institutional cloud storage solutions (this can be an institutional OneDrive, Dropbox or other solutions) to store and backup large volumes of data.

Data repositories and limitations in terms of volume[edit]

When it comes to publication and long-term archiving of research data, institutional, thematic or genic data repositories usually have an upper limit for size per data record. The default genetic data repository, Zenodo maximizes this in 50 GB per dataset with the possibility of parsing larger datasets to smaller units as individual data records (https://help.zenodo.org/), while the Swiss National Data and Service Center for the Humanities(DaSCH) repository service that is is free of charge for national research projects or those with Swiss participation requires annual cost sharing by the project or its hosting institution for data volumes exceeding 500 GB (https://www.dasch.swiss/pricing)

Dedicated storage for large volumes of data[edit]

In some cases, institutional, national or thematic data centers offer storage and archiving services specifically to that require large quantities of research data (typically terabytes), usually alongside supercomputing facilities. These offer a bigger data container than the average data record unit of a data repository. A case in point from the arts and humanities domain is the French Huma-Num Box (https://documentation.huma-num.fr/humanum-en/). It offers a secure and long-term storage for data sets, mainly large ones (several hundred terabytes in total). The device uses magnetic disks and magnetic tapes to store data. Data deposits can be both “warm” or “cold” , including digitised cultural heritage collections, photos, audio recordings, maps, videos and 3D models. Importantly and to enhance discoverability, reusability and generic user-friendliness of the deposited data volumes, such Huma-Num Boxes can be easily connected to web-based publishing and web application systems such as Omeka or IIIF. Full description of the service is available (in French) here:https://documentation.huma-num.fr/humanum-box/ For further assistance, please contact: assistance@huma-num.fr.

For further reading, see: Nicolas Larrousse, Joël Marchand. A Techno-Human Mesh for Humanities in France: Dealing with preservation complexity. DH 2019, Jul 2019, Utrecht, Netherlands. ⟨hal-02153016⟩

What if I wish to display them elsewhere too (i.e. on a project/institutional website?)[edit]

If your research team still wishes to display the project outputs, resources on an institutional or project website, that is of course also possible. For that, consult the "What if I wish to display them elsewhere too (i.e. on a project/institutional website?" section below. (to link!) An emerging good practice is to publish them via repositories and then link them back on the respective website. You can see an example here: https://arkeogis.org/home/ or here under 'Datasets': https://p3.snf.ch/project-179755 . This way, you can avoid confusion and inaccuracies caused by generating multiple copies on multiple platforms.


  • Overarching questions nr. 2. OVERVIEW CHART of the CC licenses - which license what is allowed, which research scenario - to be moved to somewhere else?
  • Overarching questions nr. 3: documentation essentials - to be moved to somewhere else?
Data repositories

Open Science for Arts, Design and Music/Guidelines/Data publications such as Wikidata

Data repositories

Sharing data on social media - dos and don'ts[edit]

Regardless of where you share your data, the same legal and privacy concerns listed at "Phase 3: Producing Open content" and "Phase 4: Preparing data, resources, research outputs for publication" are to be kept in mind and to adhere to. For instance, if you share your data under a CC-BY 4.0. license, the same reuse conditions apply everywhere, regardless of whether you publish them through a trusted repository or Wikidata, on a random website on the Internet, on social media or you land it to someone on a USB stick. However, in chapter 4.2 Where to store and publish my research outputs , we saw that in terms of findability, accessibility and sustainability, selecting a publication venue has enormous consequences. In a similar vein, this section, we touch upon possible ethical implications in selecting a publication venue for your data or other scholarly content, with special focus on social media.

Sharing your work on social media is repeatedly suggested in advocacy workshops addressing especially early career researchers to make a bigger research and societal impact, improve the outreach of their work and gain more recognition. Below you can find pointers and checklists that will help you to do that in a sustainable and fair manner.

1. Check the basics

- Are you the copyright holder of the materials to be shared? If not (because it is owned by your institution or another 3rd party), is the license allow for open sharing? 
- Are all the legal, GDPR-related and ethical barriers listed above in Phase 4 sorted out? 

2. Ownership is crucial: use de-centralised and community-controlled platforms over proprietary black boxes

Recently, the change in ownership and curation policies of Twitter and the associated global issues and controversies eloquently showcased how much a critical mass of users, including scholarly communities, are exposed to such proprietary platforms and urged societies at large to look for more transparent, more community-controlled, fairer alternatives. The fedigov movement (collective?) raises awareness of these issues in an easily accessible manner and offers more sustainable alternatives to the most commonly used social media platforms.

3. Linking instead of copying

The most sustainable way of sharing your work on social media is simply to link its Persistent Identifier or Wikidata ID/link instead of republishing. This allows you to keep versioning clear and even to track its citation and usage metrics.

4. How about academic social media platforms?

The popularity of platforms like Research Gate and Academia.eu seem to be steady over the years, contrary to the fact that the same ownership and intransparency issues define their operation as we saw with Twitter. This blog post explains it all and gives you ideas for alternatives. As an addition to the latter, we recommend to discover Scholia, a scholarly discovery service that creates visual scholarly profiles for topics, people, organizations, species, chemicals, etc using bibliographic and other information in Wikidata.

5. Academic blogs

Solid evidence suggests that blogging about your research does not only make your work (ongoing work in many cases) easily accessible to audiences of different kinds but it also improves your academic writing. The Hypotheses platform offers blogs spaces freely for academics and brings them together in a catalogue broken down to languages and topics.

Data repositories

What are Persistent Identifiers and why are they important for visibility, citability of my research?[edit]

What is a PID and what is the value of PIDs?

PIDs are sometimes described as a social security number for a research object (source). They ensure unambiguous identification and secure location of research outputs even if their associated URL changes over time. PIDs and the metadata associated with them are both visible to machines and humans and help them equally to describe the type of resource, where to find it, and how to reuse it. But it is not only research outputs such as publications or data sets that can have PIDs assigned but also content creators (scholars), organisations or even funding bodies. With the help of them, we can persistently link and connect articles with the underlying data, software and funding information across the research lifecycle to support the reproducibility of their research and maintain the provenance footprints for each output over all their versions.

PIDs that are most frequently used in scholarly communication (a non-exhaustive overview)
Abbreviation Description Typical content type Where to get Example
DOI DOIs are digital identifiers for objects (whether digital, physical or abstract) which can be assigned by organisations in membership of one of the DOI Registration Agencies; the two best known ones are CrossRef, for journal articles and some other scholarly publications, and DataCite for a wide range of data objects. As well as the object identifier, DOI has a system infrastructure to ensure a URL resolves to the correct location for that object. (source) Most typically for research articles, books but also other digital content types 1. ) From publishers withCrossRef or DataCite membership

2.) From institutional libraries who has DataCite membership

3.) From data repositories

Paper: https://doi.org/10.3390/app12052426 Book: https://doi.org/10.11647/OBP.0192


Book chapter: https://doi.org/10.11647/obp.0192.05


Data set: https://doi.org/10.5281/zenodo.3893546

ORCID ID Your ORCID iD is a unique, open digital identifier that distinguishes you from every other researcher with the same or a similar name to you.

(source and further reading) || Persons || From ORCID || https://orcid.org/0000-0001-7794-0218

HDL Handles are unique and persistent identifiers for Internet resources, with a central registry to resolve URLs to the current location. Each Handle identifies a single resource, and the organisation which created or now maintains the resource. The Handle system also underpins the technical infrastructure of DOIs, which are a special type of Handles. (source) Data sets (finished or unfinished) and other digital scholarly objects From repositories or to implement a handle system, directly from the Handle.Net Registry https://hdl.handle.net/11378/0000-000B-BE89-5
ARK ARK is an identifier scheme conceived by the California Digital Library (CDL), aiming to identify objects in a persistent way. The scheme was designed on the basis that persistence "is purely a matter of service and is neither inherent in an object nor conferred on it by a particular naming syntax". (source) Archival resources, data sets at all levels of granularity (large collections vs. pieces of a single document) From repositories.

DASCH also uses ARKs as PIDs.

To implement an ARK system, contact the California Digital Library (CDL).

Data set: http://ark.dasch.swiss/ark:/72163/1/081C
Wikidata ID Each Wikidata entity is identified by an entity ID, which prefixed with Q (e.g. Q12345 ), properties are prefixed by P (e.g. P569 ) and. lexemes are prefixed by L (e.g. L1 ).

(source) || Entities, properties, lexemes || Wikidata || https://www.wikidata.org/wiki/Q1593269

The overview above does not include other PIDs types that are primarily used by the library and archival domain such as URNs, PURLs, VIAF IDs, or PIDs that are specifically designed for one preprint repository such as arxiv ID or HAL ID.

For a more detailed guide with decision trees) to select PIDs for different content types with different budgets etc. see: https://zenodo.org/record/4192174#.Y0bwa0xBy00

PID graphs[edit]

One of the most powerful aspect of PIDs is that enable persistent and machine-readable linking of different entities that different entities expressing different relationships within the research landscape, such as: linking publications with underlying datasets, source materials, software or other relevant digital output; linking authors to their publications (see ORCID IDs) or research funders to the projects or output they are funding. Or even all at once!You can see examples here: https://github.com/datacite/freya/issues?utf8=%E2%9C%93&q=is%3Aissue+is%3Aopen+label%3A%22PID+Graph%22++label%3A%22user+story%22+ or embedded in a discovers service here: https://dariah.openaire.eu/search/publication?articleId=od_______166::2f23ea37884763e88efff618d3f33688


Further reading https://de.wikipedia.org/wiki/Persistent_Identifier https://en.wikipedia.org/wiki/Persistent_identifier https://support.orcid.org/hc/en-us/articles/360006971013-What-are-persistent-identifiers-PIDs- https://www.dpconline.org/handbook/technical-solutions-and-tools/persistent-identifiers https://de.dariah.eu/en/persistent-identifiers https://www.pidforum.org/t/why-use-persistent-identifiers/714/4 https://dariahopen.hypotheses.org/1210

Phase 5: Open Access publications (articles, books, book chapters, pre-prints)[edit]

What kind of publication[edit]

Models of open access books[edit]

Books are publications which include edited volumes, monographs and exhibition catalogues.

The network specialised in open access book is Open Access Book Network, which provides resources for publishers, librarians, authors and funders.

Open Access Book Toolkit provides information about open access book publishing with a presentation of different topics:

  • Planning and Funding
  • Conduct Research
  • Consider Publishing Options
  • Write & submit manuscript
  • Peer review
  • Book contract and License
  • Book is published & disseminated
  • Research is reused

To transform academic books in open access you can refer to OAPEN: Online Library and Publication Platform.

Open Edition Books is a web platform for books in the humanities and social sciences.

The Directory of Open Access Books (DOAB) is the directory of open access books where you can find books in open access.

Models of innovative publications accommodating multimedia in open access[edit]

For more publishing innovations from arts and humanities fields, check:

Creating a self-publication in open access[edit]

Articles and books can be self-published. Self-publications can created and shared on:

Publishing an open access scholarly blog[edit]

Hypothesis hosts academic blogs. Researchers can apply to open a Hypotheses blog.

Hypotheses is a platform managed by OpenEdition which provides electronic resources and academic information in the humanities and social sciences.


Prepare your Data Management Plan (DMP) for Arts, Design and Music

How to fund Open Access?[edit]

A fundamental difference to the traditional closed access (or paywalled) models is that when publishing Open Access, costs of publication and dissemination are moved from the demand/reader side to the supply/author side. This however does not mean that authors need to pay Article or Book Processing Charges from their pockets.

In reality, these costs are covered either by funders of external research grants (such as Horizon Europe or The Swiss National Science Foundation) or by research performing institutions. In the former case, costs of Open Access publications are included in the budget of the funded project (see the example of SNSF here). In the latter case, universities and other research institutions have transformative agreements (see Swiss examples above) in place and in addition to them, also institutional Open Access publication funds. This however does not always fully cover the Open Access publication of each and every affiliated or loosely affiliated scholarly works. To mitigate gaps in the funding landscape, organisations like DARIAH ERIC have dedicated Open Access book publication grants in place. In yet another funding model, which is all probably the most sustainable one, academic institutions and their libraries pay in a shared pot to collectively fund Open Access publishing venues that are important for their communities and where their authors are free to publish. This way, these institutions take back ownership of and control over the publishing infrastructure instead of simply paying publication prices defined by for-profit publishers. The Open Library of Humanities is probably the best known example of them. [Swiss examples to be added.]

To gain up-to-date information of Open Access funding available for you or your research team, we recommend contacting your institutional library or your national research funder (SNSF, DFG, NWO etc.) for information.

In many cases, scholars without institutional affiliation and project funding or researchers coming from lesser resourced countries can apply for APC waivers.

For books, the OAPEN Open Access Books Toolkit curates a regularly updated list of available funding for the Open Access Publication of academic books across Europe here: https://oabooks-toolkit.org/lifecycle/10944589-planning-funding/article/9012512-overview-of-available-funding

Further information about Open Access book funding is available on the dedicated discussion board of the Open Access Book Network here: https://hcommons.org/groups/open-access-books-network/forum/topic/developments-for-open-access-book-funding-policies/

Prepare your Data Management Plan (DMP) for Arts, Design and Music

Open Access funding opportunities in Switzerland[edit]

Information, funding, Open Access journal finder for Swiss scholars is available on the SNSF's website here: https://oa100.snf.ch/en/news-en/open-access-simple-and-efficient-publishing-with-chronoshub/ and here: https://oa100.snf.ch/en/funding/. Further, in collaboration with the Swiss National Science Foundation (SNSF), swissuniversities offers scientists and scholars practical and financial support for the Open Access publication of their work.

Currently, there are several Swiss universities offering free-to-publish-free-to-read read Open Access publishing platforms for journals, such as the Hauptbibliothek Open Publishing Environment (HOPE) for the University of Zürich. Initiated by the swissuniversities alliance, the PLATO project (2022-2024) develops a sustainable funding model that enables collaborative community-driven and high-quality Open Access Publishing in Switzerland. HOPE provides a platform to researchers of the University of Zurich for publishing in Open Access journals

The Swiss Consortium of Swiss Academic Libraries is responsible for negotiating for, providing and administering Open Access transformative agreements with traditionally closed access publishers across the country. You can read more about them here: https://consortium.ch/vertraege-konditionen/?lang=en

Prepare your Data Management Plan (DMP) for Arts, Design and Music

Where to find and select Open Access journals and publishers?[edit]

Finding journals and publishers for your work[edit]

As a first step, we recommend you to consult your institutional librarian about fully Open Access publication forums, partnerships, transformative agreements, publication grants and other funding opportunities that are available for your institution.

For a bigger picture, you can browse the biggest, trusted catalogs of Open Access publication venues, such as the Directory of Open Access Books (DOAB) and the Directory of Open Access Journals (DOAJ).

Selecting journal and publishers for your work[edit]

Before selecting a publication venue for your work, it is important to make your expectations towards the journal/publisher clear. Beyond audience, outreach and the prestige of the journal/book series/publisher in your field, here are a couple of additional criteria.

  • Does the publication venue transparent communication of the per review process, publication fees, and Open Access policy?
  • Is this policy compliant with your institution's or funder's mandate (note for example that many research funders does not allow for publishing in hybrid Open Access journals).
  • Do they assign rich metadata and Persistent Identifiers (such as a DOI) to their publications?
  • Are they indexed by scholarly discovery services such as Google Scholar, Project MUSE or the OpenAIRE Research Graph?

The last two aspects are important for the discoverability and visibility of your research in the increasingly noisy scholarly communication landscape.

Many Swiss universities have Open Access transformative agreements with big publishers such as Taylor and Francis, De Gruyter or Springer Nature or in place. Still, we recommend to select an Open Access publication venue with the principles of FAIR Open Access in mind. There are a number of community-owned Open Access publishers and journals who make their services available on a free-to-read-free-to-publish basis. For instance, the Open Edition platform gives an overview of such Open Access journals and book publishers mainly (but not exclusively) in French speaking areas. In this blog post, you can find further examples.

Tool to select journals:

Tools to select books series, book publishers:

How can we check and guarantee the quality of Open Access content using a peer review process?[edit]

Peer review has a critical importance in scholarly communication. It is a practice that carries an enormous weight in terms of gatekeeping; shaping disciplines, publication patterns and power relations; and governing the (re)distribution of resources such as research grants, promotions, tenure and even larger institutional budgets. From the second half of the 20th century on, peer review ( (open or blind, single or double) gradually became a gold standard to guarantee quality control of scholarly publication venues. Still, peer review is an elusive practice, usually taking place in closed, black boxes.

Below you can find pointers (regarding identifying peer review criteria and beyond) that could guide you when selecting a new publication venue and identify your expectations towards journals or books series.

How to build trust and clarify your expectation towards scholarly journals[edit]

In the ideal case, primary emphasis falls on the term ‘scholarly’. That said, if you recognise names of your peers in the editorial board or among the authors.

If a journal is yet unknown for you, and especially in the case of newly established venues, you can follow a checklist described by Think, Check, Submit, a well-established tool for researchers who aren’t sure about the legitimacy of a journal. The more transparent a journal is regarding their editorial, peer review, and pricing policy, the easier to build trust towards them.

As a first step, it is recommended to look up the journal in the DOAJ list of indexed open access journals,

One indicator of predatory publishers: you and your colleagues may receive standardised emails from time to time to publish previously unpublished works of yours presented at conferences or blogs to publish in venues that are coming with a very generic scope. If the collection or the topic of a special volume is not specified, nor you can find information on their editorial and pricing policies, you have all rights to remain suspicious.

The publication below gives further context to the issue of predatory publishers: https://blogs.lse.ac.uk/impactofsocialsciences/2018/09/25/the-problem-of-predatory-publishing-remains-a-relatively-small-one-and-should-not-be-allowed-to-defame-open-access/

How to build trust and clarify your expectation towards scholarly books[edit]

Although peer review, whichever flavor it takes, serves as quasi a standard quality assessment mechanism, this is not necessarily the case for academic books where editorial review is also an established practice or there may be other alternative selection and quality assurance mechanisms in place defined by publishers. To increase the complexity, peer review of books in itself exhibits a great diversity: it can happen on the level of the book proposal, on chapter level, or on the level of the whole manuscript. It can take place internally, among authors of an edited volume, externally, open or closed.

In order to see if an open access book publisher conducts peer review, the best course of action is to check the publishers’ or series editor's policy on that. To bring more transparency to such policies, the OPERAS Peer Review Information Service for Monographs (PRISM) provides information to the Directory of Open Access Books (DOAB) from a growing number of publishers who already implemented this service.

The blog post below gives a useful checklist of the Open Access related questions you may want to ask from your publisher: https://blogs.openbookpublishers.com/what-should-i-ask-a-publisher-about-open-access/

How to be compliant with my funder’s or my institution’s Open Access policy[edit]

[Link to the Open Access in Switzerland subpage]

To make sure that you are compliant with your funder’s and/or institutional Open Access mandate, we recommend first visiting their website. As an alternative, you can browse the ROARMAP, the Registry of Open Access Repository Mandates and Policies.

Open Access policies usually outline two main ways to comply with them: either via publishing your works at an Open Access journal or book series or via self-archiving, that is, depositing a copy of your work in a trusted repository. Even if you choose to publish your work in an Open Access journal or book, deposit the full text of the author’s version in an open archive. It will guarantee free and sustainable access to your work in the case that the dissemination policy of your publisher changes in the future.

Below is a brief overview of the different routes to Open Access publishing and the most frequently used terms associated with them.

[overview chart comes here]

[Here to cite, embed flowchart from here: Matthias, Lisa; Tennant, Jon (2018): How to make your research open access? For free and legally.. figshare. Dataset. https://doi.org/10.6084/m9.figshare.5285512.v3 https://figshare.com/articles/dataset/How_to_make_your_research_open_access_For_free_and_legally_/5285512]

To find out whether the journal you selected to publish your work Open Access is compliant with your funder’s or institution’s mandates and to learn exactly which version of your paper they allow sharing via self-archiving, you can consult SherpaRomeo (https://v2.sherpa.ac.uk/romeo/)

The life cycle of a publication

  • Author version: corresponds to any text with content that is directly produced by its authors. This includes the initial manuscript and its subsequent versions, the manuscript submitted for review to a conference or a journal.
  • Initial manuscript: first form of the author version disseminated in open access.
  • Submitted manuscript: first form of the author version sent to a conference or a journal for peer review.
  • Author Accepted Manuscript (AAM)/Revised manuscript after review: last author version as transmitted to the conference or journal after peer review.
  • Publisher's version/Version of record: document possibly formatted by the conference or journal publisher and distributed by the latter. If the copyright is owned by the publisher, authors are not allowed to share this version

Unfortunately, the version of this tool optimised for academic books does not exist yet. We recommend to ask your publisher directly about possibilities for self-archiving.

If you feel unsure, where to deposit your copy of publications to openly share them, we recommend using OpenDOAR , a quality-assured global directory of academic open access repositories.

How having an ORCID ID helps compliance with funder requirements

Not only publications and other research outputs can have a Persistent Identifier assigned but also their creators. In academia, the Open Researcher and Contributor Identifier (ORCID) is the most widely used ID for that purpose. Once you registered on ORCID and have your ID, all information (CV. grants, publications, projects you are involved in) you enter to your profile become visible for your funders without having to copy them to different platforms and sheets. By adding your ORCID to your publications, email signature, applications and other academic works, this information will become visible at a glance. Linking your ORCID ID to your works (mainly publications and data sets), institutions, projects immediately and semantically connects them to the rest of your scholarly profile. This information can be exchanged effectively across databases, across countries and across academic disciplines. It is the de facto standard when submitting a research article or grant application, or depositing research data.

To learn more about how having an ORCID ID can make your life easier, read Read Meadow’s blog post Six Things to do now you have an ORCID iD.

Funding statements

In many cases, complying with your funder's requirements also includes adding a funder statement (basically, a citation of your funding body) to your publication. This usually includes the grant number too or if your institution or funder has a funder ID, that as well. This way, it becomes easier to track your work and the impact of your funding not only to humans but also to machines such as bibliometrics services. The first step is to check whether your funder has a specific template for such statements. For examples and more details, you can consult this guide: https://www.cwauthors.com/article/how-to-cite-funding-in-research

Licensing

A publication cannot be truly Open Access without defining conditions of access and reuse by clear licensing information. In academic publishing (and in many other domains), the use of the Creative Common licenses is the most standard. What is worth to know is that although all of them are considered as open licenses, they come in different flavours that exhibit a range from the most permissive possible CC0 and CC-BY licenses to actually quite restrictive options, such as CC-BY-NC-ND. (link) Picking the most appropriate license for your work means finding a good balance between controlling its accessibility, reuse (for instance, to restrict it to non-commercial reuse), or its integrity in reuse (the share alike licenses does not allow for partial reuse of your works, except fair citation) on the one hand and enabling the widest possible reuse on the other. Importantly, using permissive Creative Common licenses such as CC-BY does not mean giving away work work: copyright remains your and reusers will have to attribute you as the creator of the resource in question and therefore the value of scholarly accountability is protected, while provenance and credit-giving is also clearly stated.

The chart below gives an overview of the different Creative Common licenses.(p.10. from here: http://eprints.hud.ac.uk/id/eprint/17828/1/CC_Guide_0613.pdf)


In case the use of third-party material does not allow for open licensing of each part of your work, we recommend to use the least restrictive license applicable to your content. You can exclude third-party material from the license provision of your publication, but make sure to mark these exceptions clearly. This way the rights of the original copyright holder are respected while you are free to release your own publication under an open license (source).

Before selecting a license for your work, make sure that it is compliant with your funder or institutional policy. In many cases, research funders require CC-BY license for funded publications.

Tools: To choose the most suitable license for your publications (as well as for you data, software and other types of research outputs), you can consult the following license selector tools: Creative Commons license picker for publications, data or multimedia outputs or https://choosealicense.com/ for open source software.

Useful thoughts, blog posts that further help selecting the most appropriate license for your work:

Further reading:

How to negotiate with a publisher (templates and argumentation)[edit]

Before publishing your work, you will need to sign a contract with your publisher to agree on terms and conditions. This usually covers details of ownership, copyright management and licensing of your work and therefore reading it carefully and negotiating where needed is crucial.

How to retain your intellectual property rights (fully or partially)[edit]

A serious advantage of Open Access publishing is that ownership of the work stays with the authors and therefore they fully control its reuse conditions. However, if you have publications or are planning to publish in non-open access journals, they might ask you to sign the copyright over to them as a condition of publication, so that they can disseminate the work exclusively and therefore maximise its profitability. This could prevent you to reuse your own work e.g. for teaching, republish it elsewhere or openly share a version of it in an institutional repository.

Luckily, as Open Access publishing is becoming the norm in academia, there is a growing amount of support out there that helps you to retain copyright, or to transfer only a limited number of your rights to the publisher. Below you can find a step-by-step guide, tools, and further information.

1. Be aware of the ownership status of your publications. As their creator, you are their copyright holder by default. Still, in some countries such as in Ireland or in Austria, some of the usage rights of your work are transferred to your employer while copyright and attribution remain with the author.

2. Check the conditions of licensing, copyright transfer and the termination of transfer provisions in your publishing contract against possible reuse scenarios (like the ones above) in mind.

3. Open Access policies are in place to protect the interests of authors to keep ownership of their work and to mitigate author’s exposure to publishing houses in this respect. That said, to negotiate with publishers on ownership and licensing, simply point at your institutional or funder’s policy. In this respect, cOAlition S funders (a group of national research funders, European and international organisations and charitable foundations, including SNSF) put an especially strong emphasis on right retention. https://www.coalition-s.org/rights-retention-strategy/

4. Use one of the tools below.

Tools: The Termination of Transfer tool or the SPARC Author Addendum are useful tools that help you to legally share your work and terminate or modify restrictive licensing arrangements you have made with publishers.

A model agreement with publishers: Model contract: https://deepblue.lib.umich.edu/handle/2027.42/138828

Further information on publication contracts and success stories: https://www.authorsalliance.org/resources/publication-contracts/

A useful overview of the key issues in copyright negotiation: https://blogs.openbookpublishers.com/copyright-and-licensing-what-do-i-need-to-know/

[+ CoalitionS tools to add here once published]

Negotiations for Open Access

The first step is to check whether the journal, publisher has an Open Access policy and whether it allows for depositing a copy of your work in an institutional repository (self-archiving or green Open Access, see above). In case you do not find information about it in SherpaRomeo or on the website of the publisher, again, it is worth pointing on your institutional or funder's policies to request permission for that.


* Checklist to produce open content (visual checklist)

  • What I have to do to produce content in Open Access
  • Levels of accessibility


Checklist - Agreements for Open Science[edit]

Arguments related to Open Access to printed and digital publications

What are we aiming for: a digital copy of publications which can be uploaded and shared in preprint repositories and institutional/project websites. The printed copy can have a different distribution and a different value related to the object and its design.

  • Learning about publishers’ Open Access policies (https://v2.sherpa.ac.uk/romeo/) and checking whether they are compliant with your funders policies and institutional policies (the latter might mandate the use of an institutional repository).
  • Making sure you can have a digital copy (possibly the final publication, usually called the version of record)
  • Making sure the digital copy can be archived in your institutional repository (Green Open Access)
  • Allowing the digital copy to be archived on open repositories (CC BY)
  • Avoiding to give away exclusive rights to publishers, retaining authors’ copyrights (right retention)
  • Defining the licenses of the printed publication and of the digital edition
  • Making sure you pay a reasonable fee for the service

https://www2.supsi.ch/cms/openscience/open-science/come-pubblicare-open-access/cosa-devo-controllare-in-un-contratto/

Checklist - How to obtain an authorization (with template)[edit]

  • Requests to copyright owners (artists, writers, musicians, designers, companies…)
  • Requests to heritage institutions preserving and managing content
  • Requests to owners (owners of works, databases, collections…)
  • Requested to participants
  • Paying the rights - Pro Litteris

Checklist - How to negotiate Open Access (with templates)[edit]

  • Request to journals. What to check in a contract and to eventually ask to modify
  • Request to publishers (template)
  • Use of content when it is not possible to contact the copyright owners
  • Partnerships with cultural institutions
  • Request to aggregate content from different sources in an institutional website
  • Establishing a date after which it is possible to publish data (with publishers, companies…)
  • Retaining your copyright as an author

How to ensure the findability Open Access content? (publications and data)[edit]

Having a look at how the "F" of the FAIR principles (staying for findability) is broken down into the following sub-principles: F1. (Meta)data are assigned a globally unique and persistent identifier

F2. Data are described with rich metadata (defined by R1 below)

F3. Metadata clearly and explicitly include the identifier of the data they describe

F4. (Meta)data are registered or indexed in a searchable resource

As researchers, we have less control on how the infrastructure we use to share our work (data or preprint repositories or publishing platforms and services) fulfills these requirements but we can nonetheless carefully select them for our work and define our expectations towards them. Below you can find a checklist for each.

CHECKLIST when selecting publishers for your papers and books:

Does your publisher provide a Persistent Identifier (PID) for your outputs? (A DOI, typically. This is a minimum requirement for making your publications visible not only to humans but also to machines, including discovery platforms and other academic databases.)

Does your publisher have long-term archiving workflows in place? (This is a service usually provided by 3rd party digital preservation services, such as CLOCKSS or Portico.)

Does your publisher allow you to share a copy of your work in a repository or elsewhere? (This way you can take control over and enhance the findability of your work.)

Does your publisher provide rich metadata with your papers or book chapters in which it is possible to make reference to associated datasets and other resources? (This is a plus. A  good starting point to check metadata standards of your publisher is to look them up in Crossref.org.) 

CHECKLIST when selecting a repository for your research data or other outputs:

Is your repository owned and sustainably governed by a public body (a research institution, public research infrastructure, CERN etc.?) 
Does your repository have a robust PID and versioning policy implemented?

Does your repository have an OAI-PMH endpoint or other mechanisms enabling discovery services to harvest their content?
Does your repository have a CoreTrustSeal certification? (This is a plus.) 

Does your repository allow interlinking different outputs belonging to the same project (through related PIDs or through other semantic web technologies)? (This is a plus.)

The original source of these check lists provides a case study of how European discovery services allow for bringing together different outputs belonging to the same project from across the web.

Checklist - Publishing innovations: Multimedia online publications on institutional website AND repository of choice (alternative form of publication)[edit]

What are we aiming for: A digital multimedia publication, peer-reviewed, including content with different licenses and rights (open licenses by default), stored on an institutional website (compliant with Green Open Access) and produced possibly in collaboration with a publisher.

  • All content produced by researchers available with open licenses (CC0 for data, CC BY or CC BY-SA for content produced with the involvement of citizens).
  • Possibility to include content with different licenses and restrictions.
  • Agreement based on a public institutional repository (institutional website)
  • Possibility to include multimedia content better than with ebooks (websites allow for videos, links, audio, embedding of other websites…)
  • Necessity publish peer-reviewed content to make it significant from an academic perspective
  • How to organize peer review for innovative publication models (e.g. as part of institutional websites, peer review of multimedia, data, other digital scholarly objects)
  • Possibility to produce it in collaboration with a publisher (for the peer-review, the design and the editorial expertise)
  • Access to all multimedia content but compliant with requests of other copyright owners (institutions often allow only the publication of content on the institutional website
  • Possibility to aggregate content from different sources (as links or uploads) and to agree with copyright owners for using them on the institutional website only
  • Possibility to export a selection of the content (only the open content) and to upload it on open repositories
  • Managing the database of the website to separate open content from content under copyright or other licenses (to allow the export of open content on open repositories)
  • Not a specific website but an institutional website maintained by the institution and with an expected longevity

Possibility to have a backup of the website on Internet Archive and national archiving systems for longer archiving policies.

Phase 6: Sustainability when/after the project ends[edit]

  • Sustainability/archiving decisions
  • Post-project financing
  • Interlinking publications to underlying sources (multimedia content etc.)

Project planning with its afterlife in mind[edit]

In the arts and humanities, digital content creation is still expensive, challenging and time-consuming. An explicit motivation behind policy drives and expanding research funder mandates that call for data sharing and data management plan is to make a better return on our collective investments and increase the genuine accessibility and reusability of research outputs. In reality, enabling that requires time, effort, and money and a general rule of thumb is that the earlier sustainability decisions are made during the project ‘s lifetime, the easier they get implemented. Therefore, we recommend to ask the following questions even during the planning phase:

What will be the key outputs of the project and how widely are you planning to share them?

In the case of traditional dissemination formats, such as a book or an article, the publisher will take care of their longevity but in all other cases you need to seek ways to make your website, web artifact, data, digital exhibition or multimedia library future-proof. Be aware of your resources- alone PhD vs research team.

What are the support structures available within and beyond your institution and how much do they cost?

Examples include truly financial aspects like: If you communicate your results on your website, who is hosting it, for how long and how much does it cost? If you need storage for your large volumes of data or wish to take care of publishing your resources and ensuring their long-term availability, how much does it cost? If you wish to publish your book gold Open Access, how much does the Book Processing Charge costs and whether you can pay the bill upfront for the publication, to fit the project lifetime?

However, required effort in terms of manpower and working hours is just as important part of cost calculation. For instance, how much time documentation, data cleaning and anonymisation takes to enable data sharing and whether there is available personnel for this purpose? In this respect, the type and volume of the project (e.g. someone is working on their dissertation alone vs. a big EU project) can bring crucial differences and eventual compromises.

Further, thinking about the post project funding period: is there a contact person who remains responsible for a given service or for running a website, or an editorial team around the outputs of the project? If so, how will they be compensated or incentivised? How much institutional staff (e.g. repository managers) can take over from such sustainability efforts?

Documentation to enable others (including your future self) to follow and make sense of the research outcomes[edit]

Capturing not only your final outputs (with rich metadata and/or a readme file, see described elsewhere in these guidelines) but also their context and the processes leading to their creation (e.g. how your data has been ‘cooked’) is a crucial step to enable their accessibility and re-usability in the long term. Ideally, the structure of your resources and their description allows for newcomers (be it a new colleague joining the ongoing project or someone developing interest after the project is concluded) to interpret what they see, even if they are completely unfamiliar with the inception of the project. For instance, defining and staying consistent in file naming conventions is an important step to this direction.

Making your resources available open and preservation-friendly formats[edit]

Even if you work in proprietary environments, such as InDesign, Atlas.it or MaxQDA, at the end of your workflow, you always have the chance to convert the outcomes to open and more stable file formats - still an easier task than building a time machine!

The OpenAIRE preservation guide (https://www.openaire.eu/data-formats-preservation-guide) sums up the preferable file formats in an overview chart. The chart is in line with the Library of Congress Recommended Formats Statements https://www.loc.gov/preservation/resources/rfs/.

Sustainability scenarios: make decisions on what to keep[edit]

Thinking of and funding research in terms of projects generally does not provide the opportunity to fund activities and regular maintenance work reaching beyond the project lifetime such as updating interfaces, adding new content to a database, or refactoring or migrating an aging codebase. In the light of that, the best strategy is to select a maintenance scenario that is most realistic. The following scenarios are adapted from the King’s College Digital Library project sustainability unit.

1. Your institution or one of the project partner institutes can guarantee maintenance of your resources in their original format of deployment, including hosting, maintaining functionalities (such as search functionality or hyperlinks) and visual identity (with or without manpower dedicated to sustain the output). In this case, you may set up a service level agreement with the hosting institution to clarify and agree on details of maintenance.

2. Your institution or one of the project partner institutes, or a third party organisation can guarantee maintenance of your resources via migration with compromises regarding their original format of deployment. This includes the possibility that your project output becomes part of a bigger collection hosted elsewhere.

3. ‘Boxing’ and archiving the different components and layers of the project in trusted repositories and archives, e.g. the data, multimedia, illustrations, software layers to a data repository, associated publications to a preprint repository, interfaces to web archiving services etc. (See guidance on that elsewhere in these guidelines and see also the overview chart below.) Interlinking the different content types via persistent identifiers (see an example here under 'Related identifiers') significantly improves their contextualisation and re-usability. Although this solution comes with a serious loss of functionality and aesthetics of the outputs, especially in the case of complex web artefacts, it still guarantees long-term and stable accessibility, citability, and if well documented, reusability of the hard-earned resources. In this case, exploring how your institutions or external data infrastructures (national, thematic, or global) can support the export, ingestion and archiving process is key.

Overview chart on which content type where to archive, https://docs.google.com/document/d/1Mv8fahujsXK7mV9cpbhpulRfk_IaoN1G/edit?usp=sharing&ouid=109238419008996680964&rtpof=true&sd=true

Examples to each scenario (from around DARIAH):[edit]

Scenario 1: The SSH Open Marketplace has been among the flagship outputs of the Social Sciences and Humanities Open Cloud Horizon2020 project. By the time the project concluded in April 2022, the partners agreed on the the commitments taken for the long-term maintenance of the Marketplace. In particular, DARIAH, CLARIN and CESSDA signed a collaborative and binding agreement to give resources (hosting capacities, technical helpdesk, budgeting for manpower ensuring the population and social life of the service etc.) to maintain and further develop the SSH Open Marketplace after the end of the SSHOC project. You can read about the details here: https://zenodo.org/record/6034593#.Y1ZRjNdBxD8 . to maintain and further develop the SSH Open Marketplace after the end of the SSHOC project. You can read about the details here: https://doi.org/10.5281/zenodo.5608486 Following the SSHOC project, the SSH Open Cluster serves as a framework to collaboratively develop thematic branches of the European Open Science Cloud.

Scenario 2: The Standardization Survival Kit has been has been among the flagship outputs of the PARTHENOS H2020 project. After the project concluded in 2019, a dedicated DARIAH working group, the Guidelines and Standards Working Group was in charge of maintaining and further populating the service. To complement the voluntary effort of the working group, in 2022, Standardization Survival Kit [https://marketplace.sshopencloud.eu/search?order=score&categories=workflow&f.source=Standardization+Survival+Kit#/ has been included to the SSH Open Marketplace as one of its key components.

Scenario 3: Within the framework of the same PATHERNOS project, a set of Training Suites have been developed. Once the project concluded, the partners in charge of the development of the training materials exported them from the website and archived them on Zenodo in formats that are easy to reuse and re-adapt to different teaching/learning contexts. https://zenodo.org/communities/parthenos-training/?page=1&size=20 The pages of the website have also been submitted to Internet Archive.

Finally, the following is good example of even if there is no external grant support available, it is still possible to significantly strengthen the sustainability of research outputs. This case study describes the linguist Dr. Naomi Truan's strategy (and results!): https://halshs.archives-ouvertes.fr/halshs-03366486

Institutional good practices[edit]

King’s Digital Lab:

King's College is clearly among the most important enablers of digital scholarship in Europe. The volume and richness of the digital outputs and collections produced within the institution - be it research software, exhibitions, visualisations, cultural heritage data enrichment or multilingual publications - made the relevant departments well aware of the pressing sustainability needs and constrains. As a response, the Kings Digital Lab set up a strategy and put resources in place to strategically document and save over 100 digital humanities projects. You can read more about how they select, document, enrich, evaluate and store the different component here: https://kdl.kcl.ac.uk/blog/legacy-project-datasets/

Centre for Digital Humanities at Princeton:

Similarly to the practice of Data Management Planning, the Centre for Digital Humanities at Princeton sets up charters with project teams at an early stage to figure put role and responsibility distributions around keeping project outputs alive. In their approach to build capacity for sustaining DH projects and preserve access to data and software, they view projects as collaborative and process-based scholarship. Therefore, their focus is on implementing project management workflows and documentation tools that can be flexibly applied to projects of different scopes and sizes and also allow for further refinement in due case. By sharing these resources together with their real-life use cases in DH projects, their aim is to benefit other scholarly communities and sustain a broader conversation about sustainability issues. You can read more about their practices here: https://cdh.princeton.edu/projects/princeton-ethiopian-miracles-mary-project/

Further reading[edit]

Jennifer Edmond and Francesca Morselli, Sustainability of Digital Humanities Projects as a Publication and Documentation Challenge, Journal of Documentation, 76, 2, 2020. DOI 10.1108/JD-12-2019-0232

Fitzpatrick, Kathleen. 2011. Planned Obsolescence: Publishing, Technology, and the Future of the Academy. New York: New York University Press. (see also the book's Wikipedia summary here: https://en.wikipedia.org/wiki/Planned_Obsolescence_(book))