This page aims to collect all the available information regarding digitization projects within the Wikimedia movement, the tools that they use, hardware and software needed, best practices, learning patterns, among others. Its creation was decided in the first meeting of the Wikimedia Digitization User Group.
The purpose of this page it is not to provide deep-technical information, it is to allow non-experts to understand the basics of digitization, how to do it and which type of decisions they need to make if they plan to set forward a digitization program. If by any chance someone wants to go into the deep technical details, they can consult the Wikipedia pages over each of the concepts or read the bilbiography suggested.
It is also important to notice that there are multiple ways in which you can run a digitization program: you can either do it yourself, inside your institution, making all the decisions, or you can either outsource it to a company or partner with another institution that has the right equipments and will either do the job for you (this is, for example, the model adopted at the Boston Public Library) or either charge you for it (like the Internet Archive) or either provide you with all the equipments and make the crucial decisions for you (again, like Internet Archive, or like several Wikimedia chapters that are carrying digitization projects).
What and how you decide to do your digitization program is entirely up to you, your institution or community, and your own policies. If you decide to partner with another institution you probably won't need any of this information. If you decide to set your own digitization program, you will find most of this information of use in some way or another. Much of this information already exists and is scattered around the web and in Wikipedia pages. This page is just an effort to systematize the information needed for digitization.
- 1 Capture
- 2 Planning a digitization project
- 3 Processing
- 4 Information extraction
- 5 Sharing and availability
- 6 Wikimedia Digitization Projects
- 7 Grants for digitization
- 8 Glossaries
- 9 Additional Resources
This section is advocated to determine the main factors that you need to consider whenever you are making a digital image of maps, books, photographs, certain artworks (two dimension artworks), negatives, microfiche or microfilm. These are general principles that apply no matter the format, size, state of preservation, etc., of the analog material. For specific considerations for each material, we have outlined the main considerations that you need to take into account to be able to do a good job, but the general principles still apply.
- General principles
- Specifics by type of material
Audiovisual material and moving images
Planning a digitization project
- Selection and preparation of materials
- Provision of access to digital files
- Long-term sustainability
One day scan-a-thon
Going to a local institution one day and scanning their material. Things to consider.
When going to distant places, things that you need to consider.
Long term digitization project
If scans are uploaded to Commons and set up with Index pages on Wikisource, the Wikisource:Google OCR tool can be used to extract text from images.
Sharing and availability
Wikimedia Digitization Projects
Grants for digitization
Here's a list of available grants with a little explanation of each of them.
- Endangered Archives Program
Here are the resources organized by categories.
Comprehensive resources for the whole digitization process
- Remote Capture: Digitising Documentary Heritage in Challenging Locations. April 2018. (PDF)
- File creation help on english wikisource
- digitize.archiveteam.org a wiki with information about how to digitize different objects including books, audio and video.
- List of resources for Digitization Project Management, collected by Our Digital World. Some of these resources are a little bit outdated but still useful.
- List of resources for Digitizing Newspapers, organized by Our Digital World.
- New Self guided curriculum for Digitization, by DPLA.
- Digital Access to Collections - Digitise Digitization documentation for smaller collecting institutions
- Preservation in the Age of Large-Scale Digitization: A white paper.
- Atlas of Living Australia Digitisation guidance NB! has also a section on using volunteers for digitisation
- ISO 12233/FDIS, ISO/TC42. 1999. Photography-Electronic Still Picture Cameras-Resolution Measurements.
- ISO 12234-2/DIS, ISO/TC42. November 1998. Photography-Electronic Still Picture Imaging-Removable Memory-Part 2: Image Data Format-TIFF/EPS.
- ISO 14524/FDIS, ISO/TC42. 1999 (January). Photography-Electronic Still Picture Cameras-Methods for Measuring Opto-Electronic Conversion Functions (OECFs).
- ISO 15739/CD, ISO/TC42. 1999 (June). Photography-Electronic Still Picture Cameras-Noise Measurements.
- ISO 16067-1. 2003. Photography - Spatial resolution measurements of electronic scanners for photographic images - Part 1: Scanners for reflective media
- ISO 16067-2. 2004. Photography - Electronic scanners for photographic images - Spatial resolution measurements - Part 2: Film scanners
- ISO 17321?
- ISO 18937. 2014. Imaging materials – Photographic reflection prints – Methods for measuring indoor light stability
- ISO 18937-4. (In progress). Imaging materials – Photographic reflection prints – Methods for measuring indoor light stability – Part 4: LED Illumination.