WMDE Technical Wishes/Tables in PDFs

From Meta, a Wikimedia project coordination wiki
Jump to navigation Jump to search

The wish to show tables in PDFs[edit]

Currently, when you download an article as a PDF, all tables of the article are omitted. This is what wish #9 of the 2015 German-speaking Technical Wishes Survey wants to change.

The WMDE team aims to fulfill the request in two steps:

  • First, they will introduce a notice on the page, where the pdf can be downloaded from. The notice informs users about the fact that currently tables are not part of PDFs
  • Second, they will look deeper into the reasons why tables are omitted from PDFs, solving the problem as far as possible in a reasonable time frame.

Status[edit]

Notice about omitted tables[edit]

Yes check.svg Done A notice on the PDF download page that let's users know that they will receive a PDF without tables got implemented and deployed.

Solution to include tables in PDF[edit]

  • There will soon be an option to download articles as PDF including tables and templates. The new service will first be deployed to German Wikipedia, Meta and Mediawiki.org in December 2016. The new option will be provided to all other Wikis in early 2017.

Research, feedback and implementation plan[edit]

Pre-development research to add tables to PDFs[edit]

The WMDE team started looking into this in May 2016. Currently, PDFs are generated through Latex. Latex does not handle tables well that are too big for the paper size. Therefore, up to now, it cannot be guaranteed that a table in a pdf does not destroy the layout, or, in the worst case, prohibit the rendering of the whole pdf. This is why tables have been omitted so far. See also the phabricator investigation ticket here.

Related tasks for the Wikimania Hackathon 2016[edit]

The development team suggests two tasks for the hackathon to solve the challenge of the wish:

  • Add table appendix to PDF: The idea is not to bother too much with the combination of text and table, but to only focus on displaying tables. As a first step, it would be great to have a version that can display tables that don't pose width/ height challenges.
  • Improve patch to show tables in pdf: cscott has done some great work starting with a patch to include tables in PDFs. However, some aspects are still left out (e.g. colspan and width span attributes). This task is about adding those aspects to the patch

Hackathon Outcome: Pdfs that look like the website?[edit]

PDF created with latex (current version)
PDF created with the browser-based rendering service "electron" (new/additional version)

During the Wikimania Hackathon, the WMDE team discussed the Pdf issue with developers from the WMF as well as members of the community. Furthermore, the team used the hackathon to look into two options for solving the issue:

  • Adding table support to the current pdf code
  • Implementing a solution that creates a pdf out of the website.

The results were the following: The current pdf code creates a latex document. It is only possible to add tables for a very limited set of cases, which would be baffling for uses, since it would sometimes work and sometimes it wouldn't. We could probably do some improvements, but we would not be able to solve the whole problem in a reasonable amount of time.

However, it would be possible to offer a pdf download that is basically a printout of the website. This will not look as nicely designed as the latex version, however it will be complete. Therefore the team decided to offer a solution, where this option would be implemented as a second download offer when you click on the "download as pdf" link on the side bar. As of August 2016, the team is discussing with the German-speaking community whether they would like to see this option implemented. Feedback is also possible here. The thumbnails on the side show both the current latex version as well as the suggested additional option of browser based rendering.

Community Feedback: Pdfs that look like the website![edit]

The feedback from the German-speaking community was clear: They want to see the solution the team suggested implemented. Therefore the team will now work on integrating another link to the page that you reach when you click on "download as PDF". Next to the Latex version, users will find the option to download a PDF that is produced through the Electron service, making it look more or less like the website. As a start, the team intends to implement the functionality for single articles, not yet for books and collections.

Product Ownership and Development[edit]

Product Ownership and responsibilities[edit]

The WMF Services team is providing and maintaining the Electron PDF render service. The WMDE team is writing the extension that provides access to that service. The WMF Reading team took over the responsibility for the long term maintenance and planning of PDF rendering.

Development and deployment roadmap[edit]

Screenshot ElectronPdfService SpecialPage: Option to choose between 2 layouts after clicking on "Download as PDF"

Main ticket for the wish in Phabricator: T135643

Functionality and usage

  • The extension adds a new "Download as PDF" link to the MediaWiki sidebar.
  • You can click it to render the actual page as PDF using the Electron PDF service.
  • If the Collection extension is installed that provides the "old" Latex-based method for PDF rendering, clicking the "Download as PDF" link will provide a selection screen to choose between the Electron rendered PDF and the rendering provided by the Collection extension (see screenshot). When the collection extension is not installed on a wiki, the "Download as PDF" link will directly render the page via the ElectronPDF service.

Timeline

  • The WMDE team has started with the implementation of the proposed solution in September 2016: mw:Extension:ElectronPdfService. At the same time, the Services team at the WMF started to set up the Electron PDF service.
  • As of December 2016, the Electron PDF service got deployed to production and the WMDE team has finished their work on mw:Extension:ElectronPdfService. As of December 5th 2016, the extension is available on test wikis. As of February 2nd 2017, the new option is available on dewiki and meta.
  • The deployment to all other wikis can follow after the deployments to meta and dewiki.

See also[edit]