There has been discussion on the Village Pump about the uploading of video files on Wikimedia. How better to explain field hockey or the gait of a kangaroo?

According to Brion, "Video is certainly permitted, but there is presently a hard size limit of [2 megabytes] on uploads to the server. If you have a larger video file that is appropriate as a supplement to an article, please host it offsite and provide a URL link. Concerns on format, accessibility, and copyright apply as usual." [Now more like 100 MB, as of late-2009]

I think that the desire to add video content to the Wikipedia will gradually grow, and we should have some policy to deal with it.

There are several issues involved in using video on Wikipedia:

  • Freeness.
  • Cost.
  • Accessibility.
  • Legalities.
  • Utility, given above constraints.

Until we come up with a policy that covers the above issues I think that there shouldn't be any great push on to add material.


As an open content encyclopedia and a poor not-for-profit, we simply don't have the money to pay to license video streaming servers or codecs, and with proprietary codecs we have no assurance that we won't have to at some stage. Not to mention the fact that non-free codecs may not be playable on free operating systems...

At this moment, there is precisely one open-source patent-unencumbered video format available. MPEG has the Fraunhofer patents on the audio layer - which is why we prefer Ogg audio. MPEG-4 has patents on the video encoding. These patents apply to Germany and the US, but not elsewhere — but of course the Wikipedia is headquartered in the US. The various other codecs are all completely proprietary and thus totally unsuitable.

The only way out of this morass is probably Ogg Theora, which is an in-development, free, high-quality video codec. Ogg Vorbis will be used for the audio. Theora is covered by patents, but these have been licensed by On2 technologies, who donated the technology of Theora to Ogg. It is possible that these methods also infringe Fraunhofer patents, but they haven't tried to sue yet, even when Theora's predecessor VP3 was available for commercial sale, but I gather that Theora is definitely a better option). Longer term, there is a project by the BBC called Dirac underway, but it is nowhere near usability, let alone maturity.

Ogg Theora (libtheora) is at version 1.1.1 as of May 2010.

Therefore, I recommend that all video on Wikipedia be encoded using the Theora encoder.


Assuming we use free codecs, the cost of the software to the Wikimedia Foundation to serve video is zero. The cost of the disk storage to store it and the bandwidth to serve it up is not zero.

Assuming Ogg Theora can encode "near DVD-quality" video at about the same bitrate as MPEG-4, it is reasonable to assume that would take approximately six megabytes per minute. Let's assume hard disk space costs 1 USD per gigabyte. Therefore, for 1 USD we can store 166 minutes of high-quality video. For 1000 USD, we can store approximately three months of video footage. Even multiplying by a factor of three or four to allow for backups, it is clear that storage cost is not going to be a factor that stops us putting video on the Wikipedia. Similarly, I doubt that machines to feed that data out to the world are going to be a major issue once we can attract donations.

However, bandwidth costs are another issue entirely. I know very little about the economics of buying bandwidth, but I'm sure Jimbo can comment here. I strongly suspect that this might be a distinctly nontrivial cost if we are too cavalier about video.

There have been some technologies developed to help reduce the bandwidth requirements of delivering video and other bulky things, but they are not suitable for our purposes. Peer-to-peer technologies are great for distributing stuff, but they don't exactly have the kind of interface we want! Bittorrent does have a reasonable interface, and it would be quite easy to integrate, but it's only useful when lots of people are making a simultaneous download of the same, large file — again, not a close match to Wikipedia's usage pattern.

What we'd like is a distributed, transparent media caching system so that video requests could be handed off to local servers. If, for instance, a Wikipedia media mirror could be located on AARnet, the cost of internal traffic on that network is zero. Basically, we'd like a free, voluntary workalike of Akamai's on-demand streaming setup, though obviously it doesn't need to be nearly as complex. It seems that such a thing now exists with FreeCache [1], but who knows how well it will work with Wikipedia traffic patterns?


For accessibility purposes, the issues are mainly to do with the size of the video files we use. The ability for computers to play the files is pretty much out of our hands — I would expect that shortly after the stable release of Theora players for all the major platforms will be freely available (note: it has already happen). It is a shame that Microsoft and Apple probably won't include the codecs as standard, but maybe all that cool stuff on the Wikipedia will result in them deciding to bundle the codecs to get access to it :)

Much as we'd all like to have high-speed broadband internet connections, there is a substantial fraction of the internet populace still using dialup, or at the end of long, slow international pipes. For these people, six megabytes per minute is going to be really annoying if all they want is a quick look at something. However, there may be occasions where they may really really want a piece of high-quality video, so we should make it easy to get that piece. For everybody, regardless of the speed of their internet connection, it makes sense to allow people to download only the bits they want to see.

Another issue to be considered is when Wikipedia 1.0, which Jimbo has discussed. A single-CD Wikipedia 1.0 will have virtually no room for video, given the size of the English text repository and the size of the existing image collection. It is possible, if there is enough material worthy of it, that a second "Media CD" (which would also have audio clips) could be made available. Even so, there's only room for 90 minutes of high-quality video on a CD.

For all these reasons, it would seem highly sensible to request that video on Wikipedia should be in a lower-resolution, highly compressed format — maybe 320x240 might be appropriate — as well as good quality, and should generally be chopped up into relatively small pieces so that viewers and CD-compilers can select the bits that they want and don't want.


Obviously, copyright law applies to video as it does to text and images, and privacy laws also apply in some cases — again, the same as with still images. IANAL. Some further guidance as to fair use of copyrighted video is required.

One additional issue might be if people start uploading sexually explicit material. I believe, that there are laws in the United States beyond copyright laws that pose very specific consent requirements regarding this sort of thing, and I doubt that investing time in ensuring compliance with these laws is the best use of Wikipedia's time. Also, given the bandwidth load that hosting something that would probably be viewed by a lot of people as pornography would impose, perhaps the easiest thing to do is simply impose a moratorium on such clips.

Priorities for video coverage[edit]

We are of course dictated by what is already available in the public domain (e.g. a lot of material at, and what interests people enough to point a camera.

However, in terms of encouragement towards (or selection of where we have limited resources) subjects where video is particularly helpful in illustration (for instance, sports or hobbies), and where there isn't lots of other material readily available. I'm not convinced movie trailers, for instance, are a good use of Wikipedia resources.

Because of video's costs, we should be prepared to discard video that doesn't add something that an image or unaccompanied audio wouldn't.

Summary of recommendations[edit]

So, in summary, I recommend that:

  • Ogg Theora should be the codec of choice
  • Clips should be short
  • Low-bandwidth and high-bandwidth versions are desirable
  • Further examination of copyright as it applies to video is required
  • A no-porn rule be examined
  • We should particularly encourage video coverage where it can illustrate things that text or still images can't, and where alternative sources are unavailable, and discourage video where still images are adequate

This is only a summary of my thoughts on the matter. If you have opinions which you'd like to share, please use the talk page! I intend to work further on this on the basis of suggestions, and ultimately refine a video policy to become a Wikipedia guideline.

--Robert Merkel


See also[edit]