Wikimedia Futures Lab/Dashboard/Experiment Tracker/AI generated video summaries for articles
Appearance
- Experiment owner(s) - add your usernames:
- Olubusola Afolabi (User:Ifedohlapo)
- Andrew Lih (User:Fuzheado)
- Tochiprecious (talk)
- Experiment title: AI generated video summaries for articles
- Brief description of experiment: Use AI and VideoWiki to make video summaries of Wikipedia articles, by generating a script and identifying Commons images
- Anticipated completion date:
- Link to a page where people can follow progress - this can be your own userpage / sandbox:
- Example of VideoWiki script from Gemini:
- Example of VideoWiki script from Claude:
- https://en.wikipedia.org/wiki/Wikipedia:VideoWiki/Abuja_Nigeria
- https://en.wikipedia.org/wiki/Wikipedia:VideoWiki/Birthday_cake
- https://en.wikipedia.org/wiki/Wikipedia:VideoWiki/Ice_cream_cone
- https://en.wikipedia.org/wiki/Wikipedia:VideoWiki/History_of_the_Philippines_(1898%E2%80%931946)
- https://en.wikipedia.org/wiki/Wikipedia:VideoWiki/Adire_(textile_art) (The generated audio pronounces "Adire" as an English word rather than using the correct Yoruba pronunciation (/ah-dee-reh/). It would be worth exploring whether the video script generator can get reference any embedded audio files or IPA notation to ensure that the subject words are spoken correctly. This article does not currently include any, but this is something to investigate further with an article that does. Alternatively, adding a pronunciation to this article and testing it within the video script pipeline)
- https://en.wikipedia.org/wiki/Wikipedia:Videowiki/Olimpia_Ajakaiye (The source Wikipedia article includes IPA pronunciation templates for the subject's name in both English and Polish. However, this pronunciation cannot be reproduced as audio in the VideoWiki script. VideoWiki renders slides as video, meaning IPA templates only display as visible text and do not generate audio. No spoken pronunciation audio file (ogg/mp3) exists on Wikimedia Commons for this name, so there is no file to embed via a Media tag)
- https://en.wikipedia.org/wiki/Wikipedia:Videowiki/Maharana_Pratap
- Observed issues
- Claude cannot access the raw Wikitext if it is given the main URL of a Wikipedia article. Perhaps a bot-protection measure on the Wikimedia side, or a deficiency on the Claudes side?
- Images get bent out of shape because of aspect ratios. May need to instruct AI to help with this.
- Using the second prompt, ChatGPT creates an excellent script, however, it uses same image across all slides
Prompt example given to Claude and Gemini
[edit]* VideoWiki & Wikimedia Standards
** VideoWiki Syntax: When asked for a video summary, always use VideoWiki script syntax: {{videowiki}} at the top, == Section == for scenes, and [[File:Filename.jpg|100px|left]] followed by {{clear}} for media.
** Sourcing Requirement: Use ONLY Wikimedia Commons for images/media.
** Verification Protocol: You must manually verify that every filename exists on Wikimedia Commons. Do not provide placeholders, "red links," or hallucinated filenames.
** Visual Preference: Prioritize images already used in the specific Wikipedia article being summarized. If those are insufficient, search the relevant "Category" on Commons for verified alternatives.
Example skill file developed in Claude
[edit]# VideoWiki Script Generation Skill
## Purpose
Generate VideoWiki markup scripts from Wikipedia articles with proper image sourcing.
## Critical Requirements
### Image Sourcing Priority (MOST IMPORTANT)
1. **FIRST**: Get raw wikitext from Wikipedia using pattern:
`https://en.wikipedia.org/w/index.php?title=[ARTICLE_NAME]&action=raw`
2. **SECOND**: Extract ALL image filenames from:
- Main article images (top of article)
- Section images throughout article
- Gallery section (==Gallery==)
3. **THIRD**: Verify each filename exists on Wikimedia Commons
4. **ONLY IF NEEDED**: Search Commons for additional relevant images
5. **NEVER**: Use placeholder images or invented filenames
### VideoWiki Format
```
{{Videowiki}}
==Section Title==
One sentence of text with citations.<ref>citation</ref>
[[File:Exact_Filename.jpg|100px|left]]
{{clear}}
```
### Rules
- Maximum one sentence per slide
- Minimum 3 slides, adapt to article length
- Preserve <ref> tags from original article
- Use actual Wikipedia article images first
- Each slide needs {{clear}} after the image
- Start with {{Videowiki}} tag
### Workflow
1. User provides Wikipedia article URL
2. Fetch raw wikitext: `https://en.wikipedia.org/w/index.php?title=ARTICLE&action=raw`
3. Parse for all File: references
4. Verify images exist on Commons
5. Create script using verified images
6. Use gallery images to add variety
### Example
See: Birthday cake article as reference implementation
```
## 2. **Use Projects Feature** (If Available)
If you have access to Claude's Projects feature, you can:
- Create a "VideoWiki Generator" project
- Add this conversation as project knowledge
- Add the skill file as custom instructions
## 3. **Create a Prompt Template**
Save this as a text file you can paste:
```
Create a VideoWiki script for: [WIKIPEDIA ARTICLE URL]
Instructions:
1. Fetch raw wikitext from: https://en.wikipedia.org/w/index.php?title=[ARTICLE]&action=raw
2. Extract all File: references from the wikitext
3. Prioritize images from: main article image, section images, then gallery
4. Verify all filenames exist on Wikimedia Commons
5. Create slides with max 1 sentence each
6. Use VideoWiki format with {{Videowiki}} header
7. Include <ref> tags from original article
8. Ensure each image has {{clear}} after it
Minimum 3 slides, adapt length to article.