Talk:Data dumps

Note that this page is not monitored by dumps maintainers. For problem reporting, please add a task in phabricator with the Dumps-Generation tag. Please also subscribe to xmldatadumps-l for discussion of use of the dumps and to receive updates about dump generation.

Download wiki entire "Talk" pages

Latest comment: 6 years ago1 comment1 person in discussion

When will the entire wiki "Talk" pages be available for download?

I'm new to talk pages, so please be gentle.. I would like to download a data dump of them, for a particular language. Was this ever answered? it is almost impossible to figure whether and how Talk pages are included in the data dumps, if at all. Can anybody kindly assist with that? (written by Matannnn)

Hi Matannnn,

Go to https://dumps.wikimedia.org/
Go to "Database backup dumps"
Go to the wiki that you want, for example "hewiki" for the Hebrew Wikipedia, "frwiktionary" for the French Wiktionary, etc.
Choose the dump that you want. The "pages-meta-current" dump includes the talk pages, although only in their latest revision.

Some things to note:

When processing the dump, make sure to filter only the namespaces you need. How you do it depends on the dump processing software that you're using. You may filter for only the article talk namespace, or for all the talk namespaces.
Don't assume that odd numbers (1, 3, 5...) belong to talk namespaces. Many people assume it, but it's wrong.
Some wikis have Flow pages. Flow, also known as Structured Discussion, is a MediaWiki extension that is trying to replace talk pages. Flow is not yet used as much as the classic talk pages, but some wikis use it quite a lot. Here is an example of a Flow page (also known as "Flow board"): w:he:User talk:Amire80. So if a wiki that you want to process uses Flow even a little, make sure that you get a dump that includes Flow pages, and that you read them, too.

(Meta-comment: When posting on talk pages, add --~~~~ at the end of your post. It will show a signature. Everybody does it. It is not needed with Flow, however: Flow adds your name automatically.) --Amir E. Aharoni (talk) 09:04, 5 December 2017 (UTC)Reply

Incorrect wiktionary dumps in Turkish

Turkish Wiktionary titles are converted to uppercase using incorrect collation in mysql dumps. Turkish has different casing for i and ı (dotless i) which convert to İ and I respectively. However in dumps title "İngilizce" is converted to "İNGILIZCE" instead of "İNGİLİZCE". This makes almost whole the data useless.

Download Ireland articles and all templates

Does anyone know how to download all Ireland English wikipedia articles and all templates?? I just want something like the navbox and all the pages in the Ireland category.

enwiki-20080103-pages-meta-history.xml.bz2

Latest comment: 13 years ago2 comments2 people in discussion

Arrrgggghhhh!!! After 148 hours of downloading, I was 97% done with enwiki-20080103-pages-meta-history.xml.bz2 when someone 404'd it!!! Now we are back to having NO complete Wiki dumps available. Is this a secret policy, or what?

It is available in Internet Archive, exactly here (133 GB). Emijrp 18:32, 12 August 2010 (UTC)Reply

The .7z version of that file is listed a 17.3GB. Surely that can't really be the same data—can it? MoreThings 00:18, 11 September 2010 (UTC)Reply

Frequent abort / fail

Latest comment: 15 years ago4 comments4 people in discussion

Dumps frequently fail and then it takes quite a long time until a new one is prepared.

Also, many dumps often fail one after another and a lot of red lines appear at http://download.wikimedia.org/backup-index.html . I don't know how the dumping works, but maybe there's one bug that causes them all to fail. If one dump fails, then maybe the problem that caused it to fail causes the subsequent ones to fail and they are not retried until the next cycle.

All these observations are very amateur, so feel free to correct me.

If it cannot be fixed right away, can it at least be explained here at the main page, Data dumps?

I don't know about other projects, but on the Hebrew Wikipedia we frequently use it for analyzing and improving interwiki links (see en:Wikipedia:WikiProject Interlanguage Links/Ideas from the Hebrew Wikipedia) and for other purposes.

Thanks in advance. --Amir E. Aharoni 15:37, 30 July 2008 (UTC)Reply

Well, dumps failed on 2008-08-01, now is 2008-08-26. I think it's embarrassing for the Wikimedia. :( --Ragimiri 16:10, 26 August 2008 (UTC)Reply

What's worse (from my point of view at least) is that the "small" dumps work fine, but happen (or don't happen) at the mercy of the largest dumps, which as noted very often fail right away, run for a long time and then fail, or now and again, run for a very long time and actually succeed. It's a real shame we can't have these run every month (say), on a particular date, separately from the large dumps. However, it's probably entirely in vain to comment and complain here: I don't think the server admins/devs monitor this page. Whether they'd pay any attention to requests on wikitech-l remains to be seen. Alai 18:53, 4 September 2008 (UTC)Reply

What's worse worse is I've offered years ago to take the dumps and run with them, as it were. Instead a whole load of dev time went into smartening them up, but they cannot be a high priority for them. Rich Farmbrough 22:04 4 October 2008 (GMT). 22:04, 4 October 2008 (UTC)Reply

en dump has "ETA 2009-07-25"?

Latest comment: 15 years ago3 comments2 people in discussion

Would I be going way out on a limb, were I to speculate that this might be yet another failure mode for the full en dump that we're currently in? Alai 07:06, 6 November 2008 (UTC)Reply

No, it's not a failure; it's just a bad estimate. The full history dump does take a really long time, but (assuming it's allowed to run to completion) it'll finish well before July. Already it's estimating completion in May, so that's something. --Sapphic 04:21, 1 December 2008 (UTC)Reply

Oh, that's all right, then. </sarcasm>. The "long time" the full history dump is typically around six weeks, not the thick end of a year. It's very clear that something is very badly broken here. Alai 19:14, 7 January 2009 (UTC)Reply

ImportDump.php killed

Latest comment: 15 years ago2 comments2 people in discussion

First, I am sorry for my English.

I try to import the dump of the ukrainian Wikipedia. After 5 minutes importing I recieve messege "Killed". I have changed file php.ini and set the following parameters:

upload_max_filesize = 20M

post_max_size = 20M

max_execution_time = 1000

max_input_time = 1000

But I still recieve the same messege "Killed" after 5 minutes importing. (it importing 8000 pages maximum) Support of the webhost provider have no ideas what is going wrong.

Please, help me! Thank you! --93.180.231.55 20:58, 26 December 2008 (UTC)Reply

This means that someone or something explicitely killed the process, probably because it consumed too much resources. Many shared hosting places, universities, etc, kill processes automatically after a couple of minutes. PLease talk to your local system admin. -- 81.163.107.36 10:31, 27 December 2008 (UTC)Reply

Stub dumps

Latest comment: 15 years ago1 comment1 person in discussion

I just added a mention of the stub dumps, which I believe contains correct information. The stub dumps are useful for research purposes-- and MUCH easier to work with size-wise-- so I hope they will continue to be generated. 209.137.177.15 06:40, 3 March 2009 (UTC)Reply

Problem with split stub dumps ?

Latest comment: 15 years ago2 comments2 people in discussion

I don't know if this is the right forum for this request, but frwiki, dewiki and even enwiki seem to repeatedly fail or take too long and get killed, apparently as a result of the long delay required for dumping « split stubs ». Would it be possible to reorder dumps so that key dumps like pages-articles.xml.bz2 would be dumped before these split stub dumps ? --66.131.214.76 21:49, 11 March 2009 (UTC) Laddo talkReply

Any updates? answers? 87.68.112.255 08:20, 29 March 2009 (UTC)Reply

eswiki-pages-articles articles and templates

Latest comment: 15 years ago1 comment1 person in discussion

I'm downloaded eswiki-20090615-pages-articles and imported with mwdumper.jar now when i go to see an article in the place of an templates are articles them I see these articles in many other pages. the last year i make same procedure and i see the main page know not. in the place of my templates i have articles. i have run rebuildall.php and the problem continues --Enriluis 18:43, 30 June 2009 (UTC)Reply

'current' symlinks to the latest dumps

Latest comment: 14 years ago1 comment1 person in discussion

Could we get static symlinks to the latest dumps? Probably many tools' developers at the Toolserver (but not only them) would benefit from an address like http://download.wikimedia.org/plwiki/20090809/plwiki-current-pages-articles.xml.bz2 instead of http://download.wikimedia.org/plwiki/20090809/plwiki-20090809-pages-articles.xml.bz2. We could use it to perform some periodical jobs by downloading the newest dumps without guessing or looking at the download page. That would certainly help us automatizing these jobs. An example of this job is http://toolserver.org/~holek/stats/bad-dates.php which presents dates in articles that are written improperly, considering Polish grammar. Hołek ҉ 16:21, 10 August 2009 (UTC)Reply

Missing IDs in abstract.xml

Latest comment: 14 years ago1 comment1 person in discussion

Is there a reason why the article-ID isn't contained in the "abstract.xml"-files? - in contrast for example to the pages-articles.xml. Ecki 12:25, 16 December 2009 (UTC)Reply

mislinked md5sums file for lastest wiki

I'm not sure this is the right place to point this out but I noticed that the md5sums file for http://download.wikimedia.org/enwiki/latest/ is mislinked. it points to the md5sums from an older dump(2009-Oct-31) but the dump is from 2009-Nov-28, this is a problem for tools that fetch the latest dump and use the checksum to check for corruption.

More exact figures on uncompressed dump size?

Latest comment: 14 years ago2 comments2 people in discussion

It says that the dumps can uncompress to 20 times their size. Can anyone give a more exact figure on what the uncompressed size of the enwiki would be? (I need to know how big a drive to buy to hold it.) Thanks, Tisane 23:34, 3 March 2010 (UTC)Reply

Depending on what you want to do with the data, there may not be any need to uncompress it. If you're writing a tool to scan through the file and extract metadata, you can simply uncompress the data in memory as you're reading the file, and only write the (presumably much smaller) metadata to disk. If you're looking to import the data into a database, you're going to run into bigger problems than storage space (the full history dump far exceeds the capacity of most database software.. the Wikimedia foundation itself splits the data across several MySQL databases.) --Oski Jr 18:17, 5 March 2010 (UTC)Reply

How big would image bundles be if they existed

Latest comment: 6 years ago4 comments4 people in discussion

How large would commons and enwiki image bundles be uncompressed if they existed? 71.198.176.22 19:02, 16 May 2010 (UTC)Reply

Enwiki: 200 GB, Commons: 6 TB -- Daniel Kinzler (WMDE) 11:28, 17 May 2010 (UTC)Reply

And growing. It would be nice to make tarballs of thumbnails. Emijrp 18:24, 12 August 2010 (UTC)Reply

Checking again, after seven years, and prompted by this... Is it still impossible? And what would the size be today? And what are the "media bundles" mentioned in this page? This page doesn't make me clear, although it's possible that I'm missing something.

Tagging people who may have answers: User:ArielGlenn, User:Daniel Kinzler (WMDE)?

Thanks! --Amir E. Aharoni (talk) 21:24, 4 December 2017 (UTC)Reply

Oldest Wikipedia dump available?

Latest comment: 13 years ago2 comments2 people in discussion

At, for example, http://dumps.wikimedia.org/itwiki/ I see there are 6 dumps available, the latest 2010-Jun-27, the oldest 2010-Mar-02.

My question is: are older dumps stored somewhere? Is it possible to get them? Even via a script on the database... Thanks! --Phauly 15:42, 29 June 2010 (UTC)Reply

No, because new dumps contain all info of previous dumps, except things they shouldn't contain (deleted private data and so on). --Nemo 21:11, 11 November 2010 (UTC)Reply

Actually that is only true of the full history dump. I found the older dumps extremely useful, however space constraints mean that I can only keep a couple. Rich Farmbrough 18:52 7 January 2011 (GMT).

How could I download only Category Pages ?

Latest comment: 13 years ago1 comment1 person in discussion

I wish to utilise wikipedias category structure, could I just download the category pages? Chendy 22:23, 9 August 2010 (UTC)Reply

Dump server still down?

Latest comment: 13 years ago5 comments3 people in discussion

--eom-- अभय नातू 09:45, 24 December 2010 (UTC)Reply

Seems like the 'Dump process is idle.'. I'm waiting for it to start again. EmilS 19:02, 27 December 2010 (UTC)Reply

My pet AI is starving. Two months without juicy xml dumps!

yes it is extremely unfortunate. I thought new hardware was being sourced. there's a bugzilla on the failure. Rich Farmbrough 10:26 4 January 2011 (GMT).

When the dump page was completely down there was a page with updates, does anyone have the link to that page? Rich, do you have a link to that bugzilla? EmilS 10:37, 4 January 2011 (UTC)Reply

It was wikitech:Dataset1. Anyway there's a mailing list for updates, mail:xmldatadumps-l. --Nemo 19:08, 4 January 2011 (UTC)Reply

The only entry in the mailing list for 2011 says

-- cut here --

Hello,

first a happy new year to everybody. Are there any news about generating new dumps?

Best regards Andreas

-- cut here--

Rich Farmbrough 18:38 7 January 2011 (GMT).

There's some hints at http://en.wikipedia.org/wiki/Wikipedia:Village_pump_%28technical%29#Dumps. Rich Farmbrough 18:39 7 January 2011 (GMT).

A 14 December posting says "We have also been working on deploying a server to run one round of dumps in the interim." I wonder what happened to that. Rich Farmbrough 18:42 7 January 2011 (GMT).

And now we are back in business! :) EmilS 13:58, 11 January 2011 (UTC)Reply

queue?

Latest comment: 13 years ago1 comment1 person in discussion

Why some wikis are dumped the second time in january, and pl.wiki (and 5 others: it, ja, ru, nl and pt) have no new copy since november? Malarz pl 19:14, 18 January 2011 (UTC)Reply

Locked/closed project dumps?

Latest comment: 13 years ago3 comments2 people in discussion

Is there any reason that locked projects continue to get database dumps, when it's obvious that nothing has changed (since the database for a locked project is read-only?) The Simple English Wikiquote's most recent database dump was 04-04-2011, despite the fact that the project has been closed since at least Febuary 2010. I don't know how resource-intensive it is to make a database dump of a wiki, but surely there's no point in releasing new dumps if nothing has changed. Avicennasis 18:41, 7 April 2011 (UTC)Reply

I asked the people managing the dumps about this, and they said that the reason we keep dumps is because if someone wants a copy they have a right to it. We could change formats or add new bits of data, and most of the wikis still allow changes from users like stewards if necessary. They're happy to keep running new dumps, and it's not really a big issue.

They also pointed out that future feedback should probably be sent to xmldatadumps-l and not on wiki talk pages. :-) Cbrown1023 talk 20:45, 8 April 2011 (UTC)Reply

Ah. I knew others still a right to the data, since it's released under out favorite licenses, CC-BY-SA 3.0/GFDL. I just didn't see the point in producing a new dump when, assuming no changes could be made, the older dumps would be identical. It hadn't occurred to me that there might still be changes at a much higher level.

As a sidenote, I didn't think to ask on the mailing list, since it's mentioned on the content page for "trouble with the dump files", and I didn't think this was trouble.

Thank you, however, for reaching out to the right people, and the quick response! Avicennasis 11:01, 9 April 2011 (UTC)Reply

Dumps page sorting

Latest comment: 12 years ago2 comments2 people in discussion

Hello, can someone tell me how are the dumps now sorted on this page : http://dumps.wikimedia.org/backup-index.html

Is it due to a problem (before it was sorted by last modified date). Thanks, Jona 07:22, 1 October 2011 (UTC)Reply

The dumps are sorted by the date the dump was started. Sometimes one part of a dump will be re-run much later and so the date of the status file, which is the timestamp you see in the index.html page, reflects that. This is the reason you see an order that sometimes looks a bit odd. Typically the larger wikis (de fr it pt en) are the ones that might have a step redone. I would recommend you get on the xmldatadumps-l mailing list (see link at top of this page), where you can ask questions like this and get a much more timely response :-) -- ArielGlenn 09:03, 2 December 2011 (UTC)Reply

rebuildImages.php fails

Latest comment: 12 years ago1 comment1 person in discussion

I copied all images to new mediawiki installation. I have done a importDump.php. Then I want all images to be recreated with rebuildImages.php --missing , but it fails!

Some of the copied files have strange characters in its filenames. I think the script tries to rename the file, but it cannot find the method ImageBuilder::renameFile().

Can somebody help? Thx a lot. --Rolze 13:30, 10 January 2012 (UTC)Reply

C:\PROGRAMS\xampp\htdocs\wiki\maintenance>..\..\..\php\php.exe rebuildImages.php --missing
Fatal error: Call to undefined method ImageBuilder::renameFile() in C:\PROGRAMS\xampp\htdocs\wiki\maintenance\rebuildImages.php on line203

Notes from a crazy person

Latest comment: 12 years ago1 comment1 person in discussion

I threw some notes down at mw:Laxative about ways to make scanning database dumps very fast. I'm not sure anything will ever come of it, but just so the page isn't completely orphaned, there's now a link from Meta-Wiki. :-) --MZMcBride (talk) 22:10, 12 March 2012 (UTC)Reply

7-zip actually is better for non-full dumps

Latest comment: 12 years ago1 comment1 person in discussion

This page claims that "SevenZip's LZMA compression produces significantly smaller files for the full-history dumps, but doesn't do better than bzip2 for our other files." In an experiment I discovered that most of the gzip dumps can be made about 40% smaller by 7-zipping with maximum compression settings, and the bzip2 dumps made about 20% smaller. For example, enwiki-20120307-stub-meta-history.xml.gz is 20.2 GB, but compressed with 7z is 11.7 GB. I realise that 7-zipping everything with maximum compression settings would consume a lot of CPU and make it harder to turn out dumps on time, but I think the saved bandwidth would justify it - many overseas downloaders have to pay high rates per gigabyte for bandwidth. 7-zip is able to take advantage of multiple cores, which are rapidly becoming more plentiful. At the least, for downloaders who do need smaller dumps, we should let them know it's possible for them to recompress after downloading to save storage space. Dcoetzee (talk) 15:08, 1 April 2012 (UTC)Reply

how to create "compatible" dumps

hi, i run www.ameisenwiki.de and i want to create dumps for wikitaxi. i need the pager-articles.xml.bz2 format. currently i try this with php dumpBackup.php --full > /var/www/wiki/dump/pager-articles.xml and create the .bz2 file afterwards. wikitaxi is not able to import - parser error if i use dumpgenerator.py i also get an incompatible xml Which tool is used to create these exports?

A 7z compression test

Latest comment: 10 years ago1 comment1 person in discussion

I tried to achieve better 7z compression of the dumps than the current one. As a test, downloaded the English Wikipedia full dumps from 2013-06-04 and recompressed them with the following 7z options:

-mx9 -md=30 -mfb=270 -mlc=4

The original size of the full dumps was 70,187,341K. The size after the recompression was 57,441,684K - about 22% smaller.

The upsides are:

disk space economy
bandwidth economy

The downsides are:

signifficantly more CPU usage: the recompression took about 80 days on my AMD A4-4000
signifficantly more memory usage: with these options 7z requires about 10.5 GB of RAM.

Depending on the current CPU/RAM situation of the Wikimedia hardware and the cost analysis, it may be justified or not to use this stronger compression mode.

Also, grouping exported pages not by pageids, but by a category system may additionally decrease somewhat the compressed file size. However, I wasn't able to formulate an algorithm for choosing the best categories hierarchy to follow. -- Григор Гачев (talk) 13:11, 28 September 2013 (UTC)Reply

Would it help to use Efficient XML Interchange (EMI)?

Latest comment: 9 years ago1 comment1 person in discussion

XML files often have a large overhead due to the length of the tags, though of course good compression can alleviate that. This is addressed via EMI - see the Efficient XML Interchange Working Group Public Page. Has anyone looked at whether publishing dumps in EMI would help with the time or space issues? Nealmcb (talk) 17:17, 3 July 2015 (UTC)Reply

Documentation of content of XML files

Latest comment: 9 years ago1 comment1 person in discussion

Where is there documentation of the content of the various dumps? The names are not fully self-explanatory. For example, what are "primary meta-pages"? Are we just supposed to know enough XML to figure it out by downloading and processing each type of dump file. Specifically which dumps have pages from the Appendix name space? Or enwikt's Citations namespace? Presumably that differs by project. Should we have specific documentation for each project that expresses an interest? DCDuring (talk) 15:08, 4 July 2015 (UTC)Reply

Wikispecies species download

Latest comment: 8 years ago3 comments2 people in discussion

Hello everyone,

I already posted this question on the wikispecies discussion page, but maybe you could give me an answer to this too:

Is there a possibilty to download the core species data behind the wikispecies project? Or rephrased: Is there a dump that contains just the main pages (assuming they are species) and their respective links to successive species pages?

I'm aware that you can download the entire wikispecies, which leaves you with a 5GB large file. Converting that is naturally something I'd like to avoid, as I am basically just looking for the core relational data.--Taeping5347 (talk) 13:17, 18 April 2016 (UTC)Reply

If links are all you need, a pagelinks table dump or even database query may be easier. Extracting more information may require an alternative parser like mediawiki-utilities. Nemo 17:15, 18 April 2016 (UTC)Reply

Thank you for your hint. I will give the pagelinks a try, but I suspect it'll consist of a lot of pages that are not actually of interest. It would be great, if this specific version of a dump could be implemented for the wikispecies project, where the actual relation between pages contains very real information.--Taeping5347 (talk) 17:49, 18 April 2016 (UTC)Reply

Talk pages?

Latest comment: 7 years ago2 comments2 people in discussion

Where can I download the talk pages for EN? Thanks. -- Green Cardamom (talk) 04:51, 12 August 2016 (UTC)Reply

pages-meta-current (related: phabricator:T99483). Nemo 07:18, 12 August 2016 (UTC)Reply

Download all files in one go.

Latest comment: 7 years ago1 comment1 person in discussion

Dump files should offer packages, which offers a full download of anything -- pages, talks, images etc. of a wikimedia project in one go. It can even provide a total backup of its entire server in one folder, so as to allow users to download them more easily. Wetitpig0 (talk) 10:51, 5 September 2016 (UTC)Reply

What encoding of sha1 in dump xml

Latest comment: 7 years ago1 comment1 person in discussion

What part of revision used to calc sha1? What encoding (base[?]) used to covert sha1 to text?

...
{{Внешние ссылки нежелательны}}</text>
      <sha1>7oe1255dhbgmwbyh4bla8qh0zlq7m2o</sha1>
    </revision>
  </page>
...

I found it in ruwiki-20170501-pages-articles.xml.bz2 dump. This last revision of RU:Magnet-ссылка. Ivan386 (talk) 15:56, 12 May 2017 (UTC)Reply

Creating EN Mirror

Latest comment: 6 years ago1 comment1 person in discussion

I am seeking help in creating a daily updated EN Wikipedia mirror. This mirror needs all meta, but not necessarily media from commons. Media from commons can be stage 2. This mirror needs to be updated every 24 hours (or more frequent if deemed feasible). Hardware and bandwidth resources are available for this project. I have read Dr. Kent L. Millers work on wiki mirrors (http://www.nongnu.org/wp-mirror/) . Not sure it is the best approach. Has anyone made a successful one as I have described? Seeking all input (failures and successes) please. Thebluegold (talk) 18:37, 9 October 2017 (UTC)Reply

Some pages are missing in dump

Latest comment: 6 years ago1 comment1 person in discussion

Some pages are missing in dumps. For example "Talk:The Beatles" (ID 30303) in "enwiki-20180301-pages-meta-current.xml.bz2". Why? --Lewoniewski (talk) 19:18, 16 March 2018 (UTC)Reply

Poop logo

Latest comment: 3 years ago6 comments4 people in discussion

Thanks ArielGlenn for your reform of this page and for making a logo.

I am not totally opposed to this new poop logo but I am somewhat opposed. Increasingly I am making referrals to this documentation to all kinds of external partners, including universities, labs, and Wikimedia funders. A logo is a good idea, and maybe a poop logo is okay. I do want people who come here to respect our process. The en:Pile of Poo emoji is a contemporary symbol of communication, so maybe this is okay.

Personally, I might prefer something with more positive symbolism than a pile of poop. Perhaps a dump truck? A generic pile? I am not sure. For now, I removed the image. If anyone adds it back then please give some explanation here why you want it. I can accept it if someone justifies it. Thanks. Blue Rasberry (talk) 22:38, 21 July 2019 (UTC)Reply

I"m not wedded to the particular logo idea; it's a placeholder since there's not a designer doing one. But we ought to have one. I don't have the spare cycles to run a community process for a logo, but I could oversee it if someone else ran it. I have a quirky sense of humour, something that's more or less a requirement after keeping these things running for nearly a decade, so that's how you wound up with the particular image. FWIW I don't think a humorous logo and respect for the process or its output are mutually exclusive. BTW I would be happy to hear suggestions for improving these documents; I'm writing them in a vacuum. That's another community process I can't run but could grab suggestions and respond or implement as appropriate. -- ArielGlenn (talk) 05:21, 22 July 2019 (UTC)Reply

Thanks, I appreciate your attempt and the joke. I am not entirely opposed to the poop but this is a page which routinely gets high-profile attention from organizations which may invest financially in Wikimedia engagement. I want to retain the Wikimedia personality because a casual attitude and approachability are essential to the culture of our project. Maybe showing the poop communicates the accessibility and openness of what we are doing here.

I suppose the poop emoji is a universal symbol. Do we want to assert that the poop symbol is also a general symbol for data dumps? I imagine that maybe others would want to use the symbol in this way, and maybe the Wikimedia community should take the initiative here. The first person to ever use the word "data dump" must have known that dump has multiple meanings and named it accordingly. Blue Rasberry (talk) 11:18, 22 July 2019 (UTC)Reply

Hmm this page must not be on my watchlist. Fixing that now. I guess @Brion VIBBER: must have been the person who named them, if we want to look at the history. -- ArielGlenn (talk) 06:00, 27 July 2019 (UTC)Reply

I'm pretty sure we started using the term 'data dump' in the context of the mysqldump program which was originally used to create .sql dump files from the database. I can honestly say I didn't think of the alternate meaning until seeing that logo just now... :) --brion (talk) 16:48, 5 August 2019 (UTC)Reply

Remove the outline, leave the puzzles, and giggle softly to yourself, while serious visitors just see a perfectly appriopriate pile of puzzles. ;) Matma Rex (talk) 23:25, 2 December 2020 (UTC)Reply

Missing dumps for 20240520 and 20240601

Latest comment: 1 month ago1 comment1 person in discussion

There is no dumps for this period for different language versions of Wikipedia. For instance for enwiki the most recent one is 20240501. Why? Prof.DataScience (talk) 11:09, 5 June 2024 (UTC)Reply