It's very nice that you found a way to import old translations in the new system automatically, saving so much time to translators. I hope you can share something about it at some point, however rough it may be. Thank you very much, Nemo 12:27, 21 August 2012 (UTC)
- Yes, I'm planning to release the code somewhere eventually, but indeed needs cleanup - feel free to remind me a few weeks from now.
- Basically, the importer is a Python script that takes the source text and applies quite a few regexes to remove extranenous formatting and separate the text into translation units, then rearranges them according to a multivalued (beause some translated text may be reused in several translation units) mapping, arriving at an output file that is then fed into a slightly modified version of pagefromfile.py. Regards, Tbayer (WMF) (talk) 23:29, 25 August 2012 (UTC)
- Hello Tbayer (WMF), as you know we're now working on this, so it would be nice if you could email the unpolished code to BPositive by June 7. I'm not sure how much effort we'll be able to spend on the markup cleanup and page preparation/tagging, but if you already have a set of regexes it would be silly not to try them. :) --Nemo 15:40, 31 May 2014 (UTC)
- Hi HaeBot ! Congratulations for your work but we would appreciate accented vowels instead of strings, in French, unless you tell us how to change them automatically. Thanks. --Cquoi (talk) 15:16, 25 August 2012 (UTC)
- Hi Cquoi, I assume you are talking about HTML entities such as "è" for "è". I agree that they are a bit harder to read in the source wikitext, but they actually display fine both here on Meta and in Qualtrics (the survey platform we are using), so I didn't include code to convert them from the import source this time, and you don't need to convert them manually.
- Regards, Tbayer (WMF) (talk) 23:29, 25 August 2012 (UTC)