Talk:Community Tech/SVG translation

From Meta, a Wikimedia project coordination wiki
Jump to navigation Jump to search

Prototype[edit]

Hello everyone! Our fantastic UX designer, PSaxena (WMF) made nice prototypes for you all to look at to get an idea of what the tool will look like. You can take a look at it using the links below which show some different images in the designs. Please note that it's a prototype and does not actually work. You will not be able to add translations to files or upload them to Commons. There are only a fixed number of files (linked below) that you can look at. Features that are in the prototype may or may not be present in the final tool - they are only meant to show the design. Your feedback would be welcome.

Pinging all the usual suspects - @Waldir, Strainu, Ruthven, Dvorapa, Glrx, and JoKalliauer:. Looking forward to hearing your feedback on the general design of the tool and any concerns (or appreciations!) you might have. I'll note again that everything you see here is only for design purposes and won't actually work. Also none of this is final and it can be changed based on feedback. Thank you so much. -- NKohli (WMF) (talk) 00:08, 10 November 2018 (UTC)

Hi NKohli, great job! The interface looks simple and very intuitive to use.
Doing some tests, I've noticed a couple of possible improvements:
  • it is not possible to blank one word, when this can be useful in certain cases (e.g. a densely annotated map);
  • certain special characters are not read (e.g. &)
  • it could be useful to move the labels order (e.g. ATLANTIC/OCEAN -> OCEAN/ATLANTIC), because in certain language the order is different (e.g. Italian: Oceano/Atlantico). Of course "Oceano" can be written instead of ATLANTIC, if the semantic value of one label will never be considered.
I'll keep testing. Thanks again! --Ruthven (msg) 07:45, 10 November 2018 (UTC)
Thanks for your comments, Ruthven! To the points you raised - you can blank a word by entering empty space. This is so people don't accidentally forget to enter translations. Does that seem like a good idea? About special characters not being read - it will work in the final tool, the prototype has bugs. :) About moving label order - that's an interesting idea and I will talk to the engineers about it. There are challenges on how we will be able to remember that information in the SVG file because there is no good way that SVG format allows us to store different label-ordering for different languages, unfortunately. -- NKohli (WMF) (talk) 21:35, 13 November 2018 (UTC)
I have just 2 more things in addition to what Ruthven said:
  • The list of languages is huge, it would be great to have a hyerarhical list like we have on Wikipedia interwiki.
  • There seems to be no place for suggestions. I know this is a non-mvp feature, but changing the design a second time seems a bit useless.

Thanks for the open development process, I really appreciate it. Strainu (talk) 09:18, 10 November 2018 (UTC)

Thanks for the feedback, Strainu. We are going to integrate ULS in the tool (File:ULS-GeoIP.png) so language selection is easier. I don't understand the second point you made - can you elaborate on it? Thank you. -- NKohli (WMF) (talk) 21:35, 13 November 2018 (UTC)
@NKohli (WMF): I meant that having the image on the right of the translations leaves no place for the translation suggestions. Based on previous experiences in the Wikimedia world I would have expected the image to be on top of the strings (and stay there) and the strings to have an interface much like translatewiki, expanding and compressing as the user goes to the translation.--Strainu (talk) 23:28, 13 November 2018 (UTC)
@Strainu: I see what you mean now. Thanks for clarifying. Keeping the strings on top of the image will work on desktop but probably not on mobile and we will run into issues with larger images. Good point about suggestions though. I will discuss that with the designer and see if we can make the design work better for suggestions (though suggestions are not a part of the MVP list of features, sorry). - NKohli (WMF) (talk) 23:51, 13 November 2018 (UTC)
Wow, it looks really simple, but does the job really well. I've found only some minor and some major issues. Minor: First, the name of the Czech language contains a typo (čEština -> Čeština). Second, if I change a language, the dialog popup shows all my translations will be lost, but after clicking OK only the first translation is lost. Third, on the first load no buttons, texts, or input fields were shown, needed to reload. Major:
  • I love the interface (how the image is fixed position, but the page scrolls). I would just make the header narrower, because I don't like I have to scroll to the actual image so much. Just few pixels would probably do.
  • I still miss what I've already said, some upload button for translated files (from Commons). For multiple images there already are translated files uploaded to Commons with a title like "Original_image-pt.svg". It would be super useful just to upload the existing file and not work out or copy texts from the old file (if not translated to curves). The text usually is on the same place, just it has to be extracted (if possible) and merged to the original.
--Dvorapa (talk) 14:22, 10 November 2018 (UTC)
@Dvorapa: Thanks for the feedback! All of the minor points to you wrote will be fixed in the actual tool. They are bugs in the prototype which we are not working on fixing because it is only for design purposes. About the major points - good point about making the header narrower, I will convey that to the product designer. For the second part - I don't know if we will be able to get around to doing the merging anytime soon but it is still on my optimistic list of non-MVP features. I appreciate your feedback. -- NKohli (WMF) (talk) 21:35, 13 November 2018 (UTC)
numbers are very much off (where does 4 and 5 belong, they are in the middle of two arrows)
@NKohli (WMF):I have some questions:
I know thats Features, that might still have to be implemented, but you might can add them already now to the Grafical user Interface, maybe with the comment (feature under progress) and just link them later to their real function.
 — Johannes Kalliauer - Talk | Contributions 18:56, 10 November 2018 (UTC)
@JoKalliauer: Re-arranging text is not in the initial list of features so you cannot do that yet. See the non-MVP feature list. We are focusing on adding the basic functionality first. For choosing a language - you wouldn't necessarily have to. But if you are adding translations for Punjabi and want to use Hindi to compare with, it is easier than using only numbers. By default, you will get the Default option in the language dropdown so you don't have to pick. If the prototype doesn't do that, you can ignore it. It is only for demonstration purposes. Lastly, the tool will only create language-switch files. All translations will be added to the existing file and then uploaded to Commons. The tool will not create separate files for all translations. Thanks for your feedback. Much appreciated. -- NKohli (WMF) (talk) 21:52, 13 November 2018 (UTC)

SVG serving in wiki language by default[edit]

Hello everyone! I have a good news. After our discussions in the past few weeks, we realized that serving SVGs in English by default is a big problem and spent some time working on it. We have been working on (task T205040) for the last few weeks and I am glad to say that we are close to launching it. The feature is available to test on the beta cluster. I have been testing it with this file which has a lot of switch translations. I tested it by embedding it on the German beta wikipedia (here) and Hebrew beta wikipedia (here). As you can see, in both cases the file renders in the wiki languages (German and Hebrew) without the need for a `lang` parameter in the syntax. We will be launching this soon on all wikis after we have tested it more thoroughly. I hope you all will take some time to test it too and tell me what you think of it. I will wait for feedback before we do a rollout to all projects. -- NKohli (WMF) (talk) 00:25, 1 November 2018 (UTC)

@Waldir, Strainu, Ruthven, Dvorapa, Glrx, and JoKalliauer: Flagging the above for you all in case it got missed. :) Thank you. -- NKohli (WMF) (talk) 19:15, 1 November 2018 (UTC)
Wonderful news. MW's librsvg has problems with hyphenated langtags (e.g., "en-US" and "zh-Hans"), but let's assume that will be fixed soon (Gnome claims the issue is resolved). MW does automatic script conversion for some wikis: e.g., Chinese (zh-Hans / zh-Hant) and Serbian (sr-Cyrl(sr-EC) / sr-Latn(sr-EL)). Does the script conversion also select the appropriate langtag for the SVGs? For example
Glrx (talk) 19:47, 1 November 2018 (UTC)
Thanks for replying Glrx. I do not know the answer for sure. Pinging MaxSem (WMF) to provide more information on this as he worked on the implementation. -- NKohli (WMF) (talk) 21:56, 1 November 2018 (UTC)
@Glrx:, let's say it will be no more broken than it currently is for language variants specified explicitly via the lang parameter. There are definitely edge cases with variant handling that we may have to address later. Max Semenik (talk) 22:44, 8 November 2018 (UTC)
The automatic language selection is a pretty nice and useful feature! Does it take the user's interface language into account, or does it only use the wiki's language? I think giving precedence to the user's language (if different from the wiki language) would be the ideal behavior, but others may disagree since images are part of the content, rather than the interface. --Waldir (talk) 19:46, 1 November 2018 (UTC)
The current semantics are l10n to the wiki's language rather than i18n to the user's language. If MW starts serving SVG directly (rather than a PNG), then the l10n/i18n issue will need to be sorted out. If |lang= is present, it is an instruction to localize. The current change allows MW to avoid l10n instructions. Glrx (talk) 19:53, 1 November 2018 (UTC)
Glrx is correct. While loading the image in the user's language can be useful, I don't think it's easy to do it with our current infrastructure. So for now this feature only shows the image in the wiki language. -- NKohli (WMF) (talk) 21:56, 1 November 2018 (UTC)
Good news, I didn't know about the task. Please don't forget about alternative languages. If an SVG does not contain a translation for pt-br, don't forget to try to load pt-pt first before you finally load english. Exactly the same as MediaWiki messages work. The same e.g. for cs-CZ and sk-SK. Finally always thank you for mentioning us, I wouldn't notice anything otherwise. --Dvorapa (talk) 20:38, 1 November 2018 (UTC)
Dvorapa Thank you for bringing this up. I believe currently we cannot do this. The image is always loaded in the wiki language and if it's not available, it is displayed in English. What you proposed will be quite useful. I will discuss this possibility with the engineers. Thanks for your feedback. -- NKohli (WMF) (talk) 21:56, 1 November 2018 (UTC)
I see, hopefully there could be some hint for an implementation in the MediaWiki message system, but unlike MediaWiki, these alternative language choice mechanisms will need to be loaded live - this seems as a huge obstacle to overcome I understand. --Dvorapa (talk) 22:11, 1 November 2018 (UTC)
@Dvorapa:. I do not know how the current system works or if anything special is done on the pt.WP, but the system should do something reasonable if any pt-* langtags are present in the SVG. I'll presume the pt.WP works just like the en.WP: the users of pt.WP may see either pt-PT or pt-BZ content: the pt.WP does not try to provide content tailored to a particular language variant. Consequently, the pt.WP should just ask whether the SVG file has systemLanguage attributes that are satisfied by the langtag pt? The answer is yes if the SVG uses the pt langtag or any pt-* langtag (such as pt-BZ, pt-PT, or even pt-CV). In other words, if an SVG file uses only pt-BZ (Portuguese/Brazil) langtags and no other Portuguese langtags, then the pt.WP should show the SVG's pt-BZ material. If not, then something is wrong. Similarly, if an SVG file's only English langtags are en-JM (English/Jamaica), the en.WP should display Jamaican English material even if the user is located in Great Britain (en-GB) or India (en-IN). Glrx (talk) 16:59, 3 November 2018 (UTC)
Okay, but that is only part of the problem. What about cs-CZ and sk-SK? These two are almost the same languages, both nations understand each other in 100 % of cases, but they does not share the same langtag (cs x sk). --Dvorapa (talk) 17:12, 3 November 2018 (UTC)
The code would have to follow a fallback list. That was contemplated, but I do not know if it happened. A workaround would be to make the SVG file have both cs-CZ and sk-SK translations/langtags. Glrx (talk) 21:22, 3 November 2018 (UTC)

Removing the download feature[edit]

@Waldir, Strainu, Ruthven, Dvorapa, Glrx, and JoKalliauer: Question for you all - do you think it makes sense to drop these two features on the new tool -

  • Ability to import any image using the image URL
  • Download the image with translations

Both of these features existing in the current tool but it seems like they are not used as much. If these features do not seem useful, we can remove them and give the users a better experience. Looking to hear your thoughts on this. Thanks in advance. -- NKohli (WMF) (talk) 22:55, 12 October 2018 (UTC)

Can you elaborate what exactly does "give the users a better experience" mean?
As long as there is a simpler path to get the images on commons, it makes sense that these features won't be used much by Wikimedians. However, they could be seen as a service from our community to 3rd parties, such as other wikis (wikia, private wikis) or even random users in need of an image translation tool. On the other hand, this upload-translate-download workflow should be kept separate from the commons-translate-commons one, as we don't want to end up with an easy way to "wash-up" non-free svgs (i.e. no upload-translate-commons)
I guess this is all a cost vs. opportunity decision that should be taken by the team. If implementing the features doesn't take too much engineering effort (say, based on absolutely nothing, one man-week for dev and maybe a couple of days for testing) they would be nice to have. If not, I doubt they will significantly affect the usage of the tool.
If they need to be prioritized, I would say the download is much more interesting than the upload, since some Wikimedians might try the tool, dislike it, then continue the translation offline (commons-translate-download workflow). If they can't download, they will loose their work.--Strainu (talk) 23:12, 12 October 2018 (UTC)
I think similar, it is not really a technical question, it is more a political question:
  • If you let every image prozess, you are offering a tool for the WWW (then you need more server resources if it would become known)
  • If you only let commons.wikimedia.org and .wikipedia.org pictures prozess, then maybe educationlly useless pictures get uploaded, just to use the tool.
I think both are not very problematic, but if I have a SVG (f.e. created on my own, or downloaded from the WWW) and I want to make it multilangual I first have to upload it and then use the Tool and afterwards I would like to modyfy some thinks afterwards I have to create at least three files:
  • upload before the tool
  • created by the tool
  • modification after the tool (f.e. add comments before the switch-elements)
That might create a huge versionhistory (better for understanding the copyright-history, but bad for having a long list of useless files sored on the servers).
However images with `xlink:href` to external sources, should not be able to uploaded by the tool. (as by all other upload-tools, phab:T5537)
As said by Strainu it should be a decission of WMF-Team and the Developers-Team.
 — Johannes Kalliauer - Talk | Contributions 06:31, 13 October 2018 (UTC)
I am just afraid that the "Ability to import any image using the image URL" will become an open door for copyright violation. So, imho, no big deal in dropping the feature. To "Download the image with translations" should be maintained because we're talking about a translation tool, and because it would be helpful for third parties, as said above. Thanks for this feedback anyways! --Ruthven (msg) 06:39, 13 October 2018 (UTC)
Agree with Ruthven and others on this issue. --Dvorapa (talk) 08:13, 13 October 2018 (UTC)
Thanks everyone. What I meant by give users a better experience is that if we simplify the tool interface, it is easier to use for new people. What I am hearing is that it's okay to remove the Ability to import any image using the image URL feature because that can open a door for potential copyright violations but Download feature seems much more useful and should be maintained. If there are no objections, I would go ahead and modify the MVP. Thank you so much for your input. This was very helpful. -- NKohli (WMF) (talk) 17:33, 16 October 2018 (UTC)

Recruiting for usability testing the prototype[edit]

Hey o/ I am the designer for the SVG Translate tool, and I've been working on a prototype based on the wireframes and the feedback we've received for it. We'll be doing a round of usability testing using these prototypes and we'd would love to get your feedback on it.

We'll be using usertesting.com to get structured feedback and reactions on the prototype. We'll also be posting the prototype here for open feedback. If you're interested in participating in the usability test (it would be really helpful for us if you do), you can leave your email address using this form. We'll use it only to invite you to the usertesting.com test. --PSaxena (WMF) (talk) 08:18, 10 September 2018 (UTC)

Proposed solution - MVP phase[edit]

I've refined the Proposed solution section on the project page to identify which are the most important features that we will be working on adding in the first iteration of the tool. This lists the features that the tool should have and the gadget that should live on-wiki to make it easy for people to find this tool. I want to get feedback from everyone on what they think of this list as the requirements for the first version of this tool. @Waldir, Strainu, Ruthven, and Dvorapa: I would like to hear what you think. Thank you. -- NKohli (WMF) (talk) 02:20, 23 August 2018 (UTC)

  • The prioritization is clear from the document. However, the collaboration and login workflow are unclear to me. You mention the user can login with OAuth at any time to hold on to their translations, but they can only upload to commons after the user has finished translating all strings. This suggests that it will not be possible for another user to take over an unfinished translation. Can they at least restart the translation independently? How will races be handled? Also, what happens with the file if the user does not login at all?
    Also, you do not handle problem 7, but I believe this is intended. Is there at least a ticket for that issue?--Strainu (talk) 06:22, 23 August 2018 (UTC)
    @Strainu: My understanding is that showing the user half-completed translations is not a great idea because the user experience while seeing a half-English half-Russian labeled image is sub-optimal. In that case, it also leads to the SVG file and the tool itself being confused on what is already translated and what remains to be. Also most files seem to have a fairly low number of labels (less than 20) so completing a translation in one go should not be a problem. Do you think we should allows users to upload semi-complete translations? Another user can at any time reload existing translations and modify them.
    The current tool (svgtranslate) supports the use case where a user can supply any file (does not have to be on Commons) and translate it and download it back. I don't have data to know how popular this use case is but it's easy to support it. So if a user does not login, they can still translate the file and download it for their purposes.
    Good question about race condition. Here's my immediate thoughts: The tool may be able to show a popup on a corner when another user is modifying the same file. Flagging this for the project designer PSaxena (WMF). -- NKohli (WMF) (talk) 16:56, 23 August 2018 (UTC)
    We definitely don't want to show half-completed translations, I was referring to the fact that the wording suggested collaboration was not possible. It's good that someone else can load an existing translation. You might want to also consider translation reviews during the design phase (even if you don't plan to implement them). AFAIK major languages with many translators actively use this feature in tw.org.--Strainu (talk) 19:23, 23 August 2018 (UTC)
    The idea is that the tool does not hold translations but rather they are always on Commons. Having a review interface where people can see which files recently got translated in the language of their choice can certainly be handy. I will add this to the non-MVP list of features. Also I realized I did not answer your question about problem #7. We are going to do an investigation (phab:T202181) on that issue and see how technically challenging it is to include it to be a part of this project. I will update the proposed solution section on the project page based on the investigation results. Thank you. -- NKohli (WMF) (talk) 22:50, 23 August 2018 (UTC)
    (conflicted) Hi NKohli (WMF). Thank you for sythesising the most important features. To me they look all important (in tne MVP), but I see less important the following:
    • Allows a user to find a file on Commons: there is already a category and a template that, in theory, point to the translatable files.
    • Allows a user to download the file with the new translation: yes, but it isn't embedded in Commons already?
    Then I've a couple of questions on the tool. Translations will be added to the same SVG file using switch syntax, fine, but how to select the language on the specific language on Wikipedia? We laso said that "Long translations do not fit": often I have to move the text labels to make a translation fit, and I think it's essential to have a readable map. Will this be possible with the tool or the file should be edited externally? In that case, wouldn't it be better to have separated files for each language version? --Ruthven (msg) 06:32, 23 August 2018 (UTC)
    @Ruthven: The find feature will be an input box for the user to directly input a file name or URL. This is because the user may want to translate a file that is not on Commons yet. This is a feature the current tool (svgtranslate) supports and we want to continue supporting it to see if people find it useful. The same reason why Download feature is included.
    To the question on how to select the specific language, Commons already offers a dropdown under the image called Render this image in which allows you to pick from the translations the file contains. See File:First_Ionization_Energy.svg for example.
    For long translations, the MVP version of the tool will allow you to Preview a file to see what it looks like with the translations. They can use shorter words if the translation does not fit. In the next version of the tool, we want to allow for changing the positioning of the labels and modify the font size so the user can fit the translations better. Once that feature is included, do you think it is okay to have translations in the same file instead of creating separate language versions? A user can definitely create their own language version by downloading the file and uploading it themselves if they don't want to upload a new version of the existing file. -- NKohli (WMF) (talk) 18:26, 23 August 2018 (UTC)
  • I think the MVP is sufficiently detailed and describes functionality that would satisfy the primary use cases for the tool. I could point out some minor details that could be clearer, such as disambiguating when you're referring to in-progress translation drafts under a user's tool-specific account, or finalized translations already present in Commons. Another minor point of confusion for me was what you meant with "and leads user to commons where they can input the description/file changes", as that didn't make clear how much guidance the user would have to regarding what such additional changes should be and how to perform them (this could be important for beginner users). But I'm sure more detailed mockups/storyboards will flesh out these details.
    My main suggestion would be to, in the entry "Allows a user to preview the file", consider the option to have a near-instant, auto-updated preview, rather than a manually updated one. Perhaps this could be achieved by re-rendering only the text content of the file, atop a static raster rendering of the underlying image as a background. Such a double-layered approach sounds to me like it would enable live updates without much performance cost.
    Waldir (talk) 16:22, 23 August 2018 (UTC)
    Good points Waldir. I am going to post some mocks soon and that should clarify some things. About the auto-preview, it is something that I will be talking to the engineers about. I totally agree that it will be useful to have that. I want the tool to get into a usable state as soon as possible so we can add nice-to-have features after the MVP is built. If it's fairly easy to do auto-previews compared to on-button-click previews, I would favor doing auto-previews. -- NKohli (WMF) (talk) 18:32, 23 August 2018 (UTC)
I read the Proposed solution and User workflows sections and it seems almost ok to me. To the user workflows: There will be some images partially translated. Perhaps there should be some progress indicator/bar, how many % of the file is already translated for each file? This could be useful as a motivation to translate. To the last user workflow: We know there already are some files and their translated duplicates. I would imagine a part of the tool which:
  • allows to tick files in the same categories, that are dupes in a different language
  • allows to input a filenames of files, that are dupes in a different language
    • suggests some files the tool thinks could be dupes, but leave the decision to the user
  • after all dupes are successfully found and selected, it extracts the text from them and try to estimate the language (from filename, desc, extracted text), possibly ask the user to select one
  • matches extracted text to the fields
  • allows the user to check/fix/rematch extracted text
  • allows the user to exclude previously selected file (if the output of the tool is not correct or if the user doesn't dare to translate e.g. non-latin scripts)
  • allows the user to change dupes to redirs/mark dupes for deletion (?) after the newly translated SVG is saved
    • this should be allowed only if the SVG is 100 % translated to that language
    • this should be possible even if the dupe was previously excluded (the file can be dupe, but the tool can fail to extract the text from it/the file can be corrupted)
    • this should be possible also for PNG (etc.) files (sometimes the SVG file has a local variant in PNG format, the tool should help to merge them too even if it can not extract any text from them)
I also think this would be better not as a part of the tool directly, but as a separate gadget/tool/script in an post-MVP phase. --Dvorapa (talk) 15:23, 26 August 2018 (UTC)
We are planning on adding an indicator for how many messages are untranslated. Like you said, it can serve as a good motivation factor. The outline you describe for the last user workflow is interesting. I agree that it should be an extension of this tool or a separate gadget/tool. -- NKohli (WMF) (talk) 22:49, 28 August 2018 (UTC)

Wireframes[edit]

@Strainu, Waldir, and Ruthven: I have added some mockups to the project page. They represent the basic functionality in the tool which is outlined in the proposed MVP solution. Would love to get your thoughts on them. Thanks. -- NKohli (WMF) (talk) 21:59, 23 August 2018 (UTC)
Also pinging JoKalliauer as the proposer of the project. -- NKohli (WMF) (talk) 22:33, 23 August 2018 (UTC)

@NKohli (WMF): Thanks for pinging. I am following/reading every edit here since February, and subscribed phab:T201207 since 2Weeks. I am interested in a simple, cheap support, for "dumbest assumable user", such as ‘SVG Translate’, that new users do not have to ask the graphic-labs and are able to translate it on there own (I originally thought about only fixing the upload of phab:T164275). I prefer switch-Tags (TranslateSVG extension). Since Glrx (and others) have more knowledge and experience about it (see f.e. Grants:Project/Glrx/SVG_i18n (the luxury version)), I will only contribute if I can contribute something useful. As said before a simple solution would be enough for me, but I'm also happy about more sophisticated versions.  — Johannes Kalliauer - Talk | Contributions 15:22, 24 August 2018 (UTC)
@JoKalliauer: Thank you. :) -- NKohli (WMF) (talk) 17:46, 24 August 2018 (UTC)
@JoKalliauer: has a lot to offer; don't let him undersell himself. Glrx (talk) 21:48, 24 August 2018 (UTC)
Haha. JoKalliauer, you heard Glrx! You're not to undersell yourself. :) I'm sure there are so many things we haven't thought of yet and someone with your experience will be able to guide us in the right direction. Please do chime in when you have something to add to these discussions and on Phabricator too. Even things like "that sounds fine" is very helpful in validating if what we're thinking/proposing is sensible from a user point of view. I appreciate it. Thanks. -- NKohli (WMF) (talk) 22:33, 24 August 2018 (UTC)
@NKohli (WMF): I added some featurs I would like to have in Talk:Community_Tech/SVG_translation#whished_additional_features  — Johannes Kalliauer - Talk | Contributions 20:49, 30 September 2018 (UTC)
@JoKalliauer: thanks for the detailed list! We are about to begin development on the MVP features for the tool and I will keep the project page up to date as we make progress on it. Thank you. -- NKohli (WMF) (talk) 22:20, 1 October 2018 (UTC)

Glrx I'd also like to solicit your input on the proposed MVP and the wireframes. Thank you. -- NKohli (WMF) (talk) 17:46, 24 August 2018 (UTC)

Readjusting the user interface page isn't a big deal; it can be done after many users comment. In many ways, it depends on user preferences. You might consider scrolling the source/target underneath the picture. That allows a larger picture and can keep the text being translated close to the picture. I might do source/target as two lines rather than side by side. Labels might be single words, but they also may be longer such as "low pass filter". If you are going to display possible translations, there needs to be a lot of space for them.
A server-side application can discover suitable translation languages from Accept-Languages. A client side application can discover the same information from the browser.
Each translation unit should have a check box to say do-not-translate. Those TUs can be hidden from the user unless she clicks show all TUs. The checkbox might be disabled and the TU grayed if the tools recognizes it is too complicated to translate (e.g., tspan with baseline shifts or font changes). The DNT can be stored in the SVG as its:translate="no" or translate="no".
Glrx (talk) 21:48, 24 August 2018 (UTC)
Regarding the scrolling the source/target labels under the image - that is a good point. Same for language selection. We've been playing with ideas for what's the best way to handle that. I will talk with the project designer about this. About the do-not-translate tag - will this be on a per-language basis or for all languages? Can you give me a couple examples of when a user may not want to translate a label? -- NKohli (WMF) (talk) 22:38, 24 August 2018 (UTC)
See w:International Tag Set.
The its:translate="no" attribute applies to all languages and is inherited in the document tree. It says the text in this portion of the tree does not need to be translated (unless countermanded further down).
An illustration teaching the English language is an example. It might have an English phrase "Good morning" and offer the meaning in another text element that would be translated into de, fr, it, pt, .... The English would never be translated, but its meaning would be.
If an illustration quotes Caesar saying veni, vidi, vici in Latin, then the phrase would not be translated. The same illustration may have labels for Italy and Rubicon that might be translated. The SVG would have something like
<text its:translate:"no">veni, vidi, vici</text>
An illustration of the cipher in Poe's "The Gold Bug" would not be translated; neither would its 1-for-1 plaintext.
A diagram that showed letters on the 26 plug connections, 26 keyboard keys, and 26 output lamps of an Enigma machine would probably not be translated. That avoids 78 lines of translation, but still allows words such as "lamp", "keyboard", "plugboard" and "rotor" to be translated. Some labels are quotations about the machine, while other labels are descriptions appropriate for other languages.
For many languages, formulas such as F=ma or H2SO4 + Cu → H2 + CuSO4 do not need to be translated. The element symbols in chemical structure diagrams (e.g., N, H, S) probably do not need to be translated but other terms in the diagram such as "transcription", "DNA", "base pair", or "peptide synthesis" would be.
For the purposes here, I see it as a quick way to mark particular text as do not bother to translate.
Glrx (talk) 04:25, 25 August 2018 (UTC)
Thanks for the detailed explanation Glrx. I have a couple thoughts on this:
  • What if a user comes in and marks a label as non-translatable but it's incorrect? Since it applies to all languages, we'd need to come up with a way to ensure that even if user A marks something as non-translatable by accident, user B can come in and be easily able to access those labels and fix the checkbox. There might be a conflict in where user A and user B don't agree on whether a label chunk is translatable or not.
  • The its:translate="no" attribute is not supported by SVG 1.1, correct? So for now the only use of this will be made by the tool, right? I wonder if it can throw off some image editors which don't understand what's going on. This needs more investigation.
Do you have an example image on Commons in mind that I can use to describe the intended behavior better with the rest of the team? I tried looking for one but didn't see any. -- NKohli (WMF) (talk) 23:05, 28 August 2018 (UTC)
NKohli (WMF)
Think about how an ordinary user would use the tool to translate a diagram. The tool would extract all the text elements from the diagram and present them in a form for translation. The user would see "123", "veni, vidi, vici" or "H2SO4", realize they should not be translated, and click a checkbox that says do-not-translate. That text can now be hidden from view.
If the user wants to check all the text that has been marked do-not-translate, she can ask the tool to display ALL the text (along with their do-not-translate checkboxes). She can then uncheck the box. (There's a feature here, too. The do-not-translate check can be used to exploit switch default processing. Latin (*-Latn) languages do not need to translate quantities such as "14 kg", but Cyrillic (*-Cyrl) languages may want to use "14 кг" and Hebrew may want to use "14 ק"ג".)
its:translate is not supported by SVG in the same way that translate is not "supported" in SVG 2.0 even though it exists in the SVG 2.0 specification. The same is true of anything in SVG 1.0/1.1/2.0's metadata element such as the widely used RDF. The XML elements and attributes are there, but they have no influence on rendering the SVG. Both Adobe Illustrator and Inkscape add XML that is not in the SVG specification. User agents such as browsers and librsvg will routinely ignore the extra detail and paint the image. The issue as far as translation goes is whether such foreign extensions get stripped by graphic editors such as Inkscape, Adobe Illustration or Corel Draw. It's likely that only Inkscape will preserve the annotations -- but the other editors will probably toss the translations, too, so losing the its:translate attribute is a minor worry.
I don't know what you mean by an illustration showing intended behavior. File:Galvanic cell with no cation flow.png is a PNG, but it shows chemical formulas that need not be translated along with phrases such as "zinc anode", "copper cathode", "porous disk", and "anion flow".
Glrx (talk) 00:26, 29 August 2018 (UTC)
Thanks for the explanation, Glrx. I have added that to the list of non-MVP features. -- NKohli (WMF) (talk) 17:54, 29 August 2018 (UTC)
File:Map Tenerife Disaster EN.svg is an example where taxiway numbers and names "KLM" and "PanAm" probably do not need translation. Glrx (talk) 18:19, 29 August 2018 (UTC)
Great, that helps a lot. Thank you! -- NKohli (WMF) (talk) 18:30, 29 August 2018 (UTC)

Very nice mockups, thanks NKohli (WMF)! Besides what said above, we could also propose a default file name (like it's generally done on Commons for the translations). It should be Filename - lang.ext; where lang is the language code. Ex: Earth and moon - en.svg. Ruthven (msg) 18:13, 30 August 2018 (UTC)

Thanks for the feedback, Ruthven. Since we are planning to use switch-translations to add translations to the same file, suggesting file names would not be required as it uses the same file name. We're including automatic translation previews so users can see what the translated file would look like and adjust the labels, if needed. -- NKohli (WMF) (talk) 21:07, 30 August 2018 (UTC)

Open questions[edit]

Hello all. I'm Niharika and I'm working as the product manager for this project. I'm into the initial phase of my research and am digging into the user workflows around translating SVGs on Commons. Doing this will help us start creating rough designs which I will be sharing and gathering feedback. After several discussions, we have decided to build a new tool for this project which we will be maintaining. We are looking forward to collaborating with other developers who are interested in contributing to the project. If you'd like to get involved, you can join us in the discussions on phabricator (phab:T201207).
I have a few open questions that I am looking for answers for, wich will enable me to better understand the requirements of this project:
Thank you. -- NKohli (WMF) (talk) 22:37, 15 August 2018 (UTC)

Q: User workflows[edit]

I have documented the two workflows I have discovered that exist on the project page. Are there any other workflows that I have missed listing? For example, do translators commonly want to find files they have translated in the past to add or edit translations? -- NKohli (WMF) (talk) 20:01, 16 August 2018 (UTC)

  • I think the proposed workflow for the first use case ("This user would like to input the image URL in the tool to translate it or have the tool auto-complete the image name when they start typing.") is not sufficient. The tool should be amenable for integration via a link on the sidebar of File: namespace pages on Wikimedia wikis, so that one could simply click a "translate this image" link on the sidebar and be taken to the translation interface, with no typing necessary. For a user interested in an image in particular, I'd say this workflow would be even more convenient than having to visit the tool page and inputting the image name manually. Having both options would be great, but I'd prioritize the wiki-integrated one. --Waldir (talk) 08:19, 17 August 2018 (UTC)
Thanks for pointing it out. That is an important point and I have clarified the workflow to indicate that. What I am aiming to capture in that section is how many different ways that tool can be used. -- NKohli (WMF) (talk) 17:09, 17 August 2018 (UTC)
  • I believe there's a significant usage pattern that is missing from that list: maintenance-focused editors who may not be interested in particular languages, but would like to convert existing sets of translated images into a single image with a switch statement. I would place myself in this group. Is this something the tool could facilitate? --Waldir (talk) 08:22, 17 August 2018 (UTC)
That's an interesting point. I can think of a few potential ideas, but I have no idea if they are technically feasible, so they're purely hypothetical until we talk to the engineers:
1. The user gives the tool the files they need to combine (maybe the tool automatically fetches the linked "Other versions" files).
2. For each file, the tool opens up the translations from all the files along with image thumbnails so the user can see which string belongs where.
3. The tool opens the SVG code for the file the user provided initially (in a side panel) and allows the user to add the switch statements. I'm not sure if the tool can automatically do this but I will pose that as a question to the engineers.
Does this sound like it'll be useful, Waldir? I also have a few questions for you -
* Do separate images get deleted once they are combined? What about the pages where those images were transcluded?
* How do editors currently find the images that need combining and how do they combine them?
* Can we safely assume that everyone wants to move towards single-file versions of images or are there still users who want to have separate files for translations?
Thank you. -- NKohli (WMF) (talk) 17:09, 17 August 2018 (UTC)
I'm not sure whether the three options you mentioned were meant as mutually exclusive alternatives, or as potentially composable steps. Generally, the more automated and visual the workflow could be, the better.
As for your questions:
  • I think it should be safe to delete the multiple images as long as they're identical copies with only the text being different. Of course, expections may exist, but in such cases the community is capable of handling them. In any case, I don't think it needs to be within the scope of this tool to provide a way to delete the copies once they're merged. It would be a nice addition for convenience, but certainly as an extra, not a fundamental feature.
  • I'm not aware of any reusable workflow to locate such sets of images and do that sort of combining. Perhaps others will be more informed, but I don't think it's common practice, precisely because it's quite cumbersome to do manually.
  • I can't speak for the rest of the community -- as a WikiGnome, I tend to work mostly behind the scenes doing small, uncontroversial edits, and therefore don't coordinate much with other editors. From my POV, merging translated SVGs into a single image is a reasonable position, but others may have insightful comments that haven't occurred to me.
Hope this helps! --Waldir (talk) 16:13, 18 August 2018 (UTC)
Thanks, Waldir. I intended the steps to be part of the same workflow. We will consider adding functionality for allowing users to do this type of maintenance, depending on technical feasibility and how quickly we can get the basic tool up and running. I have added this use case to the user workflows section. -- NKohli (WMF) (talk) 18:56, 20 August 2018 (UTC)
Remerging split files should be a separate project; the tasks are different. I've added a lot {{Other versions}} templates to files to get a notion of what is going on, but it raises some thorny issues. An editor may upload a CC0 file, and then another editor may translate but upload the translation as a CC-BY-SA file. For the files to be remerged requires the new rights to be either respected or challenged. I would encourage the tool to require translation contributions be CC0.
A typical course is to leave old files around to preserve their edit histories. The files can be marked as superseded by other files.
It is not clear that single file versions are "safe". We should find out what happens to them in Inkscape, Illustrator, CorelDraw, and any other major graphics editors. @JoKalliauer: can probably tell us what happens with Inkscape. We don't want to turn files into objects that only wizards can edit.
Glrx (talk) 20:52, 24 August 2018 (UTC)

Q: Basic proposed solution[edit]

Does the proposed solution section on the project page look accurate as far as basic functionality for the tool goes? Are there any important pieces missing? -- NKohli (WMF) (talk) 20:01, 16 August 2018 (UTC)

  • It would be good to consider integration with existing tools such as CX or the VisualEditor. Also, the section does not state that it will be a ToolForge project, you need to go in other sections to find that out.--Strainu (talk) 22:04, 16 August 2018 (UTC)
Translatewiki suggestion feature.png
I was not sure if everyone would know what ToolForge is but I have updated the top of the project page to reflect that. We are planning on showing translation suggestions from TranslateWiki (see screenshot). I'm not sure if ContentTranslation offers APIs for integration but we'll check. Thanks! -- NKohli (WMF) (talk) 22:20, 16 August 2018 (UTC)
How about reverse integration? E.g. click on a svg in CX and be sent to the tool or something like that.--Strainu (talk) 11:43, 17 August 2018 (UTC)
That's an interesting idea. The reason it will be tricky to do is that the code has to figure out that the SVG file has text labels and is translatable. This is not natively handled by MediaWiki or CX and will have to be added to it, which is a great deal of work. The most straightforward way would be to link to the tool from the Commons image file page. This would work via a gadget (perhaps default) that lives on Commons and produces a link to the tool. The gadget can also potentially be used on other wikis to add direct translation links to SVG files so they don't have to go through Commons to access the tool. -- NKohli (WMF) (talk) 17:17, 17 August 2018 (UTC)
  • I am not sure. It states some basic goals, but I don't get a good sense for computation or mechanism. Statements such as
  • tspan tag labels in a text tag are merged and presented to the user as one translation unit.
raise questions. tspan is used for line breaking, but it is also used for font changes, subscripts, and color changes. Just grabbing text.textContent can be the wrong thing.
I'm hearing ToolForge and gadget, but I'm not getting a notion of server and/or client processing.
Glrx (talk) 21:13, 24 August 2018 (UTC)
@Glrx: good question. The proposed solution represents the basic functionality of the tool from a user's perspective. It does not yet say on a granular level how everything will function as the team is still working out the technical details. Specifically about line-breaking, I have a ticket (task T202771) for the engineers to compare different solutions and decide on the best way to implement this. As you know a lot more about how SVGs work than I do, your input on that ticket will be invaluable. -- NKohli (WMF) (talk) 21:43, 24 August 2018 (UTC)

Q: Frustrations with existing solutions[edit]

I have outlined several problems that were brought up on phabricator (phab:T201207) in the Problem statement section on the project page. If you have used the svgtranslate tool in the past, what are your frustrations with it? What other solutions do you use and what problems do you run into? -- NKohli (WMF) (talk) 20:01, 16 August 2018 (UTC)

  • No way to easily find translated files (with different name) even if they exist.--Strainu (talk) 22:07, 16 August 2018 (UTC)
That's an interesting problem. Commons does have a feature to tell you if it finds a similar file when a user tries to upload a duplicate file. It should be possible to use the same code to allow users to discover same files with different names. I added this to the separate file versions problem section. -- NKohli (WMF) (talk) 22:26, 16 August 2018 (UTC)
I believe the feature you refer to uses the SHA1 checksum of the image, in which case it's useless if the contents change (like they do for translations). I might be wrong though.--Strainu (talk) 11:26, 17 August 2018 (UTC)
SVGTranslate relied on DerivativeFX to make a backpointer link on the Commons page. SVGTranslate did not enforce a convention that translating File:My Pretty Picture.svg to Klingon should upload the translated file to File:My Pretty Picture - tlh.svg. That would make it easier to find files. I look for similar names and all linked categories but still do not find all the descendants.
The backpointer should also have been added as CC RDF to the SVG's metadata. Yes, changing the content changes the SHA1 digest, but the
tuples should stick around. Glrx (talk) 22:14, 24 August 2018 (UTC)

whished additional features[edit]

Sorry if some of those features allready whised somewhere else, or I added it at the wrong place (you can edit/delete/add point here, or move my edit), and some of them also are probably just "pie in the sky".  — Johannes Kalliauer - Talk | Contributions 21:01, 16 September 2018 (UTC)

Q: Ways to make the tool more discoverable[edit]

External tools are hard to discover for people who are on wiki. How can we give these tools more visibility? -- NKohli (WMF) (talk) 20:01, 16 August 2018 (UTC)

I would (as a user) expect a link to the tool from a Commons file page, local image description page, also probably from local image for translation templates. --Dvorapa (talk) 02:46, 17 August 2018 (UTC)
Just under the SVG image, activable from Preferences menu. Exactly like the "Rotation tool". --Ruthven (msg) 05:59, 17 August 2018 (UTC)
@Dvorapa and Ruthven: Adding a gadget on Commons (and other wikis) that links to the tool from the image page should work. For local templates, it could be a link that the community agrees to add. Do you think the links should only appear on SVG images in the Translation possible - SVG category or on all SVG files? How up to date is that category? -- NKohli (WMF) (talk) 03:23, 18 August 2018 (UTC)
@NKohli (WMF): Logically it should appear on SVG images in the Translation possible - SVG category only. It wouldn't be very useful for a beautiful SVG coat-of-arms for instance, where there is nothing to translate. --Ruthven (msg) 06:15, 18 August 2018 (UTC)
Ruthven, thanks. To clarify, my question was whether that category is kept sufficiently up to date. It sounds like that is indeed the case. -- NKohli (WMF) (talk) 18:59, 20 August 2018 (UTC)
@NKohli (WMF): Well, the files that aren't tagged as translatable are difficult to discover. We will have for sure a lot of SVG files that can be translated but not marked as such. However, besides manually checking the uploads, there isn't much to do - unless automatically see if there is some embedded text in SVG files. --Ruthven (msg) 21:46, 20 August 2018 (UTC)
@Ruthven: Understood, thanks. I think that this is something which a bot can probably do and maybe someone from the community will volunteer to do that. -- NKohli (WMF) (talk) 22:12, 20 August 2018 (UTC)
We could possibly ask bot operators for help. --Dvorapa (talk) 10:22, 24 August 2018 (UTC)
Image.php displays the "render this image in" dropdown box if the SVG has switch translations. A nearby button could offer to edit or add to the file's translations. Glrx (talk) 21:18, 24 August 2018 (UTC)

Translate and wikilink through Wikidata: use QID[edit]

For example File:Simple Periodic Table Chart-en.svg. It has the English words "group, period". Projected, is, add per chemical element cell their names (title) in en and wikilink to the en:wiki article. As we can see, the "en" qualifier is added to the filename (title). There is also derivate (code fork) File:Periodic Table Chart-sr.png, for srwiki (Serbian, in Cyrillic script BTW). Of course in SVG, we can add multiple languages in one file, and set the lang switch in respective wikis.

I'd suggest to add this option: add option QID's (for example, QID=Q925 for mercury, Mercury (element)). The cell "Hg" then has, for lang=en (!): title=mercury, link to = en:mercury (element) (note: the chemical symbols like "Hg" are not translated ever). -DePiep (talk) 10:11, 18 December 2017 (UTC)

Okay, thanks for the suggestion! -- DannyH (WMF) (talk) 22:03, 18 December 2017 (UTC)
First, the indicated Chart has converted its text to paths, so it is a huge file (421 kB) rather than its initial small size (21 kB). The text should not use paths. We should encourage MW to directly serve small SVGs. (Switch translation works for a few strings in a few languages; for many strings in many languages, it can make the file large; in such situations, XSLT can be used to localize the file just as librsvg localizes SVG to PNG.) There should also be more space available for the "Group" string.
Second, I'd borrow the ITS / SVG 2.0 translate="no" attribute to mark most strings as "do not translate". That would leave only "group" and "period" for translation. (W3C validation will complain about the SVG 2.0 attribute.) There's no reason for tools such as SVG Translate to ask the user to translate "Hg" to "Hg"; tools should obey a do-not-translate instruction. (I'm not sure that is an absolute yet, but many languages would not translate them.)
Third, I'd mark the relevant strings with the attribute data-wd-q="QID". (W3C validation will complain.) "Group" would get group (Q83306) and "period" would get period (Q101843). The QID have names and aliases in dozens of languages that may be translations. Period in Korean is 주기율표 주기 . The elements would have their respective QIDs. (I'd also add the lanthanides and actinides.)
Fourth, linking won't work for the served PNG; the user would have to click through to the SVG. Instead of linking to the en.Wikipedia article, I'd link to either the Wikidata entry (the user can choose the wiki article) or go to an endpoint that will look at the user's ACCEPT-LANGUAGE request header[1] and redirect to the wiki article in the user's preferred language.
Glrx (talk) 22:45, 8 February 2018 (UTC)

Project proposal[edit]

Please see Grants:Project/Glrx/SVG i18n. Its intention is to use SVG's switch element and Wikidata for translation hints. Glrx (talk) 21:43, 8 February 2018 (UTC)

Oh, what an in-depth project page, thank you for sharing! We'll certainly look into this as part of our technical investigation in the next few weeks. (phab:T184310) — Trevor Bolliger, WMF Product Manager 🗨 22:28, 8 February 2018 (UTC)

JavaScript required?[edit]

Didn't work for me at all. When I pointed it to an image (I first had to manually hunt on Commons), it just said 404 to me. 0/5 would not translate again. --2001:14BA:804D:9F00:0:0:0:5D9 19:52, 26 October 2018 (UTC)

Hi. Which tool did you use? As I mentioned on the project and talk page, we don't have a tool currently. We only have a prototype only for illustration/design purposes. It is not supposed to work. -- NKohli (WMF) (talk) 21:10, 13 November 2018 (UTC)