Abstrakt Wikipedia/Iddien
Abstrakt Wikipedia |
---|
(Diskussioun) |
Allgemeng |
Entwécklungsplang |
|
Notes, drafts, discussions |
|
Examples & mockups |
Data tools |
Historesch |
Structured Comments, Attributes and Decorators
Structured comments, attributes and decorators could be utilized by Wikifunctions to provide features such as: function metadata, facilitating searching for functions, and the automatic generation of documentation. Structured comments are programming language comments which utilize syntactic patterns or XML so that the contents of the comments can be mechanically processed.
- https://docs.microsoft.com/en-us/dotnet/csharp/codedoc
- https://docs.microsoft.com/en-us/visualstudio/ide/xml-documentation-comments-javascript?view=vs-2015
Attributes can be attached to functions and their parameters in programming languages such as C# and Java.
Decorators are a proposed JavaScript language feature.
- https://github.com/tc39/proposal-decorators
- https://github.com/tc39/proposal-decorators#metadata
- https://tc39.es/proposal-decorators/
— AdamSobieski (Discussion) 22:15, 15 July 2020 (UTC)
Versionéierung
Utilizing structured comments, attributes, or decorators, versioning-related metadata can simplify versioning scenarios on evolving crowdsourced resources.
— AdamSobieski (talk) 22:15, 15 July 2020 (UTC)
Namespaces and Modules
Namespaces and modules can be useful when organizing large collections of functions. With namespaces or modules, multiple paradigms or ecosystems of functions could more readily coexist in a crowdsourced resource.
- https://v8.dev/features/modules
- https://www.typescriptlang.org/docs/handbook/namespaces-and-modules.html
— AdamSobieski (Discussion) 22:15, 15 July 2020 (UTC)
Scripting Environments for Natural Language Generation
With modern scripting engines such as V8, it is relatively easy to create and provide scripting environments.
Resembling how Web browsers provide scripting environments and API for Web scenarios, we can envision providing scripting environments and API for natural language generation scenarios.
Discussion topics pertaining to scripting environments for renderers include: (1) API and object models for accessing and working with input Wikidata data, (2) API and object models for accessing and working with the rendering context, (3) API and object models for accessing and working with intermediate knowledge representations, (4) API and object models for generating output natural language content.
- https://v8.dev/
- https://developer.mozilla.org/en-US/docs/Web/API
- https://developer.mozilla.org/en-US/docs/Web/API/Window
- https://developer.mozilla.org/en-US/docs/Web/API/Document
- https://developer.mozilla.org/en-US/docs/Web/API/Document_Object_Model
— AdamSobieski (Discussion) 22:15, 15 July 2020 (UTC)
Provenance
As Wikidata is a sourced knowledgebase, API and object models should include means for annotating any intermediate representations and portions of natural language with sources. In automatically-generated articles, statements’ sources could appear as referenced materials in articles’ “References” sections with numbered citations appearing inline, near relevant content.
Should Wikidata come to include support for automated reasoning, any reasoning, argumentation, derivations and/or proofs supporting statements could similarly appear in articles’ “References” sections with numbered citations appearing inline, near relevant content. Readers could click on hyperlinks to navigate to automatically-generated documents which indicate supporting reasoning, argumentation, derivations and/or proofs for one or more statements.
- https://www.wikidata.org/wiki/Help:Sources
- https://www.wikidata.org/wiki/Wikidata:WikiProject_Reasoning
- https://en.wikipedia.org/wiki/Wikipedia:Citing_sources
— AdamSobieski (Discussion) 22:15, 15 July 2020 (UTC)
Output Streams, Logging and Diagnostic Events
When editing/developing Wikifunctions content for use on Abstract Wikipedia, it would be convenient to be able to output to multiple streams, to log, and/or to raise typed events. Such features are part of the scripting environment provided to functions.
It would also be useful to be able to aggregate, organize and view diagnostic outputs with a configurable granularity or verbosity.
— AdamSobieski (Discussion) 22:15, 15 July 2020 (UTC)
Editor/Developer Experiences
Editors/developers could have a means of toggling a “developer mode” or “debugging mode” on Abstract Wikipedia so that they could, while viewing articles, either:
- hover over portions of natural language to view relevant traces of computation and diagnostic messages in hoverboxes,
- view visual indicators for traces of computation and diagnostic messages in a margin so that they could then interact with the visual indicators to view expanded data, or
- otherwise select or indicate portions of natural language content to view relevant traces of computation and diagnostic messages.
— AdamSobieski (Discussion) 22:15, 15 July 2020 (UTC)
Reader Experiences
We can consider adding feedback mechanisms for Abstract Wikipedia readers such as commenting upon, liking, upvoting, or otherwise providing feedback with respect to specific portions of automatically-generated natural language content.
Also possible is that readers could “post-edit” automatically-generated content. For automatically-generated articles, there could be wiki versions of the articles for purposes of crowdsourcing the fine-tuning of the articles. These “wiki post-edited” versions of automatically-generated articles could be navigated to via tab user-interface elements. Data from this variety of crowdsourced feedback on automatically-generated articles, “wiki post-editing”, could be collected and aggregated for use by Wikifunctions editors/developers.
- https://en.wikipedia.org/wiki/Postediting
- https://www.aclweb.org/anthology/W05-1615.pdf
- https://ehudreiter.com/2020/06/08/human-editing-of-nlg-texts/
— AdamSobieski (Discussion) 22:15, 15 July 2020 (UTC)
The Automatic Evaluation of Natural Language
Software tools in the categories of automatic essay scoring, grammar checking, readability measurement, and/or natural language evaluation could be of use for automatically measuring articles in a number of ways. Coh-Metrix 3.0, for instance, measures natural language on 108 indices.
Perhaps bots could measure articles as they are updated and report their data to editors/developers using a platform API.
— AdamSobieski (Discussion) 22:15, 15 July 2020 (UTC)
Generating Articles in Response to Users’ Questions
Resembling question-answering systems, articles could be generated for Wikidata queries or after users navigate to articles from Web searches. Beyond highlighting relevant content, articles could be generated while utilizing this context data. — AdamSobieski (Discussion) 22:15, 15 July 2020 (UTC)
Generating Follow-up Questions for Use in Articles
Resembling hypertext-based dialogue systems, one could place follow-up questions which might interest a reader in a section near the bottom of articles, each being a hyperlink to another article. One could hyperlink to articles which could be dynamically generated, if they are not already created and cached. — AdamSobieski (Discussion) 22:15, 15 July 2020 (UTC)
Speech Synthesis and Hypertext
There exists a CSS Speech Module W3C Candidate Recommendation.
With respect to pronunciation, one can utilize pronunciation lexicons with hypertext documents. Also, resembling EPUB3, one can utilize SSML-based attributes on generated hypertext outputs to provide pronunciation data.
- https://www.w3.org/TR/css-speech-1/
- https://www.w3.org/TR/pronunciation-lexicon/
- https://www.w3.org/publishing/epub3/epub-contentdocs.html#sec-xhtml-ssml-attrib
- https://www.w3.org/TR/speech-synthesis11/
— AdamSobieski (Discussion) 22:15, 15 July 2020 (UTC)
Use GPT-3 style API's
Use GPT-3 style API's to automatically translate normal language into the syntax of Abstract Wikipedia. — ChristianKl (Discussion) 16:15, 16 January 2021 (UTC)
Function contracts
Wikifunctions should support function contracts, as already provided by different programming languages (like Eiffel, Spark-Ada, or JML; or even Python with different packages), that is:
- Preconditions: A new section after the function arguments with a list of boolean predicates indicating the conditions required by the arguments before calling this function (e.g. list cannot be empty, or first parameter must be greater than the second one)
- Postconditions: A new section after the function arguments with a list of boolean predicates indicating the conditions that are guaranteed by the result of the function (e.g. result is between zero and the first parameter, or the length of the output list is the length of the input list plus one)
- Type invariants: A new section in the data type page with a list of boolean predicates indicating the conditions met by every value of this data type (e.g. value must be strictly greater than zero, or is a prime number)
Probably the simplest approach is that each predicate is just a function in the same namespace as the rest of the functions. In programming languages usually extra operations are allowed in contract predicates (like the "for all" or "there is at least one" quantifiers), but this may be optional. In pre-conditions it would be required just to reference the function arguments, and in post-conditions a way to reference the function result will be needed. As I understand that arguments cannot be modified, there is no need to reference in postconditions the original value of a parameter at function start. All predicates of the list must be true (i.e. a logical and of all the predicates), and preconditions can be as detailed as needed, probably being also useful for renderers, e.g. argument must be an wikidata item of a human, and also already dead.
Besides formally documenting the function for implementers and users (e.g. if a precondition fails, the user will get a unified and clear error message, without the need to handle the different errors inside each function implementation), postconditions are very useful for automatically checking the results during tests, and would be nice that the platform generates a report with constraints violations for each implementation in case a result doesn't fulfill the postcondition with a set of input parameters (so the implementation or postcondition can be corrected). Type invariants would be implicit function preconditions of every function with arguments of that type, and implicit postconditions of every function with a result of that data type.
Other advantages usually obtained is the potential to simplify implementations because some defensive code can be reduced thanks to the preconditions, and avoids the need to handle exceptional situations in a compatible way between all supported languages. Maybe "robustness tests" should also be provided, i.e. some special tests for a function checking that some parameters are not allowed by the current preconditions of the function.
I hope this helps in the design process, and thank you very much again for this awesome project. — surueña (Discussion) 06:59, 3 April 2021 (UTC)
- Kuckt hei. — DVrandecic (WMF) (Discussion) 22:59, 19 April 2021 (UTC)
Beta testing with useful material and a sponsored team of translators
Service manuals are knowledge about how to properly use something. Because manufacturers will benefit from a multilingual translation of their service manual and because mass producers, like Sony, Stihl, Bayers, Suzuki, have gigantic ressources, they can afford sponsoring a team of translators provided by Wiki but paid by them. Translators will double check if Abstract properly brought the content from one language to another. And hosting fees for the translated manuals can be fully covered by the mass producers who would agree to thrust the team here.
Having a multilingual service manual library about how to use a chainsaw, or achieve a bokeh, that’s great, but this is not the goal as I see it. It’s just a happy collateral benefit.
The main goal is to beta test the machine. Translators will not fix the errors; they will point it out for us so we may understand why the machine failed. And if the machine is perfect, the translated team will have proved its reliability.
About the translating team, it can be recruited in different ways. Fiver Schools teachers Catholic Church
I think the idea of the Catholic Church may be surprising and seem out of place, or a devout move, so here is the thinking behind it.
First, let’s clear the clouds: even tough there’s sadistic teachers, murderous police officers, and a lot of bad people in (maybe) every organisations, we cannot condemn them all as bad people. And I’m not talking about forgiveness, not at all. I’m talking about collaborating with the better part, the good people there. Their pitch is about love and sharing, and there’s no need to believe in any faith or miracle. Wiki is secular, not invested in paranormal activities or message spreading. But (again, a double-win): Catholic church is everywhere on the planet, in every language, and full of scholars in languages. Yes, they want to talk about love (and unfortunately sometimes about why their love is better) but they do want to translate the bible in every language on earth. It’s a core belief, since Jesus said (so they say): Spread the word. I’m not up to date about the financial resources of Pope Francis and friends, but I know they have a network of scholars speaking local languages or knowing who is really reliable to do the job. They had bad press and are now looking for an answer about which place can the Church have, how they can help people by spreading our words and their Word. And whatever the belief, the Bible is a book who shaped mankind for the last 2 milleniums. Translating it in every creole language would be showing peace and respect, that’s it. And what about the other big religions? Hop them in! We can send invites simultaneously, as the scope is clearly defined (so Wiki stays secular and everyone is at ease). Every church is founded on the reading and interpretation of literature, this is really a place of knowledge and philology. And their hierarchy can confirm the credibility of the translators. We don’t want someone like the fake sign language interpreter who was interpreting Barack Obama’s speech at the memorial service for Nelson Mandela.
—The preceding unsigned comment was added by François Jourdain (talk) 04:35, 23 February 2022 (UTC)
- Possibly an interesting idea. Sponsored contributions have always to be considered thoroughly and together with the community, but it could be a possibility in case organic growth is too stagnant. Let's see! That's something we should think about in a year or so. --DVrandecic (WMF) (talk) 00:11, 5 March 2022 (UTC)