Talk:Abstract Wikipedia/Tasks

From Meta, a Wikimedia project coordination wiki
Jump to navigation Jump to search

Abstract Wikipedia & Wikidata[edit]

Task P2.3: Abstract Wikipedia Create a new namespace in Wikidata and allow Content to be created and maintained there.

I don't think having abstract Wikipedia within a Wikidata namespace is a good idea. Abstract Wikipedia has different concerns when it comes to notability then Wikidata. It would be better if it has it's own domain, policies and admins. ChristianKl❫ 22:21, 17 May 2020 (UTC)

@ChristianKl: Wouldn't the notability policy of Wikidata always be more inclusive than of any Wikipedia, including the Abstract Wikipedia content? --denny (talk) 01:25, 18 May 2020 (UTC)
@Denny: That's the point Wikidata's notability policy is more inclusive then what would be good for Abstract Wikipedia. As a result there will be a problem when Abstract Wikipedia tries to reuse Wikidata's notability policy. I don't think writing the policy for Abstract Wikipedia would be a good job for Wikidata. I think it needs it's own jurisdiction. ChristianKl❫ 21:05, 19 May 2020 (UTC)
@ChristianKl: Ah, I understand your point. Because that would, for any wiki that does implicit integration, mean they would transfer their decision on whether to have an article on a topic to Wikidata, instead of a new project that would start fresh to make that decision.
We have the whole first project year to decide whether Abstract Wikipedia should be part of Wikidata or not, so I don't think we should decide it now, but why do you think it would not be a good idea to leave this decision to the Wikidata community? I don't really see the arguments of why a new project or Wikilambda would be better places. --denny (talk) 21:43, 19 May 2020 (UTC)
I don't have a problem with doing the decision via a Wikidata RfC, as a Wikidata Admin I would however surprised if the RfC ends up resulting in Wikidata wanting to be the place for managing Abstract Wikipedia.
Abstract Wikipedia likely wants to write their policies with Wikilambda, which needs specific skills most in the Wikidata community won't have and won't be interested in.
It's very unlikely that EnWiki will display any free text from Abstract Wikipedia in the next ten years. It's still good to have those articles on a website people can find via Google. ChristianKl❫ 18:40, 20 May 2020 (UTC)
@ChristianKl: Writing policies with Wikilambda is probably a bit further away than writing content (but a pretty neat idea, in fact for all the multilingual projects, including Wikidata, Meta, and Commons, eventually!)
Having an RfC on Wikidata within the first year of the project is a good idea, to decide about whether the Content should be stored in Wikidata or not. I think it is a good idea, and you gave enough reasons that warrant a wider discussion. I made two changes to the proposal to capture this. --denny (talk) 22:39, 24 May 2020 (UTC)

Project order[edit]

I'm confused to see that coding is planned to start right away, before any research on the linguistics etc. aspects is performed for instance. Does this mean we consider the research field to be "done" with providing a theoretical foundation for this idea? I'm not sure what's the budget for this project, but I suppose several millions of dollars. It might be useful to offer some sort of "theoretical bug bounty", or some other way to stimulate the production of papers by experts of the field to suggest how (their?) past research can be applied to the project. There is no single "correct" answer, but you can get some ideas. Even Denny might have overlooked something in the literature. ;-) Nemo 19:21, 2 July 2020 (UTC)

Even accepting the current general architecture, my personal opinion is that the very early work on the design of the initial varieties of constructors/renderers should be closer to the start of the development timeline and tracked. The usage of constructors/renderers is where much of the ambitiousness of the project originates. --Chris.Cooley (talk) 08:05, 3 July 2020 (UTC)

Development Part PP2 Suggestion[edit]

I suggest we add a series of tasks in a new development part named PP2 that would start concurrently with part P1. Part PP2 would be about producing a broad spectrum of ideas and alternatives from which a program for natural language generation from abstract descriptions would be selected.

I include a draft part PP2 below. The draft is adapted from Pre-Phase A and other material from NASA Systems Engineering Handbook.[1]

As I am sure there are many problems with this very initial draft, I ask the community to edit the draft and provide comments, objections, etc. as they see fit. --Chris.Cooley (talk) 07:34, 7 July 2020 (UTC)

Part PP2: Concept Studies for Abstract Wikipedia[edit]

Part PP2 is about producing a broad spectrum of ideas and alternatives from which a program for natural language generation from abstract descriptions can be selected ("the Program"). This part will start concurrently with P1.

Task PP2.1: Review initial requirements, scope of work, background, and implicit assumptions and constraints[edit]

Here we review the following items to the extent they are relevant to the Program.

  • Initial requirements[2]
  • Scope of work[3]
  • Background[4]
  • Implicit assumptions and constraints

Output:

  • Baseline requirements, scope of work, background, and assumptions and constraints

Task PP2.2: Identify and involve users and other stakeholders[edit]

Here we identify groups and individuals that are affected by or have a stake in the Program. Identified stakeholders shall be involved in all development parts of the Program.

Outline the needs, goals, and objectives for the Program for every primary stakeholder according to their expectations. Define measures of effectiveness for meeting these expectations.

Example output:

  • Wikimedia Foundation
  • Readers
  • Abstract content editors (Wikidata, prospective)
  • Current content editors (Wikipedias, Wiktionaries, Wikispecies, etc.)
  • Constructor/renderer developers
  • Domain experts
    • in natural language generation
    • in semantics
    • in construction grammar[5]
    • in frame semantics[6]
    • in typology
    • in minority languages
    • in previous attempts in natural language generation from abstract descriptions (e.g., Universal Networking Language[7], Abstract Meaning Representation[8])
    • in knowledge representation
  • Corresponding needs, goals, objectives, and measures of effectiveness for the above stakeholders

Task PP2.3: Review overview of the envisioned system[edit]

Here we review the architecture[9] of the Program at a high-level, without making specific decisions on constructor/renderer implementations.

Output:

  • Baseline overview of the envisioned system

Task PP2.4: Collect documents[edit]

Here we conduct a literature review of relevant research and reference material in the domains of the domain expert stakeholders. We also collect applicable documents created for use in the development of Abstract Wikipedia.

Output:

  • Baseline collection of documents

Task PP2.5: Complete analysis of alternatives for theoretical bases for the Program[edit]

There are many semantic theories that wish to describe meaning at some level that is presumed to be universal, and therefore would conceivably be a candidate for an abstract description language.

Here we analyze various alternative theoretical bases for the Program. A theoretical basis shall be chosen, and the assumption that another alternative should have been selected shall be refuted.

Output:

  • Analysis of alternatives for theoretical bases for the Program
  • Baseline selected theoretical basis for the program

Task PP2.6: Describe the envisioned system[edit]

Here we provide a more detailed description of the envisioned system. For example, a description of a specific constructor/renderer implementation making use of the selected theoretical basis shall be given.

Output:

  • Baseline description of the envisioned system

Task PP2.7: Describe use cases for the envisioned system[edit]

Here we describe use cases for the envisioned system. These are the conditions in which the envisioned system shall function.

Example Output:

  • Use cases from project goals[10]
    • Reusable and well-tested natural language generation
    • Allowing more people to read more content in their language
    • Allowing more people to contribute content for more readers, and thus increasing the reach of underrepresented contributors

Task PP2.8: Identify initial technical risks[edit]

Here we describe any risks and potential issues associated with the development and operations of the envisioned system. Also include any concerns and risks with the project schedule and implementation approach.

Output:

  • Baseline technical risks

Task PP2.9: Develop plan[edit]

Here we develop a technical plan for the creation of the envisioned system to occur in task P2.1 below.

Identify the roles and responsibilities of those involved in the technical plan.

  • Baseline technical plan

Task PP2.10: Prepare sub-proposal for the Program[edit]

Here we prepare a sub-proposal for the Program that will include the outputs of all the above tasks. This proposal must be approved by consensus before work on the Program in task P2.1 below can begin.

Output:

  • Sub-proposal

Notes[edit]

  1. National Aeronautics and Space Administration, NASA Systems Engineering Handbook (Washington, DC: National Aeronautics and Space Administration, 2016).
  2. See primary goals in Abstract_Wikipedia/Plan#Goals.
  3. See Abstract_Wikipedia/Requirements and Abstract_Wikipedia/Architecture.
  4. See Abstract_Wikipedia#Background_/_supporting_material_/_existing_discussion.
  5. This domain is explicitly present in this list because of initial interest.
  6. See note 5 above.
  7. See en:Universal_Networking_Language.
  8. See en:Abstract_Meaning_Representation.
  9. See Abstract_Wikipedia/Architecture.
  10. Abstract_Wikipedia/Goals

Comments[edit]

@GrounderUK:

With respect to your question about adding signatures when editing a draft document, I think it would make sense to not do so, and I removed your question and signature accordingly.

Good move!

I also removed "Template developers" and "in system performance" from the list of example stakeholders as I felt that they had significantly less stake in the Abstract Wikipedia side of the project than the other listed stakeholders.

If you disagree with any of the above, I think it would be great to start a discussion here! --Chris.Cooley (talk) 20:52, 8 July 2020 (UTC)

@Chris.Cooley: Speaking for "template developers" and "systems performance" experts (and I am neither), my experience (not of Wikimedia projects) has been that these are the types of stakeholder who often end up as "victims" of innovative development. You are probably right that performance is much more an issue for Wikilambda than for Abstract Wikipedia, and as this happens first, I'm happy to leave them to consider what level of ongoing involvement is appropriate.
I feel rather differently about "Template Developers". What I have in mind is that contributors who do more than just edit will have different concerns and priorities than ordinary editors like myself. I think that template developers (for local Wikipedias &c) will end up sitting between editors and Renderers. That may not be the vision, but it is my expectation. And in the early days, I expect that a rendered (no longer) abstract article's Wikitext will include more characters within {{}} than is currently typical. I wonder, also, whether language-neutral abstract templates are within scope (I think they should be). In any event, if template developers are not the most experienced current users of Wikidata content, who is?--GrounderUK (talk) 22:46, 8 July 2020 (UTC)
@GrounderUK:
Would it make sense to change "abstract content editors" and "current content editors" to "abstract content/template editors" and "current content/template editors," respectively? --Chris.Cooley (talk) 11:06, 10 July 2020 (UTC)
@Chris.Cooley:
Sorry, Chris. I note that we are discussing the Example Output from a task that has not yet started. Further, it seems to be just the two of us at this stage. I'm happy to get involved with keeping these things moving along, but I feel we are still a long way away from what I've been thinking the requirements suggest. So, if I may take you back to PP2.1:Requirements..., we need to be doing those things (at a high-level, initially) in order to get a feel for the best approach to PP2.2:...Stakeholders. I see no reason to delay PP2.1:Requirements, but when I went back to the requirements, it seemed to me that the scope of PP2.2...Stakeholders needs some thinking about, given the hundreds of Communities and our many languages and cultures. I feel the need to get some degree of traceability in place, but it will be a few days before I can step through PP2.1:Requirements... Basically, I'm just thinking of a one-dimensional array of stakeholder types for each paragraph. Other suggestions always welcome, of course...--GrounderUK (talk) 19:50, 10 July 2020‎ (UTC)
@GrounderUK:
I am well aware of this. For the moment, I am just trying to be responsive to immediate criticism of the draft, which was the intent of my comments above. --Chris.Cooley (talk) 23:37, 10 July 2020 (UTC)
@Chris.Cooley: No criticisms from me, just constructive engagement. You asked for comments and amendments; I was just explaining that it will take me some days to work out what I think the output might look like, taking into account any feedback there may be in the mean time.--GrounderUK (talk) 00:18, 11 July 2020 (UTC)
@GrounderUK: I understand. Thank you.

@Chris.Cooley: And feedback came there none. Well, it's early days but the general level of activity here is causing me some disquiet. (To be clear, I'm not talking about you here, Chris, nor about any individual in particular, just a general "where is everybody?")

On the stakeholder front, my ideas are still developing. I'm wondering whether we can do a Wikispore thing? It might link in with the Related and previous work. Then, any individual or group could have a page with links to their papers, software, blogs etc and here they (or we) could outline our shared interests and (shall we say) where we have less in common. The same would apply to different Wikimedia projects and contributors. This should support genuine stakeholder identification as well as our all-important gap analysis.

Since "we" are ourselves stakeholders, I asert that the union of all intersections between stakeholders and the project is a definition of the project. The gap analysis ought, therefore, to be quantitative rather than existential. In a more pragmatic sense, the Project Director is personally responsible for any aspect of the project that is not explicitly delegated and, in this context, a complete absence of any aspect of the project from the union of intersections is a defect of delegation (or, realistically, a defect in the documentation). Our quantitative gap analysis considers the alignment between the recorded stakeholder/project intersections and some multi-dimensional standard of adequacy. Which is to say, do we have too little, too much, or just enough. (I make no presumption here that the project is fully defined a priori. On the contrary, I hold it to be true that a project can only ever exist as a manifestation of a subset of stakeholder concerns, but I do not defend that thesis here or elsewhere.)

Our quantitative gap analysis should not be thought of as something we conclude at some particular point in time; it is ongoing; the project definition and the stakeholder intersections co-evolve. While the project is immature, I think the problem is principally a deficiency in documentation. As the documentation grows (on Wikispore, for example), the real (and temporary) deficiencies in stakeholder engagement are made manifest. If attention paid to temporary deficiencies reduces the deficiency (by increasing stakeholder engagement) the project and its stakeholder engagement are better aligned. If the stakeholder engagement is not increased, the project scope is reduced (temporarily, at least, and not necessarily formally... "it's not our highest priority at this time" is our watchword). I leave the problem of excessive stakeholder engagement for another time, but it's filed under Goldratt's Theory of constraints along with the delightful Engpasskonzentrierte Strategie. @Chris.Cooley: Apparently ping doesn't work if you forget to sign your comments, or so someone said in 2017. Pinging you again in case that's true; don't feel pestered!--GrounderUK (talk) 16:27, 28 July 2020 (UTC)

@GrounderUK: I really do apologize for the delay. I do not have anything intelligent to add to the topic of stakeholders, but I appreciate you adding good discussion here. --Chris.Cooley (talk) 23:47, 28 August 2020 (UTC)
@Chris.Cooley: That's okay, I still have hopes for a Wikispore, but maybe we'll get our full Wiki sooner rather than later. Denny mentioned 400 languages (as opposed to 7,000) on Talk:Abstract Wikipedia; presumably there's a list somewhere (some subset of these plus some others?). If @Quiddity: happened to be passing, he might have some thoughts on how we provide a "Start Here" for each of the favored four hundred... --GrounderUK (talk) 10:05, 1 September 2020 (UTC)