Talk:Abstract Wikipedia/Function model

From Meta, a Wikimedia project coordination wiki

Typo under Z8/Functions ?[edit]

Has a sentence:

Using literals instead of persistent ZObjects for the arguments, this would look as follows. Note that we are creating the literals using the Z70/positive integer as a constructor. All Z8/Types can be called like this, providing a value for each of their keys. This is not a Z7/Function call, but a notation for the object of the given Z4/Type.

Shouldn't that instead say ... "All Z4/Types can be called like this, ..." ?

Or was it supposed to say ... "All Z8/Function types can be called like this, ..." ? Thadguidry (talk) 22:03, 27 July 2020 (UTC)[reply]

Yes, you're right, it should have been Z4/Types! Good catch, thanks! Fixed. --denny (talk) 22:50, 27 July 2020 (UTC)[reply]

Internationalization concerns[edit]

  1. Bidirectionality (or multi-directionality, generally): if strings in a natural language are sequenced right-to-left, can the sequence of JSON strings have the same sequence ("NOSJ")? I think JSON objects are unordered name—value pairs; can they be value—name pairs? "2" =: "Z70K1" (or should that be "2" =: "1K70Z", for example?)
  2. Punctuation: You'll note that I directionalized "=" to "=:", for effect. What about other controlled punctuation? Double quotes are a useful translingual feature, but do we want to accept that constraint?
  3. Numerals: Any support for alternative numerals? "2" is just the name for the corresponding integer, isn't it?
  4. Identifiers: They can't help being meaningful, in that they acquire meaning through use, but this composite form, "K1" appended to "Z70" (or is it the other way round): is it inherently unidirectional? Is it a name—value pair in disguise? Or a pair of name—value pairs? I prefer it to just using arabic-numeral string representations of integers, or random alphanumeric strings, but I think a thoughtful selection of codes is worth exploring. Not so much for Z2/persistent objects, but for their required methods (like "key", unless we need leftKey, rightKey...). Maybe we could require Ks to be symmetrical, or symmetrical integers, or symmetrical alphanumerics?--GrounderUK (talk) 10:57, 28 July 2020 (UTC)[reply]
That is a great question! I always assumed that JSON already solved the issue of RTL/LTR representation. Was I too naïve? Since the same issue has already been solved in the Translate extension which also stores RTL/LTR texts, I hope that @Amire80: may help here. Also possibly relevant: W3C discussion note. --DVrandecic (WMF) (talk) 22:23, 4 August 2020 (UTC)[reply]
Regarding the numerals - yes, other types can be introduced to support other numerals natively if so wished, and also the numerals can be used as a basis to render them in different ways. --DVrandecic (WMF) (talk) 22:23, 4 August 2020 (UTC)[reply]

@ABaso (WMF):Further to your question on Talk:Abstract_Wikipedia/Early_mockups, on reflection, I suggest we:

  1. Adopt the JSON formalism as an explicit constraint on the project
  2. Make explicit reference to internationalization concerns implied by the left-to-right formalism of JSON
  3. Commit to exploring generic re-sequencing solutions ("JSON isomorphisms") during the initial development, working with these principal classes of stakeholders:
  1. programmers whose principal (first and working) language is not written from left to right
  2. non-programmers who contribute to Wikipedias (primarily), especially to the smaller ones
  3. people who do not contribute to Wikipedias, especially if they use languages that are not written from left to right.

We may want to involve researchers in the field, but I believe that engaging directly with these stakeholders is more important even though it presents many practical difficulties. Or for precisely that reason.

I would not be thinking of considering identifiers during this engagement. Once we have JSON as a constraint, propagating its left-to-right bias into our identifiers seems perfectly sensible or, at least, justifiable on the ground of consistency. JSON isomorphisms can be constrained to leave identifiers as-is, as if they were single tokens.--GrounderUK (talk) 10:54, 31 July 2020 (UTC)[reply]

Yes, JSON is an explicit requirement. That is done by saying that ZObjects are serialized as a subset of JSON. Regarding the RTL issues, as discussed above - these indeed need to be resolved. I was seriously under the impression that this was already the case, and we need to check on that. --DVrandecic (WMF) (talk) 22:32, 4 August 2020 (UTC)[reply]

DVrandecic (WMF), JSON is not exactly code, but it is certainly plain text, and it has some characteristics of code.

Plain text and code have issues for all languages that need letters beyond plain ASCII. This may even be an issue for big Latin-alphabet languages like French or Vietnamese, and it gets much worse for languages that need right to left or vertical text or advanced letter shaping, like Arabic or Hindi. If anyone is interested, I can go into details of possible problems, but I'll skip it for now.

The important takeaway is that storing stuff in JSON is fine for computers, but editing stuff in JSON should never be a requirement for humans. Ideally, it shouldn't be a requirement even in English! There is at least one good example of this: the TemplateData extension has a TemplateData editor, which allows doing (almost) everything through a form. I'm writing "almost" because the last time I checked (many months ago), there were some TemplateData fields that the form could not handle and they had to be written manually in raw JSON. I know that at least some of these issues were fixed recently, and if such fields still exist, they must be fixed, too.

The Translate extension also helps removing the burden of raw JSON editing in almost all cases for i18n/*.json files in core MediaWiki and extensions. There are only two cases in which raw JSON has to be edited:

  1. When adding new messages, the English text has to be added to en.json, and its context documentation must be added to qqq.json.
  2. When changing an existing message, it must be changed in en.json. Documentation in qqq.json may be updated if needed, but it can also be updated by editing the corresponding page on translatewiki.net.

All the translations to other languages are added and modified through the web interface on translatewiki.net. Even the documentation in qqq can be edited after the message key is added. There is never a reason to edit raw JSON in other languages. The files are updated automatically by scripts and bots that the translatewiki staff runs regularly (more or less every day).

Using a form to edit JSON also goes a long way to prevent syntax errors, and to help people focus on the content instead of remembering the valid key names.

And this should be the situation with all other cases where people need to edit JSON or other code or plain text with human-readable strings, even if they are short. --Amir E. Aharoni (talk) 10:36, 5 August 2020 (UTC)[reply]

@Amire80: Yes, fully agreed! We won't have people edit JSON files directly, it will always be through a UX. It seems that resolves most of the issues. Thanks, Amir! --DVrandecic (WMF) (talk) 20:56, 6 August 2020 (UTC)[reply]
Just a note - I wrote some Vue components to allow form-style editing of ZObjects for Denny's draft version - it's pull request #23. I absolutely agree people shouldn't be editing raw json! ArthurPSmith (talk) 13:12, 5 August 2020 (UTC)[reply]
@ArthurPSmith: Whaat?? Why did I never see that pull request! Oh goodness, there has been quite some activity on that repo. I need to get back to that, sorry! I am a terrible person, thank you! I haven't seen that at all! --DVrandecic (WMF) (talk) 20:52, 6 August 2020 (UTC)[reply]
@DVrandecic (WMF): It's ok, I figured you'd been busy! It was mostly an experiment, but maybe useful as a starting point! ArthurPSmith (talk) 21:41, 6 August 2020 (UTC)[reply]
@ArthurPSmith: Thank you so much for this, I will take a look at it soon! --DVrandecic (WMF) (talk) 22:12, 6 August 2020 (UTC)[reply]

JSON isomorphisms[edit]

We can move this section to a different page if we agree to explore it further.

Approaches to consider would begin with the familiar (names invented)

  • Reverse calculator
A certain smartphone has a very simple calculator in portrait orientation. The digits are entered from left to right to form a simple decimal number, but whether the operator then entered is to the left or to the right is in the mind of the operator - I mean, user. One can even think of the numbers as being in a column, from bottom to top, say, if one chooses.
  • High-school algebra (PEMDAS)
  • Generic spreadsheet
There are inconsistencies across different offerings and, of course, within them. For arithmetical functions, the use of operators is from left to right. Ranges can be entered from right to left or from bottom to top but they are presented as top left to bottom right... (maybe there's a setting to change that; it might even follow country settings!)
  • Simple MW template (ordered list)
  • Infobox template (headed unordered pair list)
Notice how the template wizard presents the two types of template in a common format. The pairs may be added in any order, but the wizard outputs wikitext in the order the template determines.--GrounderUK (talk) 14:50, 31 July 2020 (UTC)[reply]
Sorry, I am confused by what you are saying here. Could you please rephrase? --DVrandecic (WMF) (talk) 22:30, 4 August 2020 (UTC)[reply]
@DVrandecic (WMF): I can't say I find your confusion surprising; I wrote the above rather quickly in an attempt to clarify what I had in mind when I said (at 3, above) "Commit to exploring generic re-sequencing solutions ("JSON isomorphisms")..." So, my starting point is how to end up with JSON ZObjects, given a contributor's assumed lack of programming experience and without assuming experience of left-to-right expression. (And I guess there's an implicit assumption that programmers will find their own routes, having already overcome left-to-right bias in some programming language.) With all that in mind, I imagined how one might begin to explore the idea of functions with such stakeholders through their previous function-like experiences, such as using a calculator or a spreadsheet. It seems to me that these might also be rather unfriendly to some classes of stakeholder, which is why I suggested a "reverse calculator" (which, to an r-to-l user, would seem to be more natural and, in any event, suggests to me that this might be more of an issue than it might appear – just try entering your sums backwards, not caring about the incorrect answers...). Anyway, that's all that was about: how do we allow our least experienced contributors to specify a function in a way that makes sense to them (with minimal compromise) and which we can automatically parse into a well-formed JSON ZObject?--GrounderUK (talk) 14:12, 26 August 2020 (UTC)[reply]

Syntax[edit]

It appears that empty strings and lists are well-formed ZObjects, but empty records are not. If this is correct, should the "subset of JSON..." be amended to make this difference explicit? --GrounderUK (talk) 23:09, 30 July 2020 (UTC)[reply]

That is correct. The grammar already makes that explicit, by requiring the Z1K1 on the Record rule. --DVrandecic (WMF) (talk) 22:27, 4 August 2020 (UTC)[reply]

Is it intended that the first pair in a ZObject record must be the Z1K1/type of the record, as the canonical form asserts? The text says only that the ZObject must have a Z1K1/type.--GrounderUK (talk) 23:47, 31 July 2020 (UTC)[reply]

I'd say yes. In fact, I would sort all the keys. But given that they are unique, in most cases this doesn't matter. I would like a canonical version where they are sorted - but there is really no need to have a ZObject be rejected because of that. Hmm. This probably should be captured somewhere. --DVrandecic (WMF) (talk) 22:27, 4 August 2020 (UTC)[reply]
I'm not sure ordering should be imposed on the keys; at least in some languages hash/map objects are not ordered. Though I guess if it's only a matter of how it's displayed "pretty-print" style that could be stated without a problem. Note that json objects can have arbitrary numbers of space-type characters all over the place, so they won't be consistent as strings unless some standard style is asserted. ArthurPSmith (talk) 13:19, 5 August 2020 (UTC)[reply]
Good point, agreed. It shouldn't require it for the implementations, but it could define some canonical serialization (also to avoid semantic-free diffs in the wiki). --DVrandecic (WMF) (talk) 21:27, 5 August 2020 (UTC)[reply]

"The goal is to provide an easy UX to allow the creation and manipulation of ZObjects through a wiki interface"[edit]

What "easy UX" are you referring to? Editing those JSON objects? Or is there a simpler editing UX envisioned and not described here?

There's nothing simple and accessible about giant JSON structures for non-programmers. Even if the wiki JSON editor warns them, they need to learn balancing brackets and curly braces, the difference between them, why a comma-separated list can't finish with a comma, etc. Short of picking XML, expecting people to learn writing code in that syntax has to be the worst introduction to programming one could ever have.

And seeing the Eiffel tower structures the most advanced examples shown here (which are not advanced at all!) already turn into, this choice of data structure is reminiscent of 90s XML horrors that is certain to drive away people with enough programming experience to actually understand the content.

In short, these JSON structures are complex enough to be a serious barrier to entry to non-programmers. Higher than actually learning programming, I'd say, because you're writing code inside a JSON tree structure which is designed to be readable by machines first, not humans. And they're inadequate enough to drive away real programmers. This is unfriendly to all potential contributor audiences.

Which drives me to the elephant in the room, that this document in no way addresses: Why do we need a wikilambda? Why do we need ZObjects? Like everything else I've seen about this project, the implementation is discussed as if it's the only way to solve whatever problems the overall project is trying to solve, without answering any of the fundamental questions about why this architecture makes sense. Once you have a collection of functions editable as a wiki, what problems does it solve that a software library couldn't? If the answer is "the community", then the complexity of this ZObject function model is certainly ensuring that you will have the smallest contribution base one could ever have for such a project. A subset of users who are both highly fluent in wiki contribution and computer science and are not driven away by an incredibly inadequate interface to programming.

--GDubuc (WMF) (talk) 07:42, 10 August 2020 (UTC)[reply]

@GDubuc (WMF): Here's a snapshot of the editing UI I put together for Denny's abstracttext prototype (based on Wikimedia software using the new Vue component capabilities).
Draft editing UI for ZObjects
No, we're not expecting people to generally edit raw JSON. The point of this is to have a framework for completely language-independent software creation. If you change the user language to German (say), all the fields immediately are replaced by their German labels (if those exist). ArthurPSmith (talk) 18:51, 10 August 2020 (UTC)[reply]
If that UI is supposed to be an example of better UX than editing JSON, it fails spectacularly at it. You need some serious UX and audience research for this project before getting into these specifics. I don't see how the concepts introduced like ZObjects, the JSON storage, this UI layer on top of it are in any way accessible to a wider audience than a classic open source project would be. This is so unusable and impossible to comprehend, it couldn't be further away from the proclaimed goals of making something more user-friendly. Outside UX, usability and audience research is desperately needed before going full steam ahead with this architecture. There is still time to go back to the drawing board and take a good look at the desired audience and the reality of the current design. As for making programming multilingual, it has been tried many times over the last 50 years and has never caught on. The diversity in contributors to the FLOSS world should be demonstration enough that english keywords are not a barrier to entry to programming. You're basically trying to solve visual programming and multilingual programming, endeavours that have failed repeatedly in the past, on top of what the project is actually trying to do, which is to have a repository of language generation functions. This is destined for failure, one hard problem is enough, no need to pile on 2 other ones that are strictly unnecessary just for kicks, based on the bizarre highly subjective notion that these overly complex concepts and UIs introduced here are somehow more accessible than programming. --GDubuc (WMF) (talk) 11:46, 11 August 2020 (UTC)[reply]
@GDubuc (WMF): Well, it directly addresses your original concerns about brackets, braces and commas - the experimental draft UI here takes care of all of those for you. Sure, "serious UX and audience research" would be a great idea, let's do it! It's absolutely not intended to be polished at this point. The larger question of whether this is a good starting point or not is obviously a good one though and ought to be addressed first. Aside from language-independence, which I think is a highlight of the ZObject approach, the larger goal here is allowing wiki-based development of (specific types of) software. What makes wiki-editing work for natural language but more difficult for code? Or is it just that the software community prefers other tools? Git etc. track changes and attribution, but they're not as open for anyone to edit as wiki's. ArthurPSmith (talk) 12:39, 11 August 2020 (UTC)[reply]
I don't think the software community "prefers" code. In 50+ years of its existence, about every visual programming idea you can imagine has been tried and it just doesn't work for general purpose programming. For editing natural language, WYSIWYG has clearly demonstrated its superiority and accessibility. Wikitext has made wiki contribution less accessible and the contributor base less diverse and more limited than it would have been if Wikipedia had started with a visual editor. Even with a decent visual editor now, we still don't have a diverse contributor base representative of the general population. Yes, learning to write code and use version control is less accessible than WYSIWYG wiki editing and would also limit the audience. But a WYSIWYG experience is not what's being proposed here at all! Everything I've seen here is shoving programming languages (who's going to write the python/JS/language of your choice snippets inside the function ZObjects? They sure need to know programming to do that part), shoved inside a JSON structure about as elegantly as the 90s let's-put-everything-in-XML disaster philosophy. And since that wasn't complex enough, a completely foreign UI has been slapped on top of it to make the JSON completely impossible to read or parse visually in any way. Given the density of information and the crazy tree structure that's been created here, you can never simplify this to an elegant UI. There is just too much information overhead. It's always going to be a monstrous giant tree that you can't even display on a single page for anything semi-complex. The amount of information overhead is just too high to make this accessible to anyone. Once again: the way this is designed is a complete distraction from the purpose, which is to collaboratively build a repository of programming functions to process and generate natural language. If you think that restricting such a collaborative project to a FLOSS community would limit its contributors base, this proposal (ZObjects/JSON/UI) is going to be a tiny subset of that. You still need to know or learn programming to make any meaningful contribution, yet none of the classic code collaboration tools, tutorials and resources are available to you to help with this task. And you need to learn completely foreign abstractions that will limit the contributor base to this project to a very tiny group of nerds with severe diversity issues. This attempt to widen the audience because FLOSS contribution is thought to be too inaccessible has the opposite effect, in a very severe way. I'm not saying that there isn't a third way, but this proposal completely fails at making things more accessible than a FLOSS code project. It narrows the potential audience greatly instead of widening it. And you can hope for better UIs as much as you want, the amount of information overhead introduced by ZObjects/JSON cannot be overcome. You have to display that overhead and make it editable somehow, which is why it's fundamentally flawed and the specifics of the prototype UI are irrelevant. --GDubuc (WMF) (talk) 14:32, 11 August 2020 (UTC)[reply]
@GDubuc (WMF): I don't think "collaboratively build a repository of programming functions to process and generate natural language" is a correct description of the purpose. Denny's background whitepapers on this (here for example) describe two categories of functions needed: "constructors", basically like Mediawiki templates - language independent units that can be put together to write a language-independent article (the "content"), and "renderers" which generate natural language from the sequence of "constructors" in a given piece of content, for any desired language. I can barely envision a "WYSIWYG" environment where a user could create new abstract content that is immediately rendered into their own language where existing constructors and renderers exist. But how would one manage it if there wasn't a renderer yet in their language for a given type of constructor, or if they needed to come up with a new constructor to create the content they wanted? There is inevitably a lot of "information overhead" going on here, because we're trying to create language-independent narrative text. It would probably be nice to have at least a bare-bones demonstration that this is workable, and some sort of evidence that the ZObject approach is helpful (or not). ArthurPSmith (talk) 20:11, 11 August 2020 (UTC)[reply]
I do share some of the concerns here. Wikidata, for example, is fairly friendly for newcomers, even those who have no idea of semantic web, ontology, triples or RDF, can quickly grasp the main idea behind statements and properties. Of course, the model of Abstract Wikipedia/Wikilambda would be much more complicated, but I think it is critical that, people can quickly and easily understand what a function is supposed to do from UI, at least for those who have some basic programming knowledge. The draft UI, however, just simply draws the JSON tree structure using web elements. Editing through this UI is not much different than editing raw ZObject/JSON, except it shows labels and takes care of potential syntax errors. But I would still feel like I have to program inside a JSON object, which is counterintuitive. To be clear, I'm not opposing the ZObject approach per se, but if we adopt this model, ZObject/JSON should only be used for the stored raw data. UI should be simple and decoupled with the raw format, and only be driven based on UX. Maybe Scratch-like programming interface? I think it would still be okay even if some advanced editing cannot be done through this UI. People can use it to construct and modify some simple functions, and more advanced users can use the fully-featured JSON-based editing UI instead. --Stevenliuyi (talk) 05:17, 7 September 2020 (UTC)[reply]

Make Z7 more uniform[edit]

Tracked in Phabricator:
Task T266242

Currently, a function call is represented as follows (assume, Z142 is the concatenation function):

 Z1K1: Z7
 Z7K1: Z142
 Z142K1: "Wiki"
 Z142K2: "data"

If we use global keys, it would look like this:

 Z1K1: Z7
 Z7K1: Z142
 K1: "Wiki"
 K2: "data"

The local keys in this case get expanded against the Z7K1 value, not the Z1K1 value, as is the case for all other local keys. This makes it very different than all the other objects, and requires special handling.

The suggestion is to change the representation of function calls and make them more unified compared to the other entries, i.e. like this:

 Z1K1: Z7
 Z7K1:
   Z1K1: Z142
   Z142K1: "Wiki"
   Z142K2: "data"

So, instead of pulling the values into the Z7 object, we basically instantiate a function just like any other type, and wrap it into a Z7 to say that this is a function call. This needs one extra object, but it leads to much more uniform handling of objects.

Any thoughts? --DVrandecic (WMF) (talk) 23:16, 14 October 2020 (UTC)[reply]

Why not have a Z7K2?
{
  "type": "function call",
  "identity": "concatenate",
  "args": [
    {
      "arg1": "Wiki",
      "arg2": "data"
    }
  ]
}
{
  "Z1K1": "Z7",
  "Z7K1": "Z142",
  "Z7K2": [
    {
      "Z142K1": "Wiki",
      "Z142K2": "data"
    }
  ]
}
--GrounderUK (talk) 11:04, 21 October 2020 (UTC)[reply]
Because in this case the embedded object is not a valid ZObject (lacking Z1K1).--GZWDer (talk) 11:06, 21 October 2020 (UTC)[reply]
I did have a Z1K1 in there originally but I wasn't sure whether it would be a Z3 or something else. I suppose one Z7/function call on a Z8/function differs from any other call in its arguments, so its logical to refer to the arguments as Z3/keys. That would make a Z7 suspiciously similar to a Z4/type, which is no great surprise, I imagine. GrounderUK (talk) 16:48, 21 October 2020 (UTC)[reply]
It seems what's needed is a hash/map/dictionary ZObject type then? ArthurPSmith (talk) 20:15, 21 October 2020 (UTC)[reply]
I find it all rather confusing! I would expect a Z7/call to have a form that resembles a Z8/function but the Z8 is currently defined with just "K1" for its "arguments". Is there some reason why K1/arguments is not "Z8K1" (or "Z8K2", if "Z8K1" could be used for K5/identity)? Anyway, by analogy with Z8, I'd be looking for the equivalent of Z17/argument declaration, which would be Znn/argument (or whatever). That would give (very roughly, mostly omitting quotes) Z7K2/arguments:[{Z1K1/type:Znn/argument, ZnnK1/argument type:Z6/string, ZnnK2/key id:Z142K1, ZnnK3/value:"Wiki"}, {Z1K1/type:Znn/argument, ZnnK1/argument type:Z6/string, ZnnK2/key id:Z142K2, ZnnK3/value:"data"}]. (The ZnnK1/argument type appears to be redundant but is included for consistency; in contrast, I have omitted "label" as being a documentary component of Z8 that is not relevant to a Z7.)--GrounderUK (talk) 21:47, 21 October 2020 (UTC)[reply]

OK, I have been thinking more about this, and also reading the comments here, which have been very enlightening and sketched out a good solution (basically the one Adam hinted at in the email-thread too).

So the function call we used as a running example could look like this:

Z1K1: Z7
Z7K1: Z142
Z7K2:
  Z1K1: Z824(Z142)
  Z142K1: "Wiki"
  Z142K2: "data"

Z7K1 points to the Function to be evoked, and has Z8/Function as the value type, and so we can use Z142, which is a Function as the value. Z7K2 points to the arguments. The type of Z7K2 is using the Z824 Function on the Function. Z824 basically takes a Z8/Function and returns a Z4/Type (basically turning the Z17/argument declarations to Z3/keys, mostly).

There's a little bit of repetition (Z142 is mentioned twice, as the value of Z7K1 as well as in the Z1K1 of the Z7K2), but that also allows us to break any potential recursion by materializing the the result of Z824 (which might be necessary to express Z824(Z824) (or not, we'll see)).

I think that should solve this conundrum.

Again thanks everyone. If there are no complaints here or on-wiki, I will update the Function Model after the weekend. --DVrandecic (WMF) (talk) 01:08, 23 October 2020 (UTC)[reply]

I have no expertise in this field and my comprehension of the Function Model is probably no higher than 60%, despite reading some sections repeatedly and sometimes even carefully! I don't believe it is explicitly stated anywhere that a Z7 is restricted to Z8s that are Z2s; perhaps that is either obvious or untrue. It don't see how the proposed approach would work with a "transient function", if such a beast is permitted (as Z8/Functions states it is). For persistent functions, my only misgiving is the repetition of the Z8/Function's identifier. I think the rationale for that will need to be challenged very robustly. My thinking at the moment is that the evaluator of a Z7/Function call only really wants/needs a list of key–value pairs (with English labels [function:concatenate, string1:"Wiki", string2:"data"]), and that's pretty close to where many humans might begin ('concatenate "Wiki" and "data"!' or 'what's "Wiki" and "data" when concatenated?...). That's rendering and parsing covered! Now, of course, we need to consider the abstract content, the normalized form of the Z7/Function call (which might be better labelled "evaluation", if we are more interested in the "magic" than in the casting of the spell). Here, I can only guess at what Z824 might deliver... But, since it clearly receives a Z9/Reference ("Z142"), it is not obvious to me that it should not deliver it straight back. Well... I didn't start out thinking this is where I would end up, but is this a viable alternative?
Z1K1: Z7
Z7K1:
  Z1K1: Z824(Z142)
  Z142K1: "Wiki"
  Z142K2: "data"
--GrounderUK (talk) 14:48, 23 October 2020 (UTC)[reply]
@DVrandecic (WMF): This looks reasonable, but I have one (potentially stupid) question – what is “Z824(Z142)” here? It looks like syntactic sugar for a function call, but wouldn’t that make the function call syntax recursive? (Maybe it’s generic and I haven’t looked into our generics enough yet.) --Lucas Werkmeister (talk) 21:46, 3 December 2020 (UTC)[reply]
There's something you've not seen (and the modeling using indented syntax is not equivalent): the value of args (Z7K1) is an ordered table, containing several types of invokations of the function, depending on parameters, to select an implementation; and Z824 here would be one of the implementations I think: you can select the implementation that matches the input/ouput parameter types (but beside that there's no other selector, such as the language or target engine/server with its actual implementation.
The indented syntax just above strips one level, allowing a single implementation (indicated by Z1K1, it would be Z824 here, and taking the parameter types indicated by Z142); if I translate back your indented syntax to JSON, I get:
{
  "Z1K1": "Z7",
  "Z7K1": "Z142",
  "Z7K2": {
    "Z142K1": "Wiki",
    "Z142K2": "data"
  }
}
but not:
{
  "Z1K1": "Z7",
  "Z7K1": "Z142",
  "Z7K2": [
    {
      "Z142K1": "Wiki",
      "Z142K2": "data"
    }
  ]
}
verdy_p (talk) 22:44, 3 December 2020 (UTC)[reply]
@Lucas Werkmeister, Verdy p, and DVrandecic (WMF):Z824 would be a function and that does imply a recursion, but is that a problem? I think the important thing is to realise that the invocation and the evaluation of a function are quite distinct, but the expression "function call" can refer to either. The evaluation engine is supposed to receive a normalized JSON representation but it is not the responsibility of the calling function to provide this. Arguably, we want to minimize the content of the initial invocation and maximize the content immediately prior to evaluation. I suspect there is more than one function between the start and the end of this pipeline, the first of which is here identified as Z824. Whether it is helpful to make this function explicit, I rather doubt. Instead, going back to "something like LISP in JSON", the initial invocation of a persistent function could be little more than a list that identifies the function and supplies its arguments. Unlike LISP, the elements need not be ordered (assuming they are key–value pairs) and the function identifier might be a separate argument, but we should be able to construct a Z7 from a simple list of (function and) arguments like "["Z142", "Wiki", "data"]". The LISPy constructor function evaluates these arguments and returns a "proper" Z7 (however defined). A function like Z824 takes a "proper" Z7 (referencing a Z8) and, in effect, expands or resolves it into a valid transient function call (a "proper" Z7 embedding a valid transient function). The pre-evaluation normalization function (assumed to be common to transient and persistent functions) expands the function call to its final, maximal form. (In the case of a persistent function, this final, normalized form also includes a reference to a Z8, but perhaps not to the same Z8 as was referenced in the original function. I'm guessing a transient function must also ultimately reference a Z8, somehow.)--GrounderUK (talk) 14:40, 17 December 2020 (UTC)[reply]
Yes, I think that's right. Function calls are indeed both the invocation and the evaluation, and that's what's confusing me. But I think you are right, it is probably not a problem. Since the evaluation order does not matter (modulo termination), it should be consistent one way or the other. Thanks! --DVrandecic (WMF) (talk) 22:28, 17 December 2020 (UTC)[reply]

ZObject definition[edit]

A ZObject consists of a list of Key/value pairs.

Every ZObject must have a key Z1K1/type with a value that evaluates to a Z4/Type

ZObject := String | List | Record

It seems that there is a slight inconsistency here. A ZObject is defined to be a string, list or record but I think it has a Z1K1/type and is a list of Key/value pairs only if it is a “record”. In general, the word “ZObject” is used on the main page to refer only to Z1s, so I think it’s the syntax box that needs fixing. Eliminating the word “record” and using “???ZObject” as a placeholder for a replacement term for “string, list or Z1/ZObject” (the expression used in ECMA-404 is “JSON value”, or just “value”, which I hesitate to adopt), I suggest it might be

== Syntax ==

The canonical representation of a ???ZObject is a subset of JSON. A well-formed ???ZObject has the following syntax:

???ZObject := (String | List | ZObject)
String  := "Character*" // to be specific, as in JSON / ECMA-404>
List  := [ ( ???ZObject ( , ???ZObject )* )? ]
ZObject  := { "Z1K1": ???ZObject ( , "Key": ???ZObject )* }
Key  := ZNumber KNumber | KNumber
ZNumber := Z Number
KNumber := K Number
Number  := [1-9][0-9]* // a positive non-zero decimal integer

where:

  • bold characters are terminal symbols;
  • italic characters are used for non-terminal symbols;
  • ( ) are used to surround a group which may contain one or several alternatives separated by |;
  • a group can be repeated using * which means repeat 0..n times, or + which means repeat 1..n times;
  • a group followed by ? is optional;
  • whitespaces can be used as in JSON.

That results in the subset of JSON without numbers, null, booleans, and with limited keys.

In order to be well-formed, the Z1K1 key must have a value that may evaluate to a Z4 Type, i.e. either is a well-formed literal of type Z4, a reference, or a function call.

I rather doubt that a Z1K1 can be a list, so it might be better to say “ { "Z1K1": (string | ZObject)”, but presumably it can only be a string if that string is a Z9/Reference (whether or not the referenced Z2 exists), a ZNumber... and now we’re back to the final sentence.--GrounderUK (talk) 11:43, 8 March 2021 (UTC)[reply]

There is a bit of ambiguity, sorry for that, regarding the term ZObject as a syntactic element, as described in the Syntax section and the box you quote, and a ZObject semantically, as in the rest of the document. So, indeed, a ZObject can be represented by a String, a List, or a Record, in JSON. Syntactically, only a Record would have an explicit Z1K1. Semantically, all of them do (if it is a String, the Z1K1 is Z6/String or Z9/Reference, depending on the string, if it is a List, it is Z10/List of the type of elements).
The more of the semantic restrictions we try to pull in into the syntax, the messier the grammar gets. I decided to make the cut as it is, but really it is a bit arbitrary and it could have tried to pull in more semantics, or even less.
The easiest way to fix that would be to not overload ZObject in the Syntax section to also mean the syntactic representation of a ZObject, but I can't think of a good name. ZObjectRepresentation is a bit of a mouthful :), but it would reduce the ambiguity that you have found here. DVrandecic (WMF) (talk) 20:50, 10 March 2021 (UTC)[reply]
@DVrandecic (WMF): Thank you for your response; I think we disagree. My view is that the string "ZObject" should never be used to refer to anything that is not represented as a JSON object (and, more specifically, has a "Z1K1"). Perhaps I could insert notes on the main page, where you have used the string in the broader sense? For example,
  • Every value in a Key/value pair is a ZObject.[demo 1]

References

  1. That is, a String, a List or a Z1/ZObject.

--GrounderUK (talk) 14:19, 11 March 2021 (UTC)[reply]

Yes, that sounds like a good way forward. And once they are identified, we/I can go through them. This way we disentangle what I think of as "ZObjects, semantically" and "syntactic representations of ZObjects", and we can identify whether there is substantial disagreement. Thanks! --DVrandecic (WMF) (talk) 18:21, 11 March 2021 (UTC)[reply]
@DVrandecic (WMF): In the end I annotated every use of “ZObject” outside of the Syntax box. I deliberately avoided a default assumption, so I must have had my reasons, at the time, for tagging the use in the way that I did. That said, I did change my mind in a few cases and, on reflection, it might have been helpful to identify these separately. I can probably identify most of them, but I’ll leave it for now. It was mainly (only?) in the (de-)serialization/normalization area, where I may have been confused about the level of the object referred to: whether it is the whole object or the objects within it (including, ultimately, a “simple” string). Enjoy! --GrounderUK (talk) 11:14, 15 March 2021 (UTC)[reply]

Function types[edit]

The final sentence in the Function types section seems incorrect:

The validator on Z4/Function ensures that the types as given on the Z7/function call to Z8/Function are consistent with the types given on the Z4K2/key declaration.

I tried to fix this but it became too complicated. We must be talking about validating the Z8/Function, but that doesn’t have a Z4K2/keys until after its generic type is returned by the function call. So I would expect that we would want to validate that types listed in the Z8K2 [here] are consistent with the argument types in the Z8/Function’s own Z17/argument declarations. We also want to validate the Z8K1 [here] against the function’s own return type. Currently, Z8K1 is “arguments” and Z8K2 is “return type”, but I was aiming for internal consistency first.

The validator for a Z8/Function ensures that the argument types as given in its Z8K2/arguments to the Z7/function call to Z8/Function are consistent with the Z17K1/argument types in the function’s own Z8K2/arguments. The Z8K1/return type in the Z7/function call must also be the same as the function’s own Z8K1/return type.

When a few other changes are made to the rest of the document, this would become:

The validator for a Z8/Function ensures that the [argument] types as given in its Z8K1/arguments to the Z7/function call to Z8/Function are consistent with the Z17K1/[argument] types in the function’s own Z8K1/arguments. The Z8K2/return type in the Z7/function call must also be the same as the function’s own Z8K2/return type.

Here, I leave open the question of whether the English label for Z17K1 should be “argument type” (as on the main page) or just “type” (as currently defined). GrounderUK (talk) 10:52, 11 August 2022 (UTC)[reply]

You are right, something is off with that text. Ah, the whole idea of function types, we never implemented it. I need to figure out how to move forward with this. I think it would be good to go through the whole function model and make sure that it is actually current and true to the implementation. --DVrandecic (WMF) (talk) 18:44, 14 July 2023 (UTC)[reply]