Talk:Abstract Wikipedia/Error messages on ZObjects

From Meta, a Wikimedia project coordination wiki

Z2/Persistent object - State[edit]

I outlined an argument elsewhere that a Z2/Persistent object should be allowed to exist in the main namespace as a "draft". The proposal in the content page here suggests that "invalid" (Z2) objects should be allowed to remain, and I agree. In practice, the software may not need to distinguish between "drafts" and other defective objects. One possible exception is when the draft is not, in a formal sense, "defective". In such a case, we need the "state" of the object to indicate that it is still a draft. There are probably other useful states like "deprecated" or "unavailable" (blocked). Validations against the object's state could prevent the use of an object that is "draft" or "unavailable" whilst allowing the use of a "deprecated" object, for example. That said, it might be valid for a draft object to use another draft object, but not valid for it to use a deprecated object so, yes, this can get quite complicated. And that's before considering how the (true) state of an object is affected by the states of the other objects on which it ultimately depends.

Referring back to the content page:

3. Everything stored in the wiki is guaranteed to be well-formed and canonicalized.
4. Everything stored in the wiki is guaranteed to follow the inalienable truths; these are hardcoded in the wiki and cannot be changed on wiki. These include that every object must be a Z2/Persistent object, that every Z2 has labels, etc.
  • Does 3 relate only to Syntax in the Function model? If so, my only concern is around the need for a Z1K1/type (referring back to my argument elsewhere). Could a draft object's Z1K1 have a value of "Z2" or "Z8", for example?
  • What are the other "inalienable truths" referred to by 4's "etc"? And is the "labels" constraint satisfied by a single label?

--GrounderUK (talk) 13:45, 10 December 2020 (UTC)[reply]

Thanks! These are great points, and I think you are right, the idea of having objects be in a draft mode is a good idea, particularly when objects start getting more complicated. I fully agree with that.
An invalid object should not be evaluated (or rather, its evaluation will result in an error), but that's OK. But basically, what this suggests is that it should be OK to save an object in an invalid state.
I think that the conditions 3 and 4 should still remain, but validation is on top of that. More discussion about this is on phabricator:T269182. I will link from there to this discussion too, so that we keep this in mind.
Regarding your questions:
  • yes, 3 relates only to the syntax. This is a very low bar.
  • we should write down the inalienable truths! This are envisioned to be a small set of core rules, that break the need for a recursive validation otherwise. E.g. the structure of a Z6/String or a Z9/Reference, etc. The task for these is here: phabricator:T260314.
Thank you! I agree on the need for drafty or invalid objects to be stored. Let's see if the others agree too :) --DVrandecic (WMF) (talk) 20:42, 11 December 2020 (UTC)[reply]

Validation in the store API[edit]

I feel like point 9 – But once the API gets invoked to store, we will do so even in face of validation errors — but not in face of well-formedness errors or errors against inalienable truths. – is too lax. I understand that endpoints to read/present objects will need to allow potential validation errors, but I don’t think it’s too restrictive to enforce valid input (according to the current rules) when saving new revisions. Maybe allowing validation errors will be necessary occasionally to break some kind of dependency cycle, but at least it shouldn’t be the default behavior of the API, in my opinion.

From a practical standpoint, I’m worried that if the save API doesn’t do validation by default, then most external tools using the API will have to call some separate validation API beforehand to ensure they don’t introduce accidental validation errors, and the majority of invalid saves will not be intentional, but accidents from tools that don’t do that check properly. --Lucas Werkmeister (talk) 22:22, 1 February 2021 (UTC)[reply]

@Lucas Werkmeister: that's a great point, and I agree. If the validation is not too costly, we should probably validate everything that comes in against the current state, just to avoid the unintentional cases that you describe.
I think the reason why I was thinking it would be good to store invalid objects would be for drafting an object, or if someone has trouble writing a specific object, and wants to store it and ask others for help. So far so good, that should be possible, but that's the exception. Maybe having a checkbox for manual overriding of the validation is a better solution for that situation.
Thanks for the point! This week I want to get to write down more about validation, so that is very timely. I think you are right. --DVrandecic (WMF) (talk) 23:14, 1 February 2021 (UTC)[reply]
I think we need to distinguish between an "invalid" object, which has errors that have been identified, and an "unvalidated" object, which may or may not have errors that validation would uncover. And if validation sometimes starts but does not continue to its natural conclusion, we will have "partially unvalidated invalid" objects (other errors may be present) and "incompletely validated" objects (no errors have been found yet). We could also have "invalidated" objects, which were valid once but no longer are.--GrounderUK (talk) 01:25, 2 February 2021 (UTC)[reply]
Yes, unfortunately, all of this is possible. For now I hope that validation can be done quick enough to be performed on the fly. But indeed - since the validators can be changed after instances have been created, it can turn valid objects into invalid ones and the other way around, and many other things.
That seems to be mostly the same problem as when we are using *any* result from an evaluation anywhere, though. That result may change. We're talking with the Architecture team to figure out what a possible solution for that could be, but yes, you have very correctly identified a hard problem we are going to have.
Thanks to ZIDs, we solved one of the two hard problems of computer science - the other one is going to hit even harder, though :) --DVrandecic (WMF) (talk) 00:33, 4 February 2021 (UTC)[reply]