User:IMSoP/ext-markup

From Meta, a Wikimedia project coordination wiki

Below is a table showing some alternative thoughts for the "inclusions"/"extensions" section of Lee Crocker's Wiki markup proposal. Basically, I've tried to look at what would happen if you stuck rigidly to either "one character declares inclusion class" or "inclusion class is named in full". The exact words and characters suggested may not be ideal, but thinking through all of them at once leads to some interesting thoughts.

The advantages of symbolic markup include:

  • doesn't require internationalisation
  • is quicker to type

However:

  • not as extensible
  • the more classes you represent this way the more arbitrary the characters become
    • and, relatedly, the harder it is to learn all the different types

[I was going to mention how each breaks the flow of the surrounding text, but couldn't decide a) which was "more distracting", and b) whether this would be a positive or negative attribute, given that the contents of an inclusion tag generally aren't part of the main text flow.]

Of course, it's not really necessary to pick just one "style" of tag; in fact, it's probably best to select the few which are most commonly used and create symbolic markup, and use verbose versions for everything else. That way, the extensibility of the syntax is retained, and the few symbolic ones picked become easier to select (for the developer) and learn (for the user).

[Note that in each of the below, foo represents zero or more arguments.]

Inclusion type Verbose markup Symbolic markup Comments
Invisible comment <<comment foo>> <<-foo>>
or <<!foo>>
the "!" form may seem more natural to those familiar with HTML
Raw text <<raw foo>> <</foo>>
Template <<include foo>> <<+foo>>
Image <<image foo>> <<#foo>>
Template parameter <<param foo>> <<$foo>> "$" being recognisable to programmers as "variable"
Global variable
("currentmonth", "localserver", etc)
<<var ''foo''>> <<?foo>> Some of these are more like functions, so would take additional parameters (e.g. {{localurl:bar}} would become <<var localurl bar>> or <<?localurl bar>>). Treating these as one class, however, saves polluting the inclusion/extension namespace with a single-use class for every variable or function of this sort (something I rather dislike about the current usage).
Math[s] <<math ''foo''>> <<=foo>> this is probably best thought of as one of an unbounded set of extensions, and therefore left verbose (or see below)
Other extensions <<extension foo>> <<@extension foo>>
or <<!extension foo>>
if an all-symbolic system was used, perhaps all user extensions ("hiero", "easytimeline", etc, and probably also "math") could be put in their own namespace for consistency - i.e. so that <<foo>> is unambiguously incorrect. Thus <math>''foo''</math> would instead be <<@math foo>>; <<foo>> could then either disappear (i.e. be a comment) or be rendered as-is (i.e. be invalid)

["!" is familiar to *nix hackers as "execute this", but we might want it for comments]

Multi-line markers <<begin class>>
foo
<<end>>
<<class...>>
foo
<<.>>
note that the symbolic form of this could be used in conjunction with other symbolic forms (e.g. <<#...>>) or with verbose forms (e.g. <<image...>>)

An additional thought about arguments[edit]

Some people writing extensions have asked for the extension tags to be able to take arguments in an XML-ih way (since they are/look like XML elements) (I think there's a bug open on it somewhere). Obviously, <foo bar="baz">quux</foo></code> could also be expressed as <code><nowiki><foo>bar=baz||quux</foo> or some similar contrivance internal to the extension, but if it were something like <math format="mathml">...</math>, that does seem rather inelegant. And thinking about it, a similar difference applies to the caption part of image syntax as opposed to everything else - the others might be strings (e.g. alt=some alternate text) but the caption is a whole wikitext rendered paragraph. So, the question is:

  • should there be two types of argument? one for parameters, and one for "primary input" (analagous to command-line switches and standard-in, I guess)
  • or should there be a standardised separator for multiple arguments, which "goes with" the syntax so that it's obvious that these are multiple separate parameters and not one line of input?
  • what would either of those look like, in practice? <<class param1=x||param2=y||foo>> might be good enough for the second I guess. The "||" rather than "|" makes the separation clearer, while leaving spaces free to be used inside the construct - perhaps a good idea for [[http://example.com||external links]] too?

(I'm putting this here by way of a "scratchpad" - it's 3AM, I've been up since 6AM, and I need to go to sleep now!)