Help:String functions

From Meta, a Wikimedia project coordination wiki
Jump to: navigation, search

Extension[edit]

Various string functions are part of Extension:ParserFunctions.[1] However, this part is disabled on Wikimedia wikis, with $wgPFEnableStringFunctions = false. See also bugzilla:6455 and bugzilla:19298. The following description is based on these functions being DISABLED.

Concatenation[edit]

Concatenation is done by juxtaposition.

The expansion of the concatenation of two balanced wikitexts is equal to the concatenation of the two expanded wikitexts.

However, the rendering of e.g. '''bo and ld''' does not look the same as that of the concatenated wikitext.

Trimming[edit]

Trimming is the removal of newlines and spaces from the start and the end of a string.

Automatically trimmed are:

Thus, if trimming is desired, this is in many cases conveniently performed together with another operation one wants to perform. To just trim string S, one applies a template or parser function such that this does not affect the string, apart from the trimming, e.g.:

  • {{1x|1=S}} (using Template:1x containing {{{1}}})
  • {{#if:x|S}}
  • {{#switch:|S}}
  • {{padleft:S}}
  • {{padright:S}}

Equality[edit]

For equality there are #ifeq: and #if:. To force a string-based comparison with #ifeq:, add a non-numerical character to both compared arguments.

Note that this prevents the trimming at the side at which the character is added. If desired, trim the arguments first. If trimming is not desired, put a non-numerical character before and after both strings, e.g. put them in quotes.

Case[edit]

See mw:Help:Magic words#Formatting.

Padleft[edit]

The following applies to trimmed strings S and P, and n = 0, 1, 2,.. Note that the result is not trimmed: if S is empty the result can end with spaces and/or newlines.

{{padleft:S|n|P}} pads a given string S on the left with a padding string P (default: "0"), to increase the length (where a newline counts as one character) to min(n,500), or if length (S) ≥ min(n,500) it just returns S. The padding string P is zero or more times repeated, with finally possibly a truncated P. If P is empty, S is returned.

Properties:

  • length ( {{padleft:S|n|P}} ) = if P is non-empty then max ( min(n,500), length (S) ) else length (S)
  • {{padleft:S|n|P}} is equal to S if and only if length (S) ≥ min(n,500) or P is empty.
  • {{padleft:S|n|P}} is equal to P if and only if [length (P) = max ( min(n,500), length (S) ) or 0] and S is equal to a substring of P ending at the end of P (i.e., P is the concatenation RS of some string R with S).

An important special case is that where S is empty: {{padleft:|n|P}} produces a string of length min(n,500) consisting of zero or more repetitions of P, with finally possibly a truncated P. In particular, if P has a length greater than min(n,500), this is the truncation of P to min(n,500) characters, i.e., the substring of P of min(n,500) characters starting at the start of P.

The above-mentioned properties reduce to:

  • length ( {{padleft:|n|P}} ) = if P is non-empty then min(n,500) else 0
  • {{padleft:|n|P}} is equal to P if and only if length (P) = min(n,500)

Examples:

  • "{{padleft:|0|ab}}" gives "" [1]
  • "{{padleft:|1|ab}}" gives "a" [2] (truncation)
  • "{{padleft:|2|ab}}" gives "ab" [3]
  • "{{padleft:|3|ab}}" gives "aba" [4]
  • "{{padleft:|4|ab}}" gives "abab" [5]
  • "{{padleft:|5|ab}}" gives "ababa" [6]
  • "{{padleft:|3| a  b }}" gives "a " (the parameter is trimmed, the result is not)
  • "{{padleft:1|0|ab}}" gives "1" [7]
  • "{{padleft:1|1|ab}}" gives "1" [8]
  • "{{padleft:1|2|ab}}" gives "a1" [9]
  • "{{padleft:1|3|ab}}" gives "ab1" [10]
  • "{{padleft:1|4|ab}}" gives "aba1" [11]
  • "{{padleft:1|5}}" → "00001" [12]
  • "{{padleft:1|5|}}" gives "1" [13]
  • "{{padleft:1|5| }}" gives "1" [14]

"{{padleft:a

b|5}}"
gives "0a

b" [15]

  • "{{padleft:é|5}}" → "0000é" [16]
  • {{padleft:|2|14:38}} gives 14 (hour; not needed for the current time, because there is also the variable {{CURRENTHOUR}}; see also the #time function)

Maximum string length[edit]

If characters of P are in the result (P is not empty and the string is not already longer than the required length and not longer than 500) the maximum length of the resulting string is 500:

  • "{{padleft:abc|507|12345678 0}}" gives "12345678 012345678 012345678 012345678 012345678 012345678 012345678 012345678 012345678 012345678 012345678 012345678 012345678 012345678 012345678 012345678 012345678 012345678 012345678 012345678 012345678 012345678 012345678 012345678 012345678 012345678 012345678 012345678 012345678 012345678 012345678 012345678 012345678 012345678 012345678 012345678 012345678 012345678 012345678 012345678 012345678 012345678 012345678 012345678 012345678 012345678 012345678 012345678 012345678 01234567abc" [17]
  • "{{padleft:|507|123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789 123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789 123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789 123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789 123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789 123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123}}" gives "123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789 123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789 123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789 123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789 123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789 " [18]

However, if no characters of P are in the result (P is empty or the string is already longer than the required length or longer than 500), the whole string S is returned, even if it is longer:

  • "{{padleft:123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789 123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789 123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789 123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789 123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789 123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123|507|}}" gives "123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789 123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789 123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789 123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789 123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789 123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123" [19]
  • "{{padleft:123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789 123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789 123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789 123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789 123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789 123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123|507|p}}" gives "123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789 123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789 123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789 123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789 123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789 123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123" [20]

Applications with respect to a page[edit]

Rendering the truncated expanded wikitext of a page:

{{padleft:|500|{{Help:Parser function}}}} gives "A parser function is one of the double-brace structures that can be in a page, see also Help:Expansion. It returns a value based on at least one unnamed parameter, separated from the function name by a colon ":". There may also be additional parameters, which, like those of ordinary templates, are each preceded by "|". Parser functions differ by whether they have a leading hash character (#):

  1. functionname: argument 1 | argument 2 | argume"

Note that when specifying an arbitrary number of characters. links, table structure, XML-style tags etc. may be broken. However, if the braces are balanced in the source page there are neither unbalanced braces in the result, because expansion is done before truncation.

Displaying the truncated expanded wikitext:

{{#tag:nowiki|{{padleft:|500|{{Help:Parser function}}}}}} gives "A '''parser function''' is one of the double-brace structures that can be in a page, see also [[Help:Expansion]]. It returns a value based on at least one unnamed parameter, separated from the function name by a colon "''':'''". There may also be additional parameters, which, like those of ordinary templates, are each preceded by "'''|'''". Parser functions differ by whether they have a leading [[w:Number sign|hash character]] (#): # ''functionname'': ''argument 1'' | ''argument 2'' | ''argume"

Displaying the truncated wikitext:

{{padleft:|500|{{msgnw:Help:Parser function}}}} gives "{{h:h|editor toc}} {{languages}} {{otheruses4|parser functions in general|the MediaWiki extension '''ParserFunctions'''|mw:Help:Extension:ParserFunctions}} <onlyinclude>A '''parser function''' is one of the double-brace structures that can be in a page, see also [[Help:Expansion]]. It returns a value based on at least one unnamed parameter, "

{{padleft:|500|{{Help:Template}}}} gives "

MediaWiki Handbook: Contents, Readers, Editors, Moderators, System admins, Researchers +/-

The page has been prepared with <onlyinclude>tags, so that the small maximum of 500 characters is not wasted on a header box or table code. This also reduces the size of what is transcluded before applying padleft, reducing the post-expand include size, for which there is a limit of 2048000 bytes per page. A disadvantage could be that other pages cannot include the rest of the page either.

Padright[edit]

The following applies to trimmed strings S and P, and n = 0, 1, 2,.. Note that the result is not trimmed.

{{padright:S|n|P}} pads a given string S on the right with a padding string P (default: "0"), to increase the length to n, or if length (S) ≥ n it just returns S. The padding string P is zero or more times repeated, with finally possibly a truncated P. If P is empty, S is returned.

Properties:

  • length ( {{padleft:S|n|P}} ) = if P is non-empty then max ( n, length (S) ) else length (S)
  • {{padright:S|n|P}} is equal to S if and only if length (S) ≥ n or P is empty.

The special case where S is empty is the same as for padleft.

Examples:

  • "{{padright:|0|ab}}" gives "" [21]
  • "{{padright:|1|ab}}" gives "a" [22] (truncation)
  • "{{padright:|2|ab}}" gives "ab" [23]
  • "{{padright:|3|ab}}" gives "aba" [24]
  • "{{padright:|4|ab}}" gives "abab" [25]
  • "{{padright:|5|ab}}" gives "ababa" [26]
  • "{{padright:1|0|ab}}" gives "1" [27]
  • "{{padright:1|1|ab}}" gives "1" [28]
  • "{{padright:1|2|ab}}" gives "1a" [29]
  • "{{padright:1|3|ab}}" gives "1ab" [30]
  • "{{padright:1|4|ab}}" gives "1aba" [31]
  • "{{padright:1|5}}" → "10000" [32]
  • "{{padright:1|5|}}" gives "1" [33]
  • "{{padright:1|5| }}" gives "1" [34]
  • "{{padright:abc|13| d  e }}" gives the wikitext "abcd  ed  ed " rendered as "abcd ed ed "
  • "{{padright:abc|14| d  e }}" gives the wikitext "abcd  ed  ed  " rendered as "abcd ed ed "

Length of a string[edit]

There is no function for the length of a string, but from the above follows:

  • {{padleft:S|n}} is equal to S if and only if length (S) ≥ min(n,500).

(Note that {{padleft:S|n}} may end with spaces and/or newlines, so care should be taken that it is not trimmed before the comparison.)

Thus with a binary search the length can be found, except that if it is 500 or more this fact can be found, not the actual length. This is done in Template:Len with the help of Template:Len/digit. They use quotation marks around the string, so the maximum length found for the string itself is 498.

Extracting a character[edit]

As follows from the above, the first character of a string P can be extracted with {{padleft:|1|P}}. This method is preferable for this case compared with the method below, for efficiency and because there are less limitations.

There is no function for extracting a character from an arbitrary given position of a string. However, for a given character we can compare the truncation of the string up to and including the given position with the truncation of the string until that position, concatenated with the character. Thus we determine whether the character at the given position in the string is equal to the character we tried. (Note that the truncation of the string until the given position may end with spaces and/or newlines, so care should be taken it is not trimmed before the concatenation. Similarly, when trying whether the character is a space or newline, care should be taken that the compared strings are not trimmed before the comparison, because then we would not distinguish between a space and a newline.) Also care should be taken that the result, which may be a space or a newline, is not unintentionally trimmed.

This is done in Template:Chr with the help of Template:Chr/list. The latter contains a switch with a case for each of the supported characters.

The automatic newline feature/bug for "{|" does not affect the result, except that it adds the newline if "{|" is at the start of the substring, since Sub calls Chr for every character position separately. However, just "*", "#", ":", and ";" without newline cannot be produced by any template or parser function. Possible remedies:

  • put the character in <nowiki> tags
  • Postpone adding a resulting character to the output until the next charaxter is checked; output it together with that character is the second character is "*", "#", ":", or ";". This seems rather complex, since there is no feature for variables in the computer programming sense; it may require deep nesting, but then one may be restricted by the system limitations on the number of levels.
  • Post-process the resulting string, removing the added newlines. See Template:Rmanl for a start.

Extracting a substring[edit]

As mentioned above, if n is not greater than the length of a string P, {{padleft:|n|P}} produces the truncation of P to n characters, i.e., the substring of P of n characters starting at the start of P. This method is preferable for this case compared with the method below, for efficiency and because there are less limitations.

There is no function for extracting a substring that does not start at the start of the string, neither one for looping. However, for every potential position we can determine whether it is within the range of the required substring, and if so, extract the character. The results are simply concatenated.

This is done in Template:Sub. Note that for each of the required character positions it requires two calls of padleft and a call of a switch. Thus quickly system limits are reached.

Extracting a substring from the right side of a given string[edit]

Extracting a substring from some given position to the end of a string requires the specification or determination of the length of the string. This also applies to extracting a substring of a given length from the end of the string. Of course, if the length is known it is more efficient to specify it than to have the system determine it.

If the length is not known, Template:Str right and w:Template:Str rightc can be used, respectively, otherwise the previous section can be applied directly.

XML-style tags[edit]

XML-style tags, e.g. <nowiki> tags and <math> tags, together with their content, are temporarily replaced by a so-called strip marker, a unique code with a length of ca. 37 characters plus the length of the tag name (independent of the length of the content), e.g.

foo

[1]

This affects string functions. If a strip marker is truncated, the remainder is exposed. Also, padding a string with a strip marker is based on the length of that, not of what it represents.

Examples with <nowiki> tags:

  • {{padleft:|38|<nowiki>abc</nowiki>}} gives
  • {{padleft:|40|<nowiki>abc</nowiki>}} gives
  • {{padleft:|41|<nowiki>abc</nowiki>}} gives
  • {{padleft:|42|<nowiki>abc</nowiki>}} gives
  • {{padleft:|43|<nowiki>abc</nowiki>}} gives
  • {{padleft:|48|<nowiki>abc</nowiki>}} gives
  • {{padleft:<nowiki>abc</nowiki>|40|1234567890}} gives 123456abc
  • {{padleft:<nowiki>abc</nowiki>|43|1234567890}} gives 123456789abc
  • {{padleft:<nowiki>abc</nowiki>|44|1234567890}} gives 1234567890abc
  • {{padleft:<nowiki>abc</nowiki>|50|1234567890}} gives 1234567890123456abc
  • {{padleft:<nowiki>abcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabc</nowiki>|50|1234567890}} gives 1234567890123456abcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabc

Examples with <math> tags:

  • {{padleft:|38|<math>abc</math>}} gives for example
  • {{padleft:|40|<math>abc</math>}} gives for example
  • {{padleft:|43|<math>abc</math>}} gives
  • {{padleft:|48|<math>abc</nowiki>}} gives
  • {{padleft:<math>abc</math>|43|1234567890}} gives 12345678901
  • {{padleft:<math>abc</math>|44|1234567890}} gives 123456789012
  • {{padleft:<math>abc</math>|50|1234567890}} gives 123456789012345678
  • {{padleft:<math>abcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabc</math>|50|1234567890}} gives 123456789012345678

#titleparts[edit]

See Help:Titleparts.

String storage[edit]

Suppose we have the string ABCD and also need the substring CD, then because of the inefficiency of finding the substring we can better store (or pass on as parameter values) AB and CD as separate data items.

Date and time variables are available providing output in the form of numbers, allowing easy processing without expensive string operations.

However, for processing the result of {{PAGENAME}} (on the page itself or passed on to a template as parameter value) string functions can be useful, to extract the text after "Wikiproject ", the text between parentheses, the text before the parentheses, etc.

See also[edit]

  1. See the content of the file Parser.php.

Links to other help pages[edit]

Help contents
Meta · Wikinews · Wikipedia · Wikiquote · Wiktionary · Commons: · Wikidata · MediaWiki · Wikibooks · Wikisource · MediaWiki: Manual · Google
Versions of this help page (for other languages see further)
What links here on Meta or from Meta · Wikipedia · MediaWiki
Reading
Go · Search · Stop words · Namespace · Page name · Section · Backlinks · Redirect · Category · Image page · Special pages · Printable version
Tracking changes
Recent changes (enhanced) | Related changes · Watching pages · Diff · Page history · Edit summary · User contributions · Minor edit · Patrolled edit
Logging in and preferences
Logging in · Preferences · User style
Editing
Starting a new page · Advanced editing · Editing FAQ · Edit toolbar · Export · Import · Shortcuts · Edit conflict · Page size
Referencing
Links · URL · Piped links · Interwiki linking · Footnotes
Style and formatting
Wikitext examples · CSS · Reference card · HTML in wikitext · Formula · List · Table · Sorting · Colors · Images and file uploads
Fixing mistakes
Show preview · Testing · Reverting edits
Advanced functioning
Expansion · Template · Advanced templates · Parser function · Parameter default · Variable · System message · Substitution · Array · Calculation · Embed page
Others
Special characters · Renaming (moving) a page · Preparing a page for translation · Talk page · Signatures · Sandbox · Legal issues for editors
Languages: English