User talk:LA2/Extraktor

From Meta, a Wikimedia project coordination wiki
Jump to: navigation, search

Have you tried this with nested templates? Are there any nested templates in normal Wikipedia articles? All the {{#if:...}} and {{{1|}}} stuff will make the script fail, won't it? How about template calls over multiple lines? -- Nichtich 07:48, 26 September 2006 (UTC)

Yes, nested templates do exist, of two kinds: One kind is where a template definition contains a template call. I don't handle that at all. Another kind is where a template call is a parameter in another template call, e.g. {{foo | x=14 | y={{bar | z = 3 }} }}. For this case, my little script contains a loop that parses the innermost template call first, and replaces it with "@bar", so that the outer template call will be extracted as having a parameter "y=@bar". I have no idea if this is useful or what alternative ways there are to handle this. If you extract parameters from, say a dewiki dump, you can grep for "@" in the output to find examples. The "$" and "@" syntax was inspired by Perl. --LA2 14:52, 26 September 2006 (UTC)

error with new version[edit]

$ perl extraktor.pl < enwiki-20070908-pages-articles.xml > parameters-enwp-20070908 Unmatched [ in regex; marked by <-- HERE in m/(ISBN) +([0-9][- 0-9Xx]+[ <-- HERE 0-9Xx)/ at extraktor.pl line 115.