Talk:Pywikipediabot/replace.py

From Meta, a Wikimedia project coordination wiki

Jump to: navigation, search

Contents

[edit] diacritics

How can I use with this bot diacritics (UTF-8) ? It is very important for Romanian Wikipedia. --Romihaitza 17:26, 6 February 2006 (UTC)

I've just converted the article names and it's working. JackPotte 01:32, 16 November 2009 (UTC)

[edit] Bug

When loading an entry list via -file: parameter, if the wikilinks have accents or any non plain ascii, bot borks and dies before doing makign any changes (even if there are thousands of entries before that one which are plain ascii). For example: Talk:Benjamín Urrutia (on en:) makes the bot break because of the accented i. 132.248.81.29 23:48, 10 March 2006 (UTC)

forgot to sign drini 23:49, 10 March 2006 (UTC)

[edit] Problem

It has a problem - if i open replace.py, i get a question "Please enter the text that should be replaced", and when i typing it the text and press enter, i get the second question "Press enter the new text", typing the new text and i press enter and i press enter 2 times again, but the command window disappear immediately. --IL 18:18, 18 March 2006 (UTC)

Sounds like you haven't specified the range of pages to check for the misspelling. When you run "python replace.py", try appending something like "-start:!" or "-start:A". 129.21.121.171 06:11, 18 May 2006 (UTC)
So in other words, without computer jargon. Type:
replace.py -start:A
Odessaukrain 22:33, 24 May 2008 (UTC)

[edit] Cat-based replace

Category-based replace seems to be invariably giving no results, Getting 0 pages from .... Is this a bug or am I missing something? I've tried with both the category name itself and with the Category: prefix, as well as with spaces in the name and underscores in the name. Help? --88.113.114.226 16:37, 31 May 2006 (UTC)

Maybe someone did not add the category to the pages but did list the articles there manually, then it might not work. Give the page where you have the problem please, greetings --birdy geimfyglið (:> )=| 19:36, 31 May 2006 (UTC)

I can't seem to get Category-based replace to work either. Here's the command I used that failed:

   python replace.py -cat:Front_groups "{{tobaccowiki}}" "{{#badge: tobaccowiki | front groups}}"

--Sheldon Rampton 04:40, 23 December 2007 (UTC)

I experienced it like you can only work on categories that indeed contain pages. The bot won't run on pages in subcategories. --Plasmarelais 18:14, 30 August 2009 (UTC)

[edit] Finding pages

This script seems to have a hard time finding pages. It can get pages from Special:Allpages, even edit articles (with edit_article.py), but whenever I try to replace something, I get a face full of "Page X not found". Any help? The Mu 20:11, 18 August 2006 (UTC)

Someone reported a similar problem with replace.py for non-WMF sites on #pywikipediabot. Not sure what the problem is, as other tools seem to work. --Connel MacKenzie 19:00, 10 September 2006 (UTC)
Yeah, thats me, I have the same problem. The -ref: and -start:! options work all 'fine' but I always get "Getting xx pages from mywiki:en" followed by a number of "Page Foobar not found". Most other scripts work just fine. I work from Windows XP to Mediawiki 1.6.8. I've also just downloaded the september 5 version of the pywikipedia framework. --GrandiJoos 20:57, 10 September 2006 (UTC)
Update: weblinkchecker.py also reports "Article_X does not exist"... --GrandiJoos 08:06, 11 September 2006 (UTC)
Fix: in localsettings add
$wgGroupPermissions['bot'  ]['export']          = true;

Thanks to Andre Engels. --GrandiJoos 11:07, 2 October 2006 (UTC)

[edit] Problem with utf-8

 [[한국]]
 [[월드컵]]
 [[미국]]
 [[나비]]
 python replace.py -file:articles_list.txt "errror" "error"
I make a file, and run my bot...
RESULT: error!!
Korean exlorer, wikipedia use utf-8 (unicode)
cmd use not utf-8
and python also use not utf-8
help me~!!
How setting for me? -- WonYong(Talk / Contrib) 11:46, 11 September 2006 (UTC)
I've just converted the article names and it's working. JackPotte 01:30, 16 November 2009 (UTC)

[edit] Bot problem (korean language)

  • I have a question.
  • I input 스모그 to interwiki.py in WIN XP's cmd.exe console.
 C:\pywikipedia>interwiki.py -start:스모그 -autonomous
  • following is result...
Checked for running processes. 1 processes currently running, including the current process.
NOTE: Number of pages queued is 0, trying to add 60 more.
Retrieving Allpages special page for wikipedia:ko from %C2%BD%C2%BA%C2%B8%C3%B0%C2%B1%C3%97, namespace 0
Getting 60 pages from wikipedia:ko...
  • and, I input 스모그 to web wikipedia (ko:)
  • thus, connected to link as follows:
 http://ko.wikipedia.org/wiki/%EC%8A%A4%EB%AA%A8%EA%B7%B8
  • Why? is different?
%EC%8A%A4%EB%AA%A8%EA%B7%B8
I input 스모그 to web wikipedia (ko:)
%C2%BD%C2%BA%C2%B8%C3%B0%C2%B1%C3%97
I input 스모그 to cmd console
  • It is a bug??
  • and, I input by file method...
C:\pywikipedia>
C:\pywikipedia>copy con aa.txt
[[스모그]]
^Z
C:\pywikipedia>interwiki.py -file:aa.txt
Checked for running processes. 1 processes currently running, including the current process.
NOTE: Number of pages queued is 0, trying to add 60 more.
Dump ko (wikipedia) saved
Traceback (most recent call last):
File "C:\pywikipedia\interwiki.py", line 1467, in ?
bot.run()
File "C:\pywikipedia\interwiki.py", line 1200, in run
self.queryStep()
File "C:\pywikipedia\interwiki.py", line 1174, in queryStep
self.oneQuery()
File "C:\pywikipedia\interwiki.py", line 1132, in oneQuery
site = self.selectQuerySite()
File "C:\pywikipedia\interwiki.py", line 1114, in selectQuerySite
self.generateMore(globalvar.maxquerysize - mycount)
File "C:\pywikipedia\interwiki.py", line 1050, in generateMore
page = self.pageGenerator.next()
File "C:\pywikipedia\pagegenerators.py", line 162, in __iter__
for pageTitle in R.findall(f.read()):
File "C:\Python24\lib\codecs.py", line 481, in read
return self.reader.read(size)
File "C:\Python24\lib\codecs.py", line 293, in read
newchars, decodedbytes = self.decode(data, self.errors)
UnicodeDecodeError: 'utf8' codec can't decode byte 0xbd in position 2: unexpected code byte
C:\pywikipedia>
  • korean lanuage is broken...:(
  • help~!! :( -- WonYong(Talk / Contrib) 11:49, 11 September 2006 (UTC)
Acording as korean wiki admin, use it as following:
1.
 C:\pywikipedia>interwiki.py -start:스모그 -autonomous  (X)
 C:\pywikipedia>interwiki.py -start:%EC%8A%A4%EB%AA%A8%EA%B7%B8 -autonomous  (O)
So, I use it.
2. and, edit user-config.py file.
 console-encoding = 'cp949'
3.
 cmd.exe /U
4. cmd font change to truetype korean font
RESULT: cmd Output is good. korean font output is not broken. WOW!!
but, I also want NOT BROKEN INPUT as following:
 C:\pywikipedia>interwiki.py -start:스모그 -autonomous  (O)
help!! -- WonYong(Talk / Contrib) 11:59, 11 September 2006 (UTC)

[edit] regex BUG

tried with several version of pywikipedia any change with regular expression leads to a crash can you fix this huge bug? thanks

[edit] Bug: "\n" in replacement text

I can't put a newline in the replacement text. For example:

 python replace.py "AbcDef" "Abc\nDef"

actually inserts the literal characters "\n" rather than a newline. The preceding unsigned comment was added by 128.165.16.45 (talk • contribs) 20:46, 9 July 2007.

If you are using a bash command line, you can add a literal new line by simply press enter:
$ python replace.py "AbcDef" "Abc
> Def"

Cheers, John Vandenberg 23:16, 27 September 2007 (UTC)

[edit] Correct command for interwiki purposes

Suppose that we have a page, say w:Door, with 19 links to it. We decide to tarnswiki it to Wiktionary, and we want all links to w:Door to be directed towards wikt:door. What is the correct command to achieve that, using replace.py? Huji 17:28, 27 August 2007 (UTC)

[edit] use xml dump

if I use a xml dump with the option -xml the program must modify not only the online page, but also the xml version, so if I redo the same operation the script finds that the page is already modified. --Wiso 15:36, 27 September 2007 (UTC)

[edit] Isredirect and putthrottle

2 issues:

I have tried to edit pages using -ref:, and when I use that, it says "Isredirect". When I don't use that, it works.

I have also tried putting -putthrottle: to 12, and it does it in shorter amounts of time.

why are these happening?

[edit] How could I replace the text wich contain two or more lines?

For example I must change the text:

==Subtitle==

with the following:

==Subtitle==
{{template}}

How could I do it? The problem is that I could not press enter while typing the replaceing text and I couldn't use <br> because it will not cause the expected effect --A1 20:04, 3 December 2007 (UTC)

Hi, I would make it like this, modify replace.py the following way:
remove everything between
    elif fix == None:
and
    else:
        # Perform one of the predefined actions.
Then put in between them
        replacements.append((u'\=\=Subtitle\=\=',u'==Subtitle==\r\n{{test}}')) 
        wikipedia.setAction(u'summaryhere')
Name the file other than replace.py (e.g. myreplace.py)
Run with myreplace.py -regex option.
Best regards, --birdy geimfyglið (:> )=| 20:20, 3 December 2007 (UTC)
Thank you but it doesn't work. Am I right:
elif fix == None:
        replacements.append((u'\=\=Subtitle\=\=',u'==Subtitle==\r\n{{test}}')) 
        wikipedia.setAction(u'summaryhere')
    else:

--A1 21:46, 11 December 2007 (UTC)

Worked perfectly for me [1], did You save the modified.py file as UTF-8? Did You run with -regex option? Best regards, --birdy geimfyglið (:> )=| 04:22, 8 February 2008 (UTC)
I think it's easier to use -regex with "==Subtitle==" "==Subtitle==\n{{template}}" --Plasmarelais 16:49, 29 September 2009 (UTC)

[edit] Nothing happens

I need some help with replace.py

I installed Python25 and Pywikipedia on my computer

The purpose : use it on a MediaWiki installation (in french), in version 1.6.7

I configured my user-config.py, my [name_of_the_wiki]_family.py that I put in families directory...

And i launched the /cmd on Windows in order to use Python

I've already created an account for the bot on the Wiki ( its name is Orthobot)

Then I launch python.exe login.py (and there isn't any problem)

Then I launch replace.py like that :

python.exe replace.py -page:Abheva "sanscrit" "sanskrit"

The bot finds the page, finds the word, asks me if I want to replace it...

And it says the page has been replaced.

But the problem is that, when I look at this page, the replacement hasn't been made

So, I don't understand : everything seems to be ok but It doesn't work

Any idea ?

Urobore 22:39, 7 February 2008 (UTC)

Try the following: run replace.py -page:Abheva (without anything else) and type enter the text when prompted.
If You can't get it to work, You will find more people to be able to help You at #pywikipediabot
Best regards, --birdy geimfyglið (:> )=| 22:52, 7 February 2008 (UTC)
Thank you but, unfortunately, it didn't work. About the IRC channel, in fact, I come from there... If somebody has another idea... Thanks, anyway ! Urobore 22:58, 7 February 2008 (UTC)
Some thoughts: is the page locked? Or semiblocked (and You just registered), do You have the newest version of pywikipedia (easiest is to update via tortoise). Do You have a link to this page please? --birdy geimfyglið (:> )=| 23:04, 7 February 2008 (UTC)
No, the page isn't locked, nor semiblocked. However, I just found a means to resolve the problem even if I don't understand why. In fact, it is a public Wiki but blocked for non-registered users (only registered one can Edit a page). Moreover, a non-registered user can't create a new account (it means I create accounts for users who wants to participate to the wiki). Concretly, my LocalSettings.php contains :


$wgGroupPermissions['*']['createaccount'] = false;
$wgGroupPermissions['*']['edit'] = false;


So, I put the second line = true instead of false and it resolved the problem (the page "Abhava" has been edited).

It means I'll have to edit my LocalSettings.php each time I'll want to use a Pywikipedia. But I don't understand exactly why because the Bot has an account and should be able to edit a page without the autorization as a non-registered user : in Edit History of the page "Abhava", the Bot only appears as a non-registered IP address (mine) and not as a user. It means there should be a problem with the login of the Bot (it can't log in on the wiki and, so, is able to edit a page only as a non-registered user). So : have you got an idea about this problem of login identification of the bot ?

Thank you, anyway.

Urobore 07:10, 8 February 2008 (UTC)

[edit] More than one thing to replace

Is there a way to replace two or more strings at the same time? I mean: to replace at the same time "Errror" for "Error" and "Miistake" for "Mistake", without having to run the bot twice in the same category? Thank you!--Xtv 11:47, 18 September 2008 (UTC)

Very intresting question. If there was such a feature, it would be very useful for me too. So is there a way? Thanks a lot! --Plasmarelais 23:37, 15 January 2009 (UTC)
Yes, it's posible. Just write replace.py and then hit Enter. It will ask you what you want to replace, and you will be able to replace more than one thing in the same operation.--Unai Fdz. de Betoño 12:05, 30 August 2009 (UTC)
Or you just define several replacements like
replace.py -start:! "errror" "error" "miistake" "mistake" "wroong" "wrong"
As long as you give an even number of replacements. --Plasmarelais 18:07, 30 August 2009 (UTC)

[edit] Working in all namespaces

How to make bot work in all namespaces using "-start"? Because when I type "-start:!" it works only in main namespace. 77.253.22.92 00:00, 20 September 2008 (UTC)

Hello, for example -start:Template:! should do the trick for a specific namespace, see also the -namespace: section in Replace.py, best regards, --birdy geimfyglið (:> )=| 00:04, 20 September 2008 (UTC)
But this way bot still works in one specific namespace, and I wanna make him work in all of them. And, if I understand well, -namespace cannot be used with -start, and I have to use -start, as I don't have possibility to make xml dump. 87.205.65.183 23:05, 21 September 2008 (UTC)
As far as I know, Pywikipediabot doesn't support that. You have to run the bot for each namespace. Huji 17:17, 22 September 2008 (UTC)
Try to use the -namespace:nn several times in one run. --Plasmarelais 18:08, 30 August 2009 (UTC)

[edit] Replacing birngs new text

Hi! I'm new on working with bots, but i feel myself confronted with a problem: everytime i replace one word for another, theres always an extra text coming with the new word:

   <div id="wikia-credits"><br /><br /><small>Von [http://memory-alpha.org Memory Alpha],
einer [http://www.wikia.com Wikia]-Wiki.</small></div>

How can i avoid that extra text?

Thank you very much for any help! --Plasmarelais 15:30, 13 January 2009 (UTC)

I found the answer on MA/en: http://memory-alpha.org/en/wiki/Memory_Alpha:Bots. Anyway thank you! --Plasmarelais 22:23, 13 January 2009 (UTC)

[edit] Additional parameters

Replace.py -help shows several options which are not described on Replace.py page.

Is it working options or obsolete? e.g -uncatfiles? --Dnikitin 10:28, 17 February 2009 (UTC)

You should assume that the file itself is correct. This page is probably outdated. In any case, -uncatfiles is indeed available. --Erwin(85) 19:45, 17 February 2009 (UTC)
The option is a general one not specific to replace.py, it's from pagegenerators.py. -- User:D2

[edit] Infobox field samples

It would be nice to have some samples for infobox field updates. -- User:D2

[edit] Question

I have the next problem when I were testing the script:

python replace.py -page:Usuario:Ezarate73/pruebas -regex "[aeiou]staba más" '\1' Getting 1 pages from wikipedia:es... /home/esteban/pywikipedia1/pywikipedia/weblinkchecker.py:808: SyntaxWarning: name 'day' is assigned to before global declaration

 global day

Traceback (most recent call last):

 File "replace.py", line 705, in <module>
   main()
 File "replace.py", line 701, in main
   bot.run()
 File "replace.py", line 376, in run
   new_text = self.doReplacements(new_text)
 File "replace.py", line 344, in doReplacements
   allowoverlap=self.allowoverlap)
 File "/home/esteban/pywikipedia1/pywikipedia/wikipedia.py", line 3491, in replaceExcept
   replacement = replacement[:groupMatch.start()] + match.group(groupID) + replacement[groupMatch.end():]

IndexError: no such group

Where's the error? Thanks --Ezarate 03:35, 8 June 2009 (UTC)

[edit] Replace Quotationmarks (solved)

Is it possible to replace a string _including_ some quotationmarks?

I want to replace "abc" with abc. (Get rid of the quotation marks...)

Thanks alot! --91.33.247.68 05:52, 22 September 2009 (UTC) (Felix)

Think you may use -regex:
replace.py -regex "\"abc\"" "abc" -start:!
if you type \" instead of " it is regonized as part of the string. --Plasmarelais 07:41, 22 September 2009 (UTC)
Thank you very much! --131.234.103.164 11:06, 23 September 2009 (UTC) (Felix)