Toolserver/New accounts: Difference between revisions

From Meta, a Wikimedia project coordination wiki
Content deleted Content added
Line 130: Line 130:
:What about the MP3 patents? Within the United States, royalties are required to create and publish a MP3 file. I personally advocate for greater adoption of Ogg/Vorbis/Theroa and other FLOSS formats. Thanks, [[User:GChriss|GChriss]] 20:03, 10 April 2007 (UTC)
:What about the MP3 patents? Within the United States, royalties are required to create and publish a MP3 file. I personally advocate for greater adoption of Ogg/Vorbis/Theroa and other FLOSS formats. Thanks, [[User:GChriss|GChriss]] 20:03, 10 April 2007 (UTC)


==[[:en:User:GChriss]]==
I am a young professional interested in improving the ease-of-distribution of free content. [[:commons:User:Gmaxwell|Gmaxwell]], using the toolserver, installed [http://www.jcraft.com/jorbis/ JOrbisPlayer] to play Ogg Vorbis files hosted on WMF servers in-browser. ([http://tools.wikimedia.de/~gmaxwell/jorbis/commonsJOrbisPlayer.php?path=Mozart+-+Concerto+in+D+for+Flute+K.314.ladybyron.ogg Mozart example.]) In a similar vein, I intend to install, facilitate multi-language translation, and maintain a copy of [http://menguy.aymeric.free.fr/theora/ iTheora] (GNU) to play Ogg Theora in-browser. I still need to research how to create a local cached copy for iTheroa playback in light of Wikimedia's distributed server framework. I have had a good experience with a test iTheora installation, and remain open to suggestions and feedback. Thanks, [[User:GChriss|GChriss]] 21:03, 26 February 2007 (UTC)
:I also would like to develop VoIP services (audio/video) to enhance WMF communication, specifically Wikimania 2007 [[:wm2007:Online_participation|online participation]] and day-to-day conference calling between Foundation volunteers. [http://www.freeswitch.org/ Freeswitch] is a promising FLOSS VoIP contender, and has substantial [http://wiki.freeswitch.org/wiki/FreeSwitch_FAQ#Q:_What_is_FreeSWITCH_anyway.3F__Is_it_just_another_fork_of_Asterisk_or_something.3F advantages] compared to the traditional [http://www.asterisk.org/ Asterisk] package. Freeswitch supports audio conferencing (currently video-agnostic), has a [[:en:Session_Initiation_Protocol|SIP]] "no-media" feature to coordinate videoconferencing peer-to-peer, and [http://www.voxgratia.org/docs/faq.html openH323/OPAL] modules are planned for video [[:en:multipoint control unit|multipoint control unit]] support. Audio-only channels are helpful by themselves, and will be the first priority.


:[[:commons:User:Gmaxwell|Gmaxwell]] is testing an (inactive?) Asterisk service. ([[:commons:User:Gmaxwell/voip|His Asterisk notes]] and [http://nonbovine-ruminations.blogspot.com/2007/03/is-wikimedia-really-committed-to-open.html a related discussion].) Thanks, [[User:GChriss|GChriss]] 18:48, 27 March 2007 (UTC)


== [[:ar:User:alnokta]] ==
== [[:ar:User:alnokta]] ==

Revision as of 18:50, 12 June 2007

Before requesting

  • The toolservers are under the legal responsibility of Wikimedia Deutschland, and their use is restricted according to EU and German laws.
  • On a day to day basis, decisions about users who will receive accounts are delegated to DaB
  • At present, the toolservers do not have direct access to page text, due to Wikimedia's use of external storage clusters. Requests to run projects which depend upon fetching a lot of page text might be postponed.

When requesting

We need to know

  • Who you are. Please provide a link to your user page on your home wiki project; we can follow this and poke about a bit to find out who we're dealing with, e.g. check your involvement in projects, etc.
  • Why you want an account. A short description of what you want to do is needed, and any links to existing sample projects or code, or even working utilities, are very useful.


Requests

Hello. I'm a little prgrammer from germany and I like to make webapps, and because of i'm interested in wikis and free software, i think its funny und good when I programm a little Wikipedia-Wikibooks-Wiktionary-and so on tool which give out some text. It shall analyze a question too. I can't speak English as good as you - but I hope you allow me to make a tool. --77.128.46.167 12:08, 7 June 2007 (UTC) Sorry - I had no Meta-Acc. Now I'm logged in --Nummer9 12:10, 7 June 2007 (UTC)[reply]


en:User:Hank

I am a programmer from Alaska. My Blog

I would like to create a simple personal learning app - would find the top 100 articles that the user hadn't read, and allow them to tag the articles as being read, interesting, uninteresting, etc. Would make suggestions based on past experience with articles. Would allow someone to read only the entries that probably interest them.

Developing a fine-grained REST statistics service for pages as a complement to Zachte's coarse-gained stats. Also supplying overlays of these statistics onto content pages, i.e. background color varying as the number of editors since text introduction. -- 03:00 1 November 2005 UTC (Note, copied over here from old rev of old page after being dropped from that page.

Sorry, that your request was lost. But to be honest: I don not understand what you plan to do. Can you please say it with more simply words again please? --DaB. 21:30, 20 September 2006 (UTC)[reply]

I'm interested in doing a tool to find the adding of specific text (who added that?, in which revision?). When it was recently added, there's no much trouble, but if you try to find who added this recently deleted image several months ago... you can become mad ;) I have little experience on MW code (they rejected bugzilla:5763 i may do it when i felt free enough). However, i have played a bit with extracting information [1] [2] [3] and JS [4]. Ah, and i run a bot when i'm in mood of then revising its changes. Platonides 17:11, 12 June 2006 (UTC)[reply]

That sounds nice, but a lack of simple access to page text might hamper your ability to do this. How would you work around that? robchurch | talk 01:43, 19 June 2006 (UTC)[reply]

Oh, i wasn't realizing that Daniel Kinzler == Duesentrieb. It's really a problem, as we're stuck on WikiProxy with the external storage problem. However, the Toolserver is still the better way to perform it. Would a limit time between queries need to be enforced? (it seems robust enough but it's better to have things secured) Platonides 22:11, 19 June 2006 (UTC)[reply]

You don't seem to have thought out the implementation with respect to the problems. robchurch | talk 16:32, 23 June 2006 (UTC)[reply]

The algorithm is pretty clear. That the get revision text step can't be done asking the DB but need to ask the WikiProxy is not a big change. It's more a Toolserver's problem than user's one. Moreover, the WikiProxy first looks into the text table, only asking through HTTP if it's not locally available. I don't understand what you mean. That could be achieved through other methods, like JavaScript, but in no way better. Platonides 16:51, 24 June 2006 (UTC)[reply]

My point is that WikiProxy is fine for requesting a few pages every so often. For applications which depend upon continuously fetching and comparing page text, it's usually better to use a database dump. Incidentally, that it is "a toolserver problem" is fairly germane; yes, it is a problem, and yes, we are trying to work out how to fix it, but we also expect some co-operation from our users. Excessive resource usage leads to processes and queries being killed; it's the same on any multi-user system. robchurch | talk 16:08, 2 July 2006 (UTC)[reply]

A database dump is fine if you want statistics, not if you want data about live wikipedia. So you would still need fetching the last week or so. I'm not sure if wikiproxy/db page retrieving already handles it or if it would need to be added at application layer. However, these dumps would only be available for large wikis (define large as you want: by dump size, number of revisions, number of tool queries...). Platonides

It seems the same feature was requested years ago at bugzilla:639. Can it be tried, instead of having theoretic arguments? Platonides 22:27, 15 July 2006 (UTC)[reply]

I'd like to request a toolserver account for the purpose of creating pages to perform queries against the En database for some common tasks that people are interested in like determining edit counts, finding pages created by by a particular user, finding all articles containing a particular style error, detecting copyright violations, and so on, as well as for performing ad hoc queries to answer specific questions. Such queries would be carefully optimized and spaced in time to minimize performance impact. Deco 19:23, 28 June 2006 (UTC)[reply]

How do you plan on
  1. "finding all articles containing a particular style error"
  2. "detecting copyright violations"
efficiently? robchurch | talk 01:55, 29 June 2006 (UTC)[reply]
Please note that at this time, article text is not available in the toolserver database. If you want to work on the fulltext of many articles, you will have to process a dump. If fulltext is only required for relatively few pages, WikiProxy can be used to fetch it live, but this is comparitively slow. -- Duesentrieb 13:51, 29 June 2006 (UTC)[reply]

Hi!

I hereby request a toolserver account for some things:

  • I want to use my commons mover on toolserver after User:Duesentrieb looked over the code
  • I want to develop together with Cool Cat a script wich daily gets the images used on the main pages of the 10 major wikis and protects their image pages on Commons to prevent vandalism and reuploads (especially by the Penis vandal)
  • And I want to code a tool which checks all new pages made by IPs on dewiki against copyvios via CopyScape.
  • And at final I want to code a tool which checks vote legitimations for votes on dewiki so that no one needs to care if the users vote is legit or not.

Best regards, Klever 18:11, 7 July 2006 (UTC)[reply]

Why auto-protect them? To replace a image, the user must be autoconfirmed (or pass a similar account-oldness). For the same cost, they could completely edit many Main Pages. I'd vote for a image-watching advising such replacements (on #wikimedia-commons?) which are supposed to be very rare or the whoole system wouldn't make sense.
Be careful with the vote legitimations, to avoid false warnings on simple comments.
Platonides 20:53, 10 September 2006 (UTC)[reply]
The auto-protect was cool cat's idea. and in dewp, vote comments are illegal in vote section. greets, HardDisk 19:26, 16 September 2006 (UTC)[reply]
The user is informed, that his request is moved to the future. 
Thats not a reject, but I have to watched the user (and his code) for longer. --DaB. 21:20, 20 September 2006 (UTC)[reply]

Hello, I would absolutely love to have an account, I want something special I can do, and I would love to develop tools. I understand the rules and am looking forward to developing tools for the Wikimedia project. If you have any questions, please contact me at en:User talk:Minun, cheers Minun 19:44, 30 July 2006 (UTC)[reply]

Please, reread When requesting. You're not saying Why you want an account so you have little chance to get it. Platonides 21:59, 5 August 2006 (UTC)[reply]
I'd like to help out other users, and take special parts in the project, and this is one of them. I've been thinking of a couple of ideas too. I have thought of perhaps a signature generator, and a user page generator, and some more advanced tools include a barnstar generator, and much more. Cheers Minun 18:53, 6 August 2006 (UTC)[reply]
Hello, Please first write down what you like to do. Because I need something to decide :). --DaB. 14:05, 7 August 2006 (UTC)[reply]
I would just be mainly taking a special part in the projects, here are some thoughs I made. Here are a few tools for users who make their personal content (user page, talk page, siganature, etc...)
  • A user page (and talk page) design generator
  • A signature generator

and I also have this tool in my mind

  • A barnstar generator (kinda like what users can use to make theirt own barnstars)

These are just what im thinking of, I will probably be able to make much more Minun 18:48, 7 August 2006 (UTC)[reply]

Those don't require database access or processing time on another server. 86.134.49.147 15:40, 21 August 2006 (UTC)[reply]

Please note that en:User:Minun has been blocked for one year by ruling of the arbitration committee. TDS 01:58, 24 August 2006 (UTC)[reply]

Its been appealed so please wait if your here to reject it for that exact reason, but nevertheless, I can still work on it before it gets lifted.

Now, regarding the anonymous user's question, i've scrapped the idea of the image generator as that may not require such use, but I will still keep the idea of the signature generator and the userpage one. I've also thought of a new one, its a tool that can sniff things they can do to improve, one for normal users to find out if their ready to be an administrator, which can sniff if someones reverted that users contributions, if that person needs to do more stuff, example discussions, deletion discussions, etc... and lots more. Minun 15:06, 30 August 2006 (UTC)[reply]

Of course neither I nor the toolserver is under the controll of the en:-arbitration-Board. But the things you do would resultated in a block on each project I know. So you have lose the huged part of the trust of communitis and so I can't image that someone would use your tools. So I think the request is going to wait until the trust in you is back and/or the block is finish. --DaB. 21:29, 20 September 2006 (UTC)[reply]

I want to delete the stuff that is pointless and should not be there.

A little bit more information would be nice :). --DaB. 20:07, 13 January 2007 (UTC)[reply]

I run Zorglbot on en-wiki; among other tasks, the bot parses en:Special:Shortpages, then retrieves and parses all pages mentioned there. The end result is stored at en:User:Zorglbot/Shortpages, and allow editors to know which pages still require processing (revert vandalism, speedy delete empty pages, etc), or which ones are legitimate (short disambiguation pages, etc). This works relatively smoothly, except for two things: en:Special:Shortpages is quite expensive to make and so is cached and updated only about every 2-4 days; in addition, it returns only the 1000 shortest pages.

If this is possible, once the replication lag for en-wiki will be below 4 days, I'd like to run this tool on the toolserver. I understand that since the text of pages is not directly available on the toolserver, the actual parsing of the pages will still require to fetch the pages from en-wiki (possibly through the WikiProxy); however, being able to bypass the cache would be a huge improvement, in particular for vandalism fighting. Schutz 20:59, 18 January 2007 (UTC)[reply]

en is not replicated at the moment because of overloading. But we get a new server in the near future for en. Should I give you an account now or then? --DaB. 20:24, 18 February 2007 (UTC)[reply]
I can wait until then (... and I could not do anything for now anyway :-) Thanks, Schutz 13:44, 25 February 2007 (UTC)[reply]

I running the german Wikipedia MP3-Podcast and an automatically gernerated OGG-RSS-Feed for spoken articles at de:Wikipedia:WikiProjekt_Gesprochene_Wikipedia/RSS-Feed. I host all of the services on my own server, causing me 6GB of monthly traffic and the need for approx. 2GB disk space. (only for german podcast) I have all together approx. 50.0000 monthly hits and about 150 listeners to the german MP3-Podcast. I'd like to expand my services to the international wikipedia. Beside hosting of the nesecessary scripts, causing 1GB Traffic and consuming about 0.5GB disk space (for a very short time audio converison cache OGG->WAV->MP3), I like to host MP3 versions of the original OGG versions of spoken articles somewhere (not nescessary on Toolserver). You will find more Information in German on my discussion page at de:Benutzer_Diskussion:Jokannes#Toolserveraccount. de:User:Jokannes 19:19, 19 January 2007 (UTC)[reply]

What about the MP3 patents? Within the United States, royalties are required to create and publish a MP3 file. I personally advocate for greater adoption of Ogg/Vorbis/Theroa and other FLOSS formats. Thanks, GChriss 20:03, 10 April 2007 (UTC)[reply]


Hello!

I run a very simple spell-checking bot using replace.py and fixes.py for the word couples. My connection isn't efficient and breaks every while I run it. I also run interwiki.py...I don't know if that falls into the scope, but if it is..I would like an account.--The Joke النكتة‎ 05:07, 3 May 2007 (UTC)[reply]

My name is Andrew Krizhanovsky. I am a researcher at St. Petersburg Institute of Information Technology and Automation, of the Russian Academy of Sciences. This is my home page.

I need an account in order to run Synarcher. This tool helps to find semantic related terms (synonyms, hyponyms, etc.). Current version of Synarcher requires a lot of software (e.g. LAMP, MediaWiki), Synarcher requires parameters of MySQL Wikipedia: host ip, database name, user, password. I want to simplify installation process and present the Synarcher as a Java WebStart application. --AKA MBG 13:59, 4 May 2007 (UTC)[reply]

I have a few tasks which I'd like to host on the toolserver, because I don't have the allowable bandwidth on my computer's internet connection to run it manually.

  • Fistly, two of the tasks are archival bots for the English Wikipedia, which are written in perl and is currently under bot approval, waiting for this toolserver request to be accepted/denied. It will most likely be approved as it will be a much needed resource for the community.
  • Secondly, I would also like to test/run a few extra scripts, which are made with perl and PHP. One of them is a braille image generator created by a user from Wikimedia Commons which automatically runs and creates a SVG image that is the translation of the input value in braille format. A link to the braille script is in my Wikipedia userspace. Another is a HTML to Wiki code converter for those people who want to translate web page information into wiki-based style code.

For the reasons stated above, I am kindly asking for a toolserver account. Many thanks, Extranet talk 07:38, 13 May 2007 (UTC)[reply]

I would like to create a bot for calculating statistics of users' contributions inside categories. The bot would maintain ranking of users on pages of larger categories. Once I have written bot which maintained tables of sovereigns in calendary pages on Polish wiki (see for example table in pl:1852). I have also written WikiMiner - static search tool for DVD edition of Polish wikipedia. Olaf m 22:10, 7 May 2007 (UTC)[reply]

I am currently maintaining de:Wikipedia:WikiProjekt Begriffsklärungsseiten/Top-BKS, a page similar to en:Wikipedia:Disambiguation pages with links.

I would like to query the toolserver database in order to provide an updated list more frequently, without depending on dumps. Not everything I currently do can be done without access to the text of the pages, but counting the links to a disambiguation page should be a simple SQL query. Secular mind 07:55, 8 May 2007 (UTC)[reply]

P.S.:

The script I'm using to create TOP-BKS is my own work as en:User:Bo_Lindbergh/dabalyze did not have all the features I wanted. It is a Ruby script and after the most recent changes I think it is now fit to be published, so I'm also looking for a place to host it under the GPL. de:Benutzer:secular mind/Problematische BKS is another page created using this framework, which is now also linked from de:Wikipedia:WikiProjekt Begriffsklärungsseiten. Secular mind 13:49, 9 May 2007 (UTC)[reply]

I am a PHP programmer from Norway, and I want to create a tool to convert HTML to Wikitext, so tables will be converted and <b> will be ''' and so on... The text should also be easy to copy, by putting in it a textfield...

I am user Mashiah Davidson mostly participating in ru:Russian Wikipedia. My primary interest in the wikipedia is not the content of articles itself but links between articles, i.e. I am concentrating on such issues as pages wikification and orphaned pages problem.

In Russian Wikipedia the orphaned page has strong definition which is more correct than algorithm used to create the lonelypages list. This definition does not take into account links from all namespaces except for the main (zero) namespace, links from disambiguation pages, and links from some other pages such as chronological articles.

I have created and optimized a sequence of sql statements collecting the orphaned pages list according to a correct definition. This set uses tables page, categorylinks and pagelinks to create a list of titles for orphaned articles. It is supposed that the set of scripts will be run on the toolserver weekly or dayly (depending on the real situation) and then used to put a template into each orphaned page constructing an Orphaned Pages category and giving some way to help in orphanes adoption.

I have tested all the scripts on my local machine with sql dump of tables mentioned above and found it takes minutes to construct the resulting list, the exact time is depend on hdd speed and tables format. I have tested it with MyISAM model and InnoDB and found it works 1.5 times faster with first one. With 5400 rpm hdd I need about 15 min (with MyISAM, and 24 min with InnoDB) to get the result on the laptop with 1.5 GHz Pentium M. For machhine with 2.8 GHz Pentium IV and 7200 rpm HDD the time for the list construction is less than 10 min (15 with InnoDB model).

I am ok with the timing of the script and the only reason I am requesting the account is I do not want to dounload such tables like pagelinks frequently because of the traffic payments I have.

In addition to the main part of my request I have to add that because of big amount of technical templates linking meta-part of the Wikipedia (pages to be improved, discussed, deleted etc) the deadend pages list has the same problems like lonelypages list and the technology used for new list generation can be easily adapted for generation of correct deadend pages list too.

After some real-life testing and maybe additional optimization of the script it may be used for other languages, so, I hope my request will not be declined. Mashiah Davidson 12:03, 11 May 2007 (UTC)[reply]

Forgotten to provide a link to a script, sorry. It is here. Mashiah Davidson 22:55, 11 May 2007 (UTC)[reply]
I've made some changes in the script and now 15 minutes is enough to restore from the dump and process on my Dell Inspiron 6000. Querying fits into 7 minutes instead of 15. One more important thing is I've also made a minor bugfix to avoid influence of links to co-named articles in other namespaces (like links to discussion pages from some templates). Mashiah Davidson 18:32, 13 May 2007 (UTC)[reply]

I would like to use the toolserver to run some bots for a few of the smaller projects. At the moment, this is just q:User:WelcomeBot (information listed on userpage), but will probably be expanded at a later date to include more bots (like a VandalBot). Please see my meta userpage or the bot's userpage for more information. Cbrown1023 talk 23:15, 18 May 2007 (UTC)[reply]

I'd like to get an account on the Toolserver - mainly - to run SQL queries on the live databases. At the Italian Wikipedia we have many "regular requests" (please see it:Wikipedia:Elenchi generati offline); an account may be very useful (no need to wait for the next dump to search spam links or so). I also run a bot on that wiki that currently uses query.php to get lists of pages to work on, but it can simply use a direct MySQL connection. I have also some programming experience (Java, a bit of PHP, Python.). --.anaconda 00:37, 19 May 2007 (UTC)[reply]

Hi, I'm looking to create information pages to contextualise the geological timescale. I started with this, but Wikipedia doesn't have the functions enabled to allow dynamic page generation, which this server would allow. I may also look to integrate my citation formatter to a Wikipedia interface.

Many thanks, Verisimilus 09:51, 19 May 2007 (UTC)[reply]

Hi, I am an administrator on pl.wikinews, and I currently run K.J.Bot from a computer, now I would like to run it off of toolserver. Bot add interwikis on some Wikipedias and do something else on pl.wikinews (move, edit, add a category and a template). (see also contribution: [5], [6], [7], [8])

Thank you, Krzysiu Jarzyna 10:44, 19 May 2007 (UTC)[reply]

Hi there. I'm requesting an account on the toolserver to use for developing a couple of tools / bots for enwikiversity (though they may well be expanded to a larger scope). Now that the wikiversity has grown to more than 31,000 and pages and more than 2,000 files, performing housekeeping tasks, such as categorizing pages and dealing with improperly tagged images or images that violate our (yet to be completed) fair use policies, is becoming increasingly more difficult, and, as such, we're in quite desperate need of some tools to assist in these tasks. Naturally, such tools could be implemented, as they have been in the past, using crawlers; however, I find this solution to be far less than ideal due to its slowness and the unnecessary load it puts on the servers. The tasks needed can be performed far more efficiently through sql queries, which is what I would plan to do.

As for my credentials, I have been an active sysop on the English-language Wikipedia since June of 2006 (almost a year ago). I have developed a plethora of client-side applications for MediaWiki including VandalProof, VP2, and WikiMonitor, and I am currently working with Yuri Astrakhan to develop, test, and expand the API bot framework. I am well versed in a variety of programming languages, with my background primarily in Java and BASIC, and am a current undergraduate compsci student.

Please don't hesitate to contact me if you have any questions. I can be reached on my enwiki talk page, my enwikiversity talk page, and (likely the fastest way) by e-mail. Thanks in advance. AmiDaniel 21:44, 23 May 2007 (UTC)[reply]

I'm an admin on the English Wikipedia and a Perl programmer. I am in the process of writing a bot in Perl (en:User:polbot) that uses perlwikipedia to create and update articles on politicians. I would like space to run this bot. What do you say?

Please finish the bot first. The toolserver is not a good place to test a bot, because a out-of-controll-bot get fast blocked and many other bots with it. --DaB. 18:45, 12 June 2007 (UTC)[reply]

Hello, I am a Linguistics student and language instructor at Portland State University. I am working on a grammar aid/checker for Wikipedia. Editors will be able to sort by article popularity or number of errors, and use a friendly interface that highlights suspect errors in an article and features inline editing to produce a corrected, ready-to-paste version. The tool is also designed to be an easy-to-use learning tool for ESL students to observe grammar features within any article of interest. Currently I am playing offline with a older dump of the Wikipedia database, but I would like access to the live data whenever possible. As a basis I am using the Link Grammar Parser that is included in AbiWord. I prefer Ruby, but often write PHP as well. My only related online example is a graphical interface for a part-of-speech tagger to make grammar investigations easier: http://orderofr.net/compling/postagger. Thank you for your time! --Notbot 10:15, 28 May 2007 (UTC)[reply]


Currently working on pulling statistics data for the en:Wikipedia:WikiProject Vandalism studies/Study2. Right now I generate a lot of requests against api.php. I'm also hoping to create a few web tools for compiling data, and expect to continually lurk around the bot requests page looking for ways to help out. It would also be a good location to store copies of the Perl libraries I've worked on which function against api.php.

Extremely strong SQL and unix backgrounds. Programming in Perl, Java, and PHP. Autocracy 15:21, 5 June 2007 (UTC)[reply]

No problem with, but en: is not uptodate at the toolserver at the moment. Do you like to get the account now or when en: is replicated again? --DaB. 18:40, 12 June 2007 (UTC)[reply]