Single login/Hard way
Here is a possible strategy for transitioning to single login without casuing too much chaos. I'll bet the algorithm isn't the hard part, so this isn't really helping anyone, but hey it was worth a shot!
We assume that each wiki points to some file(s) that contains its user database, and that setting up single signup is a matter of pointing all wikis to the same file(s). This page proposes an algorithm for merging a wiki's user database with some global user database. It probably makes sense to try it on some smaller ones first (Simple, Malay?).
Initialisation
[edit]first_wiki := the largest wiki by number of users {... likely en } global_users := users of first_wiki merged_wikis := empty set UNION first_wiki
Performing a Merge
[edit]For each wiki we want to merge (after the first)
- Translate text in /Messages
- Wait for users to interwiki/correct their stuff
- Apply Merge_Logins on the wiki (the algorithm)
- Replace login pages with new helpful pages (see /Messages)
unmerged_wikis := all_wikis - first_wiki Merge_Logins(unmerged_wikis, merged_wikis, global_users)
Before Merging
[edit]To avoid creating chaos, we first run a version of the algorithm which does not make any changes to the databases. This would be useful for warning users ahead of time if their user name would change, or if we think they are the same user on some other wiki. See #Merge_Logins for more details.
- current_wiki - the wiki we want to merge
- merged_wikis - wikis that have already been merged
- global_users - global user database
Before_Merging(current_wiki, merged_wikis, global_users) { recursive } current_wiki_users := users in current_wiki for id = 0 to size of current_wiki_users id_2 := -1 { the new user id } current_user := current_wiki_users[id] current_name := name of current_user { See if we know the user or not } (id_2, confidence) := Identify_User(current_user, current_wiki, merged_wikis, global_users) if id_2 > -1 then { Fr:LeFakeUser is the same user as En:TheFakeUser } new_name := (name of global_users[id_2]) else { If there already is a global user with the same name (e.g. "LeFakeUser"), we have a conflict (since this is a new user and not somebody we know), so we have to rename the user to something else like "LeFakeUser_2" } if current_name = conflict_name such that conflict_name is the name of a user in global_users then old_name := current_name new_name := Rename_User(old_name, global_users, unmerged_wikis) (name of current_user) := new_name endif {current_name} end if { id_2 } assert id_2 > -1 { sanity check } write some helpful message on the user_page of current_user end for { id }
Merge_Logins
[edit]This is the front end. It merges the user databases of the unmerged_wikis with the global database.
The three basic cases we handle:
- We know the user, so we don't do very much.
- We don't know the user
- ...and he has a name we haven't seen before, so we create a new global_user with the same name
- ...and he has a name already in use, so we create a new global user with some variant on the name.
{ Here we use Fr as the example of a wiki we want to merge, and En as a wiki that is already merged. } Merge_Logins(unmerged_wikis, merged_wikis, global_users) { recursive } { base case } if size of unmerged_wikis = 0 then return endif current_wiki := unmerged_wikis[0] current_wiki_users := users in current_wiki { Calculate an offset so that we can avoid user id collisions later on } offset := 1 + size of current_wiki_users + size of global_users offset := ceiling offset to nearest power of 10 { for debugger-friendliness } { Temporarily move all id numbers in the article history to an offset location so that they don't collide when we perform the merge } for each record in current_wiki for each user_id in record { history } replace user_id with (user_id + offset) end for end for { record } for id = 0 to size of current_wiki_users id_2 := -1 { the new user id } current_user := current_wiki_users[id] current_name := name of current_user { See if we know the user or not } (id_2, confidence) := Identify_User(current_user, current_wiki, merged_wikis, global_users) if id_2 > -1 then { Fr:LeFakeUser is the same user as En:TheFakeUser } old_name := current_name new_name := (name of global_users[id_2]) if old_name != new_name then renamed_users[current_wiki][old_name] := new_nam endif { old_name } else { We don't know Fr:LeFakeUser, so he is a new global user } id_2 := last_global_user_id + 1 last_global_user_id := id_2 { If there already is a global user with the same name (e.g. "LeFakeUser"), we have a conflict (since this is a new user and not somebody we know), so we have to rename the user to something else like "LeFakeUser_2" } if current_name = conflict_name such that conflict_name is the name of a user in global_users then old_name := current_name new_name := Rename_User(old_name, global_users, unmerged_wikis) (name of current_user) := new_name renamed_users[current_wiki][old_name] := new_name endif {current_name} global_users[id_2] := current_user end if { id_2 } assert id_2 > -1 { sanity check } { Re-attribute things by/about Fr:LeFakeUser to Global:TheFakeUser } for each record in current_wiki replace (id+offset) with id_2 end for { record } end for { id } { Perform merge on the next wiki } unmerged_wikis := unmerged_wikis - current_wiki merged_wikis := merged_wikis + current_wiki Merge_Logins(unmerged_wikis, merged_wikis, global_users)
Identify_User
[edit]Determines if we already know a user from the global_user database, and who that user is. Returns a tuple (global_id, confidence). If we don't know the user, we return (-1,0).
We use several criteria to determine if two users (for example, one on fr, and another on en) are actually the same person
- If there is a two-way interwiki between the user pages, they are definitely the same person.
- If there is a one-way interwiki, they might be the same person
- If they have the same name, they might be the same person
- If 2 and 3 are true, we are more confident that they are the same person
Identify_User(current_user, current_wiki, merged_wikis, global_users) MAX_CONFIDENCE := 5 candidates := EMPTY_SET { set of (id, confindence) tuples } user_page := the user page of current_user in current_wiki for each interwiki in the interwiki links of user_page confidence := 0 if interwiki links to some user_page_2 in one of merged_wikis then if user_page_2 interwikis back to user_page then confidence := MAX_CONFIDENCE { max } else confidence := 1 end if { link back } id_2 := if of user for user_page_2 candidates := insert (id_2, confidence) in candidates endif { interwiki links } end for { interwiki } { If we already know someone for sure, return immediately no point searching further. } if there is exactly one (id_2, confidence) in candidates such that confidence = MAX_CONFIDENCE then return (id_2, confidence) else if there are more than one write to error file return any one endif { exactly one MAX_CONFIDENCE } { Now try searching for users with the same name } for each global_user in global_users if name of global_user = name of current_user then id_2 := id of global_user if there is an (id_2, confidence) in candidates then replace (id_2, confidence) with (id_2, confidence+1) in candidates else insert (id_2, 1) in candidates endif { there is an id_2 } endif end for { global users } { Finally try matching the passwords } current_password := password for user in current_wiki for each (id_2, confidence) in candidates if current_password = password for id_2 in global_users then replace (id_2, confidence) with (id_2, confidence+1) in candidates endif end for { candidates } if candidates = EMPTY_SET then return (-1, 0) else return (id_2, confidence) that maximises confidence in candidates endif { candidates empty }
Rename_User
[edit]Finds a variant of name which is not already in global_users. The important thing is that it not be in global_users, but to be helpful we can also try to pick a name the user already knows.
Rename_User(name, global_users, unmerged_wikis) { recursive } { base case } if (name is not in global_users) then return name endif { base case } { search for a familiar name } new_name := "" if user_page interwikis to user_page_2 such that user_page_2 is a user a page in one of all_wikis and such that user_page_2 interwikis back to user_page and such that the name of the user for user_page_2 is not in global_users then new_name := name of the user for user_page_2 endif { If a familiar name was not found, we do something ugly and simple This could conceivably result in names like LeFakeUser_2_2_2_2_2_2 ... Oh well. } if new_name == "" then new_name := Rename_Users(name + "_2", global_users) endif { familiar_name } assert (there is no global_user whose name = new_name) return new_name