IP Editing: Privacy Enhancement and Abuse Mitigation/Privacy enhancement

From Meta, a Wikimedia project coordination wiki

IP Address[edit]

What can an IP address tell you?[edit]

IP addresses can reveal a wealth of information to anyone who is even slightly technically inclined. With the advent of modern search enginers, much of this information is readily accessible by searching for an IP address. Here's a WHOIS query performed on a WMF Office IP address to show what data is exposed from an IP address:

$ whois 198.73.209.241 

NetRange:       198.73.209.0 - 198.73.209.255
CIDR:           198.73.209.0/24
NetName:        WMFOIT
NetHandle:      NET-198-73-209-0-1
Parent:         NET198 (NET-198-0-0-0-0)
NetType:        Direct Assignment
OriginAS:       AS11820
Organization:   Wikimedia Foundation, Inc. (WF-44)
RegDate:        2013-11-21
Updated:        2013-11-21
Ref:            https://rdap.arin.net/registry/ip/198.73.209.0

OrgName:        Wikimedia Foundation, Inc.
OrgId:          WF-44
Address:        1 Montgomery Street
Address:        16th Floor
City:           San Francisco
StateProv:      CA
PostalCode:     94105
Country:        US
RegDate:        2013-09-10
Updated:        2017-09-25
Ref:            https://rdap.arin.net/registry/entity/WF-44

OrgAbuseHandle: WOI-ARIN
OrgAbuseName:   WMF Office IT
OrgAbusePhone:  +1-415-839-6885 
OrgAbuseEmail:  officeit-bgp@wikimedia.org
OrgAbuseRef:    https://rdap.arin.net/registry/entity/WOI-ARIN

OrgNOCHandle: WOI-ARIN
OrgNOCName:   WMF Office IT
OrgNOCPhone:  +1-415-839-6885 
OrgNOCEmail:  officeit-bgp@wikimedia.org
OrgNOCRef:    https://rdap.arin.net/registry/entity/WOI-ARIN

OrgTechHandle: WOI-ARIN
OrgTechName:   WMF Office IT
OrgTechPhone:  +1-415-839-6885 
OrgTechEmail:  officeit-bgp@wikimedia.org
OrgTechRef:    https://rdap.arin.net/registry/entity/WOI-ARIN

This IP address above tells you:

  • The company this IP is registered to. In many cases, especially for smaller companies this will be the ISP of the company currently using the IP address and not the company itself.
  • The date it was registered
  • The exact address, phone number, email address and other information for that company

Below are some other terms associated with IP addresses

rDNS (reverse domain name system)[edit]

On the internet, the DNS (Domain name system) is like a phone book. It converts names like en.wikipedia.org into IP addresses like 198.35.26.96. rDNS does the opposite; it converts IP addresses into names. For websites, sometimes these names are the usual name you are used to. Other times they are not. As an example, consider Wikipedia's IP for people in western North America 198.35.26.96 (We are using a tool called host, which comes installed by default on Linux and Mac OSX):

$ host 198.35.26.96
96.26.35.198.in-addr.arpa domain name pointer text-lb.ulsfo.wikimedia.org.

As you can see, the reverse DNS doesn't match en.wikipedia.org. However it still shows administrative information about the IP - namely that it is a Wikimedia IP, and that it belongs to ulsfo, Wikimedia's San Francisco data center. Although text-lb.ulsfo.wikimedia.org is not the same as en.wikipedia.org, it is still a domain. You can even go to http://text-lb.ulsfo.wikimedia.org although your web browser will get confused and think someone is eavesdropping on the connection. You can also convert it back to an IP address:

$ host text-lb.ulsfo.wikimedia.org
text-lb.ulsfo.wikimedia.org has address 198.35.26.96
text-lb.ulsfo.wikimedia.org has IPv6 address 2620:0:863:ed1a::1

There is no requirement that a reverse DNS entry exists, or that the reverse DNS entry is convertible back to an IP address, but it is fairly common.

For consumer IP addresses, there is a lot of variety in what the reverse DNS contains. Most commonly though, it will include the name of the ISP and some sort of encoding of the IP. For example, consider 24.123.4.5 (An IP belonging to Charter Communications, formerly known as Road Runner High Speed Online):

$ host 24.123.4.5
5.4.123.24.in-addr.arpa domain name pointer rrcs-24-123-4-5.central.biz.rr.com.

It does not tell you much beyond the ISP and the IP. It is probably more useful for institutions, where sometimes it will contain information on the role of the IP. For example, a library IP might have the word library in the reverse DNS.

GeoIP[edit]

Lots of companies compile lists of IP addresses and where approximately in the world the IP address is. IP addresses can change hands between users quite often, so this can be a fraught task. Usually it is pretty accurate on a country level generally. In the United States and Canada it is often accurate up to a state/province level. Beyond that it can often be pretty inaccurate. The most famous such list is the one provided by MaxMind. Wikimedia uses the MaxMind list when deciding to show certain CentralNotices to people only in certain areas (To for example, advertise a local meetup).

There are several (kind of sketchy) sites on the internet to lookup someone's location given an IP address. https://ipstack.com/ is one example.

Blacklists[edit]

Various groups maintain lists of various IPs. Often they are lists of IPs that spam other people or are otherwise malicious. Knowing if an IP is on such a list can help in determining if it's likely the IP is an open proxy or otherwise generally malicious.

AS (Autonomous System) number[edit]

For administration and routing purposes, organizations that participate in the internet get assigned an AS number. Usually this info is included in the output of whois, but there are also dedicated tools to lookup AS numbers given IP. This info is not all that important from a privacy perspective, but does give interesting info on the organization that owns the IP address. For example, This is the info on Wikimedia (AS14907)

Traceroute[edit]

Traceroute is a tool to see the path a message takes when you send it somewhere on the internet. While it is useful for debugging, it mostly isn't all that useful from a privacy perspective. It can help find rough geographic locations, as usually the hops along the path get closer, and often these hops have location info in their reverse DNS. However, generally GeoIP databases are much better at this task.

For example, here is the traceroute coming from west coast North America, going to tools.wmflabs.org (Which is located in Ashburn, Virginia)

$ traceroute tools.wmflabs.org
traceroute to tools.wmflabs.org (185.15.56.11), 30 hops max, 60 byte packets
 1  _gateway (192.168.0.1)  344.557 ms  344.503 ms  344.487 ms
[... early hops omitted for privacy]
 5  sea-b2-link.telia.net (213.248.66.92)  308.709 ms  309.344 ms  308.181 ms
 6  chi-b21-link.telia.net (62.115.117.49)  653.177 ms  138.452 ms  151.218 ms
 7  * * nyk-bb2-link.telia.net (62.115.137.58)  158.432 ms
 8  ash-bb3-link.telia.net (62.115.141.244)  159.748 ms  159.100 ms *
 9  ash-b1-link.telia.net (62.115.143.121)  161.286 ms ash-b1-link.telia.net (62.115.143.79)  161.425 ms ash-b1-link.telia.net (62.115.143.121)  148.632 ms
10  wikimedia-ic-308845-ash-b1.c.telia.net (80.239.132.226)  148.194 ms  172.348 ms  171.331 ms
11  toolforge.org (185.15.56.11)  170.768 ms  204.919 ms  204.886 ms

As you can see, some of the reverse DNS entries have the phrase "ash" in them, hinting that messages are passing through Ashburn. In this particular case, this is more useful than GeoIP, which seems to think toolforge is in the Netherlands for some reason.

Knowing when someone is connected to the internet[edit]

In rare circumstances, knowing someone's IP allows you to know if they are currently connected to the internet (E.g. their computer is on) by pinging them. In an age of NAT and ICMP filtering, this is usually not actually possible.

Similarly, sometimes you can use fingerprinting to determine what operating system (e.g. nmap -O). Again, this is not really possible for most consumers who have some sort of NAT sitting between them and the internet

Various ways in which IP addresses are used[edit]

Some of these are pulled from discussion:

  • Check previous edit history of an IP. Some IPs are very static, I know people who edited anonymously from the same IP for years and created dozens of articles from their IP (but also possibly violated rules). Being able to interact with such people unwilling to register is clearly useful. However, we need to keep their identifier static and not change it daily, they might well edit the same article from the same IP a few years in a row.
  • Check range contributions. If an IP is dynamic, it is very useful to know if there is any activity from a neighbouring range. For instance, if a vandal with a particularly annoying pattern (e.g. changing dates in articles) is active in a dynamic range, getting all edits from this range to check them is clearly necessary.
  • Set an abuse filter on a range. If there is a particular pattern of vandalism from a range (e.g. use of certain words that might be appropriate in some but not all articles), we might have to disallow editing with this specific pattern to this range. This is an alternative to a block of the entire, potentially large range, and to disallowing potentially useful edits to all users.
  • Check global contributions. It is extremely important to keep identifiers consistent between wikis for fighting cross-wiki vandals. This is particularly the case of cross-wiki spammers who may insert spamming links from the same IP to multiple wikis.
  • Check if an IP is a proxy, VPN or Tor node. This is usually more advanced than automatic tools can allow, particularly in cases when people use proxies or VPNs to hide links with their main accounts in an abusive way. Sometimes I literally google an IP to find if I happen to find it in some proxy or VPN list.
  • Check if users/IPs belong to same network/geography. Some providers use multiple ranges with very different IP patterns (like a 128.*.0.0/16 and a 192.*.0.0/16), and a user (both registered and anon) might move from one to another without notice. Some users (both registered and anon) use two different providers (like home and mobile) but in a very specific location, and we can link accounts by this location. For example, if two IPs from the same town but different networks in Malaysia participate in the same discussion in Ukrainian Wikipedia, they very likely belong to the same person.
  • Check location of an IP. Unlike the previous case, location can be used in a positive context. For instance, an IP adding information of some obscure politician in China is possibly a vandalism. However, a Chinese IP adding information about a Chinese politician is less likely to be reverted.
  • Check organisation of an IP. This is needed for paid editing / COI matters. For example, an edit to an article about an MP made from the Parliament's IP (be it a registered or an anon user) is very likely an undisclosed paid editing or a COI and requires relevant actions.
  • Running the IP over various abuse databases such as stopforumspam, and the cbl