The Challenge of IP Reputation
2023-9-12 05:5:9 Author: textslashplain.com(查看原文) 阅读量:20 收藏

When protecting clients and servers against network-based threats, it’s tempting to consider the peer’s network address when deciding whether that peer is trustworthy.

Unfortunately, while IP addresses can be a valuable signal, attempts to treat traffic as trustworthy or untrustworthy based on IP address alone can be very prone to mistakes.

Background

Most clients and servers today have one of fewer than 4.3 billion IPv4 addresses, often expressed as a set of four numbers delimited by dots. For example, 20.231.239.246 is an IP address used to serve the Microsoft.com homepage. IPv6, which offers 340 trillion trillion trillion unique addresses, is used relatively less commonly because not every internet service provider supports IPv6. For example, my provider, AT&T Fiber, presently does not.

DNS

Rather than directly supplying an IP address, most apps usually work by looking up a given hostname (say, microsoft.com) in the Domain Name System (DNS). In response to a query for a hostname, the DNS server returns one or more IPv4 or IPv6 addresses, and the client selects one to connect to. (If the connection fails, a client may try more addresses in the list.)

Different DNS servers may return different addresses for a given hostname. For example, a DNS server in France asked to look up microsoft.com is likely to return the address of Microsoft’s server in Europe, rather than the IP of its server in the United States. That helps ensure that French users don’t have to send their traffic across the Atlantic when a closer server is available. Similarly, DNS might return different addresses as a part of load-balancing or as servers are taken offline to perform maintenance tasks.

IP Geolocation

Various services purport to offer geolocation information for a given IP address based on where they believe this IP is hosted, but such services are imperfect. (Aside: I once remember a case at a Microsoft conference in Vegas where my IP geolocation was marked as New York. I was confused but eventually I realized that the traveling conference’s last stop was in New York, and whatever database was mapping IPs to locations simply had cached outdated information.)

Making matters even more confusing, a single IP address might be served by entirely different servers.

Still, that’s not to say that geolocation isn’t without value as a signal. For example, last month, my parents encountered a scam site purporting to sell shoes. The scam site was a visually-perfect copy of a real site, but hosted by the scammers who collected credit card numbers and ran up hundreds of dollars in bogus charges. While the site’s domain name ended in us.com, suggesting an American company, the site’s IP address reveals that it was hosted out of Beijing. When I went back a few days later, the IP had changed to one somewhere in the Middle East.

Problems With IP Reputation

While IP addresses seem like a useful signal for whether a peer is trustworthy, there are unfortunately some major problems.

IPs Are Not Constants

A key problem with using IP addresses for reputation is that they can change. That’s true for both servers (as described above) and clients. For example, my ISP may assign my home router a particular IP this morning, but a different one tonight.

Similarly, thanks to DNS, a server needn’t have just one IP address — a given DNS query might return several different IPs, and a different DNS server might return an unrelated set. As long as all of the IPs are handled by the same server, they go to the same “place” even if the IPs themselves have no components in common.

IPs May Be Unavailable

For web content in particular, the client may not know the IP address of the target server. While, as noted above, browsers usually perform DNS resolutions themselves, proxy servers are a common feature in enterprise environments. When communicating to a proxy server, the browser (unless using the SOCKS protocol) doesn’t look up the web server’s address at all. Instead, the proxy server does the DNS resolution on the client’s behalf, and the client never even sees the server’s IP address.

In extreme cases, the client PC might not even have working Internet DNS at all, relying on a proxy for all of its Internet-access needs.

Shared Server Addresses: Virtual Hosting

Another key problem with using IP addresses for server reputation is that a single IP address might be used by many unrelated entities, particularly when the IP address resolves to a web server.

When a modern browser navigates to a HTTPS server, its initial message to establish the TLS connection (the ClientHello) contains a field called the Server Name Indication, that specifies the certificate that the client expects to receive. The server (often a frontend owned by a Content Delivery Network that has hundreds or thousands of unrelated customers) then knows which of the many HTTPS sites the client expects and it returns the content from that site. Because a single IP may serve content from many different parties, blocking that IP because it serves some malicious content could result in blocking lots of legitimate content.

Back in 2021, the blogging giants over at Cloudflare posted an amazing article on their experiments with serving Canadian visitors over 20 million sites from one IP address. Their shocking conclusion was that it worked, and doing so even had a number of benefits.

Cloudflare’s learning

Shared Client Addresses: NAT

A parallel problem exists for evaluating the reputation of clients. There are many scenarios where a server will see traffic from a large number of different entities originating from a single IP address. This can occur with, for instance, enterprise proxy servers, consumer VPN egress points, or even a bunch of users behind a single wifi router at a school or business (NAT/”Network Address Translation”).

A service that applies rate limits or anti-abuse mechanisms against a client’s IP address has to be careful to avoid breaking hundreds or thousands of clients using that same IP address. Once again, Cloudflare has a great blog post here, which explains the multi-user IP address problem and a service they offer to try to mitigate it.

Advantages of URL or Hostname Reputation

In contrast to IP-based reputation, URL or URL-Hostname based reputation has some advantages.

Blocking URLs and URL-hostnames is not subject to the vagaries of DNS– you can block evil.com without worrying about what specific IP address it’s being served from, and you don’t have to worry that you might’ve missed an address, or that it may resolve to a different IP if you use a different DNS server.

In the case of a server configured for virtual hosting, having reputation for a particular hostname allows you to block only traffic bound for that specific hostname and not all of the other hosts on that server.

In the case of a server that allows many unrelated parties to publish content (e.g. drive.google.com) URL reputation allows blocking only specific resources (e.g. drive.google.com/user/evil/* is blocked while drive.google.com/user/good/* is allowed).

Some security benefits are less obvious.

Securing a site with a HTTPS certificate now requires that the certificate’s existence be publicly logged via Certificate Transparency, allowing for the possibility of detection of abusive sites (e.g. www.paypal.com.gibberish.example.com) by a log monitor, even before the scam site is put online.

So We Should Always Use URL Reputation, Right?

While URL and URL-Hostname reputation are preferable to IP reputation when it comes to evaluating the trustworthiness of an Internet server, unfortunately, we cannot always rely upon them.

One scenario involves malware that talks to a “Command and Control” (C2) server. Such software may connect to an IP directly rather than using an address supplied by DNS, meaning that there’s no hostname to block. However, C2 IP addresses are kinda sitting ducks for security software, so attackers have started using other approaches to get their attacks to blend in with legitimate scenarios.

In other cases, blocking may be provided by a component that does not have access to the URL. For example, prior to the ubiquitous deployment of HTTPS, security software on the local PC or at the network gateway could easily see all of the URLs streaming by. TLS encryption prevents software on the network from seeing the HTTPS URLs being loaded, however, meaning that an observer only has access to the target IP address and maybe the hostname. Today, the hostname can be detected for most TLS connections by sniffing the TLS ClientHello message sent by the client to pluck out the Server Name Indicator (SNI) field that contains the hostname the client is trying to talk to. In the future, Encrypted Client Hello aims to combat this shortcoming of HTTPS by hiding the target hostname. An enterprise who wishes to perform hostname blocking will need to disable the ECH feature in their browser (e.g. Chrome, Edge).

For example, Windows Defender’s Network Protection (NP) feature operates on the Windows Filtering Platform layer (below the Windows Firewall). As a consequence, it only has access to the IP address, and to the hostname if present in the SNI. This allows NP to block access to sites if the entire site is deemed malicious (evil.com) but unlike Edge’s SmartScreen, sites with only some malicious content (drive.google.com/user/evil) must be allowed through to avoid false positives.

Unfortunately, TLS causes a bunch of user-experience problems for the blocking unwanted content.

When Network Protection determines that it must block a HTTPS site, it cannot return an error page to the browser because TLS prevents the injection of content on the secure channel. So, NP injects a “Handshake Failure” TLS Fatal Alert into the bytestream to the client and drops the connection. The client browser, unaware that the fatal alert was injected by Network Protection, concludes that the server does not properly support TLS, so it shows an error page complaining of the same.

t=2817 [st=305] SOCKET_BYTES_RECEIVED
                --> byte_count = 14
                --> bytes =
                15 03 03 00 02 02 28 15  03 03 00 02 01 00
t=2817 [st=305] SSL_ALERT_RECEIVED
                --> bytes = 02 28
t=2817 [st=305] SSL_HANDSHAKE_ERROR
                --> error_lib = 16
                --> error_reason = 1040
                --> file = "..\\..\\third_party\\boringssl\\src\\ssl\\tls_record.cc"
                --> line = 594
                --> net_error = -113 (ERR_SSL_VERSION_OR_CIPHER_MISMATCH)
                --> ssl_error = 1

If the system’s Notifications feature is enabled, a notification toast appears with the hope that you’ll see it and understand what’s happened.

<lament> I could write about this for days…

-Eric


文章来源: https://textslashplain.com/2023/09/11/the-challenge-of-ip-reputation/
如有侵权请联系:admin#unsafe.sh