Name-based Virtual Hosting in TCP

Hubert ChaoBrian Kowolowski
Jed LiuJeffrey M. Vinocur

Introduction

Our project involves the addition of name-based virtual hosting support to the Transport Control Protocol [TCP]. This involves sending hostname information at the beginning of each connection, similar in spirit to virtual hosting on the web [HTTP/1.1] but at a lower level.

Motivation and goals

There are a few problems which can be solved with the modification we describe.

  1. Very few application-level protocols have support for name-based virtual hosting; there simply was no need for it when most of these protocols were designed. In the absence of protocol support, the only information available (to the server) about the ``desired'' resource is what IP address and port number the client requested. A server application certainly can make use of the IP address in handling a request; this is IP-based virtual hosting. But IP-based virtual hosting is often a poor solution for reasons we will discuss below.

  2. Although some application-level protocols, notably HTTP, support name-based virtual hosting, there are some protocols for which this is quite simply impossible. One of these is [HTTPS]; the protocol requires presentation of a ``server certificate'' before any in-band data is transmitted, including the virtual hosting information. A solution to this has been presented in [TLS-HTTP], but will require significant changes to the deployed software base and thus is unlikely to face easy adoption.

  3. The success of ``Outbound'' Network Address Translation [NAT] (for using a single globally-unique IP address for an entire collection of machines on an internal network) relies on the fact that in general, clients are content with only outgoing TCP connections. However, there are times when this is inadequate. For example, the ``Sidecar'' authentication service here at Cornell [TECH-ARCH] requires the server make an out-of-band query to the client (similar to traditional Unix Identification Protocol [IDENT]). This is virtually impossible in the presence of traditional NAT.

We have designed the modifications to TCP necessary to ``cure'' all three of these problems, and implemented them in the Linux 2.4 kernel. We also present as proof-of-concept the minimal modifications to several applications necessary to make use of these changes. We did not investigate the changes to the kernel's routing functionality required to implement point (3) above.

Problems with IP-based virtual hosting

In principle, IP-based virtual hosting might be sufficient for point (1) above; simply have one IP address for each virtual host. But in practice, this is inadequate:

Certainly name-based virtual hosting is far cleaner and more useful, in principle, than IP-based virtual hosting. But the domain name used in the request is not available to the server from TCP/IP. As a result, protocol support is required in order to do name-based virtual hosting. But it is too late to get this sort of support in all of the application protocols. The solution is a method which requires software support only (that is, no protocol modifications). In addition, putting support in TCP reduces the overall amount of duplication of design and code.


Design and Implementation

Design decisions

Protocol modifications

Endpoint naming semantics
The goal is to pass two DNS hostnames in the SYN segment which initiates a TCP connection. We consider the sender name and receiver name, which in usual TCP fashion correspond to the local endpoint and remote endpoint, respectively. The data is always stored in the packet using the sender's point of view; this means that receiving TCP stack must swap the fields to get the receiver's point of view.

TCP header changes
The TCP specification includes an extension mechanism, using header options. Currently options through 26 decimal have been assigned [IANA]; we chose 42 decimal unofficially for the HOSTS option. Our option has a length of 6 bytes: the 8-bit TCP kind field, the 8-bit TCP length field, a 16-bit field specifying the location of the relevant section (see below) in the data section, and two 8-bit fields specifying the lengths of the receiver name and sender name, respectively.
    +--------+--------+---------+--------+--------+--------+
    |00101010|00000110|      offset      | rcvlen | sndlen |
    +--------+--------+---------+--------+--------+--------+
     Kind=42  Length=6

As there are currently no other options which involve putting data in a SYN segment, the offset will likely always be zero. However, implementations should handle the offset in the event it becomes useful.

TCP data changes (SYN segment only)
If the HOSTS option is present in the TCP headers of a SYN segment, there should be at least 2 + offset + rcvlen + sndlen bytes in the data section of the segment. If there is less data than that, the packet is malformed. An implementation may discard the segment, but is permitted (and encouraged) to accept the segment but treat it as if the HOSTS option had not been present. Beginning offset bytes into the data section, there should be a 16-bit checksum (in usual TCP ones-complement fashion), followed by the two variable length fields for receiver and sender hostname, respectively (the lengths, of course, are found in the TCP header as described above).
    +--------+--------+---          ---+---          ---+
    | option checksum | ...rcv host... | ...snd host... |
    +--------+--------+---          ---+---          ---+

The checksum is, of course, stored in network byte order.

Networking API modifications

The host_info struct
Central to our API modification is the host_info struct:
    struct host_info {
       __u8 rcv_host[TCP_MAX_HOST_LEN + 1];
       __u8 rcv_host_len;
       __u8 snd_host[TCP_MAX_HOST_LEN + 1];
       __u8 snd_host_len;
    };

This defines the basic data structure that an API programmer would use to interface with our TCP option.

setsockopt
The setsockopt(2) API function is used to set the sender and receiver hostnames for a socket. At present, this is how the client indicates, before the call to connect(2), which hostname it used to obtain the server's IP address. It might also be useful in the future on the server, before the call to bind(2), as described in Future Work below. Usage looks like:
    struct host_info hosti;
    socklen_t optlen = sizeof(struct host_info);
    /* initialize fields of "hosti" here, including length fields */
    if (setsockopt(sockfd, SOL_TCP, TCP_HOSTS, &hosti, optlen) < 0) {
        /* an error occurred */
    }

The kernel may also set the sender and receiver hostnames for a socket without the application calling setsockopt(2), for example if an incoming SYN segment includes the HOSTS option.

Calls to setsockopt(2) for the TCP_HOSTS option are not useful after the connection has been initiated. The only possible error (other than the normal setsockopt(2) errors) is EINVAL, indicating that the optlen passed in was not acceptable.

getsockopt
The getsockopt(2) API function is used to recover the current sender and reciever hostnames for a socket. This is how the server determines what hostnames, if any, the client specified when initiating the connection. Usage looks like:
    struct host_info hosti;
    socklen_t optlen = sizeof(struct host_info);
    if (getsockopt(sockfd, SOL_TCP, TCP_HOSTS, &hosti, &optlen) < 0) {
        /* an error occurred */
    }
    /* use fields of "hosti" here */

The caller should be warned that because of the potential for binary data (see the discussion in Design decisions), the hostname strings are not guaranteed to be terminated by a '\0' character.

The length returned in the final argument to getsockopt(2) is the length of the first string in the host_info struct (that is, the value of the rcv_host_len field), provided that the input length was sufficient to store at least that string. This means that if the server is only interested in the receiver hostname field, the following idiom is possible:

    socklen_t optlen = TCP_MAX_HOST_LEN;
    char hostname[optlen];
    if (getsockopt(sockfd, SOL_TCP, TCP_HOSTS, hostname, &optlen) < 0) {
        /* an error occurred */
    }
    /* use "optlen" and "hostname" fields here */

The only possible error (other than the normal getsockopt(2) errors) is EOPNOTSUPP, which indicates that no hostname information is currently available for this socket (for example, the client did not send the HOSTS option). Note that getsockopt(2) will return EOPNOTSUPP before examining the optval parameter; thus an input of NULL will allow the caller to determine if hostname information is available without allocating any storage.

host_info.h
To allow applications to be compiled on systems which do not have our changes to the system header files, we provide a stub header file which includes the constants and struct definition necessary to compile on a non-compliant system. Code should, in general, run on systems regardless of compliance; calls to setsockopt(2) and getsockopt(2) will return ENOPROTOOPT if the TCP_HOSTS option is not supported.

Kernel modifications

The modifications necessary to the Linux 2.4 kernel fall into several categories:

Storing hostname data for each socket
Since we only associate hostname data with TCP sockets, the appropriate place to store it is in the tcp_opt struct associated with each socket. This data structure is used to keep track of a variety of TCP features which can be enabled or disabled in certain circumstances.

Generating outgoing SYN segments
There are several modifications necessary to the generation of outgoing SYN segments. The TCP header generation must include the HOSTS option (if data has been supplied by the client application), and a body must be included in the SYN segment (normally no data is passed in such cases) with the appropriate checksum calculated.

Handling incoming SYN segments
When a SYN segment arrives, the TCP stack parses the options. Modification must be made to recognize the HOSTS option, and if it is present, extract the hostnames from the data section.

Handling RST responses to SYN segments
When a normal SYN is sent to host which is not listening on the specified port, the RST that comes back to the originating machine will have an ack sequence number one higher than that in the SYN. However, if the SYN is one of our enhanced SYNs, the additional data changes this figure. Instead of off by one, as expected, the RST has an ack sequence off by one plus the number of bytes in the data portion of the SYN. By default, the kernel ignores inbound packets that don't appear to be part of a stream, and a RST with an ack sequence number that's 20 or 30 higher than expected doesn't appear to be useful. Thus we change the check to allow in addition an incoming RST which has an ack sequence number that is higher by exactly the amount of data sent in the SYN. This allows a compliant implementation to detect a reset connection rather than timing out.

Extending the sockets API
Since we are only adding a TCP option, the only modifications that were required to the sockets API are to the setsockopt(2) and getsockopt(2) function that were extended to handle our new option. See Networking API Modifications for details.

Extending the sysctl API
The sysctl interface was extended to turn on the TCP Hosts functionality by default, however the current setting can easily be examined, enabled, and disabled, with the (respective) commands:
    % sysctl net.ipv4.tcp_hosts
    # sysctl -w net.ipv4.tcp_hosts=1
    # sysctl -w net.ipv4.tcp_hosts=0

Application examples

Trivial telnet-like client and server: tclient and tserver
The first thing we did to test our TCP option was to implement a trivial client and server, tclient and tserver. tclient is a simple, telnet-like client. It simply listens on stdin and sends the input to the server. tserver is a telnet-like server: it waits for a connection and writes any data received from the socket to stdout.

An HTTP server: thttpd
To demonstrate the utility of our option, we decided to modify an HTTP server to use the information that can be gathered from the option. We chose to modify the thttpd web server, which has support for virtual hosting and is relatively simple. The reason we chose this instead of Apache is because Apache has almost 10 times more lines of code than thttpd.

Modifying the code to support our option consisted of about 10 lines of changes. This involved getting the receive host information out of the option, adding a field to pass the hostname along to where it is needed, and then using the receive host information. We first check the option's receive host for a virtual host. If the option is not present, then we fall back on the default thttpd behavior.

An HTTP and FTP client: wget
To test the modified HTTP server, we needed to modify a web browser. Modifying Mozilla, or even Internet Explorer, was certainly out of the question. Hence, we opted to patch wget.

The modification involved less than 10 lines of changes, consisting of initializing the receiver host information in the host_info struct with the hostname of the machine being contacted and setting the socket option before connecting to the server.


Conclusions

Results

Both endpoints compliant
If both endpoints (application and TCP stack) are compliant with this extension, it works exactly as intended.

To test this, we brought up a thttpd server within a user-mode Linux kernel that supported our extension. This server had virtual hosts bound to the names foobaar, 192.168.20.20, 127.0.0.1, and localhost. Each virtual host had /index.html file which announced the virtual host on which the file was located.

On a separate machine running a copy of our kernel, we used wget to contact the server at 192.168.20.20. As expected, the page returned indicated that it was being served by the appropriate virtual host. To test the other virtual hosts, we used tclient, which allowed us to specify a value for the rcv_host field in the outgoing connection.

Noncompliant applications and/or TCP stacks
There is no problem, in principle, if an old application is run on a system with an updated TCP stack. Similarly, there is no problem if an updated application is run on a system with a traditional TCP stack. In either case, no outgoing SYN segments contain the HOSTS option, and that option is ignored on incoming SYN segments.

We have tested a variety of interactions. We can connect with ssh from a non-compliant system to a compliant one, and we can ssh from a compliant system to a non-compliant one. We can connect to a compliant system running our modified thttpd server from a non-compliant system, from a compliant system with a non-compliant application, and from a compliant system with a compliant application, such as our modified wget.

Potential problems and suggestions for future work


References

[DNS]
Mockapetris, P. ``Domain Names - Implementation and Specification'', RFC 1035, November 1987.

[HTTP/1.1]
Fielding, R., et. al. ``Hypertext Transfer Protocol -- HTTP/1.1'', RFC 2616, June 1999.

[IANA]
Internet Assigned Numbers Authority. ``TCP Option Numbers'', May 2001.
    http://www.iana.org/assignments/tcp-parameters

[IDENT]
St. Johns, M. ``Identification Protocol'', RFC 1413, February 1993.

[NAT]
Srisuresh, P. and M. Holdrege. ``IP Network Address Translator (NAT) Terminology and Considerations'', RFC 2663, August 1999.

[TCP]
Postel, J. ``Transmission Control Protocol'', RFC 793, September 1981.

[TECH-ARCH]
Cornell Information Technologies. ``Cornell University Technical Architecture'', draft, December 1989.
    http://solutions.cit.cornell.edu/doc/TechnicalArchitecture.pdf

[THTTPD]
Poskanzer, Jef. ``Tiny/Turbo/Throttling HTTP server''
    http://www.acme.com/software/thttpd/

[TLS-HTTP]
Khare, R., and S. Lawrence. ``Upgrading to TLS Within HTTP/1.1'', RFC 2817, May 2000.

[ZEUS]
Zeus Technology. ``Hosting Multiple Web Sites on a Single Server Machine'', 2002.
    http://www.zeus.com/library/articles/hosting.html