Port-based Apache virtual hosts

Published: 10 February 2009

Last modified: 18 January 2010

Introduction

When developing web sites, it can be useful to provide clients with access to their site during its development. One option is to publish the site to a suitable hosting provider. To prevent conflicts with any existing site, additional hosting space, domain names and DNS records may be required. For new sites, it is often not desirable to publish incomplete sites to their final location. Even if neither of these problems is deemed particularly serious, there’s still the extra overhead of having to keep the external site up-to-date with the latest changes.

The Host request-header field specifies the Internet host and port number of the resource being requested.

However, providing external access to internal name-based Apache virtual hosts presents some problems. Apache uses the HTTP Host request-header field to determine the correct name-based virtual host to serve. For an internal web server, it is simple enough to create DNS or hosts file entries which translate a number of host names to the single IP address of the Apache web server. Once the request has reached the Apache server, each host name can then be mapped to the correct virtual host.

Let’s assume that you have the following three Apache virtual host declarations:

# Site 1
<VirtualHost *:80>
  ServerName internal.site1.com
  DocumentRoot /var/www/site1
</VirtualHost>

# Site 2
<VirtualHost *:80>
  ServerName internal.site2.com
  DocumentRoot /var/www/site2
</VirtualHost>

# Site 3
<VirtualHost *:80>
  ServerName internal.site3.com
  DocumentRoot /var/www/site3
</VirtualHost>

If internal name resolution is correctly configured, a request for http://internal.site2.com will be sent to Apache. Apache will then attempt to match internal.site2.com with one of the ServerName directives from the above virtual hosts. In this case, it will match it with the second virtual host, and thus be able to provide the correct site to the browser making the request.

When we go out on to the Internet, the internal host names will no longer be available. Instead, you will most likely access your server via an IP address or DNS name provided by your ISP. Even if your firewall allows it, an attempt to access one of these internal sites through your public IP address or hostname will not work, as Apache won’t be able to match the host name to a virtual host declaration.

In this article, I discuss a method which will allow you to expose your internal Apache web sites to the world, without the need for public DNS records for each virtual host. It is assumed that you are familiar with Apache web server virtual hosts, and that you know how to edit and reload Apache configurations.

Environment

Shorewall (aka Shoreline Firewall), is a tool for configuring Netfilter: it’s iptables made easy.

The examples given below have been tested on Debian 5.0.3 (“lenny”) with Apache 2.6.26-2 web server and the Shorewall firewall configuration tool. However, the principles outlined below should work with any similar setup.

Security issues

This article does not delve into the security issues involved. Suffice to say that punching holes in your firewall, to allow the outside world in, comes with risks. You should understand these risks before implementing any of the examples provided here.

It is highly recommended that your publicly accessible web server is separated from your private network. Placing your web server on your private LAN, and opening up a hole to it through your firewall, is just asking for trouble. One option is to place your web server in a DMZ. The following Shorewall article is a good place to start if you would like more information on this:

Shorwall Three-Interface Firewall

Notes

Apache configuration

In the examples given below, the Apache web server is located in a DMZ, and has the IP address 192.168.2.254, although these details are incidental.

It is assumed that you already have Apache up and running with name-based virtual hosts. Your internal host names may be resolved via DNS, or via hosts files.

Firewalls

There are many different firewalls, and providing information for each would be an exercise in futility. Instead, I have provided some sample Shorewall rules. If you use a different firewall, it should be easy enough to translate these rules to suit.

Uniform Resource Locators

The use of absolute URLs to refer to pages within your web site may cause problems with the methods outlined on this page. Wherever possible, use relative URLs. For example:

This absolute URL could be a problem:

http://internal.site1.com/product/id/1973

Use a relative URL instead:

/product/id/1973

Port based Virtual Hosts

Apache web server can be configured with name-based virtual hosts, and IP-based virtual hosts. Either the IP address or the host name is used by Apache to determine the correct virtual host to serve. However, Apache can also use a port number to determine the correct virtual host to serve. This is often used to provide a different virtual host directives for HTTPS port 443.

Name-based virtual host directives can easily be extended to provide port-based virtual host detection. The following three virtual host declarations have been extended, such that each includes an additional unique port:

# Site 1
# Instruct Apache to listen on port 6001.
Listen 6001

# Associate internal.site1.com with port 6001.
<VirtualHost *:80 *:6001>
  ServerName internal.site1.com
  DocumentRoot /var/www/site1
</VirtualHost>

# Site 2
# Instruct Apache to listen on port 6002
Listen 6002

# Associate internal.site2.com with port 6002.
<VirtualHost *:80 *:6002>
  ServerName internal.site2.com
  DocumentRoot /var/www/site2
</VirtualHost>

# Site 3
# Instruct Apache to listen on port 6003.
Listen 6003

# Associate internal.site1.com with port 6003.
<VirtualHost *:80 *:6003>
  ServerName internal.site3.com
  DocumentRoot /var/www/site3
</VirtualHost>

The Listen directive tells Apache that it should listen for connections on the given ports (6001, 6002 and 6003). These are in addition to port 80, which is usually part of the default configuration. Simply specifying the port in the VirtualHost directive is not enough: Apache must be instructed to listen on these ports.

Each VirtualHost directive is then assigned one of these ports, in addition to the standard port 80. Be sure to pick ports that do not conflict with anything else you might be running. For example, port 9000 is the default port for PHP xdebug.

The sites are still accessible internally by their host name, as before:

http://internal.site2.com

It is also possible to access the site by its port number. When accessing via port number, the host name does not have to match the ServerName directive. This means that the site can be accessed via an IP address and port number. Each of the following are now valid:

http://192.168.2.254:6001
http://internal.site1.com:6001
http://internal.site1.com

The port method can be tested internally, before opening up your web server to the world. If your web server is running in a DMZ, you may need to configure port forwarding so that your private LAN (loc) is able to access your web server via the new ports. An example Shorewall DNAT rule is given below:

#ACTION    SOURCE    DEST                 PROTO  DEST
#                                                PORT
DNAT       loc       dmz:192.168.2.254    tcp    6001

Now you can access the site using the server’s private IP address:

http://192.168.2.254:6001

Once you are happy that this works, you can open up your firewall further, thus providing access from the Internet. The following example Shorewall DNAT rules grant access from both the Internet (net) and the private LAN (loc), to a web server located in the DMZ:

#ACTION    SOURCE    DEST                 PROTO  DEST
#                                                PORT
DNAT       net       dmz:192.168.2.254    tcp    6001
DNAT       loc       dmz:192.168.2.254    tcp    6001