Contents |
[edit] Introduction
Given the necessarily long time horizons, the cost of routine administration and emergency response are major contributors to the overall cost of digital preservation. To minimize these costs, we supply the LOCKSS system as a "network appliance", a complete operating system and application environment for a generic PC. The environment has been very carefully designed and configured to be as secure as possible, to enable any security vulnerabilities that are discovered to be corrected quickly and easily, and to make responding to a break-in as simple as possible.
The LOCKSS box is based on the OpenBSD operating system. It boots and runs directly from a CD; there is no software installed on the PC's hard disks. All software is either run directly from the CD, or it is digitally signed and the signatures are verified before use (by software run directly from the CD). All system configuration information is stored on write-locked media (floppy disk, USB flash, or the CD itself). LOCKSS boxes regularly check for system updates ("patches"), and apply all updates with valid signatures every time the system reboots. These precautions mean that at any time the system can be restored to a known state simply by rebooting.
Institutions considering using the LOCKSS system to preserve web-published content should ask a number of security-related questions:
- How secure is the material a LOCKSS box is preserving?
- How much risk does running a LOCKSS box pose to its host network?
- What happens if a vulnerability is detected in the LOCKSS box software?
- What must be done to recover if a LOCKSS box is compromised?
- Can security be increased by running the LOCKSS software in another environment?
- Can the security of a LOCKSS box be increased by packet filtering?
[edit] Risk to Stored Content
The LOCKSS box has been carefully configured to reduce the risk that it could be compromised and used as a base for further attacks against the network on which it is installed. The LOCKSS box is hard for a remote attacker to penetrate or exploit. The only network-accessible services a LOCKSS box exports are:
- Ping
- SSH
- HTTP administration web server & proxy
- LCAP
The SSH daemon is used for remote administration. The root user can log in with a password chosen during system configuration. If the relevant configuration option is selected, the LOCKSS team can log in as an unprivileged user. Packet filters allow SSH connections only from the subnet(s) specified during configuration. The OpenBSD SSH daemon runs with privilege separation, so even if a remote attacker compromises the SSH server on the LOCKSS box, he gains no privileges on the system.
The LOCKSS daemon includes the Jetty web server which exports HTTP service on two ports:
- The administrative user interface on port 8081. This accepts requests only from a specified set of IP addresses. External packet filters can be used to enforce this.
- The HTTP proxy service on port 8080. This accepts requests only from a specified set of IP addresses.
The Jetty server is very simple, and protected against buffer overflow and other attacks by being implemented in Java. It does not support CGI or other scripting interfaces. It will only run a limited set of pre-specified servlets.
The LOCKSS daemon communicates with other LOCKSS daemons using a TCP protocol called LCAP. This is designed to be hard to attack. It is not a request-response protocol; sending an LCAP message to a LOCKSS daemon causes no rapid, predictable response. LOCKSS daemons send each other messages at a limited rate.
The LOCKSS daemon runs as an unprivileged user. We intend eventually to run it in a chroot-ed "jail". It necessarily has write access to the stored content, so a hypothetical remote attacker who compromised the LOCKSS daemon would be able to corrupt or destroy the content. But once the attacker had been excluded the content would repair itself from other caches. Only the simultaneous compromise of the vast majority of LOCKSS boxes storing particular content would place the content at risk in the long term.
[edit] Risk to Host Network
The LOCKSS box has been carefully configured to reduce the risk that it could be compromised and used as a base for further attacks against the network on which it is installed. Only the essential services run on the LOCKSS box, and those that do are hard to subvert.
As described above, neither of the two daemons that are remotely accessible (SSH and LOCKSS) can be compromised to gain root privileges. A hypothetical remote attacker who succeeded in compromising the SSH daemon's authentication phase would gain the privileges of the special SSH privilege separation user, which are minimal. If he compromised the LOCKSS daemon he would gain access to the unprivileged LOCKSS user "lcap". Neither of these users can, for example, snoop on network traffic or create privileged sockets.
Even if a remote attacker were able to penetrate a LOCKSS box and obtain root privilege, the damage he could do would be limited in time. LOCKSS boxes boot and run from a CD, not from software installed on the PC's hard disk. Because the software boots from read-only media, and uses programs run directly from the read-only medium to verify the digital signature of all software run subsequently, the state of the system after a reboot is known good. The partition on the hard disk containing the stored content is mounted with the nodev and noexec bits set so it cannot be used to harbor a Trojan. There is no place for an attacker who did manage to penetrate the system to install a Trojan or anything executable that would survive a reboot.
[edit] Response to Vulnerability
In more than 10 years, there has been only one remote vulnerability in the OpenBSD configuration we use: a problem in the SSH daemon, We learned from the experience of responding to this threat, and to a more recent SSH vulnerability from which the LOCKSS boxes were immune, how important it is to have a plan in place to respond to newly-discovered vulnerabilities, and to exercise it regularly.
Each LOCKSS box regularly checks a set of download servers to see if there are new versions of any of the binary packages from which the boot sequence creates the running system. If there are, it downloads them and caches them on the hard disk where the boot sequence will find them and, if their digital signatures can be verified, use them to replace the earlier version on the CD itself.
Thus the process of responding to a newly discovered vulnerability in OpenBSD or one of the applications used by the LOCKSS box is as follows:
- The vulnerability is discovered.
- A patch for the vulnerability is promulgated.
- The designated responder incorporates the patch into the system source and builds an entire new release (this process takes about 90 minutes and is run automatically every night, so there is high confidence that it will work in a crisis).
- The designated responder extracts the packages affected by the patch from the newly built release, signs them, and puts them on the download servers.
- The designated responder sends e-mail or telephone messages to the cache administrators, asking them to log in to their caches and execute the single command "immediate_system_update" as root. This command performs the regular download check immediately, then reboots the LOCKSS box, causing it to use the newly downloaded package.
We tested this response process and managed to update 95% of the LOCKSS boxes within 48 hours.
[edit] Response to Compromise
An administrator of a LOCKSS box who suspects that it has been compromised should immediately reboot it. This will restore the system (though not its stored content) to its previous state. Although this state is now suspected to be vulnerable, it will take time for the attacker, if there is one, to re-use his exploit to regain access. If at all possible all packet traffic to and from the box should be logged. The LOCKSS team should be alerted as quickly as possible.
[edit] Using Other Platforms for LOCKSS
Long-term preservation of digital material requires the highest possible standards of system security. In the absence of a self-healing mechanism such as the cooperation among LOCKSS daemons provides, or an expensive off-line backup system, a single security breach can result in an unrecoverable loss.
Sites may wish to and can run the LOCKSS daemon on any system with a suitable Java Virtual Machine implementation. It will considerably enhance the overall security of the system if there is diversity among the JVM and operating system implementations. The LOCKSS team supports Red Hat Linux and Solaris as alternate platforms. Although at present the LOCKSS team only has resources to support these platforms, we are anxious to encourage other members of the community to diversify the platforms the LOCKSS system can use.
It is possible to run the LOCKSS daemon on a general-purpose server along with other services, but the LOCKSS team cannot recommend doing so. The only services running on a LOCKSS box are those which are essential to its function. They, and the underlying operating system, have been carefully configured to reduce the risk of compromise. Even if the underlying operating system is as secure as OpenBSD, running additional services on it can only increase the risk of compromise.
[edit] Using Packet Filtering with LOCKSS
The LOCKSS protocol is designed to be easy to filter. The LOCKSS box uses its own packet filters to block all non-essential traffic. Because the box is configured to run in high security mode, even an attacker who obtained root privilege would not be able to change these filters.
If a LOCKSS box is behind a firewall the only inbound connections that must be allowed are those for the LCAP protocol. The TCP port used is configurable. It will not, however, be practical to restrict access to this port to individually specified IP addresses of other LOCKSS boxes. The LOCKSS system is a peer-to-peer system:
- We expect new peers to join in at any time; there is no central control of the system from which they must ask permission or with which they must register. A central control point of this kind would be a single point of failure, and a major vulnerability.
- Further, peers may change their IP addresses at any time without notice. Their network administration could reassign the address to another purpose, making the assumption behind the filter entry obsolete.
- In addition the LOCKSS system, like other peer-to-peer systems, must take measures to prevent free-loading. A LOCKSS box that expects other boxes to provide it with service but will not provide service in return, because its communication is blocked by packet filters, will encounter these measures and, over time, will be "frozen out" of the system.
- Finally, we hope that there will in time be thousands of LOCKSS boxes, making individual IP address filtering too burdensome to be feasible.
LOCKSS boxes do not trust each other; they treat all other LOCKSS boxes as potentially hostile. For more details of firewall rules, please contact the LOCKSS team.