Privacy in stored data  

The problem

There are many aspects to privacy in electronic information, and many legitimate reasons to keep data private, both on personal and business grounds. Cryptography solves some of the problems as far as transmitted information is concerned. Stored information involves problems of a different nature. Some of these are discussed below, but this discussion is by no means complete.

How safe is information stored on your computer?
What can authorities, private investigators or hostile individuals do to gain access to information stored on your computer?

The following discussion deals mainly with Windows NT/2000/XP/Vista, the only system with which I have extensive experience. Much of this information is applicable to other systems, but I cannot tell you to what extent. I cannot be very precise on some of the points, because the technologies continue to improve. Nonetheless, the principal areas of concern are:

You as a user must be aware of what you are doing and be informed. The strongest cryptographic algorithms cannot protect you if you write down your password on a corner of you desk, or use your fiancée's, wife or daughter name, or your address, social security number or other familiar words and numbers as password. A good password should satisfy all of the following requirements. As a rule, a phrase is better than a single word.

  • be at least 8 character long (for a low-security password) or at least 50 characters long (for a truly secure password)
  • contain one or more lowercase letters, one or more uppercase letters, one or more digits, one or more punctuation signs
  • not be found in a dictionary or a common book. Large dictionaries, lists of names and common books and texts (e.g., the Bible, the US Constitution) are routinely used to automatically break passwords.

For example, now is the time for all good men is not a good passphrase, but now is tHe time for a!! g00d m3n is considerably better, and now isn't 12:00 a good time for eating lunch!? is better yet, because it is not based on a sentence coming from a known source.

Don t use the same password for different purposes. In particular, if you have a password which protects your sensitive data, don't use the same password for your e-mail, ftp or telnet accounts, because these services transmit your password in cleartext, and anyone located between you and your server can use a packet analyzer to collect passwords.

Don't trust the cryptographic libraries built into your operating system, unless they are available as source code and have been subjected to peer review. At least some of them may have been weakened intentionally in order to make them easy to break, while others are likely to contain backdoors and/or embedded keys which the manufacturer has passed on to the authorities. For instance, the independent security advisor Cryptonym discovered indirect evidence suggesting that Microsoft provided the NSA with built-in keys for the cryptographic engines of Windows NT and Windows 2000, thus enabling the authorities to more easily decrypt network transactions "secured" with these operating systems. This was subsequently denied. However, accidental security flaws are being discovered practically every day in major operating systems and software. More than a few copy-protection schemes already have been discovered, that were introduced by the music and movie industry without telling users, and sometimes negatively affect the users' computers. Organized crime has a keen interest in technologies that allow the computers of unsuspecting users to be remotely controlled for sending spam, automatically clicking on pay-per-click ads and carry out illegal activities ranging from petty offences to identity theft and international terrorism. Police and secret services in several countries are know to intentionally infect personal computers of people under covert investigation with software designed to log and transmit information on the user's activities. In the face of this, how far off is the idea that, deep within the software, there may be one or more well-hidden and intentional backdoors already built-in by the software manufacturers?

If you want to be sure that sensitive data does not leave your computer unless you choose so, don't connect this computer to a network (and especially, not to the Internet). If you really value your data, use a non-networked computer to hold it, and a separate computer for other tasks that require a network connection. Be careful when you move data to and from the secured computer (e.g., with USB memory sticks, which you should wipe regularly, besides disabling the computer from running software directly from the stick). If your data is really valuable, probably it is worth more than the price of a second computer. Then, of course you must protect the data on this computer with cryptography, as discussed below.

In addition, you must have a working understanding of the basic principles of cryptography. These pages are only a starting point.

Really deleting data

Forensic recovery techniques normally consist of making a mirror image of a hard disk and searching it for compromising words. These techniques are available to anyone with physical access to your computer (i.e., just about anyone who really wants to). Deleting a file is not safe in this respect, because all the data contained in the file remains on the hard disk (see following point).

Data that has been deleted and overwritten several times on your hard disk can still be recovered, although this involves the use of advanced facilities (clean-rooms similar to those used for semiconductor manufacturing, as well as special electronic equipment). In other words, here we are talking about NSA, FBI, KGB, the security services of most developed countries, major technological industries, and a few of the most advanced universities (updated information: a modified atomic-force microscope can be used for this purpose, at a material cost probably not exceeding US$ 50,000, which puts this operation within the reach of small laboratories and determined individuals). I shall not discuss here the technology involved. Just how many passes (i.e., successive overwritings) can be regarded as safe is impossible to tell. A single pass effectively prevents the application of techniques normally used for forensic recovery. Four passes are known to allow an easy recovery through special techniques. Eight passes are recommended by the Pentagon for low-security erasing. Data recovered after 22 overwriting passes has been used as court evidence against computer-related crimes in the United States. Thirty passes or more should be safe, unless the opponent is exceptionally motivated to recover the data. Physical destruction of a hard disk, through a complex procedure (involving high temperature and grinding to dust the whole hard disk with abrasives), is required by the Pentagon for hard disks used to store sensitive data.

There are several file and disk wiping programs available. I recommend two: BcWipe (available as freeware from Jetico Corp.) and PGP file and disk wipe, which is part of the PGP package (now owned by Symantec). I use both, because neither of them does everything I need. You must use these programs intelligently, and be aware that they do not automatically provide privacy. In particular, you must be aware that sensitive information may be contained on several locations of a hard disk, in addition to a file. In particular, remember these locations:

  • temporary files created automatically by word processors and other programs
  • free disk space resulting from the automatic deletion of the above files
  • the swapfile
  • the file slack at the end of each file
  • the cache of web browsers

Disk encryption

Disk encryption programs are especially useful, because they create encrypted partitions or encrypted virtual drives on a physical drive. Once a disk or partition are encrypted, its data cannot be recovered without a key. You can even boot and operate from an encrypted disk, so everything is protected, and stealing the computer or hard disk gives the thief no access to the data. The programs I have tested and found to work are BestCrypt and PGPdisk. Currently I use PGPdisk.

An interesting characteristic of disk encryption is that you don't have to worry about overwriting the data occupied by deleted files, because all the contents of an encrypted disk - including unused areas and unused directory entries - are encrypted. The pagefile is still a security concern, unless it is also stored on an encrypted disk.

Remember that the data on an encrypted disk is no longer protected when the disk is unlocked with its key. At this point, spyware on the computer can access the data and do anything it wants with it. Therefore, be extra careful of what you run on a computer that holds secured data. Not using it on a network goes a long way towards protecting it, but you still need to be careful when connecting storage devices to it.

Further concerns

There are several, and increasing, reports of users carrying laptops being forced to boot and log onto their computers when passing through border checks (especially, but not only, in the United States). Border agents are known to have taken computers, unlocked in this way, out of sight of the owner, presumably to inspect and/or copy the hard disk's contents (there are also reports of phone memory cards being copied by border police). Whole-disk encryption is not a sufficient protection in this case. If you want to cover also this eventuality, clean up your computer thoroughly (preferably, by wiping the hard disk and reinstalling operating system and a minimum of software) before your trip. If you absolutely must take sensitive data with you, put it in one or more encrypted files containing a virtual disk, or another type of archive. You can also disguise these files, and/or carry them separately from the computer (for instance, on a DVD, or in a memory card - the latter is so small that it can be concealed practically anywhere). It is also feasible to prepare your encrypted materials before your trip so that they are available on-line, and to download them to your computer via the Internet once you have reached your destination. If you use reliable encryption software, you can transfer the encrypted files through public Internet channels and let them pass right under the nose of would-be peepers.

Typing on a keyboard generates electromagnetic pulses which can be intercepted from a distance of several tens of metres and through building walls. The necessary equipment can fit in a suitcase (in other words, the police has no particular need to park a large black van right under your window). I would expect that a broadband generator of radio noise is an effective countermeasure, but I have no concrete information. My first thought would be to try with high-voltage, high-frequency generators like Tesla coils (but be careful, because a Tesla coil of moderate size can fry a computer several metres away). Having a few computers close to each other and running simultaneously, as frequent in office environments, might also provide some protection (as long as they all use the same screenresolution and refresh frequency, see below).

Computer monitors emit electromagnetic radiation that can be intercepted and used to reconstruct the picture displayed on the screen. I don't know whether this applies to LCD screens as well, but would expect so. The equipment and possible countermeasures are similar to those described under the preceding point. The military and security agencies use so-called Tempest terminals, which are shielded against electromagnetic emissions. In case you wonder, a home-made Tempest shield for your computer is unlikely to be effective, and drowning the leaking electromagnetic signals with similar "noise" would seem to be more practical.

Special fonts, called Tempest fonts, have been developed for reducing the above problem. Instead of having sharp edges, they are "fuzzy" (in particular, their higher harmonics have been eliminated through a two-dimensional FFT transform), so that their potential for generating radio emissions is substantially reduced. From normal reading distance, these fonts are perfectly readable. Later versions of PGP provide the option of using a "secure reader" that employs this technology when decrypting text messages.

A related technology adds high-frequency harmonics to video signals in order to generate a display that looks normal but generates large amounts of radio signals. These signals can be used to transmit information (e.g., file contents or keystrokes) to a remote listener without the knowledge of the computer user.

Threats of physical harm or legal prosecution may be used to force you to provide "voluntarily" information of private nature. Aside from deniable encryption, I know of no practical workaround (except getting a good lawyer and the support of civil-rights groups).