Amazing Graphics Papers: Dual Photography

This is a summary of the technique in the paper Dual Photography, which was presented at SIGGRAPH 2005. What the authors managed to do is very clever, and you can even understand the technique without knowing much high-brow math. Color me impressed.

Imagine you are photographing a scene which contains objects lit by a lamp. Suppose the light source is replaced with a structured light source, basically a projector. We could turn all the pixels of the projector on and get what is essentially a lamp. But in general we could light just some of the projector pixels, and illuminate the scene with whatever weirdly shaped light we want.

Fact: light transport is linear. So the relationship between the input vector p (which pixels on the projector are lit) and the output vector c (the brightness on each camera pixel) can be described by a matrix multiplication: c = T p. The ijth element of the matrix T is the brightness of the ith camera pixel when the jth projector pixel (and nothing else) is lit with intensity 1. (If the camera and the light source both have resolution of, say, 103×103, then p and c both have length 106, and T is a 106×106 matrix.)

We can actually construct the matrix T by lighting the first pixel, taking a picture, lighting the second pixel, taking a picture, etc. (each picture yields one column of the matrix T). And once we've done this, we can plug in any vector p we want to see what our scene would look like with arbitrary illumination from the projector. So, once we have this matrix T which completely characterizes the response of this scene to lights, we can change the lighting of the scene in post-processing.

That's already kind of neat, but the most impressive trick here is based on the fact that light transport follows the principle of reciprocity. If light of magnitude 1 enters a scene from a projector pixel a and lights up camera pixel b with intensity α, then a ray of light could just as well have entered at b and exited at a, and that light will also be passed with the same coefficient α. This is easily seen if the scene contains mirrors, but it's actually true in general, even if light is partially absorbed, reflected in funny ways on the scene, etc.

Now, what happens if we swap the locations of the projector and the camera in our scene? What's the matrix T' associated with this new scene? Because of reciprocity, it's merely the transpose of the matrix T: all the coefficients are the same, but they move around because we've swapped the 'input' and the 'output' of our system.

You'll notice something funny about this, which is that using T', we can reconstruct the scene as if it were viewed by a camera where the projector was, and lit by a projector where the camera was. The authors show that this actually works, and describe some tricks they played to get it working well. See the pictures below: the second, third, and fourth pictures can all be synthesized from the matrix T, despite the fact that the scene was never photographed from that angle.

The authors then use this technique to (I am not making this up) read the back of a playing card. Watch the video (linked from the web page), then read the paper.

The approach works no matter what the resolutions of the camera and projector are. You can even take a picture using a projector and a photodiode (i.e. a single element light sensor), because the effective resolution of the virtual camera is the resolution of the projector.

A web hosting recommendation: NFSN

This year I started hosting my personal web site with nearlyfreespeech.net. I have nothing but the best to say about them.

NFSN is a virtual hosting service, which means that small sites don't pay for capacity they aren't using, and large sites are automatically load-balanced. NFSN-hosted sites regularly get links from Slashdot, Digg, etc. and they stay up.

Pricing: NFSN is pay-as-you-go: bandwidth starts at $1/GB and goes down based on your usage, and storage is $0.01/MB/month. I'm paying less than $7/month for my site, and most of that only because I'm running a web app that's a derivative of Wikipedia. No stupid pricing tiers here.

Environment setup: you can set up your shell for public-key access over SSH. Their environment comes with Emacs 22 and git installed, which means that, for once, my web host shell is actually useful and not just a pale shadow of a real working environment. (Yes, vim 7.1, bzr, hg, SVN, and CVS are installed too.)

Abuse policy: here is a clip from their abuse FAQ:

A NearlyFreeSpeech.NET member site is defaming me or otherwise injuring me civilly.

Please forward a copy of your legal finding from a court of competent jurisdiction to our contact address. If you have not yet obtained such a finding, a preliminary injunction or court order is also sufficient.

The rest of it reads similarly. It is nice that there are web hosts out there that respect their customers, even when those customers don't have absurdly priced SLAs.

Making backups (instructions for GNU/Linux)

The cardinal rule of backing up is: assume that any one of your hard drives could go up in smoke at any moment. Zap. Could be right now. Magnetic disks will fail, and it's not a matter of if, but of when. (Backups will also save you from some user error, although that is not their primary purpose.)

Once you are thoroughly convinced of that, you will be nearly paranoid enough to implement a good backup strategy. This may seem awfully depressing, but the great news is that storage is cheap these days. As of this writing, you can get a 500GB hard disk for under $100. Having a full backup of your files when your primary disk breaks into tiny pieces is worth a lot more than $100. I've lost three disks in the last three years. In no case did I lose any data.

I am not willing to mess with the hardware or software needed to configure a RAID, so my backup solution (based on jwz's PSA) involves a second hard disk and rsync on a cron job.

Whenever I get a new hard disk for a computer, I generally make it the primary disk (containing /, /home, and swap) and graduate the previous disk to the role of a backup and swap disk. The backup partition is formatted with ext3, just like my root and home partitions. Suppose that the backup partition on the second disk is mounted at /media/sdb1 and I want to backup my homedir to /media/sdb1/phil. (My system is pretty much a stock Ubuntu install, so there is little value in backing up stuff outside of /home.)

The following script, archive-homedir, rsyncs my homedir to the backup disk:

rsync -vaxE --delete --ignore-errors /home/phil/ /media/sdb1/phil/
touch /media/sdb1/phil/last-backup

The options basically mean: print the files being copied; preserve timestamps and permissions; don't descend into other filesystems mounted under your homedir; delete files in the backup when they get deleted on the main partition; and ignore errors. The script also touches a file so you can see at a glance when the last backup was made.

This crontab file backup.crontab causes a backup to happen every day at 6AM:

0 6 * * * /path/to/archive-homedir

Install and activate it with crontab /path/to/backup.crontab. If your cron is configured to email you with the job output, you will get the list of files that was backed up every morning. Watch out for messages telling you that your disks could not be read or written. These generally mean that you need a fsck or a new disk.

I use the same kind of setup to back up my laptop disk over the network. On my laptop, archive-homedir looks like this:

rsync -vaxE --delete --ignore-errors --delete-excluded \
--filter="merge /path/to/archive-exclude-patterns" \
/home/phil/ desktop:/media/sdb1/laptop/

where archive-exclude-patterns is a file with a list of filter rules that instruct rsync to include or exclude certain files. I use this file to tell rsync not to back up some files that are not worth transferring over the network, like my web browser cache. My archive-exclude-patterns looks like this:

- /.local/share/tracker/
- /.local/share/Trash/
- /.mozilla/firefox/*/Cache/
- /.thumbnails/

On my laptop, I don't run archive-homedir on a cron job, but I do run it whenever I'm on the same LAN as my desktop.

Emacs in Ubuntu Hardy now has anti-aliased fonts

Update, 7 August 2009: the most recent major release of Emacs (v. 23.1) now has the anti-aliased font support. See the Ubuntu elisp PPA, which contains packages for any recent Ubuntu release, or see installation instructions for various other platforms.

The latest emacs-snapshot-gtk packages in Ubuntu Hardy (1:20080228-1) have the Unicode/font changes merged, and now support anti-aliased fonts for the first time.

While I didn't really mind the old bitmap fonts, I have to say that anti-aliasing is gorgeous.

To activate the new fonts, I added the following line to my ~/.Xdefaults:

Emacs.font: Monospace-8

Then, I ensured that the settings in .Xdefaults are being loaded by adding the following to my ~/.xsession:

if [ -f $HOME/.Xdefaults ]; then
  xrdb -merge $HOME/.Xdefaults
fi

Content-addressable storage

In Git, every file is stored under a hash of its content. One consequence is that a particular file is only ever stored once in the repository, regardless of how many versions it appears in and under how many names. Each of those instances is a pointer to the same blob of data.

I'm using Git to manage a large collection of binary files (scanned images). I'm using version control so I can add new data, replace bad scans, and rename or reorder files reversibly. I've noticed two great things about this:

First, since I frequently add new files and rename files, but only occasionally delete or modify them, the repository (containing the entire revision history) is only slightly larger than the actual current state of the repository. This is true for many kinds of binary data (e.g. photos, music), so it makes using Git very attractive: why would you keep one backup copy when, for about the same amount of space, you could have a complete version history?

Second, pushing changes is very fast. Even after I rename hundreds of files, Git doesn't need to push huge amounts of binary data (just a few kilobytes) because every file is already in the repository, except possibly under a different name. This is far more economical than rsync, which would attempt to re-transfer every file that had been changed.