Subtleties of search-and-replace in Emacs

The search/replace mechanism in Emacs is a beautiful thing, full of subtleties.

It is pretty clear that its design has been honed to an edge by about 20 years of heavy use. You might not think there could be much to it, but a lot of apps that have come along since then could learn something from some of its features:

Incremental search. (isearch, or "find as you type") This has gained traction elsewhere in recent years. To search, press C-s and start typing; C-s again takes you to successive matches. Emacs highlights all the matches that are visible:

You can save quite a bit of typing this way because you can stop as soon as you've typed a prefix that uniquely identifies the text you're looking for.

In addition, the search UI is in the minibuffer, instead of in a separate dialog that covers part of your document.

Firefox, Chrome, and gedit all have fairly similar features; however, they only expose bare-bones search functionality through them. OpenOffice and most word processors bring up a big and jarring dialog.

Yanking from the document. This is like auto-completion for your searches. Suppose you're looking for all instances of "tag.artist" in your document. Press C-s and start typing...

Now you notice that the word following the highlighted match is exactly what you were looking for. Just press C-w to pull the next word into your query:

Then pressing C-s searches for subsequent occurrences of tag.artist. If you think this is useful, just think about how awesome it is when you need to search for superLongIdentifierNames in Java code.

You can also yank just the next character from the document, or everything up to the end of the line. Type C-h k C-s in Emacs to learn more.

Case replacement. If you have the text

Eat foo for breakfast!
Foo is the best medicine.
VOTE FOO FOR PRESIDENT

and replace (M-x query-replace or M-%) "foo" with "bar", what do you get? Exactly what you expect:

Eat bar for breakfast!
Bar is the best medicine.
VOTE BAR FOR PRESIDENT

This is clearly what you want most of the time, at least when you're editing prose. Gedit and OpenOffice don't do this (and Chrome and Firefox don't have a "replace" feature).

(What happens, you ask, if you're editing code, or anything else where case sensitivity is critical? Emacs's heuristic is to only turn on these smart replacements when your query and replacement text are both all lowercase. You can also manually toggle case-sensitivity in searching with M-c. For what it's worth, I've never noticed a false positive when editing code.)

Cycling behavior. Many programs do a poor job of letting you know when your search has wrapped around to the beginning of the document. In Firefox, I often find that I can inadvertently cycle through looking at all the search results two or three times before I realize what I've done.

After the last match in a document, Gedit and Chrome wrap around to the beginning silently. Firefox wraps around and concurrently displays a notification in the search UI. The problem is that if your attention is focused on the document content—as it probably is if you're scanning the individual matches—it's easy to miss the cues that tell you the search has wrapped around (e.g. Firefox's notification, or the scroll bar jumping back). You can end up back at the beginning of the document without realizing it.

At the other end of the spectrum, OpenOffice brings up a big and jarring dialog asking you "Yes/No" whether you want to continue searching at the beginning. Not so great, either.

How does Emacs deal with this? When you've reached the last match for a query in the buffer, pressing C-s does nothing, except raising a quiet notification that there are no remaining matches:

If you press C-s again, however, the search wraps back around to the beginning of the buffer.

That is, at the last match, C-s is not initially met with an immediate jump to a new match (as it usually is); Emacs gives just enough "pushback" to ensure that the user is aware when the search has wrapped, even if they're not paying any attention to the application chrome (minibuffer and scroll bars)! Very cool.

Displaying failed searches. When you type something that doesn't have a match, Emacs brings you to the maximal matching prefix of whatever you typed, and highlights the part of your query that wasn't found:

This kind of feedback gives you a good idea of whether or not you made a typo/error in your search: if you did, it often appears near the point where the text stopped matching.

If you only get all-or-nothing feedback telling you "no matches were found," then you are sort of groping around in the dark as you try to figure out what went wrong.

Search as a method for navigation. Perhaps the killer feature of isearch in Emacs is that you can use it as the primary method for moving around in a document, essentially replacing most of the functions of the PgUp/PgDn keys and the scroll bar in other applications.

Part of this has to do with the fact that the search UI is so lightweight (no dialog boxes!). However, isearch in Emacs is also cool because it facilitates many use cases:

  • During a search, you can press C-g to cancel and return to where you started. This is handy if you just wanted to search for something in order to get a quick glance at it.
  • …However, if you do indeed want to stop at the current match, press RET. Perhaps you want to make some quick edits or do some more extensive reading at that spot. But when you're done with that, you can press C-x C-x to return to where you originally were, because Emacs has set the mark to the place where you started.
  • …Or, you can just continue editing there with no intention of returning to your original position.

Gedit and OpenOffice really only support that very last use case. You can hold your existing spot in a document with the cursor, but then you're limited to looking around with the scrollbars, which is cumbersome. Or you can use the Find feature or otherwise move the cursor, but then you've lost your original position.

Emacs is like your very own personal assistant: it remembers things so you don't have to.

Conclusion

I haven't mentioned many of the miscellaneous keyboard-accessible commands and options available from within isearch and query-replace. These are well worth your time to learn, although they are quite self-explanatory if you read the integrated help:

  • For options available in isearch, type C-h k C-s.
  • For options available in query-replace, type C-h at a query-replace prompt.

I'd sum up some of the more generalizable design principles behind search-and-replace in Emacs as follows:

  • Reduce extraneous typing.
  • Stay out of the user's way whenever possible.
  • However, alert the user with immediate and useful feedback whenever necessary.

Manipulating Picasa Web Albums programmatically

I was pleased to find out that the Picasa Web Albums API makes it quite easy to programmatically upload and download photos from Picasaweb. So any desktop app that manipulates photos or graphics can easily be extended so that it can interact with web albums. The possibilities are manifold… below is some code that demonstrates a couple of the basic capabilities.

Like everything else on Ubuntu/Debian, you only need about 15 seconds to obtain all you need to get started with this. Just install the python-gdata package:

$ sudo aptitude install python-gdata

Uploading albums to Picasaweb

The following function creates a new Picasaweb album from a sequence of photos and associated captions.

#!/usr/bin/python
import gdata.photos.service
import os.path

def create_new_album(album_title, email, password, photos):
    """
    Creates a new Picasa Web Album with the specified images.

    album_title: title of album, e.g. "Spring Break 2009"
    email: Google account name, e.g. yourname@gmail.com
    password: Google account password
    photos: sequence of tuples where each entry is (filename, caption)
            e.g. [("/tmp/photo0001.jpg", "A huge turtle"), ...]
    """
    # Authenticate to Picasa Web Albums.
    gd_client = gdata.photos.service.PhotosService()
    gd_client.email = email
    gd_client.password = password
    gd_client.source = 'MyPhotoUploader' # Fill in the name of your program here.
    gd_client.ProgrammaticLogin()

    # Create the album.
    album = gd_client.InsertAlbum(title = album_title, summary = '')
    album_url = '/data/feed/api/user/%s/albumid/%s' % \
        (album.user.text, album.gphoto_id.text)

    # Insert the photos.
    for filename, caption in file_info:
        gd_client.InsertPhotoSimple(
            album_or_uri=album_url,
            title=os.path.basename(filename), # This shows up as 'Filename' on Picasaweb
            summary=caption,
            filename_or_handle=filename,
            content_type="image/jpeg")

Downloading albums from Picasaweb

Normally you can't download entire albums directly from the web (unless you have the Picasa client software installed). You can however use the API for this purpose (the parts in blue below are the core, non-interactive part of the program):

#!/usr/bin/python
#
# This interactive script prompts the user for an album to download.

import gdata.photos.service
import urllib

def main():
    "Downloads a Picasa Web Album of the user's choice to the current directory."
    gd_client = gdata.photos.service.PhotosService()
    username = raw_input("Username? ")            # Prompt for a Google account username.
    print_album_names(gd_client, username)        # Enumerate the albums owned by that account.
    album_id = raw_input("Album ID? ")            # Prompt for an album ID.
    download_album(gd_client, username, album_id) # Download the corresponding album!

def print_album_names(gd_client, username):
    "Enumerates the albums owned by USERNAME."
    albums = gd_client.GetUserFeed(user = username)
    for album in albums.entry:
        print '%-30s (%3d photos) id = %s' % \
            (album.title.text, int(album.numphotos.text), album.gphoto_id.text)

def download_album(gd_client, username, album_id):
    "Downloads all the photos in the album ALBUM_ID owned by USERNAME."
    photos = gd_client.GetFeed('/data/feed/api/user/%s/albumid/%s?kind=photo'
                               % (username, album_id))
    for photo in photos.entry:
        download_file(photo.content.src)

def download_file(url):
    "Download the data at URL to the current directory."
    basename = url[url.rindex('/') + 1:]  # Figure out a good name for the downloaded file.
    print "Downloading %s" % (basename,)
    urllib.urlretrieve(url, basename)

if __name__ == '__main__':
    main()

More info

Picasaweb API developer's guide (Python client). From that page you can also find more about bindings for .NET, Java, and PHP.

I hereby release the code in this post into the public domain.

With a partner like Apple, who needs competitors?

Apple's contempt for iPhone users and developers keeps pushing the limits of credulity. It is exemplified by its response to the FCC inquiry into the Google Voice app:

FCC: Why did Apple reject the Google Voice application for iPhone and remove related third-party applications from its App Store?

Apple: Contrary to published reports, Apple has not rejected the Google Voice application, and continues to study it. The application has not been approved because, as submitted for review, it appears to alter the iPhone's distinctive user experience by replacing the iPhone's core mobile telephone functionality and Apple user interface with its own user interface for telephone calls, text messaging and voicemail. Apple spent a lot of time and effort developing this distinct and innovative way to seamlessly deliver core functionality of the iPhone.

At first glance what Apple has written doesn't even appear to be an answer to the question. Apple has, surprisingly, not claimed that it's protecting users—from harm, or from confusing apps. It only alludes to the fact that it has previously cited "duplicating existing functionality" (read: competing with Apple) as grounds for rejection.

Now, it is Apple's platform to mold as they wish; if consumers get a gratis dialer app either way, what's the harm? Why would Apple object to third-party developers making the iPhone better? Apple recognizes that it can't constrain them from subsequently bringing equivalent functionality to Android or WebOS or whatnot. What if there comes a day when the most commonly used apps (dialer, browser, etc.) have feature parity cross-platform? Who would buy an iPhone then? Many people, to be sure, just not quite as many as before.

But instead of staying competitive by making great software, Apple is doing pretty much the opposite. It is trying to forestall competition the only way it knows how: by crippling the iPhone's software—barring entire classes of applications—to obfuscate comparison between its products and competitors'. The result? There's a whole world of apps now that the iPhone just won't run, to say nothing of the apps that never get written because development is such a crapshoot. Users have to pay the cost for Apple's decisions every day now.

To add insult to injury, Apple is eager to remind everyone just how much "time and effort" its engineers spent on the iPhone's dialer. Does Apple think so little of its users that it considers its own good intentions justification enough to override users' requests? This isn't elementary school. You do not get points for "effort."

This madness is even codified in the iPhone SDK developer agreement. Native apps are forbidden from carrying out instructions on behalf of users ("executable code" or "interpreted code"), [1] lest users actually get to choose what they want to do. From an engineering standpoint, code is everywhere—web browsers, office programs, anything with macros, even calculators all have "interpreted code" at the core of their functionality. It is striking that Apple actually mandates that apps subvert their users instead of empowering them.

This situation is rather surreal.

Apple does not even try to hide its contempt for its customers. The iPhone is a device that only allows choosing from a circumscribed set of carefully enumerated functions. Apple only thinly veils the fact that it despises third-party developers, who are ostensibly its "partners". If you develop for the iPhone, you have to deal with a company that believes it has more to gain from hindering you than from helping you.

I am not suggesting that Apple's behavior is illegal, or even incomprehensible (the shareholders must love it)—merely abhorrent. We choose the world we want to live in, and this is not something I want to be a part of, not when there are so many worthy alternatives.

Yet, Apple's candor is nothing if not refreshing. Most companies don't openly talk about the mechanics of their anti-competitive practices. They speak of protecting consumers from confusion, or keeping prices low. Not Apple. There is no dissembling here. Apple freely admits that it will do whatever it takes to keep competitors from getting a foothold, no matter the cost to its own customers. When we consider that fact, it seems that the App Store is not the opaque and inscrutable system that many have claimed it is. What Apple has done has been no more, and no less, than what it has said it would do.

[1] 3.3.2 An Application may not itself install or launch other executable code by any means[...]. No interpreted code may be downloaded and used in an Application except for code that is interpreted and run by Apple's Published APIs and built- in interpreter(s).

You will have to see Wikileaks(!) for the full agreement.

Modifying SSH port forwardings mid-session

I frequently use SSH port forwarding to access services on computers I'm connected to (e.g. VNC, web servers, Zeya). For example,

ssh -L 8001:localhost:8080 foobar

connects port 8001 on my local machine to whatever service is running on foobar port 8080.

Sometimes I'll discover mid-session that I wish to connect to a new service I've just started up remotely, or that I forgot to add the -L flag for some service I wanted. I could always just disconnect, add the appropriate port forwardings, and reconnect.

However, I just learned that SSH also supports some escape sequences, one of which lets you break out to a command line, where you can change port forwardings mid-session without disconnecting.

With the default settings, type ~C at the beginning of your session or after a newline. You'll see a command prompt:

ssh>

At this prompt, you can add additional forwardings using the same syntax that ssh accepts:

  • Local forwarding to remote service: -L local_port:hostname:hostport
  • Remote forwarding to local service: -R remote_port:hostname:hostport
  • Dynamic forwarding, e.g. for SOCKS: -D port

Further reading:

Assorted notes

Many random notes (some for my own reference), each of which is too short to warrant a full blog post:

Emacs

It had always bothered me, just a little, that C-v and M-v, in addition to scrolling, move the cursor to the bottom or top of the screen. This has the odd effect that C-v M-v is not a no-op. Turns out that setting the variable scroll-preserve-screen-position appropriately can fix this. (via emacs-devel)

Chrome

  • Have you ever wondered why running Firefox instances get broken when new versions are installed? On GNU/Linux systems the package manager, which runs as root, pays no mind to what non-root users are doing. Some apps, like Firefox, run into major trouble because they'll load additional files after startup that can't be mixed and matched with the previous versions (this isn't really a problem for programs that are essentially just one binary). Chrome acquires immunity to this with its Zygote mode, which basically means do not load anything from disk after startup. A simple idea, but it takes some work to follow through with it.
  • Among visitors to this blog, Chrome is the second most popular browser, with a 9.5% share! Chrome leads IE (6.6%), Safari (5.9%), and Opera (4.3%), and trails only Firefox (72.6%). You can sort of see what a non-representative sample of web users happens across this blog.

Unix/Ubuntu

  • Soundconverter (aptitude install soundconverter) is a handy GTK/GNOME program for converting (transcoding) audio between formats. It is easy to use, transfers all the music metadata, and seems to take full advantage of multicore processors. (I use it to downsample my music, mostly in FLAC, to Ogg or MP3 for use on portable devices.)

  • When using find with xargs, xargs will get tripped up by filenames with spaces (it will treat each space-delimited component as a different argument). You can get around this by changing the delimiter in both find and xargs to \0 instead of space, as follows:

    find . -iname '*.tmp' -print0 | xargs -0 rm

    You can usually achieve the same effect with find ... -exec but I can never remember how to use it.

HTML, HTML5

Some things I picked up while working on Zeya:

Git for researchers

In my previous job—as a grad student, doing computational/biomedical research—I used Git to manage my projects.

For small projects, people usually treat CVS/SVN as checkpointing tools—tools to get you back to a known good state when you've screwed up. Git, however, provides a whole new vocabulary you can use to talk about creating, altering, composing, combining, splitting, undoing, and otherwise manipulating changes to code (commits). It helps you get stuff done faster every day, not just when you mess up.

Here are a couple of reflections and "lessons learned" on really using VCS to your advantage in a research environment, where some of the rules of thumb are a bit different from those in industry.

(They seem so stunningly obvious now that I've committed them to writing, but they seemed much less so when I first articulated them to myself.)

Retaining history, all of it. I have found git merge -s ours to be very handy. It produces a merge commit and merge topology, tying in the history of the other branch, but without applying any of the changes produced in that branch.

Typically, if a feature doesn't pan out, you delete the corresponding branch and destroy all evidence that you tried. But in exploratory or research contexts, the details of your failed experiments can be quite important. You might need to revisit some past state in order to perform further investigation. Or maybe you want to obtain some numbers for a paper or presentation.

Graphically: imagine you have a "successful" branch feature1 and a "failed" branch feature2 (left). You don't want to git branch -D feature2, since that could cause its history to be lost. If you instead git merge -s ours feature2, you get a topology where the states from both branches appear in your git log (right), but the state at the tip is the same as that at feature1.

* ddddddd (refs/heads/feature1)
* ccccccc
* bbbbbbb
| * 2222222 (refs/heads/feature2)
| * 1111111
|/
* aaaaaaa
* eeeeeee "Merge branch 'feature2'."
|\
* | ddddddd (refs/heads/feature1)
* | ccccccc
* | bbbbbbb
| * 2222222 (refs/heads/feature2)
| * 1111111
|/
* aaaaaaa

This kind of setup makes tracking your progress super easy. My git log basically becomes the scaffolding for my research notebook. I have bare-bones notes like the following:

Commit 2222222: this change did not improve quality at all. Furthermore it runs much slower, probably because blah blah blah blah. See full output in /home/phil/logs/2222222.

The great thing is that now every result (whether a success or a failure) has, attached to it, a commit name: a pointer to the exact code that generated that result. If I hadn't had complete change history so easily available, I would have spent half of my time second-guessing results I'd already obtained.

This application also demonstrates the strengths of DVCS versus CVCS. Research and software development do not happen in a clean linear way. There is lots of backtracking, and sometimes you cannot expect to work effectively with a VCS whose basic model is "one damn commit after another."

Summary: 90% of everything ends in failure. Keeping your failure history (as well as your success history) around is something that is underemphasized.

Long-lived branches vs. refactoring. If you know what you're going to do in advance, then it's not called research. In my work, what I ended up writing on a day-to-day basis depended more on experimentation and testing than on planning and specs. Here's some sample code for illustrative purposes:

# (1)
def my_function(a, b):
   foo = random_sample() # Random heuristic
   something(foo)
   ...

I want to find out how the following code stacks up against (1). Does it perform better? Is it faster?

# (2)
def my_function(a, b):
   foo = shortest_path(a, b) # A better(?) heuristic
   something(foo)
   ...

In reality we might be evaluating alternative heuristics (as here), different numeric parameters, alternative algorithms, or an alternative data source (e.g., training vs. testing data).

Sometimes, when there are a number of alternatives, the right thing to do is to refactor to parameterize the code, for example,

# (3)
def my_function(a, b, heuristic = 'shortest_path'):
   if heuristic == 'random':
       foo = random_sample()
   elif heuristic == 'shortest_path':
       foo = shortest_path(a, b)
   else:
       foo = ... # Additional logic...
   something(foo)
   ...

But every parameterization increases complexity. The new argument is something you have to think about every time you or someone else tries to read your code. Your function is longer, leaks more implementation details, and provides less abstraction. So you don't want to go down this route unless it's necessary. If one choice is a clear winner, and every invocation is going to pass the same argument, then the extra generality you introduced is a liability, not an asset. To do that refactoring can be a lot of work without much reward.

So you want to run and evaluate the alternatives before refactoring. People who find themselves in this situation often write code like this:

# (4)
def my_function(a, b):
   foo = random_sample()
   ## Uncomment the next line if blah blah blah
   # foo = shortest_path(a, b)
   something(foo)
   ...

which is convenient to write, but setting all the switches by hand whenever you want to run it is rather error-prone, especially if the difference is more complicated than one line.

Branching saves the day by letting your tools manage what you were doing by hand in (4). You can compare alternatives like (1) and (2) above against each other if you keep them in parallel branches (granted, you can't select between the alternatives at runtime, but that may be OK). Maintenance is a breeze: with git merge it's easy to maintain multiple parallel source trees, differing by just that one line, for as long as you please. And because you're committing every merge commit, your results are 100% reproducible (if you were messing with your files by hand, in order to reproduce a code state you would have to not only specify a commit name, but also what lines you had commented and uncommented).

After branching, you can mull it over and obtain data on all the alternatives. When you've made your decision, you either drop one implementation and end up with (1) or (2), or, if you need the generality, then you refactor so you can choose between them at runtime (3).

Summary: lightweight branches allow you to defer the work of refactoring rather than having to pay for it up front. They greatly improve the hackability of code, by letting you try out many different alternatives reliably and without much hassle.

Zeya 0.3

I'm pleased to announce the 0.3 release of Zeya. Zeya is a music player that runs in your (HTML5 draft standard-compliant) web browser, giving you streaming access to your music collection from pretty much anywhere.

Romain Francoise has generously packaged Zeya for Debian. Debian "testing" (squeeze) users can now install Zeya as follows:

# apt-get install zeya

(Zeya 0.2 is in Debian "testing" at present; Zeya 0.3 will be in "unstable" shortly and in "testing" after the requisite testing period.)

Many significant changes have been made since Zeya 0.2:

  • You can filter your music collection by title, artist, or album with Zeya's new search functionality. (Thanks to Romain Francoise)
  • Zeya implements traffic shaping for Firefox clients, to keep from hosing low-bandwidth network links.
  • Zeya supports password protecting your music collection with Basic HTTP authentication, configurable with --basic_auth_file. (Thanks to Samson Yeung)
  • The default backend is now dir (read all music within a directory) instead of rhythmbox.
  • Initial load time is significantly improved, as Zeya now compresses output data when appropriate. This yields a 3-7x decrease in transfer size. (Romain Francoise)
  • The target output bitrate can be set with --bitrate. (Romain Francoise)
  • Zeya does a better job of guessing reasonable metadata when the ID3 tags are missing. (Samson Yeung)
  • Zeya listens on IPv6 interfaces by default. (Romain Francoise)
  • Zeya is multithreaded for improved parallelism.
  • zeyaclient.py supports skipping to the next song (with C-c) and jumping back to the query prompt (with C-c C-c).
  • Zeya decodes MP3s using mpg123 instead of mpg321, for some performance improvements.
  • An online guide to the keyboard shortcuts has been added (click "Help" at the bottom or press ?).

Many bug fixes and UI improvements have also been made.

Known issues:

  • zeyaclient has not yet been updated to support basic HTTP authentication.

You can obtain Zeya via git:

git clone http://web.psung.name/git/zeya.git

As I mentioned above, Zeya is also packaged for Debian.

See http://web.psung.name/zeya/ for more information, including a quick start guide. We'd appreciate hearing any problem reports on our new bug tracker or via Debian's bug tracker.

Zeya 0.2

Zeya is a music server that streams your music collection to any computer, phone, television, picture frame, or refrigerator that has a current-generation web browser. The client part uses the <audio> tag in the HTML5 draft spec, so it runs right in the browser— without Flash or Silverlight and without the need to install any extra software at the client.

I'm pleased to announce the 0.2 release of Zeya.

New since Zeya 0.1:

  • Support for Internet Explorer (via Google's Chrome Frame plugin). IE joins Firefox and Chrome in the list of supported browsers.
  • A new console frontend (more below)
  • Numerous UI improvements, both substantive and cosmetic
  • A unit test suite

There are many bug fixes, the most notable being:

  • Filenames with non-ASCII characters can be read and served correctly.
  • Files that are not in a decodable format are hidden entirely from the user.
  • Zeya should actually work in Python 2.5 as advertised.

Known issues:

  • 64-bit GNU/Linux builds of the Google Chrome dev channel are undergoing some codec turbulence. Use either the 64-bit Chromium builds or 32-bit builds of Chrome or Chromium instead.

The new console frontend, zeyaclient.py, is a simple (read: primitive) app that connects to a Zeya server and prompts for songs to play. This is handy if you are using a computer that doesn't have a supported web browser (but on which you can run Python scripts). The Sugar OS on the XO-1 is one such setup, so I'm now using my XO, which is connected to a hi-fi set, as a jukebox for my living room.

I've also been using Zeya more frequently to listen to music (from my home computer) at work. It's much more satisfying than internet radio.

Visit http://web.psung.name/zeya/ for more information, or read previous blog posts on Zeya.

It's time for web developers to break out the champagne

It is somewhat amusing to realize that Google is not playing the same game as the other browser vendors. Its incentives are very different, and that makes all the difference. Google is not directly interested in increasing adoption of Chrome; it actually benefits from increased use of the web on any browser. Sundar Pichai, the Google VP in charge of Chrome, has said as much:

"We were all very clear that if the outcome was that somehow Mozilla [Firefox] lost share to Google [Chrome], and everything else remained the same, internally, we would have been seen as having failed," Mr. Pichai says.

So, journalists who are fixated on market share, like JR Raphael of PC World (not to pick on him), are missing the point:

Can Chrome Shake Up the Browser Market? ... As it stands now, Chrome holds about 3 percent of the global browsing market ... Google's hope ... is to double that share by next September—then triple it by 2011.

Chrome has already shaken up the browser market. In the year that Chrome has been out, speed has become one of the primary selling points for every major browser. The performance of every browser (and hence, every web app) has improved dramatically, and more improvements are in the pipeline. That fact—not Chrome's who-even-cares percentage market share—is the story of this year.

And the folks at Google are probably pretty happy with that. If people use a fast browser and load more web pages and use more Google products and see more advertisements, who cares if it says "Firefox" or "Chrome" in the title bar? Google doesn't.

Hell, who cares if it says "Internet Explorer"? Google doesn't. Which brings me to Chrome Frame, a plugin that runs a Chrome renderer in IE:

I wept tears of joy when I first saw this. For comparison, see IE8's native rendering of Acid3.

The Chrome renderer is activated on an opt-in basis from web developers, using a special META tag.

It remains to be seen how all this will pan out, but this is a very clever move on Google's part. It changes the economics both for IT departments and for developers:

  • Many if not most IE users are unwilling or unable to change their browser for legacy/lockdown reasons; I suspect that is the most important reason why Firefox, after all these years, is only at around 23% market share. It's not that IT departments are inherently against installing new software. They'd love to be able to deploy spiffy HTML5 apps too. But their hands are tied because they must make sure existing apps keep working. Now that they can use HTML5 apps without breaking any legacy apps, they may be a lot more open to deploying Chrome Frame than to changing browsers entirely.
  • Installing a plugin is a lot easier for end users than installing a new browser. In fact, it's a routine enough operation that I suspect many web developers will soon choose to take the Chrome Frame route rather than continue to expend time, money, blood, sweat, and tears working on hacks for native IE support. When users are presented with a dialog saying "In order to use This App You Really Want you must install the mumble mumble plugin," most of them will do it. (Flash has upwards of 90% market penetration!)

Because it was willing to sneak in the side door, ship as a plugin, and strip the chrome from Chrome, Google may be the first to have a real shot at bringing fast and standards-compliant browsing to a majority of web users.

New external Thinkpad keyboard

I ordered one of the new external Thinkpad keyboards and have been using it for a few days now.

Specs in brief: USB, no touchpad, no numpad, feels exactly like a Thinkpad T400s keyboard, and the price is right ($60).

For my own purposes, the design decisions that Lenovo made here are pretty much spot-on. This thing is fantastic; in particular:

  • TrackPoint scrolling on my desktop machine! No exaggeration, this alone is worth the price. Scrolling any other way is just… uncivilized. I look forward to using the mouse a lot less.
  • I was at first skeptical of Lenovo's decision to make a small keyboard, but I think it really turned out well:
    1. It significantly reduces the distance I have to reach to press keys like PgUp, PgDn, and the arrow keys. Alas, there are still a bunch of apps that have the poor sense to bind commonly used commands to keys like Ctrl+PgDn. I'm looking at you, web browsers. (If you wish, read more ranting about poorly chosen keyboard shortcuts.)
    2. I don't use the numpad often enough to justify having one.
    3. I like having the free desk space. And not having to reach as far for the mouse when I want to use it.
  • I like how thin the keyboard is (it's sort of wedge-shaped, and only about half an inch thick at the home row). At first I didn't think this would matter, but I definitely end up contorting my wrists much less than with my old keyboard.

And, to boot, it looks very classy— the picture above doesn't really do it justice.

Thinkpad keyboards do have some quirks (e.g. Fn key placement). I don't mind these anymore, but I know some people do. So perhaps the most useful thing that I can say in this review is that there are no surprises, as far as I can tell. If you've used a Thinkpad, you probably already know whether you want one of these Thinkpad keyboards.

The only reservation I have is that the keys seem to be spaced a little further apart than I'm used to (having become acclimated to the X series keyboards). I'm presuming that this is something I'll get used to over time, so I'm not worrying yet.

Kudos to David Hill and his team for a job well done.

ThinkPad Keyboard at the Lenovo Store

Additional notes on GNU/Linux support:

  • All the special keys (that I tried, anyway— volume, mute, and media player control) work out of the box on Ubuntu GNU/Linux 9.04.
  • TrackPoint scrolling works, too. Follow the instructions here, except in mouse-wheel.fdi, use "Lite-On Technology Corp. ThinkPad USB Keyboard with TrackPoint" instead of "TPPS/2 IBM TrackPoint". (In general, it looks like you can figure out the right info.product string to use by running xinput list and finding the right device from the list.)

Getting more "flow" from your window manager

Working on a Thinkpad keyboard is pure bliss. The Trackpoint lets me point (and scroll!) without moving my wrists; within Emacs/screen, even it is superfluous. Being able to immediately act on one's intentions is an important factor in attaining a state of flow, and there is a huge psychological difference between being able to do something "now" and "wait, hang on for 500ms so I can reach the touchpad." When I'm really in the zone, I can actually notice my concentration dissipating if I need to reach for the arrow keys, or worse, the mouse or touchpad (this is hard to explain to people who don't use Emacs or Vim!).

But window managers are still awful

Yet, even if I'm using Emacs and/or a web browser 90% of the time, and even supposing they left nothing to be desired, there's another program I have to interact with near 100% of the time: the window manager.

It's a shame, but few of the usability lessons we learned over the years made it to any of the common window managers— Windows, Mac, or Metacity. The window manager acts an intermediary sitting between the user and every single app, but WM functions are often marginalized and too hard to activate. This is especially unfortunate, because now that huge monitors are commonplace, we actually need better WMs to help us effectively use that screen space— maximize and minimize alone don't cut it anymore! (Jeff Atwood pointed this out all the way back in 2007.)

How can we do better? Well, what does "better" even mean? Here are the assumptions I'm operating under, about what kinds of operations are easy or hard to perform:

  • "Reaching is hard": reaching touch-typeable keys is easier than reaching the function keys (F1 … F12), the arrow keys, or the numpad.
  • "Pointing is hard": typing a small number of keystrokes is easier than using the mouse (or any other pointing device).
  • "Fitts's law": pointing at a large target with the mouse (e.g. a window) is easier than pointing at a small target (e.g. an icon, menu, button, border, or title bar).

You may be able to see where the problems are. Windows and Metacity map a bunch of window management functions to far-away placed keys like Alt+F4 and Alt+F10. Mac OS has better bindings for some functions, but doesn't map Maximize at all by default, instead providing access via itty-bitty buttons on each window. On both Windows and Mac OS, the fastest way to move (or resize) a window is by dragging it by its title bar (or resize handle). It doesn't get any worse than that.

Dragging resize handles… seriously? Fitts's law, anyone? This is nonsense.

This is the main reason that I can only use Windows or Mac OS for about 30 seconds before my blood pressure starts to go up. They are not usable. Having to fish around with the pointer every time you want to do something is also a great way to develop RSI.

Building a better WM

I've played around with a few scattered ideas with the goal of making a better WM for myself. I've implemented these in my Openbox config, although any reasonably configurable WM will do.

But, I would like to stress here, the details are unimportant. If you can find the parts of your workflow that are unnecessarily slow, and eliminate them, more power to you.

1. Manage windows and workspaces with shortcuts that don't require function keys or arrow keys. For example, Alt+F10 (maximize a window) and Ctrl+Alt+LeftArrow (switch workspaces) are out. In my setup they are replaced by Win+1 and Win+j, respectively.

2. Move and resize windows with alt-dragging (actually, Win-key dragging). When you want to move or resize the window, the entire window is your drag target. No need to aim for the title bar, or the resize handle, or the window border. Fitts's law, suckers.

To move windows: point at the window, hold down Win, and drag with the left mouse button. To resize windows: point at the window anywhere near the corner you want to move, hold down Win, and drag with the right mouse button.

Kudos to Metacity (and most other X WMs) for shipping with something similar by default.

3. Arrange windows using the keyboard.

It took me the longest time to realize that this was something that would actually be useful.

Most people only have a couple of common ways they like to arrange windows on the screen— for example, one window filling the entire screen, or two half-width windows side by side. What this means is that in the common case, you are really not asking for windows to be placed at specific pixel positions on your screen. Hence, the mouse isn't actually needed.

I use a couple of different window management "idioms," which are probably best illustrated by example:

  • Win+s i j k RET, for example expands the current window until it meets the nearest edge to the north, west, and south. Before and after:

    Win+s activates window-expanding and i, j, k, and l indicate north, west, south, and east, respectively.

  • Win+s Win+s means "expand the current window as much as possible without covering anything else," which seems like a common thing to want. Before and after:

    Openbox's smart window placement policy attempts to place a second window within the space left by the first one. So a quick Win+s Win+s makes a second window neatly fill up all the space left unused by the first one!

    For more complex arrangements, it's very fast to drag a window to an approximate location (see alt-dragging, above), then use Win+s Win+s to make it neatly fill the available space.

  • Win+w is analagous to Win+s except that it moves a window to the nearest edge to the north/west/south/east instead of resizing it.

An alternative strategy is to bind keys to move windows to a small number of fixed "slots," such as the two halves of a screen. Jeff Atwood mentions this strategy and its implementation in WinSplit Revolution.

4. Configure single-key shortcuts for launching common apps. For example, Win+b for a web browser. Not always a window-manager responsibility, but it can really streamline one's work.

Conclusion

Although I am still experimenting with some aspects of this setup, I am fairly happy with it. For better or for worse it does feel a lot like a "normal" window manager, just one that is highly streamlined (and doesn't give me hypertension). I'd be curious to see what additional improvements can be gained by giving up some of the associated flexibility and moving to a more restrictive model (e.g. something like a tiling WM).

For the curious, I've posted Openbox configuration recipes for tips 2 and 3 here.

AUCTeX and preview-latex

AUCTeX is a TeX mode for Emacs that adds various conveniences for editing LaTeX (and other macro packages). One of its killer features is inline previews of math and graphics, using preview-latex:

With AUCTeX and preview-latex, you can properly edit your document and equations in their high-level (TeX source) representation, while getting near-instant confirmation that your equations will turn out the way you want. It's pretty much the best of both worlds.

AUCTeX is easy to set up on Ubuntu, although it takes slightly more work than the usual apt-get invocation. Here is a HOWTO for my future reference (and yours). This has been tested on Ubuntu 9.04 with a post-v23-release emacs-snapshot:

Install AUCTeX.

$ sudo aptitude install auctex

Add the following init code to your .emacs file.

(load "auctex.el" nil t t)
(load "preview-latex.el" nil t t)

Open a LaTeX file in Emacs and press C-c C-p C-b (M-x preview-buffer) to generate math/graphics previews for the entire buffer. After you update an equation, press C-c C-p C-p (M-x preview-at-point) to refresh the preview.

Further reading: AUCTeX, AUCTeX manual

Zeya 0.1

Zeya is a streaming music server that supports an HTML 5 (draft standard) based player.


Motto: bring your music anywhere

I'm pleased to announce the first numbered release of Zeya, version 0.1.

Notable new features:

  • Support for the directory backend, which scans a directory recursively and serves up all the music in it. Invoke with
      zeya.py --backend=dir --path=/path/to/music
  • Experimental support for Google Chrome clients. Zeya plays music in Chrome. Latency is still poor and advance-to-next-track is broken for the time being. Read the README for the gory details.
  • Keyboard shortcuts for play controls: j, k, and SPC.

Visit http://web.psung.name/zeya/ for more information, and read the previous blog post on Zeya for a bit more context.

Google Web Toolkit and Java/JS translation

I've been using Google Web Toolkit for some projects at work, and I've been quite impressed with it. I think that for many classes of web apps, GWT (or something with the same general model) is really the "right" way to build your app.

(Disclosure: these were the first nontrivial web apps I've written since before the term "AJAX" came into common parlance, so I've not been totally in the loop for some time with respect to web programming or any of the major JS libraries.)

For the uninitiated: GWT is a free software development stack that lets web developers write apps entirely in Java. GWT provides various UI and utility libraries, as well as a compiler that compiles your Java code to a combination of Java servlets and client-side Javascript— including the XMLHttpRequest serialization/deserialization code needed at the client/server boundary.

I don't have anything against Javascript or dynamic languages. But I think that for certain web apps it can make sense to trade away some of the flexibility of Javascript. In many web apps today, the app is just the frontend to some database, and the client's responsibility is mainly just to manipulate the DOM and shuttle data around. Not a whole lot of business logic— although, to be fair, JS apps will likely become more and more sophisticated in the future. I suspect many of these apps might be better served by a strongly typed language. The GWT Java compiler prevents entire classes of errors right off the bat, including the most common type in JS apps in my experience: typos. (Yeah, unit tests and tools will also mitigate these problems a bit.)

However, I think the most important idea at the core of GWT is that of a language-to-language translator. The problem with pure Javascript is that the code you want to be writing and maintaining does not look at all like the code you want the browser to be executing. The idea of an intermediate translation step is not new: people have been using Javascript minifiers/obfuscators for years to speed up loading and execution. But GWT takes this further: in addition to shortening identifiers, it aggressively inlines code and eliminates dead code (unused variables, fields, methods, classes, ...). Now you're free to introduce indirection in your code to make it more readable and testable (for example, refactoring to the model-view-presenter pattern) without feeling guilty that you are making your app slower. (Testability! Hooray!) This eliminates some of the usual tension between speed and maintainability.

Actually, GWT takes its translation process even further: it compiles multiple browser-specific versions of your app— something that few, if any, humans would bother to do— so that no client ever has to load or execute switch-on-browser logic and each client gets a variant optimized for its platform.

When all these optimizations are considered, Google claims that GWT-generated code is "often ... faster than equivalent handwritten Javascript." That shouldn't be surprising: code that is to be maintained has to meet some minimum standard of readability, but code that is to be executed has no such requirement. This really drives home the importance, and the power, of having a source-to-source translation process in the pipeline.

(There are a number of other compelling reasons to use GWT that I haven't mentioned here. The GWT page gives a pretty good overview of them.)

Emacs 23

Emacs 23 has been released. You can download it from GNU's FTP server.

The most notable new features or changes that I noticed or have been using:

  • A single Emacs process can generate frames on any number of TTYs and on X simultaneously. Run M-x server-start and then emacsclient -t or emacsclient -c to create a new TTY or X frame, respectively.
  • C-p and C-n now move by screen/visual lines instead of by logical lines. While this makes Emacs more consistent with most modern apps, it has the potential to break keyboard macros. If you use macros frequently, consider setting (setq line-move-visual nil) to disable this behavior.
  • Transient Mark mode is enabled by default. I find t-m-m is occasionally really handy, especially after the mark-* commands, and for restricting the region for query-replace. However, I think t-m-m highlights a little too aggressively. To disable t-m-m except after the mark-* commands, I use the following:
  • (transient-mark-mode 0)
    
    ;; Activate t-m-m only after mark-* commands (you can also enable it manually
    ;; at any time with C-u C-x C-x).
    (require 'cl)
    (macrolet
        ((advise (&rest commands)
                 `(progn
                    ,@(mapcar (lambda (command)
                                `(defadvice ,command (after transient-mark activate)
                                   "Activate Transient Mark mode temporarily."
                                   (setq transient-mark-mode '(only))))
                              commands))))
      (advise mark-sexp
              mark-word
              mark-paragraph
              mark-defun
              mark-end-of-sentence
              mark-page
              mark-whole-buffer
              LaTeX-mark-environment
              LaTeX-mark-section))
  • C-l is bound to recenter-top-bottom, which moves the viewable area so that the current line is at the center, top, or bottom on successive presses. Useful for surveying the area around point.
  • DocView mode, new, is a PDF/PostScript/DVI viewer.
  • Emacs is smarter about splitting windows (e.g. after C-x 4 C-f), splitting vertically if your frame is sufficiently wide.
  • Beautiful anti-aliased fonts.

There are many, many, more changes. See etc/NEWS for the gory details.

On emacs-devel, I recently got a pointer to Richard Stallman's 1981 paper on the design of Emacs, presented at the ACM Conference on Text Processing:

Extensibility means that the user can add new editing commands or change old ones to fit his editing needs, while he is editing. EMACS is written in a modular fashion, composed of many separate and independent functions. The user extends EMACS by adding or replacing functions, writing their definitions in the same language that was used to write the original EMACS system. We will explain below why this is the only method of extension which is practical in use: others are theoretically equally good but discourage use, or discourage nontrivial use.

Extensibility makes EMACS more flexible than any other editor. Users are not limited by the decisions made by the EMACS implementors. What we decide is not worth while to add, the user can provide for himself. He can just as easily provide his own alternative to a feature if he does not like the way it works in the standard system.

It is striking how so little software today, even "professional" software that users are apt to use 8 hours a day, is actually designed to facilitate extensibility and automation by users.

Zeya: bring your music anywhere

I love SSH. One reason is that I love being able to get to my files from anywhere on the planet without any advance planning.

After using SSH for a while, carrying my bits around with me on magnetized platters or EEPROMs inside a laptop/phone/PMP just seems so quaint and irritating. It leads to a host of problems: you have to worry about synchronizing, deciding what to synchronize, merging changes, and misplacing your device. Usually, some bizarre cable is involved in transferring data. And invariably, there's that one spreadsheet, paper, song, or ebook that you tragically can't view because you left it on your computer at home.

(Incidentally, I think using cloud hosted storage/apps is one approach, but not a complete solution, at least yet.)

Introducing Zeya

While SSH is great for the vast majority of application classes, I think effectively accessing audio/video remotely requires more specialized tools.

To this end, I spent a few days writing Zeya, a music server that takes your Rhythmbox music collection and streams songs from it to you. However, unlike gnump3d or ampache, Zeya presents a full music player right in your web browser, using the goodness of HTML 5. No Flash, no Silverlight, no Java applets, no plugins, no popups, no invoking external players, no client-side software installation.

Zeya brings your music to any computer with a web browser (OK, as long as the browser is Firefox 3.5, for the moment). Play songs from your collection on your desktop at work. Or on your netbook at Starbucks. And now or soon, when you get a current-gen web browser on your phone/MID/fridge, you can bring all your music there too.

Picture iTunes' library sharing. Now imagine that its functionality wasn't crippled to lock users in to iTunes. Oh, and you could actually access it from outside your LAN. Oh, and that you could listen from any computer, anywhere, without installing any software. I think Zeya in its current state is just a pale shadow of what is really possible when you actually try to make information really easy, rather than just marginally less difficult, to get to.

Samson Yeung and I have been working on, and using, Zeya for a few days now. It's pretty useful and easy to get running, but it is feature-bare, experimental, and subject to change in all sorts of fun and interesting ways. If that doesn't scare you, visit

http://web.psung.name/zeya/,

try it out, and let me know what you think.

Why I can't afford proprietary software

The following happened at the lab a few months ago:

Date: Fri, 12 Dec 2008 12:16:52 -0500
To: matlab-users

Hi All,

Seems we've exceeded our pool of Matlab licenses.  We will be getting
more but unfortunately it takes 5-10 days to do so.

In the mean time please don't take up any more licenses than you
absolutely need and close instances you are not using.  We have some
people with tight paper deadlines that are unable to get licenses.

Thanks for your cooperation in this.

Just like that, for a handful of people, research work slowed to a crawl for almost a week at the end of the semester.

This kind of stuff makes my blood boil. There is no logical reason we have to ration software as if it were gasoline. But that's another rant. Enforcing unfortunate license restrictions is completely within The Mathworks' rights under the law.

I bring this up because I think it hints at the elephant in the room, namely:

When you use proprietary software on your computer, it's not your computer anymore.

That's the reason I'll talk my head off about free software and the reason I'm not going to stand for Matlab, Windows, Mac OS, Microsoft Office, the iPhone, , the Kindle, Flash, Skype, etc.— none of it. (Mathematica is a jaw-dropping technical achievement. And I still won't touch it with a ten-foot pole.) The moral of the story is not that you should double-check your Matlab license paperwork. No, this is par for the course, and just a symptom of the real problem: that proprietary software companies can enforce totally capricious and arbitrary restrictions on what we can and can't do with our computers, and that they have a well-documented history of doing so. So the less we have to rely on them, the better.

I am always a bit incredulous when people advocate proprietary software on the basis of things like ease of use or good design or shininess or intuitiveness or speed or usability or "it just works" or architectural superiority or elegance. It's not that I don't value those things. I'm an engineer myself, so I appreciate the effort that goes into making things work and work well. But what good is speed if you can't go anywhere? What good is usability if you can't fucking use your software? If you don't have two shreds of basic personal autonomy, worrying about other things seems kind of superfluous, no?

(It's just galling how most proprietary software has awful usability, but that's also another rant.)

You are probably skeptical, and rightfully so, about the true magnitude of the problem I'm complaining about here. This is not just an issue with a few bad apples (as it were...). It's a fundamental problem of incentives. When you pay for a license for bits, the vendor has an incentive to protect future sales by curtailing the possible uses of those bits. They certainly have the power to enforce whatever restrictions they wish and to change them up at any time. And they'll put time and effort into it, because you can't remove, inspect, or enumerate those restrictions. Maybe you think things are not so bad now, and indeed, maybe they aren't. But the rules can change arbitrarily and at any time, and I mean arbitrarily and at any time. (Three words: iPhone app store.) In the world of proprietary software, there are no sure things.

To put it succinctly, if you don't have the four freedoms, the people who make your software have the means, the motive, and the opportunity to cripple their products to extract as much money from you as they can. There is something deeply perverse and broken about this kind of business model. Remember the lesson from game theory: trust people when, and only when, it's in their best interest to help you.

It's not that I can't afford to pay for nice things. (What's $29, or even $1399, if it helps you get your job done? A bargain!) What I can't afford is paying for the privilege of letting someone have me by the throat (or elsewhere).

Making your own page-a-day calendar

Anomalously, today's post is about a DIY physical artifact.

A while back, I made a custom page-a-day calendar as a gift for my girlfriend. Each page tears off and has a picture on it. (Unfortunately, I don't have any photos of the finished product.)

With just a little effort, you can make one of these things and have it look quite professional. You can fill the pages with whatever photos, comics, etc. you want. And, I can virtually guarantee you, your recipient has never had a page-a-day calendar typeset in Computer Modern.

Here are skeletal instructions for making your own. You'll need a printer, ink or toner, most of a ream of paper, some padding compound, a paper cutting device/facility, cardboard, LaTeX, and time.

1. Get source images

Acquire 365 images from Flickr, your photo collection, your favorite CC-licensed webcomic, or whatever strikes your fancy. This post on curl may come in handy. Some notes: (1) Layout is much easier if the images are the same aspect ratio. (2) Consider upsampling the images if needed, e.g. with imagemagick, so you can print at a respectable DPI. Henceforth I'll assume you've named the images imgs/001.jpg, imgs/002.jpg, etc. If this is not the case, simply adjust the code in Steps 2 and 3 accordingly.

2. Use this LaTeX skeleton

Make a new TeX file and fill it with this:

\documentclass[17pt,oneside,final,showtrims]{memoir}
\usepackage{marvosym}

\setstocksize{11in}{8.5in}

\settrims{0in}{0in}

\settrimmedsize{4in}{6in}{*}
\settypeblocksize{3.5in}{1.75in}{*}
\setlrmargins{0.25in}{*}{*}
\setulmargins{0.05in}{*}{*}
\setheadfoot{0.01in}{0.1in}
\setheaderspaces{*}{*}{*}
\setmarginnotes{0.25in}{3.5in}{0in}

\checkandfixthelayout

\pagestyle{empty}

\usepackage[final]{graphicx}

\pagestyle{empty}

\newcommand{\daypage}[6] {
  \marginpar{\includegraphics[height=3.4in]{imgs/#1.jpg}}
  \begin{center}
    \Large{#2} \\
    \HUGE{\textbf{#3}} \\
    \large{#4}

    \vspace{0.4in}
    \small{#5}

    \vspace{0.2in}
    \scriptsize{\textit{#6}}
  \end{center}
  \newpage
}

\begin{document}
  % Cover page
  \marginpar{\includegraphics[height=3.4in]{imgs/cover.png}}
  \newpage

  \include{tex-days}
\end{document}

Salient points:

  • The \daypage command generates a new page. You supply arguments specifying the parameters for each page: the filename of the image to include, the day and date, a line indicating whatever holiday it might be, etc. Play around with the layout, especially if you're using images of different aspect ratios than I did or if you have a calendar stand of a particular size.
  • If you want a cover page, supply a cover.png; otherwise, remove the corresponding lines from the template.

3. Generate the pages

The template above includes tex-days.tex, which might look something like this:

[...]
  \daypage{182}{Sunday}{03}{Jul 2005}{~}{~}
  \daypage{183}{Monday}{04}{Jul 2005}{Independence Day}{~}
  \daypage{184}{Tuesday}{05}{Jul 2005}{~}{~}
  \daypage{185}{Wednesday}{06}{Jul 2005}{~}{~}
  \daypage{186}{Thursday}{07}{Jul 2005}{~}{~}
[...]

You can generate a skeletal version of this, sans holidays, with a quick Python program. I've provided a sample tex-days.tex file for the year 2010.

The first argument on each line indicates the filename, e.g. 182 indicates that 182.jpg should be included. Make sure these match the filenames you are using. The sample file assumes your images are named 1.jpg, 2.jpg, etc. If this is not the case, either create your own version or rename your files.

If you're interested in embellishing the output, the 5th and 6th arguments on each line provide supplementary text to go on each page (#6 is printed in smaller type than #5). You can fill in, by hand or programmatically, whatever notations you want here, e.g., holidays, birthdays, anniversaries, or a countdown to whatever.

Arguments 2, 3, and 4 give the day of week, date, and month/year respectively that are displayed, in case that wasn't clear.

4. Produce and print

Run the file through pdflatex and print it! Make sure the alignment is consistent across pages.

The showtrims argument in the template file makes LaTeX print trim marks on each page. However, you really only need trim marks on the first page. If you're obsessive-compulsive, you could print the first page with trim marks and the rest without to guarantee the marks won't show on the finished product.

5. Trim it

I took the stack of paper, with pieces of thin cardboard above and below it, to my local Kinko's (now Fedex Kinko's, I guess). I asked them to cut the stack along the trim marks (2 cuts, since 2 of the edges already run up again against the page edges). They did this for a fee of just $1/cut.

6. Bind it

Get some padding compound, e.g. Sparco padding compound. (I bought a quart, so I can probably make gift calendars/notepads for years.) Align the cut pages, leaving one piece of cardboard on the bottom, and put the stack in a vise. (In a jam, "under a pile of hardcover books" will do.) Using a paintbrush, paint the top edge of the stack with padding compound. Wait for it to dry. Paint another coat.

If you have random loose paper, padding compound is also handy for recycling it into notepads.

7. Mount it

This is not really needed, but is a nice touch. Find an old stand for a page-a-day calendar. Glue the cardboard backing in.

I hereby place the LaTeX template and LaTeX snippets in this post into the public domain.

A different approach to fighting phishing

We are usually advised to avoid being phished by looking carefully at the address bar for discrepancies. Unfortunately,

The web page; the URL; the SSL certificate (if any); indeed, all information displayed to the user; is information chosen by the attacker. The user is then asked to discover discrepancies in information that has been carefully designed for deception. This type of game is better suited to a book of puzzles than a secure user interface. —Tyler Close

The problem is that humans are notoriously bad at detecting rare and non-obvious events.

The Petname tool is a nifty alternative way to detect and expose phishing. It's a Firefox extension that lets you enter short messages (e.g. "stock trades") to be associated with a site's CA public key and distinguished name (DN). Those messages are only subsequently displayed to you when you return to the site if the key and DN match.

This is somewhat similar to the approach used in SSH. That is, on the second and subsequent transactions between you and another party, it's not quite as valuable for you to know simply whether some trusted CA will vouch for that party. What you really want to know is that you are talking to the same party you were talking to last time. (As with SSH, man-in-the-middle attacks on the second and subsequent logins can be detected... even if the user didn't properly verify the authenticity of the remote party on the first login!)

Of course, all this is of little use if the human doesn't check the petname before entering his password. For that, the author suggests that web browsers be made to automatically manage credentials on our behalf...

Further reading: W3C workshop paper

Chrome is blazing

I've been using the Chromium builds off and on for some time now and finally changed my default browser to Chromium. The two major reasons that sold me:

One. It is blazing. In comparison, Firefox 3.5 is now intolerably slow. I don't even want to try Firefox 3.0. This is both for rendering time and startup time. (Startup from a cold cache: 1 sec for Chromium, 10 sec for Firefox. That's right, Chromium loads in one second on my 8 year-old computer.)

Two. The "omnibox" (the unified area for selecting URLs, searches, and bookmarks). It works fantastically. Now, Firefox's implementation (the "awesomebar") is pretty admirable. They too support searching bookmarks and history and have a fairly sophisticated filtering language for narrowing the suggestions. But, as far as I can tell, in order to actually use any of the suggestions you have to move your hands away from the home row, down to the arrow keys to press "Down".

From a UI perspective, this is snatching defeat from the jaws of victory. (In Chromium, pressing Enter in the omnibox selects the first completion.)

Chromium is still a no-frills browser, and I still miss some Firefox extensions (okay, okay, only one). But it's already good enough that using Firefox regularly is just unbearable now. Hats off to the Chromium engineers!

Making gorgeous LaTeX plots with Octave

I previously wrote about using the epslatex terminal in Gnuplot to generate beautiful plots for inclusion in a LaTeX document. The secret is that the epslatex terminal produces a combination of (1) EPS vector graphics and (2) TeX instructions to overlay all the text (axis labels, legends, etc.) in whatever font you are using in the rest of your document. So typically you get that super slick looking Computer Modern Roman (cmr) font.

Now, there are some things that are beyond the ken of Gnuplot. So it was a relief when I learned that GNU Octave can produce similarly formatted EPS + TeX graphics. What's nice about using Octave instead of Gnuplot is that, not only can you take advantage of Octave's more advanced (as I understand it) graphics facilities, but you can also bring to bear all the power of a full mathematical/simulation language for preprocessing your data or whatnot. I still usually use Gnuplot, but I break out Octave for making plots when necessary.

All you have to do is produce your plot in Octave as normal (e.g. plot(...)), and use a command like the following to output in EPS + Tex:

print('my_plot.tex', '-dtex');

As an example, here's some minimal code to produce a heatmap with contours and a legend:

x_values = [0.10 : 0.005 : 0.60];
y_values = [0.10 : 0.005 : 0.60];
contourf(x_values, y_values, data); % supply your own data...
axis square;
colorbar;
xlim([0.1 0.6]);
ylim([0.1 0.6]);
print('-dtex', 'my_plot.tex');

Octave really saves the day here. To the best of my knowledge it is difficult or impossible to do this using just plain Gnuplot, especially if you are not plotting over a square area.

Using itsalltext with emacs/emacsclient

I finally started using It's All Text!— this is something I should done long ago. (It's All Text! is a Firefox extension that lets you invoke an external editor to edit the contents of any textarea element, like this blog post I'm writing right now in Blogger.)

There's such a stark contrast between using Emacs, where I feel at one with the document, and typing in the browser textarea, which always makes me feel kind of claustrophobic now (the problem is not that the textbox is too small, but that it doesn't provide enough degrees of freedom for editing).

If you are using an Emacs with multi-TTY support (a v23 snapshot), you can leave one long-lived Emacs server instance running and quickly pop up a new frame from it for each editing buffer:

  1. From your main Emacs frame, run M-x server-start.
  2. Save the following wrapper script (I called mine ecw) and configure it to be your editor in It's All Text!:
    #!/bin/sh
    /usr/bin/emacsclient -c $@
  3. In the It's All Text! options you can configure your favorite hotkey to launch a new Emacs frame for editing.
  4. When you're done with a buffer, save and press C-x # to return to Firefox.

Update: another tip. To automatically fire up html-mode when editing text from, say, blogger.com, you can add something like this to your .emacs:

(add-to-list 'auto-mode-alist '("/www\\.blogger\\.com\\.[^/]+\\.txt\\'" . html-mode))

This works because the temp file that It's All Text! creates has a name like www.blogger.com.2a2q1e2r32.txt.

Clever software names

Many free software projects have, among other characteristics, punny project names. Some notable examples:

  • GNU CSSC (Compatibly Stupid Source Control), a clone of the Unix source control system SCCS (Source Code Control System).
  • GPG (GNU Privacy Guard), a clone of Phil Zimmerman's encryption program PGP (Pretty Good Privacy)

I suspect that people would think twice about choosing those names if those products were invented today. Naming a similar product (a clone, even) with a name that is intentionally confusingly similar seems, indeed, like the perfect way to get yourself slapped with a trademark lawsuit. (After all, it happened to Lindows.)

At least GNU is safe. No reasonable person would think that GNU's Not Unix was the same as Unix.

Microsoft Excel is a creation of staggering boneheadedness

What I learned about Microsoft Excel today makes me not really want to use it again for anything that is even moderately important.

Did you know that there is no easy way to import data into Excel with any fidelity?

Just write to a delimited file, you say. CSV data, TSV data, whatever... they are all very easy to produce programmatically. And they will all get screwed up when you open them.

You see, when you open such a file, one of two things will happen, both of them bad.

First, Excel may open the file without complaint. I think this happens when you double-click on the file in Windows. Excel will then apply heuristics to set the format for each cell appropriately. These heuristics are not 100% reliable, for which I can hardly fault Excel. As one example, a cell containing a list of numbers 50001,50002,50003,50014,50018 is interpreted as a single large integer, which Excel converts to the floating-point number 5.00e24.

However, and here is the problem, the conversions are silent (no warning is given upon opening a CSV file) and lossy (above, some of the precision needed to construct the original sequence is lost), and they cannot be reverted by any magic incantations within Excel after you've opened the spreadsheet.

That's right. When opening CSV/TSV files, Microsoft Excel's default behavior is to silently corrupt your data.

I'm running into this problem at work, and I'm just glad that someone noticed what was wrong, because this is really pernicious. You might spot-check your spreadsheet and think that everything is fine and not notice that your data is corrupted starting in row 8000. This is actually happening in biomedical research. Gene names, identifiers, you name it— are silently and irreparably converted to numbers and dates in Excel. (And those researchers don't believe there's any good way to deal with this either.)

For the record, OpenOffice gets this right, by preserving data verbatim by default when importing. That is positively brilliant in comparison.

Now, the second thing that might have happened when you opened your file in Excel is that you invoke Excel's "Import Wizard." I think this happens when you choose "File" "Open" and select a delimited file. Excel will dutifully ask you how the columns are to be delimited and what format to use for each column.

And, although you can select "Text" format here (meaning, preserve the input verbatim), the default format is the "do what Excel thinks is best" option. Once again, Microsoft Excel's default behavior is to corrupt your data. Sure, in this case at least you can click on each column you think might have a problem and select "Text" format for it. But if you are a human, and you're opening these kinds of files all the time, and you trust yourself to do this consistently and correctly, you probably deserve what's coming to you. Humans are not good at performing repetitive tasks. That's supposed to be the computer's job.

What are we expected to do, write out Excel's XML-based file format directly, now that's it's nominally "open" and "documented"? That seems really heavyweight. It is kind of a remarkable oversight that it is so difficult to massage arbitrary data into any format that can be reliably read by Excel.

Extracting select pages from a PDF

Ever needed to make a new PDF containing a subset of the pages of an existing PDF?

The pdfnup tool is nominally for printing PDFs n-up (n logical pages to a sheet), but it can be told to cut and paste pages into a new PDF, like so:

pdfnup original.pdf --nup 1x1 --pages 1,3,5,7,21-25 --outfile subset.pdf

On Ubuntu, you can install pdfnup with a simple sudo aptitude install pdfjam.

Further reading: pdfjam

Update: pdftk also seems like an intriguing option, especially for more complex operations.

Toto, I've a feeling we're not in 32-bit land anymore

I got a new computer and have had some time to put it through its paces. It's a Dell Studio XPS, with a Core i7 920, 6GB of RAM, and a 24" display. All things considered it was a pretty good deal (ordered in January, $1060 shipped). I later upgraded to 2TB of disk.

This thing is blazing. And building with make -j8 makes me happy inside.

As far as I can tell, everything in it works out of the box with all free software on Ubuntu GNU/Linux Jaunty Jackalope (9.04) x86_64. (Even the ATI Mobility Radeon HD 3450.)

The display, a Dell S2409W, is nothing to sneeze at, either.

There is space inside the case to mount an extra hard drive. It goes in vertically, which is nice because you don't have to wrestle with all the cables and wedge it past the RAM to get it in. However, there is only space for one additional hard drive.

Power consumption is much better than my old desktop. The machine idles at 97W. This goes up to maybe 170W at high load.

My only complaint is the noise. The fans go on like a leafblower when you're pegging the CPU, but I don't really care about that part. The problem for me is more the baseline noise. The computer isn't loud, per se... but it isn't quiet either. It's in my room and I put on earplugs when I go to sleep. Take that with a grain of salt, though, because I'm a person who has trouble falling asleep in the presence of a wall clock.

Chrome

I haven't used Google Chrome much, but it's been interesting following Chrome/Chromium development on the blogs. The things that make it to the blogs are neither too vague so as to be uninteresting nor too detailed so as to seem unimportant or trivial.

The Chrome folks have a lot of cool ideas (especially about UI), and it's neat seeing fresh ideas in the web browser arena. One of my favorite little gems so far:

In most tab strips, when you close a tab the other tabs expand to fit the space that has just been made available. The upshot of this is that the close boxes of the remaining tabs all move around slightly, which makes it harder to quickly close tabs by clicking in the same spot. [...]

For Chrome, we came up with something a little different. Realizing that maintaining a fixed width for tabs when closing them would keep close buttons aligned under the mouse pointer, we designed a system whereby the tab strip will re-layout when you close a tab to fill the gap left, but not resize the remaining tabs, until you move your mouse away from the tab strip [...].

(Ben Goodger, "Tabbed Browsing in Google Chrome")

That is clever. An engineer did the right thing there.

In addition, it's interesting to be able to read about the software development methodology employed by Google, which one usually doesn't hear too much about. For example, the way they use usage data to identify newly coined words that should be added to the spellcheck dictionaries or the way they systematically regression-test the renderer on the million most popular web sites. Most free software projects don't have these resources, and most proprietary shops don't publicize these kinds of details.

Update, 30 Mar 2009: I tried some Chromium builds on GNU/Linux and they are blazing. Wow.

Configuring Radeon R600/R700 devices on Ubuntu Jaunty

Update, 18 Mar 2009: Ubuntu Jaunty Jackalope 9.04 and later now support R600/R700 hardware out of the box. Make sure you update to at least xserver-xorg-video-radeon 1:6.12.1-0ubuntu1, libdrm2 2.4.5-0ubuntu2, and linux-image-generic 2.6.28.11.11. If you have done so, you can stop reading this page now.

Update, 5 Mar 2009: fixed the instructions for the DRM modules. You need to build from the origin/r6xx-r7xx-support branch.

I got a new computer with an ATI Radeon 3450 graphics card (R600 series). Here are the steps I took to configure it on Ubuntu Jaunty Jackalope 9.04 using free software (open source) drivers. The instructions below might also be useful for other Radeon hardware, in particular, R700-series cards.

Happily, Ubuntu 9.04 does drive the 3450 out-of-the-box (it boots into X and is usable, using the free radeon driver). Its performance was, however, kind of poor on my machine. I noticed "wiping" and flickering when scrolling (e.g. in firefox or gnome-terminal) or moving windows around.

Building the development versions of the radeonhd and drm drivers fixed these issues and significantly improved desktop performance. Everything is quite slick-looking now, even while driving a 1900x1080 display. Kudos to the driver developers, and to AMD/ATI for cooperating with driver development. Ubuntu does package radeonhd, so sooner or later, after this code gets released, an out-of-the-box installation will provide this level of performance as well.

For historical reasons there are two free drivers for ATI Radeon devices: radeon and radeonhd. They have approximate feature-parity these days. I used radeonhd because I found instructions for it first. :) Instructions are adapted from these sources: radeonhd r600-r700 instructions, Ubuntu community documentation for RadeonHD. (For your reference, the Radeon feature matrix describes what features are supported on what models.)

Download the build prerequisites:

$ sudo aptitude install git-core build-essential
$ sudo aptitude build-dep xserver-xorg-video-radeonhd

Clone the repos for radeonhd and drm:

$ git clone git://anongit.freedesktop.org/git/xorg/driver/xf86-video-radeonhd
$ git clone git://anongit.freedesktop.org/mesa/drm

Build and install radeonhd:

$ cd xf86-video-radeonhd
$ ./autogen.sh --prefix=/usr
$ make
$ sudo make install
$ cd ..

Build and install the drm modules:

$ cd drm/linux-core
$ git checkout origin/r6xx-r7xx-support
$ make radeon.o drm.o
$ sudo cp radeon.ko /lib/modules/`uname -r`/kernel/drivers/gpu/drm/radeon/
$ sudo cp drm.ko /lib/modules/`uname -r`/kernel/drivers/gpu/drm/

Now we'll configure X to use the new drivers. If you don't have an xorg.conf file with stuff in it, auto-generate one:

# X -configure
# mv /root/xorg.conf.new /etc/X11/xorg.conf

Change the Device section. I modified the Driver line and added the following options:

Section "Device"
    Driver "radeonhd"
    Option "AccelMethod" "exa"
    Option "DRI" "on"
    Option "VideoOverlay" "off"
    Option "OpenGLOverlay" "on"
    Option "TexturedVideo" "off"
    [...]
EndSection

Verify that the DRI module is loaded and that non-root applications can access it:

Section "Module"
    Load  "dri"
    [...]
EndSection

Section "DRI"
    Mode 0666
EndSection

Then reboot your computer. That should do it.