Pandemonium reigned in the Geek household for about an hour this morning. We've been rather busy with some wedding related issues lately, including printing and assembling our wedding invitations. We haven't been too diligent about cleaning the apartment as a result. So, when Fiancee S.'s keyring with her car and apartment keys went AWOL this morning, a lot of running around (and some cursing) ensued.

They were eventually located after about 45 minutes, in a most unusual place: one of my shoes. Fiancee S. generally puts her purse and keyring on a little table right next to the door of our apartment. I have a doormat where I leave my shoes right next to the table. The keys must have fallen off the table and into one of my shoes. Locating the keys took nearly an hour because I had picked up my shoes and placed them in a gym back along with a change of clothes because I work out at the gym before going to my desk. Fortunately, I made sure to check everything I was taking out of the apartment for her keys before I left.

Finding lost keys is not what I really want to write about, however.

I was watching a PBS show the other night about the effort to sequence the Human Genome and it reminded me of my own insignificant peripheral part in that scientific endevor. Back in late 1999, I was approached by Professor Xerxes, renouned researcher in machine learning at the institution where I was studying, about a database system he needed to build. I was the database administrator for a large archive of meterological and oceanographic data for a research project in the department that had been going for a number of years. Professor Xerxes had a large project he had to get off the ground fast that involved pushing large (multi-gigabyte) chunks of data around a computer cluster and in and out of a database. One of his students was suggesting the use of open source MySQL database, on Linux and Professor Xerxes wanted to know if I thought it was a good idea or not.

I ultimately said no. I wrote him a long e-mail (that I probably still have burned onto a CD somewhere) saying that while MySQL was probably good for running a mom and pop web site that needed to keep track of phone numbers, addresses, and a small number of business transactions, it wasn't (at that time) going to be a star performer for large chunks of data. I therefore suggested he look at procuring a commercial UNIX system like the one I was administering because it would be the "best" solution.

It turns out that great science and great discoveries are not made using the "best" solutions, however. Getting a system like the one I proposed was rejected for two reasons: first, it would take too long to procure, and second, Professor Xerxes's student was a code writing ANIMAL who was very familiar with the Linux/MySQL enviroment. When time is a constraint, you've got to use the tools you have at hand and know best.

Only later did I understand the full significance of that database Professor Xerxes asked me about. It turned out that Professor Xerxes was using machine learning research that he'd been working on for years on protein recognition to sequence the human genome. His student was writing the piece of software (in about 3-4 months) that was ultimately used to sequence the first full draft of the "public" version of the human genome (there are two versions) -- an achievement that most likely qualified him for a Ph.D. on the spot. The MySQL database I was asked about was used to hold representations of DNA sequences while they were identified and grouped into genes.

Oh, my brush with greatness.

said drgeek on 2004-04-09 at 3:13 p.m.


