Doctor of Hacking

Amir Michail has asked me to comment on his proposal to create a new field: one that’s “like computer science, but more creative.” My first reaction was to wonder, how much more creative does he want? He might as well ask for a field that’s like dentistry, but with more teeth. (I was reminded of Hilbert’s famous remark, when told that a student had abandoned math to become a poet: “Good. He didn’t have enough imagination to be a mathematician.”)

But on second thought, it’s true that computer science encourages a particular kind of creativity: one that’s directed toward answering questions, rather than building things that are useful or cool. I learned about this distinction as an undergraduate, when the professor in my natural language processing class refused to let me write a parody-generating program (like this one) for my term project, on the grounds that such a program would not elucidate any scientific question. Of course, she was right.

Paul Graham explained the issue memorably in his essay Hackers and Painters:

I’ve never liked the term “computer science.” The main reason I don’t like it is that there’s no such thing. Computer science is a grab bag of tenuously related areas thrown together by an accident of history, like Yugoslavia. At one end you have people who are really mathematicians, but call what they’re doing computer science so they can get DARPA grants. In the middle you have people working on something like the natural history of computers — studying the behavior of algorithms for routing data through networks, for example. And then at the other extreme you have the hackers, who are trying to write interesting software, and for whom computers are just a medium of expression, as concrete is for architects or paint for painters …

The mathematicians don’t seem bothered by this. They happily set to work proving theorems like the other mathematicians over in the math department, and probably soon stop noticing that the building they work in says “computer science” on the outside. But for the hackers this label is a problem. If what they’re doing is called science, it makes them feel they ought to be acting scientific. So instead of doing what they really want to do, which is to design beautiful software, hackers in universities and research labs feel they ought to be writing research papers.

(Incidentally, Graham is mistaken about one point: most theoretical computer scientists could not blend in among mathematicians. Avi Wigderson, one of the few who can and does, once explained the difference to me as follows. Mathematicians start from dizzyingly general theorems, then generalize them even further. Theoretical computer scientists start from incredibly concrete problems that no one can solve, then find special cases that still no one can solve.)

One puzzle that Graham’s analysis helps to resolve is why computer systems papers are so excruciatingly boring, almost without exception. It can’t be because the field itself is boring: after all, it’s transformed civilization in 30 years. Rather, computer systems papers are boring because asking hackers to write papers about what they hacked is like asking Bach to write papers about his sonatas:

Abstract. We describe several challenges encountered during the composition of SONATA2 (“Sonata No. 2 in A minor”). These results might provide general insights applicable to the composition of other such sonatas…

So what should be done? Should universities create “Departments of Hacking” to complement their CS departments? I actually think they should (especially if the split led to more tenure-tracks for everyone). All I ask is that, if you do find yourself in a future Hacking Department, you come over to CS for a course on algorithms and complexity. It’ll be good for your soul.

32 Responses to “Doctor of Hacking”

  1. Anonymous Says:

    “…rather than building things that are useful” – by CS here you do mean theoretical CS and not Systems?

  2. Anonymous Says:

    It is silly to suggest that theoretical computer science is the same as mathematics. This is like saying theoretical physics and math are one and the same. They aren’t. Sure, they have lots in common, and some people inhabit the boundary between the two, like Avi or Santos Vempala, but the ultimate motivating factor for each is different.

    Studying computer science problems without regard to their actual relevance is not a new field. That *is* mathematics.

  3. Anonymous Says:

    That is why internet practitioners publish few papers. They limit themselves mostly to rfc’s and the like. Perhaps there is a lesson there for the more applied side of computer science. Forget about publishing and let your code do the talking.

  4. Greg Kuperberg Says:

    I used to know Paul Graham. He’s a nice enough guy in person, but I really object to some of the opinions on his web site. Liking Paul Graham is not the same as agreeing with him.

    The idea of separating hacking from CS theory is like the idea of separating popular music from classical music. In fact, there is some distance between them in the real world. Undoubtedly there are a lot of hard rock guitar players in college who think that music departments are really tedious. Some of them even think that learning to read music is tedious. Why bother with all of that abstract stuff; what’s really important is hacking with that guitar. (And that great amplifier, which goes up to 11.)

    But what happens to self-trained rock musicians? They are in an overcrowded genre, to say the least. Very few of them get the creative career that they want. Instead they graduate to bad gigs, very bad gigs, and corporate muzak.

    There are a lot of very bad gigs and a lot of corporate muzak in the software world. Much of it is written by middling former CS majors who never respected CS theory. It’s fine to hack as a teenager ā€” I did my share. But after that, every self-respecting programmer should get a deeper clue. Which is to say, math and algorithms and all that hard stuff. This is not “eat your peas” advice. It’s more like “give Mozart a try” advice: his music is beautiful and there is a lot to learn from it.

    I also don’t like Graham’s essay about nerds. Again, it’s been there, done that; nonetheless, I strongly disagree. But that’s a more personal matter and I’m not sure that I want to expand on it in public.

  5. Scott Says:

    Greg: I certainly don’t agree with everything in Graham’s essays, but I find many of them to be brilliant. The one about nerds is my favorite. I’ll post about it sometime; then you can choose whether or not to explain your disagreement. šŸ™‚

  6. Niel Says:

    One of the problems is that Computer Science isn’t; that is, it isn’t science. Computer science has in common with mathematics that no-one really has a good place to put it. Like math, CS in many universities is lumped in the science faculty, although it is clearly not about empirical testing of the real world (except somewhat artificially in the case of the natural-history segment of CS).

    The only thing that math and CS have in common with science is that they are disciplines in which well-formed statements are likely to be falsifiable if they are not true.

    The difference between math and science is that math also allows for the possibility of proving statements to be true, which is never possible in a domain where empirical results are the ultimate authority. Theoretical CS also has this property, and properly speaking is a branch of mathematics; even if it’s plumage and mating call differ, it is obviously of the same genus. If theoretical computer scientists are different from many mathematicians, perhaps it is for the same reason as a someone working in number theory is different from someone working in solving partical differential equations.

    The natural history and programming aspects of CS seem to actually be not science, but a new variety engineering — the design and testing of information constructs to be implemented in an information-processing environment.

    I don’t know why these things are all lumped together, and furthermore are labelled as science. It may possibly be out of a vague feeling that the computer-engineers should be speaking to the informatic-mathematicians. But mathematicians got along well just fine with physicists (or were simultaneously both) without physics being labelled a branch of mathematics or vice-versa.

    There is a saying that anything which has “science” in it’s title probably isn’t. The general idea is that if a subject is actually science, you’ll be able to recognise the lion by it’s claw marks. I think that CS is an illustration that a strict interpretation of that saying is very possibly true.

  7. Miss HT Psych Says:

    This sounds very similar to something in my discipline. Psychology is so varied in it’s purposes that at some point it essentially split into 2 or 3 seperate discipline, still being called the same thing. Neuroscience, experimental psychology and applied psychology. Maybe CS will go through the same thing… a period where you have “crisis” and “unification” literature popping up in journals. It will until the seperate factions find a common ground or a language they can all speak in. If they can’t, they split forever. This is why Thomas Kuhn argued that Psychology was still a pre-paradigmatic science… because of the unification problems it couldn’t truly meet the requirements of a paradigmatic science. Maybe CS, as a young science itself, is on the same path…

  8. David Molnar Says:

    Niel: I find it amusing to juxtapose your emphasis on proving statements true with Scott’s recent posting about how we should write papers. Of course, I think Scott is saying that the way he suggests of writing should count as proving a statement in your sense.

    Scott: Who says all systems papers are boring? or is it just that this isn’t your area of interest? Yes, a paper cannot replace the code, but it is not meant to do so — and there is more to systems research in computer science than just hacking code!

    Rather than go into a long exposition of my favorite systems papers here, I’ll just point to Val Henson’s weblog, in which she picks some of her favorite work in OS and does an excellent job of explaining why it’s a good idea:

    http://osjunkie.blogspot.com/

    For what it’s worth, I do know people who believe that the code itself is sufficient as documentation. I don’t agree. Sure, 95+% of all the papers out there are terrible, but 95+% of most writing is terrible. It’s the remaining ones that advance the field and give others a concise way to avoid repeating the mistakes of the past.

  9. Scott Says:

    “Scott: Who says all systems papers are boring?”

    I do. šŸ™‚ (Well, almost all.)

    “or is it just that this isn’t your area of interest?”

    Look, I’m trying to be unbiased. But for CS152, I read many dozens of papers of the form “we implemented such-and-such,” and not once was I (so to speak) brought to intellectual orgasm, in the way one expects from a good theory paper. In my experience, the sole exception is computer security papers — especially papers that describe a break of some widely-deployed system.

  10. Scott Says:

    ” ‘…rather than building things that are useful’ – by CS here you do mean theoretical CS and not Systems?”

    No, I was talking about systems as well. See Graham’s essay.

  11. Anonymous Says:

    There is a lack of historical perspective in the discussion. Poincare–a famous mathematician–independently discovered the equations of relativity before Einstein but couldn’t bring himself to believe they were true.Einstein used to be called a mathematician, before mathematicians banished anything with a hint of an application from their “pure” math departments in the early parts of the twentieth century.

    Similarly physicists used to work on all manner of problems, from the highly theoretical to the extremely applied (see e.g. Fourier) before physics decided too to expel its best applications into engineering.

    These decisions have nothing to do with what is or isn’t truly part of math or physics or what constitures a science. They were human constructs responding to external demands and funding pressures. Math and physics were made all the poorer by them and physics has recently been moving rapidly to recover some of those orphaned areas.

    Ironically, now we see people claiming that CS won’t be a science until it makes the same mistakes as math and physicis and it jettisons its best applications and source of inspiration!

  12. Scott Says:

    Anonymous: I wasn’t suggesting to “spin off” hacking in order to preserve the purity of theoreticians (we’re already doing that just fine, thank you), but rather to give hackers more freedom to write beautiful software that doesn’t necessarily answer any scientific question.

  13. Eldar Says:

    Actually Einstein admits to being wary of mathematics during the time of his developement of Special Relativity. Of course, that all changed by the time he got to General Relativity…

    A little more generaly, from what I learned through some theoretic physics courses that I took as an undergrad, some physicists view themselves as mathematicians of reality while others view mathematics as little more than a spiritual wrench.

    And if I’m babbling so much about mathematics and physics already, one could describe some of the constructions in advanced quantum field theory as “black magic with mathematical symbols”, as no sane theoretic mathematician would ever approve the derivation of the formulas there (and in some cases not even the existence of any meaning to the formulas of the physical axioms themselves).

  14. Anonymous Says:

    To me the problem is that Computer Science departments spend way too much time with human constructs. The nuances of the Java class libraries, or the hundreds of machine instructions on a VAX (not to mention its horrid floating point representation) have litte to do with computer science. I think it would be great if systems programming and algorithms were taught by seperate departments.

    The systems side could have a more “get er done” attitude, while the algorithms side would have less baggage.

  15. Greg Kuperberg Says:

    Right, beautiful hack software unfettered by the ugly millstone of good algorithms.

    Doesn’t the world already have plenty of that?

  16. Anonymous Says:

    –The nuances of the Java class libraries, or the hundreds of machine instructions on a VAX (not to mention its horrid floating point representation) have litte to do with computer science.

    There is plenty of computer science in them. Say, why are they so wretchedly bad and what can theory say/do about it?

    As a second example: not long ago researchers at Microsoft devised a simplified logic that can capture the functionality of most device drivers and automatically formally verify them. This reduced enormously the windows bug count. An automated theorem prover that actually works is first rate theory work.

    Another excellent example is Boaz’ result on code obfuscation. A short minded view of computer science would have argued that code obfuscation is a human construct and hence not computer science. Luckily for us, Boaz wasn’t reading this blog back then, otherwise one of the better theory papers of 2001 would have never seen the light of day.

  17. Greg Kuperberg Says:

    On the other hand, I concede that I have seen one very bad trend in computer programming that has something to do with computer science departments: insanely bureaucratic “object-oriented” code. I do not know who exactly to blame for this, whether the problem is hackers, or starched-shirt CS majors, or a CS education fad, or computer language zealots. (I doubt that the problem is complexity theorists, at least.) Regardless, it’s a serious problem.

    For example, here is how an “object-oriented” zealot might write 2+3 in Java:

    arithmetic = new IntegerArithmetic();
    x = new Integer(2,arithmetic);
    y = new Integer(3,arithmetic);
    plus = new AdditionOperator(arithmetic);
    plus.leftOperand(x);
    plus.rightOperand(y);
    answer = plus.evaluate();

    Some widely used APIs all but force you to write code like this.

  18. Scott Says:

    “Luckily for us, Boaz wasn’t reading this blog back then, otherwise one of the better theory papers of 2001 would have never seen the light of day.”

    I don’t think that’s fair to Boaz. I mean, all he’d have to do is scroll down to the previous post where I praised his paper, then click on the link. šŸ™‚

  19. Anonymous Says:

    — insanely bureaucratic “object-oriented” code. (I doubt that the problem is complexity theorists, at least.)

    Actually, I think it has a lot to do with the obsession of programming language researchers with elegant theories, without attention to praxis.

    Sure object oriented software architectures are cleaner, but as you correctly show, rigid “the world is an object” architectures are silly.

    For example, Java took a step that way, completely removing global variables and functions as first order citizens (except for some few native types) and if anything they are criticized within the PL community for not going all the way!

    At least your example creates an independent addition operator (you betray your mathematical origins here). The standard way to do this in Java is to ascribe the plus operation to the first operand which is counterintuitive and wrong. The operation addition is not a property or function of a single integer, it “belongs”, if anything to the unordered pair. But even so, it is more the other way around (again as reflected in your code) the operation has independent existence and it owns the ordered pair, not the other way around.

  20. Anonymous Says:

    You might find this thread on my proposal to be of interest:

    http://www.ocf.berkeley.edu/~wwu/cgi-bin/yabb/YaBB.cgi?board=riddles_general;action=display;num=1114293169

  21. Anonymous Says:

    You might find this thread on my proposal to be of interest:

    http://www.ocf.berkeley.edu/~wwu/cgi-bin/yabb/YaBB.cgi?board=riddles_general;action=display;num=1114293169

  22. Greg Kuperberg Says:

    Indeed, in some ways my seven-line obfuscation of 2+3 is too elegant to satisfy purists of object-oriented programming. The conventions encourage you to implement binary operations, like addition, asymmetrically as methods applied to one of the operands.

    But the bureaucracy of team programming can always dismiss any solution as too specific. There are always more complicated ways to express 2+3; seven lines would not be nearly enough to satisfy the bureaucratic programmer who is long on enthusiasm and short on ideas.

    For example, last year I looked for a good program to generate Captchas. (These are visual Turing tests; I wanted one to spam-protect e-mail addresses.) Unfortunately, open source efforts were then dominated by an extremely bureaucratic project called JCaptcha. Why use a framework with hundreds of classes when some implementations are so simple, the lead developer asks. His thinking is that it will be that much easier to extend the code in the future. I hope not!

  23. Bram Says:

    The big problem is that a CS background is mostly useless for real world programming. Sure, it’s a good idea to know a few basic algorithms and runtimes, and developing an intuition about which problems are NP-complete is worthwile, but such issues only turn up occasionally in practice. Real world coding issues, like using version control, or modularizing properly, are frequently not even mentioned once in the course of an entire undergrad program.

    Exacerbating this problem is the attitude among academics (both real academics and the faux academic design patterns crowd) that they know how to code better than the people who actually do it. The commentary on operator overloading above is a good case in point – such suggestions involve a lot of increase in complexity of the underlying language, for a case which is utterly irrelevant in the real world.

    And, sorry Greg, but you’re totally wrong about musicians – people who went to Juliard may have the attitude that they could write the hits if they wanted to, but the fact of the matter is if they could they would, and a lot of very notable hit-writers have no academic training.

    I don’t see much of a way of fixing these problems in CS programs – they have a mandate to teach people to be theory researchers, but in the absence of an available program designed to train programmers, people destined to become programmers get CS degrees because it’s the best thing available.

    Occasionally academia puts out a real software project, but for some reason the work of industry or hobbyists consistently proves more utilitarian. Work on languages, layer 5 networking and (a recent obsession of mine) version control have almost entirely been hobbyist driven. Probably because the hobbyists are better at setting clear utilitarian goals and sticking with them.

  24. Anonymous Says:

    — Real world coding issues, like using version control, or modularizing properly, are frequently not even mentioned once in the course of an entire undergrad program.

    Very true, but this isn’t inherent to what CS is, rather it is simply the result of current research trends, and hopefully twenty years from now theoreticians will be regularly talking about them.

    If debugging device drivers has been brought into the realm of theory, there is no reason why we can’t do the same with all the other things you mention.

  25. Greg Kuperberg Says:

    In fact, I do listen to pop music, more often than I listen to classical music, and I am well aware that not all hit writers went to Julliard. The work of untrained musicians can easily be popular, even when it isn’t very good. Yes, I sometimes enjoy even the trashiest pop music (even “My Sharona”, even “Material Girl”), but then, I also sometimes enjoy McDonald’s. Like McDonald’s, pop music can have a greasy aftertaste. At those moments I wish that I had learned more at an early age so that I would appreciate Mozart more and Led Zeppelin less.

    Nonetheless all pop music, whether it is trashy or not so trashy, benefits from the hidden hand of the great Western tradition of classical music. Otherwise “Material Girl” would have had no backup music, except maybe drums; and no one would have remastered Madonna’s singing to put it in tune.

    The situation is very similar with hacking, except that I did eventually get the advanced training in computer science that I shrugged off with music. Before I went to college, I wrote computer games. I did already know enough to give my games clock-based timing and Galilean gravity. It’s not just that I could do it; it’s that I didn’t want anything less. I was disgusted by other games with CPU-based timing and illogical, medieval gravity.

    Which brings me to the point of learning advanced algorithms even if you won’t use them. (In the case of computer games, some college physics too.) If you only learn the simplest good algorithms, you won’t be viscerally offended when you or someone else does it wrong. You need to learn the next stage after what you will actually use to develop solid intuition. It’s the same principle whether you’re studying music, math, CS, or anything else.

  26. Anonymous Says:

    I disagree with the claim that TCS is not science since it uses math.

    In fact, I find the insistence to separate math from science (and the whole insistence of a completely precise definition of the term “science”) to be pointless philosophizing.

    When your goal is to study something about the world surrounding us by the use of reason and skepticism, you are doing science.

    It does not matter if you use physical experiments, pen and paper mathematical proofs, mathematical conjectures, computer generated proofs, or computer simulation.

    While it perhaps can be debated if studying elliptic curves can be considered studying the world that surrounds us, I don’t think anyone can say that computation is not part of the world.

  27. Scott Says:

    “I disagree with the claim that TCS is not science since it uses math.”

    Dammit! You just scooped one of my future posts. šŸ™‚

  28. Greg Kuperberg Says:

    The more that I think about it, the more contradictory it seems to have a department of hacking. If an activity is tutored, then it isn’t hacking. (At MIT, “hacking” refers to any kind of clever anarchy, not necessarily computer programming.)

    What hackers really need on campus is a club. A hackers’ club is a great idea, provided that its members find time to learn CS theory.

    I am open to the idea that much of the CS major is misdirected, although I don’t think that CS theory is particularly at fault. Rather, somewhere along the way, students got the wrong message about modular programming, and its successor du jour, object-oriented programming. People are misusing modules and objects to cause the problem that they were intended to solve, namely the problem of unlikable, unmaintainable code. Validated data (as in XML) is the new fad and people are stirring the pot with it even more.

    I actually like modules, and modular objects, and I even like XML. But can’t people show some restraint?

  29. Bram Says:

    Greg, you described the benefits of having a solid algorithms background very well. Unfortunately algorithmic techniques are only a small part of everyday programming, and getting smaller, as libraries to do the hard stuff become more mature and ubiquitous.

    You’re missing my point about music though. There’s a lot which goes into making a catchy, accessible tune, and frequently a musical training makes people worse at it rather than better, because it gives them an ear for subtle musical effects, complicated rhythms, and the ability to play lots of notes quickly, all of which are good things to be able to do, but after learning to do them you must also learn the restraint to only do them when it’s warranted.

  30. Greg Kuperberg Says:

    It may be true that the algorithmic perspective is less necessary than it once was. In effect, the software community relies on the guru system: A minority of gurus understand the algorithms, and their expertise radiates ever better to everyone else with libraries, web pages, e-mail, and so on.

    But less necessary is not the same thing as less useful. If there were more gurus, software would be better. Google’s principals, who are gurus themselves, apparently think so. They are hiring as many PhDs as they can. Indeed the genesis of Google was new algorithms.

    If studying algorithms were really a crushing burden for CS majors, then I might agree that it is overemphasized and that more time should be spent on other kinds of expertise. Maybe it is true in Hungary. I am skeptical of it as a model of the American CS major. I think that what really happens is that most of the students don’t like theory because they don’t know how to like theory ā€” it goes back to poor mathematical education in grade school. Instead, the students are in a wilderness of hacking and object-oriented bureaucracy.

    I concede an unfortunate gap between serious music and accessible music. I think that there is simply too little incentive to bridge that gap in the United States, because, just as with mathematics, education in music isn’t very good. In situation where the audience is better prepared, even if it is not an audience of music professors, the gap is bridged. (After all, you can’t expect everyone to be a music professor.) For example, Stephen Sondheim and Tom Lehrer both stand in the gap.

    I am skeptical that training ever makes anyone worse at anything, except that it may sap your desire to do it. If you are a trained chef, you may have a lot of trouble wanting to plan the menu at McDonald’s. Likewise if you went to Julliard, you might not want to arrange songs for Madonna. But if you want to anyway, you could be George Martin, “the fifth Beatle”. I don’t see how training hurts.

  31. Bram Says:

    Greg, in my experience PhDs aren’t particularly likely to turn out to be good practical programmers than anyone else with work experience, and are much more likely to be pompous, twitty, and demand a much higher salary. Faced with a choice between two otherwise equivalent potential hires one of whom had a PhD and the other of whom was a college dropout (but both with considerable work experience) I’d select the dropout.

  32. Shtetl-Optimized » Blog Archive » It’s science if it bites back Says:

    […] Is math a science? What about computer science? (A commenter on an earlier post repeated the well-known line that “no subject calling itself a science is one.”) […]