Thursday, December 11, 2008

Tips on building a CS paper or thesis

I'll describe some techniques that found useful when writing my own M.Eng. thesis and papers. My focus is on improving the process of building the paper, not writing it. So I'll write about some things you can do to make the process easier. This is not yet another compilation of writing tips.

Get a complete distribution, and install all of it. Don't try to be smart about packages. 1Gb of your disk space is worth less than the time you'll spend debugging missing dependencies.

I use texlive, because it's available on all platforms:

I'm a developer. When I write something that's not an e-mail, I'm in Eclipse. So I got myself an Eclipse plug-in for Latex at You can plug this link directly in the Update Manager to get the plug-in.

I recommend configuring your texlipse project to use pdflatex. This gets you the PDF that you need, and lets you use .pdf files as figures. Most software used for figures has pdf output.

Version Control
I feel uneasy if my work is not under version control... it's a disaster waiting to happen. Latex is all text, so you can even work on multiple machines, and merge the changes intelligently.

I recommend
subversion or git. They both have Eclipse plugins:
At the time of this writing, the git plug-in isn't self-sufficient... you'll need to know git's command-line to get stuff done. On the other hand, git's cheap branches might be worth the hassle.

There's no way in hell I'm typing in the data for my 100-200 bibliography references. Here's how I get my .bib entries:
  1. Go to Google Scholar Preferences
  2. Under Bibliography manager, select Show link to import citations into
  3. Make sure you have Bibtex selected as the format, then click OK.
  4. Find the cited paper / book on Google Scholar:
  5. Click on the Import into Bibtex link. The entry is revealed for your copy-pasting pleasure.
I know this might seem obvious, but it took me a while to try it out. Scholar's advanced search is useful when you're looking for a certain author's work, and that work is referenced a lot by newer papers. If you don't believe me, try finding the paper on Dijkstra's algorithm without advanced search.

Act Like You're Building Software
Writing in latex is more like building software than writing a humanities paper. Once I embraced that, I became less miserable.

Unlike MS Word, latex lets you distribute your work across many files. A good directory / file organization can make re-organizing sections really easy.

Writing in latex is also like building software in that you can use libraries. They call them packages. Here's what used in my thesis:
  • listings - code listings
  • url - lets you add URLs to bibliography entries (why is that not in base, again?)
  • graphicx - figures? I think you need it to import PDF files, I'm not sure
  • amssymb, latexsym, amsmath - useful stuff for mathy formulas (I include them everywhere, it makes my life easier)
  • boxedminipage - nice borders around code listings
  • times - I forgot what it does
  • clrscode - format your pseudocode CLRS style (I like it)
When I need something that seems general, I try to find a package that does it for me. People who write packages tend to write nice documentation on them, and I'd rather read that than fight with latex to figure out how to do something on my own.

I'd like to push the building software similarity even further, but I haven't figured out how to set up a continuous build yet :)

Writing in latex isn't as intuitive as using MS Word. On the other hand, there are some advantages to describing the contents of your paper in code. Knowing and taking advantage of them has made my life easier, and saved me from potential disasters.

I hope you found this useful. If you have more tips, please leave a comment!

1 comment:

  1. Wow, didn't know about the Google Scholar tip. Seems like a real time-saver, for sure.

    times :)

    I've found latexmk extremely useful, especially when changing BibTeX references. Instead of manually typesetting and running BibTeX multiple times to get all the references just right, latexmk does it automatically, as many times as is required.