Friday, October 15, 2010

My workflow for writing papers (or, why I switched to LaTeX)

In the last few years I have changed my workflow for writing papers pretty radically.  Previously, I used Microsoft Word along with Endnote as my primary platform (on the Mac, of course). My decision to change was driven by several factors:

  • I had grown tired of the klunkiness of Endnote and the lags in its integration with new versions of Microsoft Word. 

  • I had grown even more tired of Word's tendency to crash, or to do crazy things that could only be fixed by starting with a completely new file.

  • I was just starting to work on a book, and I knew that for a large project like that, using Word would be a nightmare. In addition, my coauthors and I wanted to use a source code management system to coordinate changes to the document, and this was not really practical with Word files.

In the end, I decided to move to LaTeX as my primary platform for writing papers and books.  For those not familiar with LaTeX, you can think of it as a markup language like HTML, only for writing papers rather than web pages.  Editing a paper in LaTeX is not WYSIWYG - that is, you don't see the actual layout of the paper as you type.  Rather, you have to typeset the paper in a separate step.  For example, a very short paper might look like this in LaTeX:

\title{My Article Article}
\author{Russ Poldrack}
This is the content of the paper.

Why on earth, you might ask, would I want to give up WYSIWYG editing to write my papers using some obscure markup language? The main reason is that it's very flexible, both in how you use it and what it can do.  Because the files are plain text, you can edit them using any editor you wish.  I use a package called TexShop which has a built-in editor and makes it easy to write, build, and view documents, but I know many others prefer emacs.  There are also many different packages and style files available, which allow a ton of flexibility in layouts and formatting.  Finally, the fact that they are plain text files with a known format means that you can do tricky things like automatically generating LaTeX files from the information in a spreadsheet or database.  I did this a couple of years ago when we had application packets from about 150 people for a summer course; I was able to take the application data from a web database and turn each person's data into a nicely-formatted package, all done using a few pages of python code.

Another major reason for moving was BibTex, which is the reference management system used with LaTeX.  After all of my annoyances with Endnote + Word, BibTeX was like a dream.  I use the BibDesk application to organize my libraries; it includes integrated searching of PubMed and other repositories and has met my needs almost perfectly.  It's also possible to export BibTeX libraries from Papers, but BibDesk is nice because it operates directly on the BibTeX library so there is no need to export.

There is only one thing that I seriously miss from my days of using Word, and that is the "Track Changes" feature for collaborative writing.  One can use unix tools like diff to find where two files differ, but that still only tells you which lines were changed, not what actual text was changed.  There is at least one open source tool that provides something similar to Word's track changes for LaTeX (LaTeX Diff) but I've not yet been able to get it to work on my Mac.

Another problem is that many of my collaborators are not LaTeX users, so I can't exactly send them a file of raw LaTeX code and expect them to edit it.  There are a couple of alternatives.  First is to save it as PDF and let the colleagues make comments on the file, but this doesn't let them actually edit the file.  What I generally do is export the file to rtf (using latex2rtf) and then send that to my colleagues.  Then I have to put their edits back into the LaTeX file by hand.  Not exactly optimal, but it gets the job done.

Writing papers using LaTeX is not for everyone.  There is definitely a learning curve, and occasionally things happen that require some pretty serious debugging.  It also helps if your collaborators are also LaTeX users.  But in general, it's been a welcome change from Word+Endnote.

Here are some resources that have been useful for me:

1 comment:

  1. I am a student and that post is very useful for me.