Wednesday, May 14, 2008

When copy editors make things worse

"Besides getting more data, faster, we also now use much more sophisticated learning algorithms. For instance, algorithms based on logistic regression and that support vector machines can reduce by half the amount of spam that evades filtering, compared to Naive Bayes." (Emphasis added.)

Joshua Goodman, Gordon V. Cormack, and David Heckerman. 2007. Spam and the ongoing battle for the inbox. Communications of the Association for Computing Machinery, volume 50, number 2, page 27.

Tuesday, May 16, 2006

Running on parentheticals

A common source of run-on sentences is the inclusion of a parenthetical full sentence at the end of another sentence, for instance,
This is an example (there may be others).
This construction is always wrong. Separate the two sentences, as
This is an example. (There may be others.)
or coordinate or subordinate the two, as
This is an example (though there may be others).
or
This is an example (and there may be others).
The following is not correct:
This is an example (however, there may be others).
“However” is an adverb, not a subordinating conjunction.

MS Word Defects

Writers using MS Word tend to make certain standard errors in their typesetting. For instance, they use hyphens instead of em-dashes (ctrl-alt-hyphen or option-shift-hyphen). Mathematical typesetting is especially bad. There is essentially no way to typeset mathematics well in MS Word. The best solution: LaTeX.

Wednesday, February 08, 2006

That/which

For a while, I've been meaning to comment on the "that"/"which" controversy, the claim that "which" should not be used with restrictive relative clauses, nor "that" for nonrestrictive. From a linguistic point of view, it seems clear that this view is descriptively barren. Geoff Pullum provides a convincing and entertaining argument on Language Log, based on the sentence "The key point, that all the popular reports missed, is that FOXP2 is a transcription factor...". The rarity of sentences like these, in which "that" is used for a nonrestrictive relative clause, leads Pullum to refer to it as "ivory-billed".

I suppose, and am happy to stipulate for the purposes of discussion, that the use of "which" for restrictive relative clauses and "that" for nonrestrictive (or supplemental, as Pullum prefers) is grammatical. Nonetheless, the overwhelming preponderance of occurrences of "which" for nonrestrictive clauses means that the use of "that" in that context is much more likely to give pause to the reader, a kind of cognitive setback. For that reason, a charitable writer (and shouldn't we all strive to be one of those?) ought to use "which" for nonrestrictive relative clauses -- not because it is "wrong" to use "that", or ungrammatical, but because the use of "that" is likely to be jarring to a significant fraction of one's readers. (And I don't only mean the Fowler-type prescriptivist readers, though I suppose there's no reason to be jarring them needlessly either.) An excellent point of evidence is the fact that Pullum had to ask the author directly which meaning he had intended in the ivory-billed sentence; had he used a "which", no clarification would have been needed.

In the particular case of the sentence quoted above, there is no concomitant advantage to using "that" over "which" that would compensate for the negative effect of jarring or confusing the reader. Thus, its use should be prescriptively deprecated. (This issue of compensation allows me to avoid proscriptions against splitting infinitives or dangling prepositions, the slavish following of which leads to circumlocutions and semantic errors. Avoiding these negative effects clearly compensates for the oh so very slight jarring effect on some small fraction of true-believing Fowlerians.) By a similar argument, the use of "which" for restrictive relatives should be deprecated as well in formal writing.

What I am arguing is that even though the language does not enforce the distinction between nonrestrictive and restrictive in terms of "which" versus "that" (and commas versus none), respectively, there is still a good reason to write as if it did. There was nothing wrong in the quoted sentence even under the intended interpretation, just something infelicitous.

Am I trying to have my cake and eat it too? To be able to rail prescriptively while keeping my linguistic descriptivist moral stance? Yes.

Friday, October 08, 2004

Three Styles for Writing a Paper

Different people have different styles for overall organization of a technical paper. There is the "continental" style, in which one states the solution with as little introduction or motivation as possible, sometimes not even saying what the problem was. Papers in this style tend to start like this: "Consider a seven-dimensional manifold Q, and define its hyper-diagonal as the ...." This style is designed to convince the reader that the author is very smart; how else could he or she have come up with the answer out of the blue? Readers will have no clue as to whether you are right or not without incredible efforts in close reading of the paper, but at least they'll think you're a genius.

Of course, the author didn't come up with the solution out of the blue. There was a whole history of false starts, wrong attempts, near misses, redefinitions of the problem. The "historical" style involves recapitulating all of this history in chronological order. "First I tried this. That didn't work because of this, so I tried this other way. That turned out to be stupid. Then I tried this other way...." This is much better, because a careful reader can probably follow the line of reasoning that the author went through, and use this as motivation. But the reader will probably think you are a bit addle-headed. Why would you even think of trying half the stuff you talked about?

The ideal style is the "rational reconstruction" style. In this style, you don't present the actual history that you went through, but rather an idealized history that perfectly motivates each step in the solution. "We consider the problem of XXX. The obvious thing to try is X. But such-and-such a pithy example shows that that fails miserably. Nonetheless, the example points the way naturally to solution Y. This works better, except for such-and-such an obscure case. We patch solution Y to handle this case, forming solution Z. Voila." Of course, the author doesn't tell you that he came up with solution Y before solution X, which only occurred to him after he came up with solution Z, and he skips solutions A, B, and C because, in retrospect, they are nowhere on the natural path to Z, even though at the time he was completely convinced they were on the right track. The goal in pursuing the rational reconstruction style is not to convince the reader that you are brilliant (or addle-headed for that matter) but that your solution is trivial. It takes a certain strength of character to take that as one's goal. But the advantage of the reader thinking your solution is trivial or obvious is that it necessarily comes along with the notion that you are correct.

Wednesday, June 09, 2004

James Pryor's Guidelines

I've just discovered James Pryor's "Guidelines on Writing a Philosophy Paper". Despite the ostensible limited goal of the guidelines, they are much more broadly applicable than just to philosophy papers. I especially like the characterization of readers as "lazy, stupid, and mean".

Tuesday, May 25, 2004

Running on howevers

People seem to fall prey to adverbials like "however" and "rather" seducing them into running on sentences.
This type of approach has been used in previous models, however, the presented algorithm adopts a different foundation.
But these words are not conjunctions, subordinating or otherwise. They are adverbs, like "on the other hand" or "unfortunately". The following is, presumably, clearly infelicitous.
This type of approach has been used in previous models, unfortunately, the presented algorithm adopts a different foundation.
By the same token, so is the sentence with "however". It is easily corrected:
This type of approach has been used in previous models; however, the presented algorithm adopts a different foundation.
or
This type of approach has been used in previous models. The presented algorithm, however, adopts a different foundation.