July 20 2009

Kindle, 1984 and schadenfreude



Everyone’s all atizzy about Amazon’s recent decision to removed unauthorized copies of 1984 from the Kindle. All of a sudden, we’re reminiscing nostalgically about the freedoms inherent in paper and how the new digital era represents a grave threat. Seriously?

When we were all napstering and torrenting away, digital information represented us sticking it to the man and a sign that RIAA was so desperately out of touch with the changing media landscape. But as soon as the same phenomena hits us, we end up responding exactly the same way that RIAA did, desperately trying to preserve the institutions of print media despite arguments from first principles about how such a thing is impossible.

I hope there’s some dude at RIAA right now who is fully appreciating the irony of all this.

June 17 2009

Statistical vindication



A few days ago, I wrote about a case of a seemingly fascinating graph which I felt was used inappropriately. I was rightfully castigated in the comments for being too harsh but, to me, it gave the impression of a pattern when there really was none. In reply to some of the comments, I made the observation that

The only reason I wrote about it was because, I was surprised that even I as a reasonable trained statistics guy was momentarily caught off guard by it. Clearly, you meant nothing malicious by it but it’s a technique that could be used for malicious purposes so I wrote about it.

Now, in the wake of the Iranian Elections, it seems like my speculation has been somewhat vindicated. Andrew Sullivan posted what he claimed was the red flag that proved the Iranian elections were a fraud. And it seems eminently convincing. Luckily, Nate Silver produced a null hypothesis graph based on the US elections and demonstrated that the “red flag” was just a case of the exact same statistically fallacy I wrote about a week earlier.

May 11 2009

“academic freedom”



The enthusiasm is not universal. In January, a school board in Missoula County, Mont., decided that screening the video treaded on academic freedom after a parent complained that its message was anticapitalist.

New York Times

Even though I should know better, it continually astounds me just how much the term “academic freedom” has been abused.

War is Peace people.

March 27 2009

Slate commits the facebook redesign fallacy



Ugh, yet another media establishment is running the fallacious facebook redesign argument and acting all clever about it. Sadly, this time it’s one I actually respect.

March 27 2009

Not statistically significant and other statistical tricks.



Not statistically significant…

Most people have no idea what “Not statistically significant” means and I don’t see the media being too eager to fix this.

Say you read the following piece in a newspaper:

A study done at the University of Washington showed that, after controlling for race and socioeconomic class, there was no statistically significant difference in athletic performance between those who stretched for 5 minutes before running and those who did no stretching at all.

What do you conclude from that? Stretching is useless? WRONG.

Here’s what the hypothetical study actually was: I picked four random guys on campus and asked two of them to stretch and two of them not to. The ones who stretched ran 10% faster.

Why is this then not statistically significant? Because the sample size was too small to infer anything useful and the study was designed poorly.

All “not statistically significant” tells you is that you can’t infer anything from the study but word the study carefully enough and you can have people believe the opposite is true.

Have you ever heard the claim “There’s no statistically significant difference between going to an elite Ivy League school and an equally good state school?” Perhaps from here, here or even here?

Well, from this paper (via a comment in an Overcoming Bias post):

For instance, Dale and Krueger (1999) attempted to estimate the return to attending specific colleges in the College and Beyond data. They assigned individual students to a “cell” based on the colleges to which they are admitted. Within a cell, they compared those who attend a more selective college (the treatment group) to those who attended a less selective college (the control group). If this procedure had gone as planned, all students within a cell would have had the same menu of colleges and would have been arguably equal in aptitude. The procedure did not work in practice because the number of students who reported more than one college in their menu was very small. Moreover, among the students who reported more than one college, there was a very strong tendency to report the college they attended plus one less selective college. Thus, there was almost no variation within cells if the cells were based on actual colleges. Dale and Krueger were forced to merge colleges into crude “group colleges” to form the cells. However, the crude cells made it implausible that all students within a cell were equal in aptitude, and this implausibility eliminated the usefulness of their procedure. Because the procedure works best when students have large menus and most student do not have such menus, the procedure essentially throws away much of the data. A procedure is not good if it throws away much of the data and still does not deliver “treatment” and “control” groups that are plausibly equal in aptitude. Put another way, it is not useful to discard good variation in data without a more than commensurate reduction in the problematic variation in the data. In the end, Dale and Krueger predictably generate statistically insignificant results, which have been unfortunately misinterpreted by commentators who do not sufficient econometric knowledge to understand the study’s methods.

In other words, the study says no such thing, it simply says the study itself was not sufficient to prove that Ivy League educations made you more money because the data wasn’t good enough and yet the media has twisted this into a positive assertion that state schools do indeed make you as much money as Ivy Leagues.

I’m generously inclined to believe that most cases that I see of this error are caused by incompetence but it’s pretty trivial to see how this could be used for malice. Want the public to believe that Internet usage doesn’t cause social maladjustment? Just design a shitty study and claim “We found no statistical difference in social competence between heavy internet users, light internet users and non users”. Bam, half the PR work has already been don for you.

Controlling for…

Here’s another statistical gem I see all the time:

An analysis done at the University of Washington showed that there was zero correlation between race and financial attainment after controlling for IQ, education levels, socioeconomic status and gender.

Heartwarming right, it means if we put blacks and whites in the same situation, they should earn the same amount of money. WRONG.

The key here is to see that we’re looking for financial attainment and controlling for socioeconomic status. Those two things mean the same damn thing. Basically, all this study told us was that being rich causes you to be rich.

Most people view the “controlling for” section of statistical reporting as a sort of benign safeguard. Controlling for things is like… due diligence right, the more the better… It’s easy to numb people into a hypnotic lull with a list of all the things you control for.

But controlling for factors means you get to hide the true cause for things under benign labels. That’s why I’m always so wary of studies that control for socioeconomic status or education levels, especially when they don’t have to. Sure, socioeconomic status might cause obesity but what causes socioeconomic status.


When people do bother to talk about statistical manipulation, they usually focus on issues of statistical fact: Aggressive pruning of outliers, shotgun hypothesis testing and overly loose regressions. But why bother with having to sneak poorly designed studies past peer review when you can just publish a factually accurate study which implies a conclusion completely at odds with the data? That way, you sneak past the defenses of anyone who actually does know something about statistics.

Sometimes, I swear, the more statistically savvy a person thinks they are, the easier they are to manipulate. Give me a person who mindlessly parrots “Correlation does not imply causation” and I can make him believe any damn thing I want.

February 2 2009

Shovel ready content



One small bright spot for media companies in this recession is the abundance of shovel ready content to fill the pages. Consider this article from the New York Times on the social effects of a recession.

I’ve been seeing more and more of these types of articles over the last few weeks and the great thing about them is that they require absolutely no journalistic content and can, thus be produced by the bushel.

January 5 2009

The natural grain of cynicism



Every media type has a natural “grain”, a gentle structure that tugs and shapes the work within it. You can write a novel about your heroine that has piercing blue eyes and the audience with believe you. In a movie, to convince the audience that your heroine has piercing blue eyes requires that you go out and find an actress with piercing blue eyes and this is significantly more difficult.

The natural “grain” of movies is hucksterism. It’s easier to portray an artist who is a talentless, pretentious hack who exploits a gullible audience than it is to portray an artist with genuine talent who receives the accolades they deserve. It’s easier to show a motivational speaker who only speaks in pseudowisdom than one who says genuinely wise things.

Literature is the medium of heroes, of people who are admired by others because they do great things. Movies are the medium of dupes and cons, of the people who are admired by others despite having no discernable talent. Is it then a coincidence that the rise of the long form novel coincided with the spread of modernism with it’s exultation of progress and cinema coincided with the move towards post modernism and the admiration of aloof cynicism?

November 15 2008

On the auto industry bailout



I’ve been reading various things about the audo industry bailout and various opinions of people for and against. Through all this confusion, I thought out I would point out a couple of invariants:

  • We’ll still be buying the same number of cars whether GM goes under or not. Those cars will still need parts and workers and those workers will still be working in North American plants. They might not be plants in Michigan but they’ll be somewhere in North America. So when you hear that GM and it’s subsidiaries employ 300,000 people, it does not mean that the employment rate in the US will drop by 300,000 just because GM is out of business.
  • The dislocation will be painful. On the flip side of the fence, the market idealists who like to think of creative destruction as an abstract force are wrong. There’s going to be plenty of economic cost before the economy rights itself again. Factories are going to have to be torn down in Michigan and built up in Kansas, assembly lines will have to be retooled from making GM widgets to Honda sprockets, Engineers who are used to working with Mac down the hall now have to work with Joe from Ontario, most of the R&D on the GM Volt isn’t going to be much use for the Toyota Prius.
  • Whether you support the bailout or not depends crucially on whether you feel GM can turn itself around. It’s curious that this case rarely seems to be made explicit. Those supporting the bailout make the fundamental assumption that GM can eventually be restored to a smaller yet functional corporation that will eventually return to profitability and those opposing it assume fundamental structural flaws in the company system. I’ve not yet found many articles which make such claims explicit and try to justify yet and yet this is the determining factor in whether the bailout makes economic sense.

Personally, I’m very much against the bailout. It seems to me that the problems that GM face are caused by an endemic failure of corporate culture spanning everything from an uncreative, insular management to obdurate union which is unable to make the concessions needed. Such a thing cannot be fixed through any sort of superficial restructuring or easy infusion of cash. Rather, GM needs to go through the sort of wrenching transition that IBM went through in the 90′s and I don’t see any evidence of such a thing occuring.

Yes, it sucks that GM is going out of business and it’s going to cause enormous economic pain for those involved. It would be great if we could wave a magic wand that would cause that pain to go away and I would wholeheartedly support a bailout plan if that looked realistic. But as it stands, it looks like GM going out of business will be inevitable and all a bailout will do is become an expensive way of forstalling the inevitable.

July 26 2008

Infuriating quote in the New York Times



The New York Times has a story about the-horrors-lurking-in-your-home, in this case, granite countertops.

As is pro-forma, around the middle of the article, they bring in the expert to pontificate and, in this case, it was especially, infuriatingly, stupid.

David J. Brenner, director of the Center for Radiological Research at Columbia University in New York, said the cancer risk from granite countertops, even those emitting radiation above background levels, is “on the order of one in a million.” Being struck by lightning is more likely. Nonetheless, Dr. Brenner said, “It makes sense. If you can choose another counter that doesn’t elevate your risk, however slightly, why wouldn’t you?”

The chances are tiny but why take the risk? Well, precisely because the chances are tiny.

“Walking in an open field increases your chances of being struck by lightning. If you can walk anywhere else, why would anyone walk through fields?”

Because we don’t care about the risk of being struck by lightning, it’s such an insignificant factor in our lives and so are these “deadly” radioactive countertops…

