It is a well-known phenomenon by now that whenever Randall Munroe mentions anything obscure in xkcd, searches for it spike tremendously. To this point, as far as I'm aware, he hasn't wielded this fact for evil, but still...the power that Randall Munroe holds over the internet is terrifying. I was reminded of this when one of my friends referenced one of these comics on Twitter. I then, with a bit of Googling, was unable to find a good list of examples of the XKCD effect, and decided that I would endeavor to create one.
Of course, I decided to write a script to do this - there are 676 comics as of this writing, and fortunately, transcriptions are available via OhNoRobot, so I don't even have to deal with the images. After a little poking around, I found a post on the XKCD forums (thanks, philip!) that gave the url http://www.ohnorobot.com/transcribe.pl?comicid=apKHvCCc66NMg&url=http:%2F%2Fxkcd.com%2F[comic number]%2F to get the transcription for a given comic. Perfect! I of course lean towards Perl for these kinds of things, and I was tempted to go with Python because of the images. But then I remembered that a simple gnome-open [image file] would do more than I needed, so Perl it was.
Additionally, I remembered that I had already written an xkcd download script that I had stashed away in my gMail, so I had a leg up already. So, after a mere four hours of hacking, I present the xkcd effectalyzer. It's pretty self-explanatory, the only parameter is "-r", if you want to go through the comics in reverse. The script goes through your specified comics, presents you with the OhNoRobots text, and gives you the option to view the image of the comic. Once that's all done, it asks you for a phrase to search on Google Trends. It then (with the credentials you provided at the beginning) grabs the necessary CSVs from Google Trends to get the trend data for the 5 days around and including the publication date of the comic (which it gets from xkcd's archives page). It then tells you what the indexes are, and lets you decide whether or not to save that result to a the output CSV file. It continues to ask you for phrases until you don't give it one, and will write the first three to the CSV file.
It might do more than that, but like I said - it should be pretty self-explanatory. It's got all kinds of nifty features like saving your Google session, grabbing neighboring months if necessary, that are mostly what took so long. But anyway, I went through backwards from the current comic (676) back to 600, and besides revisiting some fantastic comics (including my favorite of the more recent ones), I found over 60 noticeable spikes in Google searches because of an episode of xkcd - that's in just 75 comics. Some comics spiked for multiple phrases, of course, and some none at all. But this also includes 20 terms that hit Google's Hot Trends page. I put them together in a graph that shows the spikes collected around the release of the comic, and there are a couple of interesting anomalies: Obviously, there are a couple ponts off to the right that need explaining. The two that are shifted one to the right are both from "Locke and Demosthenes", which was released on Friday, October 11 of this year. So why the discrepancy? Well, my script gathers the dates from the recommended source - the alt-text on the xkcd archive page. But for "Locke and Demosthenes", the alt-text is off by one day, and says "9-10-09". Since the previous comics were published 9-7-09 and 9-9-09, and Randall only publishes on Thursdays in the event of a five-day series - not to mention the Google results, I'm willing to bet that the archives page is in error, and it was actually published on Friday. The other anomaly on the right-hand side is just because the fifth day is in the next month, which screws up the relative numbers. It disappears in this chart based on the fixed data, which also, handily enough, highlights an anomaly on the right side: "github" spiked on the day that Munroe put out Branding, but was climbing in popularity before that. Why that is, I haven't the foggiest.
Now, pretty charts and things aside, it's also interesting, of course, to look at which terms spiked the most. So, here's a list of all the terms, sorted by the severity of their spikes:
- 7-13: Tab Explosion, "tvtropes" up 126 points (On Fire on Google Hot Trends)
- 8-17: Branding, "github" up 55 points (On Fire on Google Hot Trends)
- 9-18: The Search, "kepler mission" up 38.6 points (Spicy on Google Hot Trends)
- 7-6: Cutting Edge, "this was a triumph" up 31 points (Mild on Google Hot Trends)
- 7-10: Form, "hofstadterially" up 31 points (Medium on Google Hot Trends)
- 7-10: Form, "hofstad" up 31 points
- 10-5: RPS, "reverse polish notation" up 31 points (Mild on Google Hot Trends)
- 10-14: Static, "anti-static strap" up 31 points
- 9-9: Date, "punnett squares" up 30 points (Mild on Google Hot Trends)
- 9-23: Tornado Hunter, "fujita scale" up 30 points
- 6-24: Game Theory, "the only winning move is not to play" up 30 points
- 11-11: Two-Party System, "approval voting" up 30 points
- 10-5: RPS, "reverse polish sausage" up 22.8 points (Mild on Google Hot Trends)
- 9-21: Lincoln-Douglas, "Hark! A Vagrant" up 22.4 points
- 11-23: Silent Hammer, "American Skeptics Society" up 21.6 points
- 11-23: Silent Hammer, "Skeptics Society" up 21.6 points
- 9-14: Brontosaurus, "brontosaurus" up 20.2 points (Medium on Google Hot Trends)
- 11-23: Silent Hammer, "American Skeptics" up 20.2 points
- 7-8: 2038, "2038" up 19.9 points
- 7-6: Cutting Edge, "the cake is a lie" up 19.8 points (Spicy on Google Hot Trends)
- 10-23: So Bad It's Worse, "Star Wars Holiday Special" up 19.8 points
- 11-18: Academia vs. Business, "0x5f375a86" up 16.7 points
- 11-25: SkiFree, "skifree" up 15.3 points (On Fire on Google Hot Trends)
- 12-11: Natural Parenting, "sampling bias" up 15 points
- 7-22: Threesome, "three-body problem" up 14.8 points
- 7-8: 2038, "unix 2038" up 13.8 points (Mild on Google Hot Trends)
- 9-21: Lincoln-Douglas, "Hark a vagrant" up 11.8 points
- 9-16: Scribblenauts, "scribblenauts" up 11.4 points (Mild on Google Hot Trends)
- 9-4: Suspicion, "vk couples" up 11.2 points
- 9-4: Suspicion, "vk couples testing" up 11.2 points (Mild on Google Hot Trends)
- 7-10: Form, "hofstadter" up 9.7 points (Medium on Google Hot Trends)
- 10-30: October 30th, "doc brown" up 8.15 points (Medium on Google Hot Trends)
- 10-14: Static, "anti-static" up 6.3 points
- 7-1: Qwertial Aphasia, "smbc" up 5.57 points (Mild on Google Hot Trends)
- 11-4: Orbitals, "pauli exclusion" up 5.25 points
- 8-12: Haiku Proof, "Q.E.D." up 4.54 points
- 9-18: The Search, "kepler" up 3.28 points
- 11-16: Sagan-Man, "Carl Sagan" up 3.25 points
- 11-2: Movie Narrative Charts, "12 Angry Men" up 3.02 points
- 9-21: Lincoln-Douglas, "stephen douglas" up 2.68 points (Mild on Google Hot Trends)
- 11-11: Two-Party System, "Bull Moose" up 2.56 points
- 8-3: Asteroid, "little prince" up 2 points (Mild on Google Hot Trends)
- 10-5: RPS, "RPS" up 1.82 points
- 8-28: Skins, "furries" up 1.66 points
- 6-22: Android Boyfriend, "fleshlight" up 1.56 points
- 11-16: Sagan-Man, "Sagan" up 1.26 points
- 11-11: Two-Party System, "IRV" up 1.25 points
- 10-7: Conversations, "dysentery" up 1.04 points
- 9-23: Tornado Hunter, "fujita" up 1.03 points
- 10-26: Nachos, "geocities" up 1.02 points (Medium on Google Hot Trends)
- 11-2: Movie Narrative Charts, "Primer" up 0.76 points
- 7-13: Tab Explosion, "rickrolling" up 0.69 points
- 6-29: Idiocracy, "idiocracy" up 0.65 points
- 10-23: So Bad It's Worse, "Plan 9" up 0.65 points
- 8-21: Newton and Leibniz, "leibniz" up 0.55 points
- 8-12: Haiku Proof, "qed" up 0.525 points
- 11-20: Prudence, "White Witch" up 0.515 points
- 8-19: Collections, "existential" up 0.345 points
- 8-19: Collections, "terry pratchett" up 0.225 points
- 8-19: Collections, "discworld" up 0.126 points
- 11-27: Pandora, "Enchanted" up 0.1 points
- 8-14: Oregon, "oregon" up 0.015 points
- 9-10: Locke and Demosthenes, "demosthenes" up 0 points
- 9-10: Locke and Demosthenes, "peter wiggin" up 0 points
- 10-26: Nachos, "nachos" up -0.6 points
As I looked through these, the thing I was most surprised by were some of the things that people Googled, presumably because they didn't know what it was about. I mean, some, like SMBC, Hofstad, Peter Wiggin, Demosthenes - I understand those. But classic stuff like "The only winning move is not to play", "the cake is a lie", and stuff like sampling bias, Q.E.D., Carl Sagan, 2038, or the debacle with the brontosaurus, demonstrates that xkcd readership does indeed include many that are not part of the normal geek crowd - such as liberal-arts majors. Also, Stephen Douglas? The Bull Moose party? What have history classes been teaching that people had to look those up?
But the takeaway from this, I think, is that Randall Munroe, as of late, anyway, has a better than 50% chance (41 times out of 76) of noticeably affecting the Google searches for whatever he happens to mention in his comic. It takes a whole lot of readers (which of course we know xkcd has) to do that with a single webcomic, and this illustrates quite clearly that Randall has them.
The other thing this demonstrates is that I have too much time on my hands, but I just finished finals, and it's Christmas break, so I don't want to hear about it.