ChipLog

Various stuff from some guy.

Pipe into your netlife: Comic feeds

Penny Arcade and Control-Alt-Del are great comics. I love to read them and do so when I’m not feeling really lazy. See, I pretty much live inside my Google Calendar, GMail, Remember the Milk, Google Reader, and Netvibes tabs, for the most part. I’m pretty lazy when it comes to some forms of content, and if a comic doesn’t show up in Google Reader, I’ll typically ignore it until someone points out a particularly good strip to me.

While playing around with Yahoo! Pipes, I realized I could finally do something about this. I began to play around with a couple of pipes to read in the RSS feeds, look for all comic entries, and change the content. The trick was to copy the location the feed item was pointing to (which would contain the actual comic image within the page) to the description, and then apply a regular expression to the description to turn it into an <img> tag pointing to the image itself.

I got lucky. To my knowledge, there is currently no way to fetch content from any arbitrary HTML page and do something with a piece of that page. I suspect their Fetch Data module might let me, but I haven’t managed to get it to work just yet. I was able to pull this off since the comic image was stored with a predictable path based on the date of the comic, and the page being linked to also contained the date. A regular expression was all that was needed to parse out the date and rebuild the path.

Anyway. the end result is that I now have inline comics in my Penny Arcade and Control-Alt-Del RSS feeds! You can add them to your RSS reader below, or take a look at how they were made.

  • [Pipe] [RSS] Penny Arcade News and Inline Comics
  • [Pipe] [RSS] Penny Arcade Inline Comics
  • [Pipe] [RSS] Control-Alt-Del Inline Comics

11 Responses to Pipe into your netlife: Comic feeds

  1. andy grover April 1, 2007 at 6:09 PM

    Thanks a lot! BTW I put together a feed of Dr. Fun (no longer published, alas) via the URL listed above. I just used a python script called from cron — I’m definitely going to have to check out Pipes, it seems pretty handy.

  2. Scott Perry April 1, 2007 at 8:47 PM

    Wow, that is terrifically neat. Going to have to play around with this later.

  3. Simon April 1, 2007 at 8:59 PM

    Some artists specifically discourage RSS feeds from linking directly to their comics, on the basic that they’re losing the advertising revenue that helps pay bandwidth cost. Do either of these comics have such a policy?

  4. ChipX86 April 1, 2007 at 9:42 PM

    Simon: Good question, and I’m not sure. If they complain, I’ll take it down. I think it would be beneficial to them, however, to have inline comics *and* inline ads in a feed provided by them. I certainly wouldn’t mind seeing both if it meant getting to see the comic. As it is, I almost never look at the comics because it means an extra click, and sometimes that’s all it takes.

  5. Ben April 3, 2007 at 3:26 AM

    Ever tried feed43.com? You give a couple simple regex-like things and it scrapes HTML. Surprisingly easy (no, I am not being paid to say this).

    E.g. http://feed43.com/4227712333774324.xml

  6. bkudria April 3, 2007 at 8:22 AM

    Nice work. I’m trying to duplicate your work, but with User Friendly (http://userfriendly.org) – I can’t get it to inline, because the author stores the images in a directory that includes the name of the month in the URL. (for example, today’s cartoon: http://www.userfriendly.org/cartoons/archives/07apr/xuf010203.gif). Any ideas how to make this work?

  7. tyler April 5, 2007 at 2:11 PM

    I’ve been doing the same thing with Greasemonkey for a few months now. It works great.

  8. Vent June 2, 2007 at 12:07 AM

    Just a question… how’s the buy and sell domain over at Penny Arcade works? who does the domain appraisal?
    Thanks

  9. Bertrand Fan February 5, 2008 at 10:30 AM

    I guess they changed it a little bit, so here’s a new pipe I made for Penny Arcade Inline Comics

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 27 other followers