Amateur Topologist

Everything but topology.

Tag: the internet

Preserving Links Across Reorganization

Link rot is the scourge of the modern Internet. It’s incredibly frustrating to find a link to something that looks really interesting, or that looks like it might solve all of your problems, only to have the link be dead. Sometimes it’s because the content is just gone; there’s really no getting around that. But sometimes it’s because the site it was hosted on reorganized their structure in such a way that the data’s still there, but it can’t be reached at that old URL. If the site has a built-in search function, then you might be able to find it that way; you also might be able to use Google’s site:example.com domain restriction (or find it on another site that so graciously copied the entire text). But that still involves extra effort.

So what can you, as a site owner, do to minimize link rot? The answer is make sure that all your old URLs continue to function. Here’s an example: I recently moved all the files for this blog from /var/www to /var/www/blog, to keep my directory structure neat. I then mucked around in WordPress to tell it the new file location, set up /etc/apache2/sites-enabled/000-default properly, and everything was working fine. But the old incoming links were now broken! Fortunately, mod_rewrite exists and it’s fixable with a few simple rules in /var/www:

RewriteEngine On
RewriteBase /
RewriteRule ^((\d\d\d\d|tag|feed|category).+)$ /blog/$1 [R]
RewriteRule ^$ /blog [R]

The first two lines here are standard mod_rewrite boilerplate. The third and forth do the actual interesting rewriting: The regex will match any URL that starts with (after the amateurtopologist.com/) four digits or the words ‘tag’, ‘feed’, or ‘category’ and prepend /blog/ to them. So, for example, http://www.amateurtopologist.com/2010/12/01/some-useful-approximations/ will get rewritten into http://www.amateurtopologist.com/blog/2010/12/01/some-useful-approximations/; both of those URLs do work and will send you to the same post. The [R] indicates that Apache should serve a redirect instead of transparently rewriting it; I did this so that people would use the new URLs in case I ever have to actually break the old ones. Finally, the last line rewrites the empty request into /blog (again using redirect).

The net result here is that pretty much every single URL that used to work will still work, but I can now keep my blog and my other stuff in separate directories. A few URLs broke because I’m too lazy to fix them (search URLs and the about page), but nothing that people were likely to actually link to.

Facebook’s privacy settings: another look

Much has been made in the news recently about Facebook’s relative lack of privacy controls, and the degree to which they’re hidden and made unintuitive to use. Naturally, people have been speculating about why they do this. The intuitive answer, and the one that I’ve heard a lot of people claim, is that it allows them to sell data to advertisers; the reasoning goes that they can’t sell your data to third parties and make money off of you if your profile settings are all set to private. But as far as I know this isn’t the case; I’ve got all my Facebook data pretty locked down; the only information about me if you’re not my friend is my name and a picture of me, which I figure is enough to allow people I know to friend me while being certain that they’re getting the right me. Yet I still get ads that I know are targeted to me because they mention my college, my location, etc.

Instead, I suspect that the real money that Facebook makes off of people who don’t have their information private is off advertising impressions. A lot of people, when they want to find more about someone, will immediately check Facebook to see if they have a profile, and if they do, they’ll spend a while ‘stalking’ them on it. If the ‘stalkee’ has their information private and the stalker isn’t friends with them, then they have to send a friend request, which means that most of the time they’ll back off. But if the stalkee’s information is public, then the stalker can spend large amounts of time looking at their information. Which means large amounts of pageviews, and large amounts of advertisements being displayed to them, which means more money for Facebook. And I think that that’s one point that a lot of people miss when they write about Facebook; not only do they want you to put all of your information on there so they can sell it, they want you to put your information on there so that other people will spend time on their site looking at your information.

Managing Online Identity

Like anybody who’s even looked at a computer in the past few years, I have an ‘online identity’. I have accounts on Twitter, Facebook, Github, Wikipedia, Formspring, Steam, AIM, MSN, and more forums and IRC servers than I can honestly remember at this point. But obviously I don’t want all of those to be connected; I don’t want the people who know me on a forum I post on to be able to figure out my real name and where I live. Similarly, I don’t want my Twitter account to be associated with my real name, because I post things to it that I wouldn’t want an employer to see. But I’ve messed up; I’ve used this site in the past to host random images, and I can’t remove that reference because other people have quoted those posts. I’ve linked to this site on my Twitter account in the past, before I used my real name.

Even if I didn’t use my name here, and I changed my username on the site from ‘phurst’, you could still figure it out; I’ve linked to my Github account, and my commits have my real name in them. I could remove those posts, but then I couldn’t write about sizeable code projects that I’ve made, since I tend to code under my real name. Plus, having a site full of respectable content come up as the first Google result for your name that’s actually you is good.

I’m especially concerned given the interconnected nature of the Internet and the growing ability of search engines to discern semantic content from websites as opposed to simply statically indexing text; the power of a human-like intelligence parallel processing everything I’ve written under one name or another I can almost guarantee will connect most of my online identities together.

On the other hand, I’m not the only person with online secrets to hide, especially with growing Facebook use. I don’t have photos of me drinking while underage, or status updates about cheating on tests or stealing or anything. Maybe the growing all-seeing eye of the Internet won’t hurt me as bad in comparison to everybody else. For now, I’ll continue not linking anything to anything else unless I’d be fine with everybody ever knowing that the two are one and the same.

MS Paint Adventures: one of the first true webcomics

This blog post contains minor spoilers for Homestuck.

The phenomenon of the webcomic is not exactly new by any means; Sluggy Freelance, one of the oldest still-running webcomics, is 12 years old, only slightly older than widespread availability of the Internet. So what do I mean when I say that MS Paint Adventures is one of the first first few webcomics? It’s one of the few (that I’m aware of, of course) that actually uses the full potential of a comic that takes place online.

The central thing that makes MS Paint Adventures unique to the best of my knowledge among comics is that the story is in large part driven by the fans. Andrew Hussie, the author, has stated in an interview that he does not plan out the direction the plot will take in advance; although he has some overall ideas of the direction he wants it to go, he lets the reader suggestions dictate it to a larger extent than essentially any other webcomic, or indeed any other form of serial storytelling. The only reason that this is possible is because of the MSPA forum; while it would be possible for this to happen in a world without the Internet, an online forum allows for other users to voice their approval for options that they might otherwise not have thought of, giving Hussie the ability to judge what the readers want. But even elements of discussion that are not necessarily suggestions can generate plot points; the apocalyptic nature of the story was originally unplanned and came about at least partly as a result of discussion on the forums about the posters on the walls of one of the main characters for apocalypse-themed movies.

The various Flash animations and other non-static content that AH occasionally uses to enhance the story are another element of MSPA that fundamentally would not work in a traditional print comic. Although they do not necessarily advance the story any better than a series of corresponding still images would, they make the story more enjoyable and immersive; the end-act flashes are probably the quintessential example of this, reminding the reader of the various active story threads, while providing a bit of progression in each of them; it’s far more effective than a series of still panels could ever hope to be. And the Flash animation/game not only could not be executed without the internet, it positively requires the high-bandwidth connections of today’s modern Internet infrastructure. The soundtracks to the Flash animations are also a key part of the overall ‘experience’, even though Andrew Hussie himself does not compose them; the collaboration between artist and composer is only possible through the Internet.

Although MSPA is certainly not the first comic to use the internet, or even the first one to use it beyond as a medium for publishing images (other comics have certainly had associated discussion forums), I believe that it’s the first one to truly use the full potential of the Internet. And while that doesn’t necessarily make it better than other ones, it definitely moves it from the realm of good to that of great.

Note that I have said that it is one of the few true webcomics, not the only one. The only other one that comes to mind is Kid Radd, which makes extensive use of animation, especially in the later strips, where almost every ‘panel’ is a three-second animation. But it didn’t use Flash, or user suggestions, so as good as it is, I don’t consider it to use the Internet in the same way that MS Paint Adventures does.