Amateur Topologist

Everything but topology.

Some useful approximations

As much as I hate to admit it, mathematicians tend to deal with approximations. A lot of times, formulas are just too complicated to deal with the full complicated formula, and you have to simplify it. So here’s some handy approximations, as well as about where they’re valid.

  • \log (1+x) \approx x, e^x \approx 1+x for x \ll 1
  • (1+x)^{n} \approx 1+nx for nx \ll 1, and in particular \sqrt{1+x} \approx 1+x/2 and (1+x)^{-1} \approx 1-x
  • \sin x \approx x, \cos x \approx 1 for x \ll 1
  • (1+\frac{a}{n})^{bn} \approx e^{ab} for n \gg a,b, especially for a,b < 1
  • n! \approx \sqrt{2 \pi n} \left(\frac{n}{e}\right)^n
  • H_n \approx \ln n + \gamma, where \gamma \approx 0.577 is the Euler-Mascheroni constant
  • \pi(n) \approx n / \ln n, where \pi(n) is the number of primes less than n
  • p(n) \approx n \ln n, where p(n) is the nth prime number.
  • 2^{10} \approx 10^3

What do you think every mathematician should know?

Facebook’s privacy settings: another look

Much has been made in the news recently about Facebook’s relative lack of privacy controls, and the degree to which they’re hidden and made unintuitive to use. Naturally, people have been speculating about why they do this. The intuitive answer, and the one that I’ve heard a lot of people claim, is that it allows them to sell data to advertisers; the reasoning goes that they can’t sell your data to third parties and make money off of you if your profile settings are all set to private. But as far as I know this isn’t the case; I’ve got all my Facebook data pretty locked down; the only information about me if you’re not my friend is my name and a picture of me, which I figure is enough to allow people I know to friend me while being certain that they’re getting the right me. Yet I still get ads that I know are targeted to me because they mention my college, my location, etc.

Instead, I suspect that the real money that Facebook makes off of people who don’t have their information private is off advertising impressions. A lot of people, when they want to find more about someone, will immediately check Facebook to see if they have a profile, and if they do, they’ll spend a while ‘stalking’ them on it. If the ‘stalkee’ has their information private and the stalker isn’t friends with them, then they have to send a friend request, which means that most of the time they’ll back off. But if the stalkee’s information is public, then the stalker can spend large amounts of time looking at their information. Which means large amounts of pageviews, and large amounts of advertisements being displayed to them, which means more money for Facebook. And I think that that’s one point that a lot of people miss when they write about Facebook; not only do they want you to put all of your information on there so they can sell it, they want you to put your information on there so that other people will spend time on their site looking at your information.

Managing Online Identity

Like anybody who’s even looked at a computer in the past few years, I have an ‘online identity’. I have accounts on Twitter, Facebook, Github, Wikipedia, Formspring, Steam, AIM, MSN, and more forums and IRC servers than I can honestly remember at this point. But obviously I don’t want all of those to be connected; I don’t want the people who know me on a forum I post on to be able to figure out my real name and where I live. Similarly, I don’t want my Twitter account to be associated with my real name, because I post things to it that I wouldn’t want an employer to see. But I’ve messed up; I’ve used this site in the past to host random images, and I can’t remove that reference because other people have quoted those posts. I’ve linked to this site on my Twitter account in the past, before I used my real name.

Even if I didn’t use my name here, and I changed my username on the site from ‘phurst’, you could still figure it out; I’ve linked to my Github account, and my commits have my real name in them. I could remove those posts, but then I couldn’t write about sizeable code projects that I’ve made, since I tend to code under my real name. Plus, having a site full of respectable content come up as the first Google result for your name that’s actually you is good.

I’m especially concerned given the interconnected nature of the Internet and the growing ability of search engines to discern semantic content from websites as opposed to simply statically indexing text; the power of a human-like intelligence parallel processing everything I’ve written under one name or another I can almost guarantee will connect most of my online identities together.

On the other hand, I’m not the only person with online secrets to hide, especially with growing Facebook use. I don’t have photos of me drinking while underage, or status updates about cheating on tests or stealing or anything. Maybe the growing all-seeing eye of the Internet won’t hurt me as bad in comparison to everybody else. For now, I’ll continue not linking anything to anything else unless I’d be fine with everybody ever knowing that the two are one and the same.

A Haskell Newbie’s Guide to Text.JSON

JSON parsing is practically required for any modern language to be able to interface with web-based applications; most of them offer JSON as a reply format, and the alternative (usually XML) can be cumbersome to work with. But Haskell and its strong type system seem like they’d be extremely ill-suited to parsing JSON; an object’s values can be arbitrary JSON objects, including different types in the same object, and arrays can contain different types of objects. How do you deal with this? Well, if you’re going to be parsing JSON, then you have to have some kind of format that you’re expecting; you know that you’ll get, say, an array of objects each of which has a specific key that has an integer as a value, and that integer is all you care about. Fortunately, Text.JSON exists, and once you get your head around how to use it, it’s simple. (Then again, so are many things in Haskell, but that doesn’t help me understand Arrows any better!)

So, let’s look at an example: getting a user’s public timeline from Twitter, and turning it into an array of Status values. We can ignore the process of actually sending Twitter the request and pulling the JSON out of the response, and for the sake of brevity we can ignore a bunch of the irrelevant data that Twitter returns, such as whether the tweet is a reply, the source, etc. Here’s the result of asking for the last 2 tweets from the official Twitter account, with irrelevant detail stripped:

[
    {
    "user": { "screen_name": "twitter" },
    "text": "Read this guest post from @FiresideInt on his experience in Haiti http:\/\/t.co\/mbMU56R"
    },
    {
    "user": { "screen_name": "twitter" },
    "text": "Do you use Twitter for a business, school, community group or another local organization? Follow @TwitterBusiness for tips and useful info!"
}
]

So, what we have here is a list of objects; each object has a user attribute whose screen_name is the actual username; the object’s text attribute then contains the actual text of the tweet. Let’s get to work.

Making the JSON value

It’s helpful to play around with whatever we’re trying to manipulate in ghci; so let’s load up the JSON into a String, json, and try to parse it:

Prelude Text.JSON> json <- readFile "/Users/phurst/json"
Prelude Text.JSON> decode json

<interactive>:1:0:
    Ambiguous type variable `a' in the constraint:
      `JSON a' arising from a use of `decode' at <interactive>:1:0-10
    Probable fix: add a type signature that fixes these type variable(s)

Why doesn’t this work? Well, decode has type (JSON a) => String -> Result a. So you can decode a string of JSON into anything in the JSON typeclass. Well, JSValue is in the JSON typeclass, so let’s try that:

Prelude Text.JSON> decode json :: Result JSValue
Ok (JSArray [JSObject (JSONObject {fromJSObject = [("user",JSObject (JSONObject {fromJSObject = [("screen_name",JSString (JSONString {fromJSString = "twitter"}))]})), (rest of line omitted)

Wordy. But we did get a successful parse (as indicated by the Ok); we then have a JSArray which contains the two JSONObjects. But in order to get at the things inside the JSArray, we’d have to manually remove the constructor via (\(JSArray x) -> x) or something. And then if we didn’t get an array (because we got an error!), we would get an unfriendly “Non-exhaustive patterns in lambda” exception. So, what do we do? Well, we’re trying to get a list of values, so let’s ask decode to give us one:

Prelude Text.JSON> decode json :: Result [JSValue]
Ok [JSObject (JSONObject {fromJSObject = [("user",JSObject (JSONObject {fromJSObject = [("screen_name",JSString (JSONString {fromJSString = "twitter"}))]})), (rest omitted)

Awesome! We have an array of JSObjects. But again, we’re still stuck inside that JSObject ‘wrapper’. This annoyed me for a while, until I realized that JSObject is both a data constructor in the JSValue type, and its own type! So we can ‘ask’ decode to give us a list of JSObjects:

Prelude Text.JSON> let decoded = decode json :: Result [JSObject JSValue]
Ok [JSONObject {fromJSObject = [("user",JSObject (JSONObject {fromJSObject = [("screen_name",JSString (JSONString {fromJSString = "twitter"}))]})), (rest omitted)

Now we’re in business. We have a list of JSObjects, which is as good as we’re going to get. Now we need to actually deal with getting data out of them. It’s a good idea to split the ‘parse an individual item’ logic off into its own function, which I’ll call makeStatus.

Writing makeStatus, dealing with nested objects

Now, we could call fromJSObject :: JSObject e -> [(String, e)], search through the pairs, and then deal with them manually. But that’d be messy, and we wouldn’t get error handling if for some reason Twitter mysteriously didn’t give us a “user” object. Instead, we should take advantage of the function valFromObj :: JSON a => String -> JSObject JSValue -> Result a, and the fact that Result is a monad. Together, these mean that we can write simple code:

data Status = Status { user :: String, text :: String }

makeStatus :: JSObject JSValue -> Result Status
makeStatus tweet = let (!) = flip valFromObj in do
    userObject <- tweet ! "user"
    user <- userObject ! "screen_name"
    text <- tweet ! "text"
    return Status {user = user, text = text}

Here, I’ve defined (!) for brevity’s sake, then used the Monad instance of Result to ‘chain together’ my calls to (!). At the end, I wrap the user and text into a Status value, then ‘return’ it back into the Result monad. Let’s see it:

Prelude Text.JSON> let tweet = (\(Ok x) -> x) decoded !! 0 -- just to get a non-monadic JSObject for now
Prelude Text.JSON> makeStatus tweet
Ok (Status {user = "twitter", text = "Read this guest post from @FiresideInt on his experience in Haiti http://t.co/mbMU56R"})

We’ve successfully parsed a status update into its components! If we cared, we could pull more information out of the original object, such as real name, avatar URLs, time of posting, etc. But first, we have a more interesting problem: how do we join these two together?

Combining decoding and parsing

Look at the type for our decoding function and for our parse function:

Prelude Text.JSON> :t \json -> decode json :: Result [JSObject JSValue]
\json -> decode json :: Result [JSObject JSValue]
  :: String -> Result [JSObject JSValue]
Prelude Text.JSON> :t makeStatus
makeStatus :: JSObject JSValue -> Result Status

Clearly, we’d like it if we could put those in one function with little effort. So, let’s abstract out the specifics: we have functions f :: a -> m [b] and g :: b -> m c. We want h :: a -> m [c]. It’s obvious we want to map g, but a normal map isn’t right, since we’d get a function of type [b] -> [m c]; we want the monad to be outside the list! So we use the monadic map, mapM g :: [b] -> m [c].

Now, we have our two functions, and we could use do-block syntax to combine them:

parseTimeline json = do
    decoded <- decode json :: Result [JSObject JSValue]
    mapM makeStatus decoded

But do-block syntax isn’t terribly ‘Haskell-y’, and there’s only two lines of it. Surely there’s a way to combine them! And if we use Hoogle and search for (a -> m b') -> (b' -> m c') -> (a -> m c') (using primes to represent lists), we find that (>=>) in Control.Monad does exactly what we want! So we can rewrite it as a one-liner:

parseTimeline :: String -> Result [Status]
parseTimeline = decode >=> mapM makeStatus

Two things to notice: first, the explicit type declaration for decode isn’t necessary. In fact, it wasn’t necessary in the above version either. This generally happens once you’ve actually written the processing part of your JSON handler; the type signature of the processor forces the types in the parser! Second, the function is now pointfree; there’s no point in including argument, so we might as well omit it.

Errors

Now, there’s one problem here that I haven’t addressed. While we do get error handling for ‘free’, since monadic handling of Result values will pass through errors, the errors are typically unhelpful; they only show what failed to parse. And if we failed to parse the initial array, for example, that probably means that Twitter gave us an error message instead, and we’d like to know what it says! Finally, our output is stuck in the Result monad until we get it out.

The solution here is to write an error handler combinator; it takes a processor and a JSON string, and tries to process it; if the result pattern-matches against Ok x, then we parsed successfully; if it doesn’t, then we parse it again looking for the error message and handle that according to however your program deals with errors.
For my part, I kind of cheat; I don’t use the JSON in the error combinator, I use the raw HTML response, and if the parse fails, I throw an exception according to to the access code returned. But I eventually do plan to actually grab the error message, which will simplify the control flow of the library and give me more specifics on what went wrong.


This entire post was inspired by my experiences writing askitter, a Haskell Twitter library using OAuth for authentication. Most of the ‘final’ code in here is copied straight from there, aside from various wrapping/unwrapping functions. It’s been very useful for learning lots of things; it’s been suggested that I ‘hide’ the fact that Twitter only gives you chunks of 20 tweets at a time by using enumerators; when/if I get around to that, I’ll write an explanation.

Of Spriggans, Stabbing, and Skeletons

So, recently I’ve been playing a lot of Dungeon Crawl Stone Soup (or just Crawl). I’m not terribly good at it; my record depth is the bottom of the Snake Pit (Snake:5, since it’s the 5th level of Snake), which is about… a third of the way through a minimal run. And usually I die a lot earlier than that, on the second or third floor. In fact, as I wrote this, I played a quick game that wound up with me dying on D:2 (second floor of the dungeon).

A bit of background is necessary: Crawl is a member of a family of games known as roguelikes (after the original, rogue), which share these characteristics:

  • Permanent death. You die, you lose; if you save your game, you quit right afterwards, so it’s only useful to take a break. You can copy the save folder, but that’s cheating.
  • Randomly-generated dungeons. There are patterns, such as special levels that always have one of a few predefined patterns, but the majority of levels are randomly-created; when you go up or down stairs, you have no clue what you’re going to get.
  • ASCII. The entire interface is ASCII, though most modern ones use Unicode for drawing elements of the dungeon. Different colors mean different things; for example, in Crawl, an o is an orc, which is fairly easy to kill, and an o is an orc priest, which has a decent chance at killing low-level characters. And if you see an orc warlord (o), run the hell away.
  • Unidentified magic items. One game, a bubbly potion might be a potion of heal wounds. Another game, it might be a potion of paralysis. There’s no way to know until you cast identify on it or you use it. And scrolls of identify tend to be in semi-limited supply, so you can’t use them all on your potions/scrolls/amulets.

So what’s the appeal? At first, it doesn’t seem like there would be one. Permanent death basically ensures that you’ll be replaying the first few minutes of the game over and over, or maybe the first hour or so once you get good at it. And the high level of randomness means that you could easily die due to sheer unluckiness. And you can’t really use the knowledge you got during that run for your next attempt, since all the potions and scrolls and such will be different the next time around.

They'll never see me coming.

Except not. Crawl has over 50 different good race/class combinations which play differently, from the ‘sleep and stab’ Spriggan Enchanter to the ‘berserk and kill everything’ Kobold Berserker or the ‘summon and spam Pain’ Deep Dwarf Necromancer. And that’s not even counting the different gods that can be worshipped; most class/races will play well with one of at least two gods. If I play a bunch of games as one type, I can just switch to another, or worship a different god. The relatively quick power progression helps too; it feels good killing huge groups of enemies that 20 minutes ago would’ve absolutely slaughtered you; conversely, it can be exciting running into an enemy that you know is going to absolutely wreck you unless you get out of there now. And over the many hundreds of games I’ve played, I’ve learned lots of things; how to successfully run away from enemies until I can get the MP back to blast them into oblivion, which enemies I can kill as soon as I see them (giant newts) and which I should stay the hell away from (most unique enemies, ogre mages, trolls).

Another large part of it is training myself to not fight everything I see ever. Crawl has two beautiful examples of this; the oklob plant and Sigmund. The oklob plant is a rather nasty monster that can spit armor-corroding acid for a good portion of your of health a turn. But they’re plants; they can’t move. And if you’re using the auto-explore feature (which you really, really should!) then you stop moving as soon as you see one, and you won’t auto-move into range of one. So they’re basically a bravado trap for people who try to kill everything as soon as they see it. Sigmund is a powerful unique monster typically encountered early on; his purpose is basically to teach you that there are some monsters that you should just run away from straight to the nearest stairs.

There’s also a social aspect to the game; you can play Crawl on one of three servers (CAO, CDO, and RHF), and let other people watch your game; deaths, ascensions, and other important milestones are also announced in the ##crawl channel on Freenode. You can also interact with (read: kill or be killed by) the ghosts of other players, who have spells and stats based on their status when they were killed. Plus you can ask people for advice, or ask the various stats bots questions like ‘what gods do deep elf wizards usually worship?’ or access the user-created database of information (<Henzell> warg[2/2]: Also, “warg” is the sound that you emit whilst being mangled by a warg).

Finally, one of the key properties of roguelikes is that every single death should be fundamentally your fault. Granted, a death is less pleasant if you know that it’s due to your stupid mistake, but that also means that you’re not going to be doing really well and then just die. Crawl is good at this (can you tell I like it yet?); I’d say about 90% of my deaths are my fault. The other 10% are early-game stuff where I happen to run into a pack of gnolls on the first level, or where a snake’s poison lasts for longer than it normally does and I don’t happen to have a potion of healing or heal wounds on me.

So go play some Crawl, have some fun, lose 20 SpEns to snakes on the first floor. Trust me, it’s worth it when you get farther than you’ve ever been and are one-shotting enemies that gave you so much trouble just a half hour before. Until something else that you haven’t seen before shows up and kills you because you’ve let your seeming invulnerability go to your head.

URL bar marquees!

Remember the heady days of the late 1990s, when every Geocities webpage would have a script that displayed a marquee in the status bar? Well, thanks to the magic of HTML5, you can revisit them with a twist: instead of marqueeing in the status bar, you can do it in the URL bar without the annoyance of having your history be clogged with a bunch of shit. How does it work? If you’ve ever browsed someone’s YouTube channel, you’ve probably noticed that the URL that you get in the address bar changes each time you click a video. But the page doesn’t refresh; this is because the part that’s changing is actually the anchor element, which comes after the #. This has the downside of potentially interacting poorly with your browser history or other things. HTML5 instead provides pushState and replaceState, which basically allow you to rewrite the displayed URL to any page on the same domain without actually forcing a page reload or whatever. They also allow you to carry around titles for the various states (different from the <title> part of an HTML document) and actual state data. Flickr uses this to great effect; if you go to, say, this photo and click the ‘This photo also appears in’ tabs, you’ll see the URL being rewritten without a page load. Note that as of now this only works in Firefox 4, Safari, and Chrome.

But that’s not what this post is about. It’s about marquees! If you use javascript, you can create a marquee in the URL bar. It has the disadvantage of everything being URL-encoded, so no fancy Unicode characters, or even spaces. It also means that unless your webserver is set up right, hitting refresh will take the user to a non-existant page. It’ll also annoy the ever-loving fuck out of your users since they won’t be able to type in the address bar before your code rewrites it. But who needs that anyway when you have glorious scrolling text?

Logging in to crawl.akrasiac.org quickly

I’ve been playing a lot of Dungeon Crawl Stone Soup lately, which is a roguelike like Nethack only it’s undergoing active development and you tend not to die to the random number generator as often as your own mistakes. (If you don’t know what a roguelike is, it’s a turn-based game with mostly randomly-generated dungeons and items, and with permanent death; you automatically quit after you save, so it’s only useful for taking breaks). I play almost exclusively on crawl.akrasiac.org (or CAO), which is like the nethack.alt.org of Crawl; other people can watch you play, there are bots that announce deaths and other important events in ##crawl on the freenode IRC network, and it has a dgamelaunch frontend for logging in to your character, setting options, etc.

Unfortunately, due to the way that CAO is set up, there’s no built-in method for automatically logging in as your character. This isn’t exactly the most horrible inconvenience in the world, but it can be kind of annoying. So I set up a script:

#!/usr/bin/expect
spawn ssh -C joshua@crawl.akrasiac.org
expect "ssword" {
send "joshua\n"
}
expect "Login" {
send "l"
send "Username\n"
send "Password\n"
send "1p"
}
interact

Obviously, replace Username and Password with your username/password, and if you automatically log in when you ssh manually because you set up the RSA key, remove the ‘expect “ssword” part. To use this, you need to install expect; if you’re playing on Linux, it’s probably in your distribution’s repositories. If you’re playing on OS X, then you need to install MacPorts and sudo port install expect. If you’re playing on Windows via PuTTY, you’re out of luck (but you can still set up the RSA key to at least auto-login as joshua).

Once you’ve installed expect, open up a text file in your home directory or wherever; paste the script in; and chmod 0755 /path/to/the/script. Then you can execute the script from the terminal like any other program, and it’ll automatically drop you into your last savegame (or the race/class choice screen)

Of Relativity, Balloons, and Automobiles

Suppose you’re driving a car on a perfectly level road at a perfectly constant velocity (because in physics problems, everything can be done perfectly unless the error is part of the problem). And suppose someone left a helium balloon inside the car, which is free to float around. Similarly, suppose someone left a ball resting on the floor of the car; since there are no jolts or other sudden accelerations, the ball is resting perfectly still. Now suppose you start accelerating forward. What happens to the ball and the balloon? Our physical intuition tells us that the ball will roll to the back of the car, because that’s what happens when we’re in an accelerating vehicle; things get pressed back. But what does the balloon do?

A lot of people, when presented with this, intuitively think that the balloon will move towards the back, just like the ball. If they’re physics-minded, they might say that the inertia of the balloon keeps it from wanting to change its velocity unless it’s forced to, and the only thing that can force it to do so is the back of the car. But if this were true, then general relativity would be wrong! Because according to general relativity, if you don’t look outside the car window or cheat in some other way, there’s no way to tell between a uniform acceleration of 9.8 meters per second per second in the forward direction and an additional gravitational force of one G pulling you back. And when we think about it in terms of gravity, the answer becomes clear: balloons in air will move against gravitational fields, so it will move to the front of the car.

Hopefully, this has left you at least somewhat unsatisfied; the question of where the balloon moves has been answered, but not why it moves. For that question, consider why balloons move against gravitational fields. They move because of buoyancy, which is a result of the fact that pressure is greater when you go ‘deeper’ into a fluid (liquid or gas) in a gravitational field. So the upwards force as a result of pressure is greater than the downwards force, and the balloon rises until something stops it or the atmosphere gets so rarefied that it reaches zero buoyancy (or, more likely, the decreased atmospheric pressure causes it to pop!) But when you have an accelerating car, the air molecules, much like the rolling ball, will tend to ‘pile up’ in the back, causing greater pressure in the rear than in the front. And in general, this pressure gradient will be enough to cause a forward force on the balloon all the way to the front of the car.

What to do when you lose your computer

I was recently thinking about security as a result of finding the hard copy of a PGP revocation certificate I had printed, when I realized: I had no clue what I should do in case the physical security of my computer was compromised (i.e., if it was stolen or went missing for an extended period of time). So I decided to take stock of how many secrets I have here and what the best way to render them useless or remote-erase them would be in case I lost it, as well as to make it hard for anybody who steals it to get any use out of the secrets before I can make them useless.

I store my passwords in a KeePass password database, encrypted using AES using a reasonably long passphrase; I have it set to require the passphrase if the window loses focus for more than 30 seconds. I then synchronize it using Dropbox between various computers, as well as so I can download it if I’m on a new computer that I trust enough to log into stuff on. I also have my PGP private key on my laptop. Noticeably, there are several password I have that are not written down anywhere: my MIT Kerberos password, my Gmail password (most of my password reset e-mails would be sent there, so if it was compromised everything else would be too), the KeePass database password, and my Dropbox password. And of course there’s a password on my laptop, but I wouldn’t rely on that for anything beyond keeping someone from looking at my stuff while I’m temporarily out.

So what does that mean in case the laptop’s stolen? Step one is to go everywhere I know of that I can sign into using public key cryptography on this machine and delete the keys; fortunately, the only such machine at the moment is the Github remotes and my VM; I’d also force-disconnect any ssh sessions that I left open by killing the processes. After that’s done, I change my Gmail password in case I left myself logged in or cookied or something, and forcibly sign out all my other accounts using the link on the bottom. If the password’s been changed or it’s been more than a day or so since I last saw the laptop, I assume all my accounts are compromised. The third step is to backup the password database and then delete it from Dropbox; if the person who stole it isn’t smart, the next time it connects to the internet it’ll delete the local copy. Interestingly, even if I change the Dropbox password, the computer will still have access to my files; if I want to disable syncing, I have to unlink the computer it on the website (which does tell me when the last sync occurred). Then I change the password for my AOL and MSN accounts; MSN only lets you sign on from one place at a time, but AOL doesn’t, and I don’t know if it’s possible to force a logout. Finally, I’d probably revoke my public key; I know enough people who can get it signed that are in the strong set that it’s not terribly difficult to get back in there. I have a printout of my revocation certificate for just this occasion.

Perl will never go away, ever

Perl was one of the first languages that I ever learned and actually truly did things with; it was the first language I ever wrote a nontrivial program in (a DES implementation that I have unfortunately lost the source code to, or else I would post it). The first language I ever wrote a program in was something I don’t even remember in BASIC; I seem to have blocked all memory of it from my memory, probably for the better. So I have a bit of a soft spot for the language, and so I still have some of my bad habits; since I didn’t use strict or -w, my code would likely be full of uninitialized variables and barewords. It’s a bad habit, and to this day I still have to be reminded occasionally that other languages, such as Python, do require variables to be declared.

But Perl is old now, and I’ve mostly moved on to other languages, like Python. I like the object-orientation, the support for functional paradigms and other nice things like list comprehensions and lambda functions. I like not having to sigil all of my variables with $ or @ or %, I like being able to supply keyword arguments to my functions so that I don’t have to remember which weird order I decided to use, I like the sheer amount of fun things that you can do with object orientation combined with reflection, metaprogramming, and everything being a first-class object. And yet, I still think it’ll stick around for a while.

Why do I say that? Simple. I was talking with someone who had left in the middle of an online IRC-based role-playing game, and they had asked for chatlogs of what had happened after they left. I had them, since I run weechat in tmux (like irssi in screen, but better!) and so am in every IRC channel I’m in 24/7. But the question was: how could I pull out just the lines that were said when he left? And the answer was Perl. It turns out that the .. operator, which in a for loop or other situations where a list is expected produces a range (so (1..9) as a list produces the list (1,2,3,4,5,6,7,8,9)), does something completely different in a scalar context, like in the conditional of an if statement. Take the statement print if (/Person.*has quit/ .. /Person.*has joined/). Each time this statement is run, the conditional will evaluate to false, until the left-hand side evaluates to true. Then it’ll start evaluating to true, until the right-hand side evaluates to false, and then it’ll stop being true (but it’ll still be true until it’s evaluated again!), etc., etc. So if this is in an implicit while loop running through the lines of a file, it’ll start printing when it sees a line saying Person has quit, including that line, then stop when they rejoin, but still print that line, and then it’ll keep going until it sees another quit line, etc. And the best part is, if you call perl with -n, you automatically get a while loop that assigns the current line of the file it’s reading from to $_, the implicit variable in the matching and print.

If I wanted to do that in something like Python, I’d have to manually set up the read loop, write a function to trawl through, build up regexp objects to match on, etc. And that’s fine for a piece of code I intend to maintain. But for a quick one-line script like this? Too much effort. All I need is perl -ne.