Saturday, September 1. 2007Leaving NZ, again.
So, first a quick update on everything I've been leaving out on this blog: Waaaay back in February, I was offered a (phone) job interview with Google. Things proceeded well, and a while ago, I was made an offer of a job as a Site Reliability Engineer in Dublin, which I accepted. I resigned from my position in NZ, sorted out stuff for moving, and hopped on a plane.
I'm writing this entry from a Dublin internet cafe. I arrived here in Ireland mere hours ago, and I'm still exhausted and jetlagged, but I have to do something with the time until I can let myself try and sleep. Hence the sudden update. First, for those wondering, Hayley, the love if my life, is, alas, still back in NZ for the moment. At the moment it looks like she'll be joining me up here in about 6 weeks. When I arrived in Dublin, I was met by someone hired by Google, who drove me to my temporary accommodations provided for me for the next 30 days. Light commentary on the city was provided. Upon reaching my accommodation, I was handed off to someone else, who gave me the Grand Tour of my apartment for the next month. This place is... luxurious. Two double bedrooms, full bathroom (w/ shower and bath), ensuite in master bedroom, leather lounge suite and wide-screen plasma TV, fully equipped, new kitchen, etc. I have no idea how I'm supposed to use all that. Roughly, my plans for the next few weeks go like this: Today, Sunday: Try to recover from jetlag, explore a little of dublin. Spend lots of time in Internet cafes, and pine after Hayley. Curse the lack of connectivity that makes all of this a lot harder than it should be. Eat out a lot. Monday: First day at Google. Get inducted, get my laptop. Begin training. Hopefully, better connectivity ensues now I have a computer again. Attempt to open a bank account, get a cellphone account, and buy a bike, not neccessarially in that order. Tuesday: More training. Wednesday, Thursday, Friday, +3 more weeks: More training. Try and locate more permanent accommodations. Then, I get shipped down to Mountain View for 4 weeks' training there. Cue pictures of eyes held open with clamps, mass brainwashing, etc. Then, finally, we both fly back to Dublin, hopefully to our new apartment, and settle in to our new lives here. Right now, all that seems pretty damn intimidating. I'm sure it's the jetlag and sheer overwhelmingness of it talking at least a little. PS: This internet cafe rocks. It has Firefox! Tuesday, August 28. 2007I am teh famous, LOL!
LOLCode.NET was featured at TechED07, in a presentation by Nick Hodge, a "Professional Geek" for Microsoft. He's posted a video of the presentation on his blog.
Now I just need to convince them to integrate it into the next edition of VS... Sunday, June 3. 2007LOLCode.net - Now your LOLCats can use the CLR!
The past few days, I've been working on a .NET compiler for the LOLCode language.
LOLCode is an emerging esoteric (and hilarious) language based on the dialect used in LOLCats images. It's been siezed upon by a group of people (myself included, now), and is being expanded into a real, workable, turing complete esoteric language (though nobody has proven its turing completeness yet!). The LOLCode.NET compiler is now working, and as a nearly-free bonus for using the .NET platform, you can even debug it in Visual Studio: ![]() ![]() I still can't believe I'm looking at LOLCode and its x86 disassembly in Visual Studio. Tuesday, May 29. 2007Dynamic code generation in .NET
I've written up a rather lengthy document on different methods of dynamically generating code in .NET - two current methods, and one hypothetical one based on a C extension called "`c" (tickc). I think it does a fairly good job of illustrating just how awkward dynamic code generation is in current mainstream languages, and how much simpler and more understandable (and in many cases, more efficient) it could be.
Monday, April 2. 2007Damn Cool Algorithms, Part 1: BK-Trees
This is the first post in (hopefully) a series of posts on Damn Cool Algorithms - essentially, any algorithm I think is really Damn Cool, particularly if it's simple but non-obvious.
BK-Trees, or Burkhard-Keller Trees are a tree-based data structure engineered for quickly finding near-matches to a string, for example, as used by a spelling checker, or when doing a 'fuzzy' search for a term. The aim is to return, for example, "seek" and "peek" if I search for "aeek". What makes BK-Trees so cool is that they take a problem which has no obvious solution besides brute-force search, and present a simple and elegant method for speeding up searches substantially. BK-Trees were first proposed by Burkhard and Keller in 1973, in their paper "Some approaches to best match file searching". The only copy of this online seems to be in the ACM archive, which is subscription only. Further details, however, are provided in the excellent paper "Fast Approximate String Matching in a Dictionary". Before we can define BK-Trees, we need to define a couple of preliminaries. In order to index and search our dictionary, we need a way to compare strings. The canonical method for this is the Levenshtein Distance, which takes two strings, and returns a number representing the minimum number of insertions, deletions and replacements required to translate one string into the other. Other string functions are also acceptable (for example, one incorportating the concept of transpositions as an atomic operation could be used), as long as they meet the criteria defined below. Now we can make a particularly useful observation about the Levenshtein Distance: It forms a Metric Space. Put simply, a metric space is any relationship that adheres to three basic criteria:
The last of these criteria is called the Triangle Inequality. The Triangle Inequality states that the path from x to z must be no longer than any path that goes through another intermediate point (the path from x to y to z). Look at a triangle, for example: it's not possible to draw a triangle such that it's quicker to get from one point to another by going along two sides than it is by going along the other side. These three criteria, basic as they are, are all that's required for something such as the Levenshtein Distance to qualify as a Metric Space. Note that this is far more general than, for example, a Euclidian Space - a Euclidian Space is metric, but many Metric Spaces (such as the Levenshtein Distance) are not Euclidian. Now that we know that the Levenshtein Distance (and other similar string distance functions) embodies a Metric Space, we come to the key observation of Burkhard and Keller. Assume for a moment we have two parameters, query, the string we are using in our search, and n the maximum distance a string can be from query and still be returned. Say we take an arbitary string, test and compare it to query. Call the resultant distance d. Because we know the triangle inequality holds, all our results must have at most distance d+n and at least distance d-n from test. From here, the construction of a BK-Tree is simple: Each node has a arbitrary number of children, and each edge has a number corresponding to a Levenshtein distance. All the subnodes on the edge numbered n have a Levenshtein distance of exactly n to the parent node. So, for example, if we have a tree with parent node "book" and two child nodes "rook" and "nooks", the edge from "book" to "rook" is numbered 1, and the edge from "book" to "nooks" is numbered 2. To build the tree from a dictionary, take an arbitrary word and make it the root of your tree. Whenever you want to insert a word, take the Levenshtein distance between your word and the root of the tree, and find the edge with number d(newword,root). Recurse, comparing your query with the child node on that edge, and so on, until there is no child node, at which point you create a new child node and store your new word there. For example, to insert "boon" into the example tree above, we would examine the root, find that d("book", "boon") = 1, and so examine the child on the edge numbered 1, which is the word "rook". We would then calculate the distance d("rook", "boon"), which is 2, and so insert the new word under "rook", with an edge numbered 2. To query the tree, take the Levenshtein distance from your term to the root, and recursively query every child node numbered between d-n and d+n (inclusive). If the node you are examining is within d of your search term, return it and continue your query. The tree is N-ary and irregular (but generally well-balanced). Tests show that searching with a distance of 1 queries no more than 5-8% of the tree, and searching with two errors queries no more than 17-25% of the tree - a substantial improvement over checking every node! Note that exact searching can also be performed fairly efficiently by simply setting n to 0. Looking back on this, the post is rather longer and seems more involved than I had anticipated. Hopefully, you will agree after reading it that the insight behind BK-Trees is indeed elegant and remarkably simple. Thursday, September 21. 2006Serializing JavaScript objects with circular references
One problem we've been dealing with at work for a while is serializing JavaScript objects to a string representation, when objects can contain circular references. The simplest example of this is an object a, which contains one property (call it 'foo') which is a reference to a.
Mozilla provides a toSource() function on the Object class (and hence every object) which returns a serialized representation of that object, allowing it to be reconstituted with eval(). To handle circular references, any object or array can be tagged with a numeric identifier by prefixing it with "#x=" (where x is an integer), and a tagged object can be referenced with the syntax "&x;". Using this, any object graph can be serialized, and we can even make use of this system to compress the serialized text down further than Mozilla does on its own by using this syntax for all backreferences, instead of just circular ones. The example earlier, of an object referencing itself would serialize as "#1={foo:&1;}". All fine and dandy, but unfortunately, IE doesn't implement the toSource() function. I don't know about other browsers either - toSource() is part of Netscape's original Javascript documentation, but not part of the ECMAScript standard. So where does that leave us? We could use this syntax or something similar and simply do both serialization and deserialization manually in IE, but this leads to some fairly serious performance issues. We need something that's far less manual, but works in any JS-supporting browser. Our first approach was simply to replace the #x syntax with standard variables, in the form of a special context array. If we name this array "_", and initialize it before deserializing, our serialized object now looks like this: "_[1]={foo:_[1]}". Unfortunately, JavaScript doesn't assign to _[1] until it's finished deserializing the value that's being assigned to it (naturally), and so any uses of _[1] inside that value return 'undefined'. Backreferences that aren't circular are just fine. Eventually, we concluded that an entirely automatic deserialization that works in all JS-supporting browsers simply isn't possible. One that requires a minimum of manual intervention is, however: We continue using the array syntax above for assignments, but we replace all backrefs with placeholder objects. After the eval() step, we recurse through the graph finding all the placeholders and replacing them with the appropriate backrefs from the array. The best part of this is that our array of backrefs has been conveniently populated for us by the JavaScript that we used to identify the destinations of the backrefs in the first place. Thus, our example from above becomes: "_[1]={foo:{_ref:1}}". After initializing "_" and calling eval() on this, we recurse through the tree, find the object "{_ref:1}" and replace it with the value of _[1]. Done! We don't even have to keep track of where we've been to avoid loops, because we just stop recursing whenever we replace a backref. Monday, September 18. 2006Street Games
Recently, when we changed over to the new, smaller coins, the council went around replacing all the old (digital) meters with new, pay-and-display ones. The old meters had one meter for every 8 or so parks, and you selected the park you were paying for before paying. Inset into the footpath by each parking space, therefore, there was a cobblestone bearing a number from 1 to 8, to identify the parking space.
When they switched over to the new meters, they went along with a sandblaster and blasted off all the numbers, leaving several long series of blank cobblestones inset into the pavement at regular intervals. Walking home one night, Hayley and I got thinking: What could one do with these cobblestones? Our first thought was to chalk or spray-paint (with a stencil) random squares from board games: "Forward 3 squares", "Roll again", "Park Terrace", "Double Word Score", "Go to Jail", "Triple letter score" and the like, to create some amusement and confusion. But why not go one step further? Why not invent our own game people could play on the cobblestones? The question now is: what? There are a few criteria any game would have to meet:
So: suggestion time! What crazy things should we do with this? Incidentally, the new meters are solar powered, and accept payment via credit card and cellphone (presumably via a GPRS modem for recieving instructions and clearing payments). How cool is that? New Hair Color for Hayley!Wednesday, August 30. 2006Indexing directory structures
After much messing around with SQL DBs and table structures, I eventually resorted to writing my own indexing scheme and storing the index in memory for my FS-indexing needs.
Here's how I went about it: 1) Construct a tree out of the FS you're indexing. Each node in the tree needs a reference to its parent, but if you're just using it for this index, it doesn't actually need references to its children. 2) Construct a dictionary of lists of nodes to act as the index. Dictionary 3) Iterate through each item, and extract terms for that component. Terms are only extracted for the current component, not its parents - "c:\foo\bar" would only have 'bar' as a term. 4) Find the relevant item in the index for each term (or add it, if need be), and add the current node to the list of nodes against that item. Once you've constructed the index, searching is as follows: 1) Extract the terms for your search query in the same manner as used for path components above 2) Obtain the list from the index for each of these terms. If any of the lists are empty, return immediately with an empty set - no results are possible. 3) For each item in the returned lists, follow the chain of parents, checking which terms from the original search are present in the item and all its ancestors. If you match all terms, add it to the result set. If you reach the root of the tree without doing so, discard the node. I can't help thinking it could be more efficient - there's got to be a way to intersect the sets returned for each term, or to make use of the property that the end results for a search can consist only of items in each term's list, or their descendents, but I can't think how to do it right now. Performance tests: Indexing over 600,000 items, searches for 4 terms return in less than 1/100 of a second. Searches with more terms show only linear increase in time taken, of course. Wednesday, August 23. 2006Obsession
For some time now, I've been working, on and off, on a P2P filesharing system, which I've dubbed 'Attercop' (named after the term used for 'spider' used in Old English and LOTR). Simply put, it's a replacement for DC++, designed mostly for LAN environments. It uses multicast, so no central hub is required, and it improves on a number of the most glaring weaknesses in DC++. I'm writing it in C#.NET.
Like most projects I embark on, I started off hugely enthusiastic, spending as much time as I could spare on it. Normally, I'd continue like that until either I get the project completed, or I burn-out on it. Quite frequently, the latter has happened - I've burned out and lost interest, with the project unfinished. However, since I've been spending all the time I've not been at work with Hayley, I've had very little time for it lately - a lack that's hard to mourn, since I consider time with Hayley much better spent - and I'd begun to lose enthusiasm for the project. I concluded that I was (unfortunately) suffering from the same sort of mid-project disinterest I often get. However, we spent the last weekend at a Lan, during which I coded nearly the whole time, and I find my enthusiasm suddenly renewed. I'm really enthused about it again, and hugely enjoy working on it. My conclusion, obvious as it seems: Limiting how much time you spend on a project - no matter how enthused you are by it - can help prevent burning-out on it. It just takes something really good to pull me away from something I'm this engrossed in. Moving In
On Sunday, Hayley's flat had a flat meeting. Hayley wanted to move out of her flat (and into our apartment full-time), and the upshot is: no problem! As soon as she/they can find a replacement flatmate, she's moving in with me full-time!
Sunday, July 30. 2006Found a flat!
I've found a flat, and I moved in this weekend. It's a single apartment in a '60s apartment block right in town, which is certainly convenient, and the rent is really low. It's fairly large for a single apartment, consisting of a sizeable entryway, a bathroom, bedroom, small kitchen, large lounge, and study.
The catch? The decor is hideous. As far as we can tell, some time in the 80s, the owners decided they wanted to redecorate, and couldn't decide between three different, conflicting, styles. So they used all 3, usually in different parts of the apartment, but sometimes together. The result is something you're likely to remember for a long time. There's a big emphasis on border prints, and quite a lot of patterned wallpaper. The owners are intending to re-rent or sell it in the new year, and Hayley and I are hoping we can convince them to let us redecorate. Initial attempts ("We'll repaint if you pay for the paint!") have met with "don't bother" responses, but I'm actually prepared to pay for the supplies - it shouldn't take much - in order to be able to redecorate. Strangely, I'm actually looking forward to doing it, too. In the meantime, we're doing what we can: I've started hanging up what little art I have (I have some nice prints I got in Vancouver, and a nice pencil drawing I got at the seattle comic con that I need to get framed), and we're taking down what we can that can be replaced (including a couple of hideous lightshades). More significantly, Hayley and I went out and got some canvases today, and she's currently penciling in a Girl Genius design on one of the larger ones. I'm really looking forward to seeing them hanging up. The largest one we're reserving for an anime collage, with parts from a number of different anime series, which we'll paint together - assuming I don't make a mess of it, at least. Since this is my first (unfurnished) place of my own, I've had a whole heap of stuff to buy. I've got a fair bit done, but there's still a whole lot left to get. We've got plates, cups, bowls, cutlery, kitchen supplies, most of those sort of basics. I picked up a Vacuum cleaner - cheap, but not the cheapest, so hopefully it won't fall apart - and got a Fridge for a bargain $600NZ. It's new, but ever so slightly shop-damaged. Also, on Saturday, we picked up a two-seater beanbag sofa from Beanbags and Beyond, in town, then ordered a taxi. You really don't realise just how big it is until you try and cram it into the back seat of a Taxi, and still fit Hayley without threatening suffocation. We did get it home without incident, and it's extremely comfortable. To finish off Saturday evening, we sat on the new beanbag sofa in the study, ate takeaway indian food, drank sparkling grape juice, and watched the Dark Crystal. A brief update about work: Going great. I'm enjoying myself much of the time, and the work is generally interesting. I love working with GIS and mapping. I have the unsettling feeling that my boss disapproves of just about everything I do, though. I'm probably wrong - I felt similarly when I first started at Niche and had no performance reviews or anything. Hopefully my 3 month review will prove me wrong, if nothing does sooner. Monday, June 26. 2006Back to work
I'm now officially employed again. I've found a kickass job at a local software company specialising in GIS (mapping etc). Their main product is a fleet tracking package. I just finished my first day there today.
They're using Postgres extensively, with their own extensions such as some form of multi-master replication. The boxes are based on Gentoo, with (oddly), VMWare instances running windows for the application servers. The environment's quite well set up, and so far quite nice to work with (with my whole 8 hours' worth of experience with it). I'm also starting back part-time at Uni this semester, doing two papers - COSC229 - Algorithms, and COSC362 - Microprocessor Systems 2. My new employer is quite flexible about allowing me to fit my work hours around uni, and the two are very close together to boot. Now I just have to find a flat in the same general vicinity... Saturday, June 17. 2006To market, to market, to buy...
Lex has posted rather interesting entry about our trip to the supermarket.
Wednesday, May 31. 2006Homeward Bound
So here I am, sitting in Vancouver airport, wasting time while I wait for my flight back home, via Sydney, via Honolulu (two new airports to get lost in! Oh joy!). And what better way to waste time than by blogging, right?
Checking in was, this time, fairly uneventful. I had to pay extra for my bike and because my suitcase was heavy, but less than I expected - Air Canada charge $50 for a bike, instead of the usual $175 for an extra piece of luggage. Because their system can't handle having both a bike and an overweight luggage item, I had to line up a second time at ticketing to pay for it. Oh well. After that, US immigration, checking my baggage in, and the inevitable x-ray scans and inspections were relatively simple. I'm just ophing I don't have to check my baggage in again at Sydney, since hauling that much baggage around on my own is no fun, but I have a sinking feeling I'll have to. I'm really not looking forward to the long flight, but at least I have a light at the end of the tunnel. I'll be home soon, Hayley!
« previous page
(Page 2 of 4, totaling 48 entries)
» next page
|
Calendar
QuicksearchArchivesCategoriesSyndicate This BlogBlog Administration |
|||||||||||||||||||||||||||||||||||||||||||||||||
