Serializing JavaScript objects with circular references

One problem we've been dealing with at work for a while is serializing JavaScript objects to a string representation, when objects can contain circular references. The simplest example of this is an object a, which contains one property (call it 'foo') which is a reference to a.

Mozilla provides a toSource() function on the Object class (and hence every object) which returns a serialized representation of that object, allowing it to be reconstituted with eval(). To handle circular references, any object or array can be tagged with a numeric identifier by prefixing it with "#x=" (where x is an integer), and a tagged object can be referenced with the syntax "#x#". Using this, any object graph can be serialized, and we can even make use of this system to compress the serialized text down further than Mozilla does on its own by using this syntax for all backreferences, instead of just circular ones. The example earlier, of an object referencing itself would serialize as "#1={foo:#1#}".

All fine and dandy, but unfortunately, IE doesn't implement the toSource() function. I don't know about other browsers either - toSource() is part of Netscape's original Javascript documentation, but not part of the ECMAScript standard. So where does that leave us? We could use this syntax or something similar and simply do both serialization and deserialization manually in IE, but this leads to some fairly serious performance issues. We need something that's far less manual, but works in any JS-supporting browser.

Our first approach was simply to replace the #x syntax with standard variables, in the form of a special context array. If we name this array "\_", and initialize it before deserializing, our serialized object now looks like this: "\_[1]={foo:\_[1]}". Unfortunately, JavaScript doesn't assign to \_[1] until it's finished deserializing the value that's being assigned to it (naturally), and so any uses of \_[1] inside that value return 'undefined'. Backreferences that aren't circular are just fine.

Eventually, we concluded that an entirely automatic deserialization that works in all JS-supporting browsers simply isn't possible. One that requires a minimum of manual intervention is, however: We continue using the array syntax above for assignments, but we replace all backrefs with placeholder objects. After the eval() step, we recurse through the graph finding all the placeholders and replacing them with the appropriate backrefs from the array. The best part of this is that our array of backrefs has been conveniently populated for us by the JavaScript that we used to identify the destinations of the backrefs in the first place. Thus, our example from above becomes: "\_[1]={foo:{\_ref:1}}". After initializing "\_" and calling eval() on this, we recurse through the tree, find the object "{\_ref:1}" and replace it with the value of \_[1]. Done! We don't even have to keep track of where we've been to avoid loops, because we just stop recursing whenever we replace a backref.

Comments

blog comments powered by Disqus