sitr.us

posts by Jesse Hallett

Functional data structures in JavaScript with Mori

I have a long-standing desire for a JavaScript library that provides good implementations of functional data structures. Recently I found Mori, and I think that it may be just the library that I have been looking for. Mori packages data structures from the Clojure standard library for use in JavaScript code.

Functional data structures

A functional data structure (also called a persistent data structure) has two important qualities: it is immutable and it can be updated by creating a copy with modifications (copy-on-write). Creating copies should be nearly as cheap as modifying a comparable mutable data structure in place. This is achieved with structural sharing: pointers to unchanged portions of a structure are shared between copies so that memory need only be allocated for changed portions of the data structure.

A simple example is a linked list. A linked list is an object, specifically a list node, with a value and a pointer to the next list node, which points to the next list node. (Eventually you get to the end of the list where there is a node that points to the empty list.) Prepending an element to the front of such a list is a constant-time operation: you just create a new list element with a pointer to the start of the existing list. When lists are immutable there is no need to actually copy the original list. Removing an element from the front of a list is also a constant-time operation: you just return a pointer to the second element of the list. Here is a slightly more-detailed explanation.

Lists are just one example. There are functional implementations of maps, sets, and other types of structures.

Rich Hickey, the creator Clojure, describes functional data structures as decoupling state and time. (Also available in video form.) The idea is that code that uses functional data structures is easier to reason about and to verify than code that uses mutable data structures.

Functional Reactive Programming in JavaScript

I had a great time at NodePDX last week. There were many talks packed into a short span of time and I saw many exciting ideas presented. One topic that seemed particularly useful to me was Chris Meiklejohn’s talk on Functional Reactive Programming (FRP).

I have talked and written about how useful promises are. See Promise Pipelines in JavaScript. Promises are useful when you want to represent the outcome of an action or a value that will be available at some future time. FRP is similar except that it deals with streams of reoccurring events and dynamic values.

Here is an example of using FRP to subscribe to changes to a text input. This creates an event stream that could be used for a typeahead search feature:

var inputs = $('#search')
    .asEventStream('keyup change')
    .map(function(event) { return event.target.value; })
    .filter(function(value) { return value.length > 2; });

var throttled = inputs.throttle(500 /* ms */);

var distinct = throttled.skipDuplicates();

This creates an event stream from all keyup and change events on the given input. The stream is transformed into a stream of strings matching the value of the input when each event occurs. Then that stream is filtered so that subscribers to inputs will only receive events if the value of the input has a length greater than two.

Monkey patching document.write()

This is one of the crazier workarounds that I have implemented. I was working on a web page that embeds third-party widgets. The widgets are drawn in the page document - they do not get their own frames. And sometimes the widgets are redrawn after page load.

We had a problem with one widget invoking document.write(). In case you are not familiar with it, if that method is called while the page is rendering it inserts content into the DOM immediately after the script tag in which the call is made. But if document.write() is called after page rendering is complete it erases the entire DOM. When this widget was redrawn after page load it would kill the whole page.

The workaround we went with was to disable document.write() after page load by replacing it with a wrapper that checks whether the jQuery ready event has fired.

(function() {
    var originalWrite = document.write;
    document.write = function() {
        if (typeof jQuery !== 'undefined' && jQuery.isReady) {
            if (typeof console !== 'undefined' && console.warn) {
                console.warn("document.write called after page load");
            }
        }
        else {
            // In IE before version 8 `document.write()` does not
            // implement Function methods, like `apply()`.
            return Function.prototype.apply.call(
                originalWrite, document, arguments
            );
        }
    }
})();

The new implementation checks the value of jQuery.isReady and delegates to the original document.write() implementation if the page is not finished rendering yet. Otherwise it does nothing other than to output a warning message.

Promise Pipelines in JavaScript

Promises, also know as deferreds or futures, are a wonderful abstraction for manipulating asynchronous actions. Dojo has had Deferreds for some time. jQuery introduced its own Deferreds in version 1.5 based on the CommonJS Promises/A specification. I’m going to show you some recipes for working with jQuery Deferreds. Use these techniques to turn callback-based spaghetti code into elegant declarative code.

The basics of jQuery Deferreds

A Deferred is an object that represents some future outcome. Eventually it will either resolve with one or more values if that outcome was successful; or it will fail with one or more values if the outcome was not successful. You can get at those resolved or failed values by adding callbacks to the Deferred.

In jQuery’s terms a promise is a read-only view of a deferred.

Here is a simple example of creating and then resolving a promise:

function fooPromise() {
    var deferred = $.Deferred();

    setTimeout(function() {
        deferred.resolve("foo");
    }, 1000);

    return deferred.promise();
}

Callbacks can be added to a deferred or a promise using the .then() method. The first callback is called on success, the second on failure:

Installing a custom ROM on the Transformer Prime: A start-to-finish guide

This guide provides step-by-step instructions for installing the Virtuous Prime community ROM on your Asus Transformer Prime TF201 tablet. This guide will be useful to you if you do not have root access to your tablet.

Be aware that following the instructions here will void your warranty and will wipe all of the data on your tablet. There is also a danger that you might brick your tablet. Proceed at your own risk.

So, why would you want to install a custom ROM on your tablet? In my case I wanted to gain root access, which allows one to do all sorts of nifty things. Community-made ROMS are also often customized to make the Android experience more pleasant for power users. And choosing your own ROM means that you are no longer dependent on the company that sold you your device to distribute firmware updates in a timely fashion. But if you are reading this then you probably already know why you want to install a custom ROM - so let’s get on to the next step.

Cookies are bad for you: Improving web application security

Most web applications today use browser cookies to keep a user logged in while she is using the application. Cookies are a decades-old device and they do not stand up well to security threats that have emerged on the modern web. In particular, cookies are vulnerable to cross-site request forgery. Web applications can by made more secure by using OAuth for session authentication.

This post is based on a talk that I gave at Open Source Bridge this year. The slides for that talk are available here.

cookie authentication

When a user logs into a web application the application server sets a cookie value that is picked up by the user’s browser. The browser includes the same cookie value in every request sent to the same host until the cookie expires. When the application server receives a request it can check whether the cookies attached to it contain a value that identifies a specific user. If such a cookie value exists then the server can consider the request to be authenticated.

attacks that target browser authentication

There are many types of attacks that can be performed against a web application. Three that specifically target authentication between the browser and the server are man-in-the-middle (MITM), cross-site request forgery (CSRF), and cross-site scripting (XSS). Plain cookie authentication is vulnerable to all three.

How Mobile Safari emulates mouse events

When you are adapting web apps to touchscreen devices particular challenges come up around events like mouseover and mouseout. Touchscreen devices like the iPad do not have a cursor, so the user cannot exactly move the mouse over an HTML element. However, Mobile Safari, the web browser that comes with the iPhone and iPad, has a fallback for websites that require hovering or cursor movement.

Usually when you tap on an element on a link or other clickable element Mobile Safari translates that into a regular click event. The browser also produces some touch events that do not exist in a lot of browsers. But from the perspective of a web page that was not designed with a touchscreen in mind, what you get is a plain click. More specifically, the browser fires mousedown, mouseup, and click in that order. But if a clickable element also does something on mouseover then tapping on that element will trigger a mouseover event instead of a click. Tapping on the same element again will produce a click event. A random example of a page that exhibits this behavior is the schedule page from the Open Source Bridge website. Try tapping on session titles and see what happens.

Mobile Safari will only produce mouse events when the user taps on a clickable element, like a link. You can make an element clickable by adding an onClick event handler to it, even if that handler does nothing. On tap Mobile Safari fires the events mouseover, mousemove, mousedown, mouseup, and click in that order - with some caveats which are explained below. Those events all fire together after the user lifts her finger. You might expect the mousedown event to fire as soon as the user presses her finger to the screen - but it does not. When the user taps on another clickable element the browser fires a mouseout event on the first element in addition to firing the aforementioned events on the new element.

CouchDB Notes

Recently I gave a talk at Portland Ruby Brigade meeting on CouchDB, a document-oriented database. I thought I would share my notes from that talk. In some respects this was a followup to an earlier talk that Igal Koshevoy gave comparing various post-relational databases. Igal also wrote some additional notes on my talk.

In summary, some of the distinguishing features of CouchDB are:

  • Schema-less data store stores documents containing arbitrary JSON data.
  • Incrementally updated map-reduce views provide fast access to data, support powerful data processing, and eliminate lookup penalties for data in large or deeply nested documents.
  • Map-reduce views - which are again, incrementally updated - provide fast access to aggregate data, such as sums or averages of document attributes.
  • Schema-less design means no schema migrations are ever required. And new map-reduce views can be installed with no downtime.
  • “Crash-only” design protects data integrity in almost every crash scenario. No recovery process is required when rebooting a crashed database server.
  • Lock-free design means that read requests never have to wait for other read or write requests to finish. Writes are only serialized at the point where data is actually written to the disk.
  • Integrated, robust master-master replication with automatic conflict handling
  • MVCC, or “optimistic locking”, prevents data loss from multiple writes to the same document from different sources.
  • RESTful interface makes it easy to integrate CouchDB with any environment that speaks HTTP.
  • Documents can contain binary attachments. Attachment support combined with the HTTP interface means that CouchDB can serve HTML, JavaScript, images, and anything else required to host a web application directly from the database.

More detailed information on all of the above points can be found in CouchDB’s technical overview.

Some of the downsides:

  • Writes and single-document lookups are slower than other databases due to HTTP overhead and frequent disk access.
  • CouchDB optimizes CPU and RAM use by using lots of disk space. The same data set will take up a lot more space in CouchDB than in other database systems.
  • You must create map-reduce views in advance for any queries you want to run. SQL users are used to processing data at query time; but this is not allowed by the CouchDB design (assuming you are not using temporary views in production, which you should not do.)
  • There is a serious learning curve when learning to think in terms of map-reduce views.
  • Map-reduce views, though very powerful, are not as flexible as SQL queries. There may be cases where it is necessary to push data processing to an asynchronous job or to the client.
  • CouchDB is a young project and its API is undergoing rapid changes.
  • Documentation can be sparse - especially when very new features are involved.

How to install Haskell “Batteries Included” Platform on Ubuntu Jaunty

Just for kicks I thought I would take another shot at some Haskell programming. To get all of the common libraries and the automated package installer, cabal, I set up the Haskell Platform. Here is how I did it.

Ubuntu Jaunty includes a package for the Haskell compiler, ghc, at version 6.8. The Haskell Platform installer will roll its eyes at you if you try to proceed with this version of ghc. So the first step is to install ghc 6.10.

Paste these lines into /etc/apt/sources.list.d/haskell.list:

deb http://ppa.launchpad.net/someone561/ppa/ubuntu jaunty main
deb-src http://ppa.launchpad.net/someone561/ppa/ubuntu jaunty main

To get the key to verify packages from that PPA, run this optional command:

sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys E51D9310

Then update your package list and install Haskell:

sudo apt-get update
sudo apt-get install ghc6 ghc6-prof ghc6-doc haddock

The Haskell Platform website does not list a package for Ubuntu yet. So download the source installer.

Database Queries the CouchDB Way

CouchDB is a document-oriented database. It has no rows or tables. Instead CouchDB is a collection of JSON documents. It uses a map-reduce pattern to index data. Queries in CouchDB pull data from what are essentially stored procedures called views. A view is made up of a map function and optionally a reduce function. Ninety percent of the time all you need is the map function, so I will focus on map-only views here.

A map function is a JavaScript function that takes a single document as an argument and emits any number of key/value pairs. Both the key and the value can be any JSON value you choose. The map function is run on every document in the database individually and the emitted key/value pairs are used to construct an index of your data.

A Simple Example

Imagine you have a database with user records and you want a view of those records using the last name of each user as keys.

function(doc) {
    if (doc.last_name) {
        emit(doc.last_name, doc);
    }
}

The above map function will produce and index something like the one below. Because doc is used as a value for each entry the entire content of each JSON document will be accessible as the indexed values.

[
  ...
  { key: "Clarke", value: { last_name: "Clarke", ... } },
  { key: "Kelly",  value: { last_name: "Kelly",  ... } },
  { key: "Smith",  value: { last_name: "Smith",  ... } },
  ...
]

Notice that the map function checks each document for a last_name attribute before emitting a key/value pair. There may be documents in the database that are not user records. By performing that check the view excludes any non-user-record documents from the resulting index.