Anders Brownworth

Technology and Disruption

600+M Rows in PostgreSQL

I've been working on a PostgreSQL project which has a table with 644,742,830 (or 644.7M) rows in it. There is an index on an array column, in this case a column of type character varying(35)[] with a UNIQUE constraint. Each row has exactly two strings in it which happen to be bitcoin addresses. When I SELECT some address from that column:

SELECT * FROM keys WHERE addresses @> '1Axrv5WPGA65wpxCpV87sYdwxhkmy8mvF2';

the database is actually SELECTing from amongst 1,289,485,660 (or 1.2 Billion) keys. Because the database is on an SSD and there is enough RAM in the system to fit the index, I can regularly return results within the 1ms to 2ms range!

On disk, the entire database (including a few other columns and several other tables) is in the 421Gig range. I had been worrying about having to use some other strategy (LevelDB possibly) but have been amazed by the results. I'll keep adding rows until I ether run out of disk space or run into a performance problem. Gotta love good indexes!


PostgreSQL Index Type for UUID[] Arrays

PostgreSQL doesn't include an index type for uuid array columns.

Here's a table with a uuid array column:

things uuid[]

But when we try to create an index on it:

CREATE INDEX items_things_idx ON items USING GIN(things);
ERROR: data type uuid[] has no default operator class for access method "gin"

So we must create an index type that understands how to compare elements in an array of UUIDs:

OPERATOR 1 &&(anyarray, anyarray),
OPERATOR 2 @>(anyarray, anyarray),
OPERATOR 3 <@(anyarray, anyarray),
OPERATOR 4 =(anyarray, anyarray),
FUNCTION 1 uuid_cmp(uuid, uuid),
FUNCTION 2 ginarrayextract(anyarray, internal, internal),
FUNCTION 3 ginqueryarrayextract(anyarray, internal, smallint, internal, internal, internal, internal),
FUNCTION 4 ginarrayconsistent(internal, smallint, anyarray, integer, internal, internal, internal, internal),

Now our index will create and SELECTs to items in the UUID array will be fast. For example, find all the rows that contain a particular UUID anywhere in the array:

SELECT * FROM items WHERE things @> ARRAY['a6e34e5d-b1fb-4240-8ad9-21ddf23134bb']::uuid[];

There's a better-formatted version of this.


Map Distance Calculator

This past weekend's hack project was Click to add points to the map and double click a point to delete. The distance (in miles, kilometers, feet, etc.) is shown at the bottom of the page.

I made this because I like to see how far I would be running when I plan to run some place new. It is critical to know how far things are so you don't run out of steam somewhere far away! Similar tools already exist but none that I'm aware of give you a full screen map with easy to edit points.

I plan to add a save/export feature but I'm open to requests. Let me know what you would like to see!


Voice Recognition Demo at TED 2013

Last weekend's hack project showed up in a video on the main TED 2013 stage.

The task was to cause a phone call to hang up via a voice command so people who may be paralyzed and not have use of their hands would be able to hang up a call. While many solutions exist to dial, few seem to help with hanging a call up which can cause someone to get stuck leaving an endless voicemail message. (termed "voicemail hell") This can turn dangerous if there is an emergency and they are therefore not able to use the phone.

With the help of Twilio and SRI (the company behind Siri on the iPhone) I was able to make an application which responded to a voice command to hang up calls. While it was only a proof of concept, the TED guys arranged for a video crew and we were able to allow Gary Whitman, a quadriplegic, to hang up a phone call. The video made it's way to the 2013 TED stage as part of a presentation.


USRP Instant SDR: "Getting Started" Kit for Software-defined Radio

Ettus Research just released an interesting new "get started with software-defined radio (SDR) kit" called the USRP Instant SDR. It consists of a USRP, an RF board and a USB stick that boots up linux with everything you need to begin experimenting with software-defined radio. This looks like a great way to get started without the cost and complexity of some of the more sophisticated setups.

Ettus features a video of mine on their page where I demonstrate a GSM cellular network using OpenBTS. They also use a clip from it in their video:

I have two videos featuring Ettus products, this one on OpenBTS:

and this video about tunneling IP networks over the air between two Ettus radios:

You can do the same with the Ettus USRP Instant SDR - give it a try!


50th Episode of our 350 Third Podcast

Scott Barstow and I just released the 50th Episode of 350 Third, our humble podcast covering the impact of the Internet on business. However, in celebration of the 50th episode milestone, we instead discussed the impact of the Godfather movies on our collective psyche. Of course we decided to record the show on the Godfather and The Godfather, Part II without actually internalizing how long they are. After over 6 hours of viewing, we reminisced for a while on the movies which, as it turns out, neither of us had seen in 15 years! Good movies both, the decision is split on which film is better but we'll leave it to you to decide. Listen in online or subscribe and follow.


Zero Dark Thirty

I had imagined if Hollywood were to do a movie about killing Bin Laden, it would have been 2 hours of hard-core "seal team gets ready" and then the "Go! Go! Go!" scene. Close to the 2 hour mark, I thought for a second, "we haven't really seen the seals yet!". The fact that all of that garbage is muted and you are not even quite sure when Bin Laden is killed makes this film very unusual. The female lead never gets into a relationship either - imagine that! Instead, I found it a very compelling "on the edge of your seat" CIA drama. This is a fantastically well done film because it doesn't lean on what you might expect to be great yet it still is.

Note: I don't watch trailers.


Microsoft's Tablet Problem

Typical PCs retail around $400 and Microsoft gets roughly $35 for the OS. Above that, they probably average $80 for Office. The OS in the tablet market is not a billed item. In the case of Apple, it is subsidized mostly by the hardware and app purchase and in Google's case, by apps and advertising. There isn't room for Microsoft's OS purchase model in the tablet market. Therefore, the only cash-cows remaining for Microsoft are apps - particularly gaming and productivity. Microsoft as of yet hasn't transitioned their Xbox and Office user bases to the tablet. Where a user used to be $50 or so per year in value to Microsoft, they are now essentially worth nothing. So how do they sustain their core business? If they make inroads, at best they can hope for is in the range of $5 per user per year. That means they need to have an order of magnitude more customers than they do now!


Just Released: pub-sub-map, Publish and Subscribe to Real-time Locations on a Map

I have open-sourced pub-sub-map, a node.js driven web application that presents a map and lets you subscribe to real-time location markers. As marker positions get updated, the new location is pushed to all web browsers subscribed to them in real-time.

How could this be useful? Coupled with an iOS application I'm writing, you could follow a number of runners in a marathon, or see where all the taco trucks or taxis are. Scraping data from various mass transit systems, you could have a realtime display of where all the trains and busses are. Or you could track a model rocket or stratospheric balloon project to see where it lands. Basically, pub-sub-map can be useful for anything that moves where any number of people want to follow in realtime. It is an expansion of what you can already do right now with

I use express as a web framework running in node.js with jade templates to render the web frontend. I don't use polling to push location information. Instead, all location updates are pushed to subscribed browsers via also running in the same node.js application.

Here's a video of pub-sub-map in action.

pub-sub-map is the next-generation engine I am developing for Give it a try and let me know what you think.