Anders Brownworth

Technology and Disruption

PostgreSQL Index Type for UUID[] Arrays

PostgreSQL doesn't include an index type for uuid array columns.

Here's a table with a uuid array column:

---------------------------------------------------
CREATE TABLE items (
things uuid[]
);
---------------------------------------------------

But when we try to create an index on it:

---------------------------------------------------
CREATE INDEX items_things_idx ON items USING GIN(things);
ERROR: data type uuid[] has no default operator class for access method "gin"
---------------------------------------------------

So we must create an index type that understands how to compare elements in an array of UUIDs:

---------------------------------------------------
CREATE OPERATOR CLASS _uuid_ops DEFAULT FOR TYPE _uuid USING gin AS
OPERATOR 1 &&(anyarray, anyarray),
OPERATOR 2 @>(anyarray, anyarray),
OPERATOR 3 <@(anyarray, anyarray),
OPERATOR 4 =(anyarray, anyarray),
FUNCTION 1 uuid_cmp(uuid, uuid),
FUNCTION 2 ginarrayextract(anyarray, internal, internal),
FUNCTION 3 ginqueryarrayextract(anyarray, internal, smallint, internal, internal, internal, internal),
FUNCTION 4 ginarrayconsistent(internal, smallint, anyarray, integer, internal, internal, internal, internal),
STORAGE uuid;
---------------------------------------------------

Now our index will create and SELECTs to items in the UUID array will be fast. For example, find all the rows that contain a particular UUID anywhere in the array:

---------------------------------------------------
SELECT * FROM items WHERE things @> ARRAY['a6e34e5d-b1fb-4240-8ad9-21ddf23134bb']::uuid[];
---------------------------------------------------

There's a better-formatted version of this.

When selecting all the rows which contain a particular UUID, it works at millisecond times on 200M row tables on a 2014 MacBook Pro with SSD:

---------------------------------------------------
trader=# select count(*) from items;
count
-----------
205543131
(1 row)

trader=# \timing
Timing is on.
trader=# SELECT * FROM items WHERE things @> ARRAY['f1438d02-c4ef-4657-b726-1542bdacd030']::uuid[];
things
-----------------------------------------------------------------------------
{1612afad-12c6-4317-8de1-ae1d0c18a1c7,f1438d02-c4ef-4657-b726-1542bdacd030}
(1 row)

Time: 0.248 ms
trader=#
---------------------------------------------------

When creating this operator class in migrations, it is useful to know if this operator already exists. I'm not aware of an "IF EXIST" for operator types, so you can test for duplicate object instead:

---------------------------------------------------
DO $$
BEGIN

-- -----------------------------------------------------
-- New Index Type for uuid[] columns
--
-- Create a GIN inverted index type for UUID array
-- columns to enable quick comparisons
-- -----------------------------------------------------

CREATE OPERATOR CLASS _uuid_ops DEFAULT FOR TYPE _uuid USING gin AS
OPERATOR 1 &&(anyarray, anyarray),
OPERATOR 2 @>(anyarray, anyarray),
OPERATOR 3 <@(anyarray, anyarray),
OPERATOR 4 =(anyarray, anyarray),
FUNCTION 1 uuid_cmp(uuid, uuid),
FUNCTION 2 ginarrayextract(anyarray, internal, internal),
FUNCTION 3 ginqueryarrayextract(anyarray, internal, smallint, internal, internal, internal, internal),
FUNCTION 4 ginarrayconsistent(internal, smallint, anyarray, integer, internal, internal, internal, internal),
STORAGE uuid;

EXCEPTION
WHEN duplicate_object THEN
RAISE NOTICE 'error: %', SQLERRM;
END;
$$;
---------------------------------------------------

Comments (1)

John Wilger from Forest Grove, OR

It's worth noting that you cannot do this if you are hosting your database on Amazon RDS, as it requires superuser privileges to create an operator, and you cannot get superuser privileges on RDS (because it is a managed db service as opposed to a completely private db instance).

Leave a Comment

Name:
Location: (city / state / country)
Email: (not published / no spam)
Comment:

No HTML is allowed. Cookies must be enabled to post. Your comment will appear on this page after a moderator OKs it. Offensive content will not be published.

Click the banana to submit your comment.

To create links in comments:
[link:http://anders.com/] becomes http://anders.com/
[link:http://anders.com/|Anders.com] becomes Anders.com
Notice there is no rel="nofollow" in these hrefs. Links in comments will carry page rank from this site so only link to things worthy of people's attention.