Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Would you accept a PR to generate a ULID in UUID format? #3

Open
dharmaturtle opened this issue Aug 29, 2020 · 5 comments
Open

Would you accept a PR to generate a ULID in UUID format? #3

dharmaturtle opened this issue Aug 29, 2020 · 5 comments

Comments

@dharmaturtle
Copy link

dharmaturtle commented Aug 29, 2020

I wrote the following, heavily inspired by what you already wrote:

CREATE FUNCTION public.generate_ulid() RETURNS uuid
    LANGUAGE plpgsql
    AS $$
DECLARE
  timestamp  BYTEA = E'\\000\\000\\000\\000\\000\\000';
  unix_time  BIGINT;
BEGIN
    unix_time = (EXTRACT(EPOCH FROM NOW()) * 1000)::BIGINT;
    timestamp = SET_BYTE(timestamp, 3, (unix_time >> 40)::BIT(8)::INTEGER);
    timestamp = SET_BYTE(timestamp, 2, (unix_time >> 32)::BIT(8)::INTEGER);
    timestamp = SET_BYTE(timestamp, 1, (unix_time >> 24)::BIT(8)::INTEGER);
    timestamp = SET_BYTE(timestamp, 0, (unix_time >> 16)::BIT(8)::INTEGER);
    timestamp = SET_BYTE(timestamp, 5, (unix_time >> 8)::BIT(8)::INTEGER);
    timestamp = SET_BYTE(timestamp, 4, unix_time::BIT(8)::INTEGER);
    RETURN CAST(substring(CAST((timestamp || gen_random_bytes(10)) AS text) from 3) AS uuid);
END
$$;

It generates a ULID, but in the UUID format. The order of the bytes is a bit unintuitive due to the UUID format/endianness.

This function makes it easier to use the uuid PostGres type to hold ULIDs. This makes for smaller, faster indexes (c.f. indexing on text).

@KnowZero
Copy link

Why not just:

RETURN encode( timestamp || gen_random_bytes(10) ,'hex')::uuid;

Doing a benchmark shows its a bit faster

@dharmaturtle
Copy link
Author

dharmaturtle commented Oct 4, 2020

Because I'm absolute trash at SQL @KnowZero :) Thanks for the tweak.


On a different note, here's a function to extract the epoch milliseconds out of a UUID formatted ULID:

CREATE FUNCTION public.ulid_to_epoch_ms(input uuid) RETURNS bigint
    LANGUAGE plpgsql
    AS $$
BEGIN
    return
        (  'x'
        || '0000'
        || SUBSTRING(input::text,  7, 2)
        || SUBSTRING(input::text,  5, 2)
        || SUBSTRING(input::text,  3, 2)
        || SUBSTRING(input::text,  1, 2)
        || SUBSTRING(input::text, 12, 2)
        || SUBSTRING(input::text, 10, 2)
        )::bit(64)::bigint AS int8_val;
END
$$;

(This uses undocumented behavior as described here.)

In most cases it isn't appropriate to extract the datetime out of a ULID, but I found a use case in my own domain.

@stfullstack
Copy link

What is the performance like on this compared with uuid generation?

@mulcek
Copy link

mulcek commented Feb 13, 2022

It would be handy to have a possibility of passing custom timestamp. Then I wouldn't need composite unique index or a primary key like (id, timestamp). Timestamp could then be extracted from ULID, used for query, conditions or even for partitions...
I'm still learning all this. But looks promising.

@asura-asp
Copy link

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants