Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make monotonicFactory the default? Replacing existing ulid() #5

Closed
grempe opened this issue Jan 6, 2022 · 14 comments
Closed

Make monotonicFactory the default? Replacing existing ulid() #5

grempe opened this issue Jan 6, 2022 · 14 comments

Comments

@grempe
Copy link
Collaborator

grempe commented Jan 6, 2022

Based on this deep dive into the probability of collision of ULID actually being lower with a monotonic approach, I would like to suggest the possibility of refactoring to remove the monotonicFactory as a separate option altogether and to make it the default when importing ulid(). This especially makes sense in a library that is targeted at running inside the context of a Cloudflare Worker isolate where time is "frozen" during the duration of a single request to the ms precision timestamp of the start time of the request.

The largest probability of collisions actually occurs when we use completely random ULID values! Incrementing the random number, even several times, leads to groups of numbers which are actually less likely to collide with each other. In conclusion, the use of the ULID monotonic option does not make ULID collisions more likely, and is safe to use in our use case.

https://medium.com/zendesk-engineering/how-probable-are-collisions-with-ulids-monotonic-option-d604d3ed2de

This would:

  • further simplify the code
  • reduce the size of the worker
  • provide a safer default in the context of the Cloudflare Worker
  • ensure that lexical ordering remains a property in a Cloudflare Worker, where the current default is that every ULID generated in a single request uses the same ms time value and only the random portions changes. In this situation (in every CF worker) the lexical ordering then also becomes random. Monotonic by default would ensure that all ULID's generated within a single worker request are still lexically ordered by time of generation even though the timestamp portion remains the same.

This can be demonstrated as the current situation by comparing a group of ULID's created in a worker process. They all have the exact same first ten characters representing the timestamp in ms.

https://ulid.truestamp.com/?q=10

The ULID Spec seems agnostic on this topic.

Cheers

@grempe
Copy link
Collaborator Author

grempe commented Jan 6, 2022

Some additional discussion on this topic. Reading it though I still think that monotonic is likely the way to go in a workers environment.

ulid/spec#40

ulid/spec#11

@peterbourgon
Copy link

oklog/ulid provides montonicity through the use of a Monotonic entropy source, which is paramaterizable.

Monotonic returns an entropy source that is guaranteed to yield strictly increasing entropy bytes for the same ULID timestamp. On conflicts, the previous ULID entropy is incremented with a random number between 1 and inc (inclusive) . . . The lower the value of inc, the easier the next ULID within the same millisecond is to guess.

@grempe
Copy link
Collaborator Author

grempe commented Jan 8, 2022

@peterbourgon Please see the linked analysis in my original change request.

Incrementing a single bit out of 80 bits of randomness will not in any way make the next value guessable.

Monotonicity is also part of the ULID spec. I am just suggesting that in the case of Cloudflare workers it should be the default behavior due to their security sensitive treatment of time (frozen to ms per request).

@peterbourgon
Copy link

peterbourgon commented Jan 8, 2022

I was merely offering the link as additional information re: collision probability. It's true that RAND+1 will minimize that probability from a single generator, but usually ULIDs are used in systems with many concurrent generators. In that context, at scale, monotonic entropy will generally not affect collisions, and has the downside of being predictable. On that point...

Incrementing a single bit out of 80 bits of randomness will not in any way make the next value guessable.

How do you figure? Given this strategy, if I observe a ULID A, then I can guess the next one will be A plus the lower 80 bits incremented by 1, modulo UINT80_MAX — right?

@grempe
Copy link
Collaborator Author

grempe commented Jan 9, 2022

After looking more closely at incrementBase32 as implemented here I stand corrected. 👍🏻

If you had access to a ULID, and you needed to guess the next you could easily do so. But, you'd have to have access AND make your guess before the millisecond ticks over. Right? So in order to exploit that knowledge you would have to have real time access to the internal state of the generator. Which in this case would be an isolate. I'm not sure this kind of attack is even possible given current architecture:

https://developers.cloudflare.com/workers/learning/security-model#isolation

🤷‍♂️

But monotonicity in the ULID case is not about preventing our ability to guess the next ULID anyway. I don't think that claim is being made. Its about:

When generating a ULID within the same millisecond, we can provide some guarantees regarding sort order. Namely, if the same millisecond is detected, the random component is incremented by 1 bit in the least significant bit position (with carrying).

https://github.com/ulid/spec#monotonicity

In the Cloudflare worker case, we would always be generating more than one ULID within the same frozen millisecond within a single request. Hence my suggestion that monotonicity by default makes sense for this lib, which was forked to focus on the CF worker context only. This is so we can retain lexical sort at all times while still having a unique value.

Cheers

@peterbourgon
Copy link

So in order to exploit that knowledge you would have to have real time access to the internal state of the generator.

You just need your worker to generate more than 1 ULID per execution. Presumably they're being emitted somewhere; whoever observes one of them can add 1 and predict another.

@grempe
Copy link
Collaborator Author

grempe commented Jan 10, 2022

You just need your worker to generate more than 1 ULID per execution. Presumably they're being emitted somewhere; whoever observes one of them can add 1 and predict another.

Again, only within ULID's generated within a single millisecond within the same request.

To demonstrate, reload this and try to guess the next output. It is monotonic by default and has a new time and a new 80 bit random component on each invocation. As one would expect.

https://ulid.truestamp.com/?q=10

And again, even if you could guess the next ULID generated within a single request I'm not sure what the concern is. You'd just as likely be wrong if the next ULID came from a different instance of the generator (different CF POP, or scaled within a POP, or different request).

Are you making the argument that monotonicity as implemented should not be used because it breaks an unpredictability "guarantee" that is not actually considered part of the spec? The spec only concerns itself with uniqueness and lexical sorting. If so, perhaps that should be taken up as a feature request on the main spec repo where the general approach to monotonicity is defined?

https://github.com/ulid/spec

Is there an argument you are making that monotonic generation should not be the default on Cloudflare (which is the only subject of this issue)?

I'll leave this in the hands of the author at this point. I will plan to use the monotonic factory for my use on Cloudflare. I just think it is also a sensible default for others who are not as involved in the nuances.

The alternative is the generation of ULIDs which sort out of generation order.

@peterbourgon
Copy link

If your monotonicity strategy is "first ULID in a given millisecond gets a random value, subsequent ULIDs add 1" — then if I observe ULID U I can predict that ULID U+1 also exists in your system with reasonable accuracy. This is demonstrated by your link: the ULIDs in each response are sequential. This may or may not be a problem for you. But predictable IDs can be risky, so it's probably not a great default strategy.

My link demonstrated a way to get monotonic IDs with a more sophisticated strategy than "add 1". Basically: instead of adding 1, you add a random value between 1 and N. Bigger N means less predictable, but higher chance of inter-millisecond collision.

@grempe
Copy link
Collaborator Author

grempe commented Jan 10, 2022

It is not my strategy. It is what is implemented, and recommended by the spec.

A ULID is, first and foremost, unique and sortable by generation time. It you are more concerned with unpredictability in all circumstances, then non-monotonic ULID or random UUID are a better choice.

I do not disagree with a single part of your statement. I do suggest though that this is the wrong place to make your case to suggest an alternative monotonicity algorithm.

The ULID spec repo would seem like the right place to make that argument so the final approach agreed to would be expected to be observed by all implementations.

@ryan-mars
Copy link
Owner

I sincerely apologize that I have not responded to this issue yet. It never showed up in my Github notifications. I'll triage and respond as soon as I can. Thank you.

@ryan-mars
Copy link
Owner

per @grempe

I would like to suggest the possibility of refactoring to remove the monotonicFactory as a separate option altogether and to make it the default when importing ulid().

I agree completely and would love to accept a PR.

@ryan-mars
Copy link
Owner

per @peterbourgon

oklog/ulid provides montonicity through the use of a Monotonic entropy source, which is paramaterizable.

Monotonic returns an entropy source that is guaranteed to yield strictly increasing entropy bytes for the same ULID timestamp. On conflicts, the previous ULID entropy is incremented with a random number between 1 and inc (inclusive) . . . The lower the value of inc, the easier the next ULID within the same millisecond is to guess.

What would you suggest as a suitable source of ULID spec compliant entropy within a Cloudflare Worker?

@peterbourgon
Copy link

peterbourgon commented Jan 12, 2022

@ryan-mars

If Cloudflare Workers freeze time at a specific instant, then I would suggest a default of non-monotonic entropy using whatever source of randomness is provided to the guest, and offer opt-in monotonic entropy with parameterizable precision in a similar way as oklog/ulid.

The ULID spec defines how ULIDs are represented in memory. It can suggest methods to produce them but it's ultimately agnostic to that behavior.

All IMO, of course.

@grempe grempe mentioned this issue Feb 9, 2022
@grempe
Copy link
Collaborator Author

grempe commented Feb 9, 2022

Pull request addressing topics in this discussion here:

#6

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants