-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make monotonicFactory
the default? Replacing existing ulid()
#5
Comments
Some additional discussion on this topic. Reading it though I still think that monotonic is likely the way to go in a workers environment. |
oklog/ulid provides montonicity through the use of a Monotonic entropy source, which is paramaterizable.
|
@peterbourgon Please see the linked analysis in my original change request. Incrementing a single bit out of 80 bits of randomness will not in any way make the next value guessable. Monotonicity is also part of the ULID spec. I am just suggesting that in the case of Cloudflare workers it should be the default behavior due to their security sensitive treatment of time (frozen to ms per request). |
I was merely offering the link as additional information re: collision probability. It's true that RAND+1 will minimize that probability from a single generator, but usually ULIDs are used in systems with many concurrent generators. In that context, at scale, monotonic entropy will generally not affect collisions, and has the downside of being predictable. On that point...
How do you figure? Given this strategy, if I observe a ULID |
After looking more closely at If you had access to a ULID, and you needed to guess the next you could easily do so. But, you'd have to have access AND make your guess before the millisecond ticks over. Right? So in order to exploit that knowledge you would have to have real time access to the internal state of the generator. Which in this case would be an isolate. I'm not sure this kind of attack is even possible given current architecture: https://developers.cloudflare.com/workers/learning/security-model#isolation 🤷♂️ But monotonicity in the ULID case is not about preventing our ability to guess the next ULID anyway. I don't think that claim is being made. Its about:
https://github.com/ulid/spec#monotonicity In the Cloudflare worker case, we would always be generating more than one ULID within the same frozen millisecond within a single request. Hence my suggestion that monotonicity by default makes sense for this lib, which was forked to focus on the CF worker context only. This is so we can retain lexical sort at all times while still having a unique value. Cheers |
You just need your worker to generate more than 1 ULID per execution. Presumably they're being emitted somewhere; whoever observes one of them can add 1 and predict another. |
Again, only within ULID's generated within a single millisecond within the same request. To demonstrate, reload this and try to guess the next output. It is monotonic by default and has a new time and a new 80 bit random component on each invocation. As one would expect. https://ulid.truestamp.com/?q=10 And again, even if you could guess the next ULID generated within a single request I'm not sure what the concern is. You'd just as likely be wrong if the next ULID came from a different instance of the generator (different CF POP, or scaled within a POP, or different request). Are you making the argument that monotonicity as implemented should not be used because it breaks an unpredictability "guarantee" that is not actually considered part of the spec? The spec only concerns itself with uniqueness and lexical sorting. If so, perhaps that should be taken up as a feature request on the main spec repo where the general approach to monotonicity is defined? https://github.com/ulid/spec Is there an argument you are making that monotonic generation should not be the default on Cloudflare (which is the only subject of this issue)? I'll leave this in the hands of the author at this point. I will plan to use the monotonic factory for my use on Cloudflare. I just think it is also a sensible default for others who are not as involved in the nuances. The alternative is the generation of ULIDs which sort out of generation order. |
If your monotonicity strategy is "first ULID in a given millisecond gets a random value, subsequent ULIDs add 1" — then if I observe ULID My link demonstrated a way to get monotonic IDs with a more sophisticated strategy than "add 1". Basically: instead of adding 1, you add a random value between 1 and N. Bigger N means less predictable, but higher chance of inter-millisecond collision. |
It is not my strategy. It is what is implemented, and recommended by the spec. A ULID is, first and foremost, unique and sortable by generation time. It you are more concerned with unpredictability in all circumstances, then non-monotonic ULID or random UUID are a better choice. I do not disagree with a single part of your statement. I do suggest though that this is the wrong place to make your case to suggest an alternative monotonicity algorithm. The ULID spec repo would seem like the right place to make that argument so the final approach agreed to would be expected to be observed by all implementations. |
I sincerely apologize that I have not responded to this issue yet. It never showed up in my Github notifications. I'll triage and respond as soon as I can. Thank you. |
per @grempe
I agree completely and would love to accept a PR. |
per @peterbourgon
What would you suggest as a suitable source of ULID spec compliant entropy within a Cloudflare Worker? |
If Cloudflare Workers freeze time at a specific instant, then I would suggest a default of non-monotonic entropy using whatever source of randomness is provided to the guest, and offer opt-in monotonic entropy with parameterizable precision in a similar way as oklog/ulid. The ULID spec defines how ULIDs are represented in memory. It can suggest methods to produce them but it's ultimately agnostic to that behavior. All IMO, of course. |
Pull request addressing topics in this discussion here: |
Based on this deep dive into the probability of collision of ULID actually being lower with a monotonic approach, I would like to suggest the possibility of refactoring to remove the
monotonicFactory
as a separate option altogether and to make it the default when importingulid()
. This especially makes sense in a library that is targeted at running inside the context of a Cloudflare Worker isolate where time is "frozen" during the duration of a single request to thems
precision timestamp of the start time of the request.https://medium.com/zendesk-engineering/how-probable-are-collisions-with-ulids-monotonic-option-d604d3ed2de
This would:
ms
time value and only the random portions changes. In this situation (in every CF worker) the lexical ordering then also becomes random. Monotonic by default would ensure that all ULID's generated within a single worker request are still lexically ordered by time of generation even though the timestamp portion remains the same.This can be demonstrated as the current situation by comparing a group of ULID's created in a worker process. They all have the exact same first ten characters representing the timestamp in
ms
.https://ulid.truestamp.com/?q=10
The ULID Spec seems agnostic on this topic.
Cheers
The text was updated successfully, but these errors were encountered: