Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

APIs should have typesafe self-documenting time units, eg proc sleep(t: Duration), allowing sleep(1.millisecond) etc #383

Open
timotheecour opened this issue Jun 7, 2021 · 8 comments

Comments

@timotheecour
Copy link
Member

timotheecour commented Jun 7, 2021

as noted in nim-lang/Nim#17149 (comment), sleep is confusing:

  • posix.sleep takes seconds (as cint)
  • os.sleep takes milliseconds (as int)

this is confusing, not self documenting, error prone and not particularly typesafe

more generally, plenty of APIs have the same problem, where you "just have to know" and it's unclear what the unit is especially at callsite, eg: proc suspend(sleepTime: float = 0), proc wait(c: CoroutineRef; interval = 0.01), proc withTimeout[T](fut: Future[T]; timeout: int): owned(Future[bool])

=> sometimes it's seconds, sometimes milleseconds (can you guess?)

proposal

  • add proc sleep(t: Duration)
    example:
sleep(1'second)
sleep(1'millisecond)
sleep(100'millisecond + 1'second)
sleep(1.5'hour)
  • more generally, for old API's that take a duration param expressed as int (or float) with some implied unit (eg seconds, milliseconds etc), add an overload that takes a Duration and simplify its documentation which then doesn't need to specify the unit
  • for new APIs, just use Duration params, eg in Fix for #18161 AsyncHttpServer Issue: Too many open files, against devel branch. Nim#18205, instead of keepaliveTimeout = 3600, it should be: keepaliveTimeout = 3600.second (or equivalently, 1.hour)

note: std/durations should be split off from std/times

std/times is a heavy dependency, ideally we'd be able to do this without depending on std/times, refs #55, via a new std/durations module (imported and re-exported by std/times); this can be done independently of this RFC

note: Duration, not TimeInterval

TimeInterval is not good, this has overhead because it's calendar based and it's the wrong abstraction for expressing durations anyways.
we can define millisecond, second (etc) without s to avoid confusion with the existing milliseconds + friends which are (rather sadly) returning TimeInterval (eg https://nim-lang.github.io/Nim/times.html#milliseconds%2Cint); there is no confusion though, we're not using s.

eg:

proc second*(a: int | float): Duration
proc millisecond*(a: int | float): Duration
# etc

# maybe also, with new literal syntax, but not needed for this RFC:
proc `'second`*(a: string): Duration

note: thin wrappers (eg posix.sleep) are excluded from this RFC

links

@timotheecour timotheecour changed the title APIs should have typesafe self-documenting time units: proc sleep(t: Duration), allowing sleep(1.millisecond) etc APIs should have typesafe self-documenting time units, eg proc sleep(t: Duration), allowing sleep(1.millisecond) etc Jun 7, 2021
@timotheecour timotheecour changed the title APIs should have typesafe self-documenting time units, eg proc sleep(t: Duration), allowing sleep(1.millisecond) etc APIs should have typesafe self-documenting time units, eg proc sleep(t: Duration), allowing sleep(1.millisecond) etc Jun 7, 2021
@Araq
Copy link
Member

Araq commented Jun 7, 2021

Not that I disagree with the RFC, but it's getting tiresome to read things like:

this is confusing, not self documenting, error prone and not particularly typesafe

You know perfectly well that APIs are inconsistent because they were written at different times by different people. And that wrappers are more restricted than idiomatic Nim code. I don't write os.sleep thinking about how it might bite with posix.sleep, I don't care about the posix module at all for you shouldn't use it and if you do, you need to understand it's platform specific, unidiomatic Nim code. More importantly, when I wrote os.sleep the Duration type didn't exist. And when the Duration type was designed we lacked user definable numeric suffixes.

more generally, plenty of APIs have the same problem

Exactly. Because code was not written after every eventuality has been considered and the solution has been compared to the 100 existing, similar solutions. Code was written, "hey it works for my project and it's generally useful" and then submitted to the stdlib. IIRC you sometimes do the same thing... ;-)

New code is added all the time that is designed more carefully and it's a good thing. So yeah, I completely agree with this RFC, I just wish it would avoid the constant punching at we already have. Last time I checked, people called os.sleep successfully regardless of its suboptimal design.

@timotheecour
Copy link
Member Author

timotheecour commented Jun 7, 2021

I think you're reading too much between the lines, this RFC isn't meant as a criticism of some APIs (even less so of their authors), its only goal is: today, pattern B is better than pattern A (because of the reasons given above eg self documenting, which i have to give in order to justify the RFC), so let's do B from now on. Whether A was the best choice at the time is entirely possible but not in contradiction.

The other aspect (more controversial) is that best practices should be preferred over consistency with surrounding code, otherwise silos persist.

written at different times by different people

that's the reason for such RFCs, discuss/establish/update existing best practices to make them easier to follow

understand it's platform specific

yes; this RFC doesn't cover thin wrappers like posix.sleep, I've updated RFC to reflect that

are duration suffixes really needed?

And when the Duration type was designed we lacked user definable numeric suffixes.

Actually I'm not sure we need numeric suffixes here unless I'm missing something. 12.second is probably good enough instead of requiring 12'second; numeric suffixes are useful primarily in cases where user may write a literal that doesn't fit (eg bigints as in std/jsbigints); if Duration gets re-implemented as int128 (on platforms supporting it), then we could have: let duration = 9999999999999999999'i128.nanosecond, all without introducing duration-specific suffixes.

The problem with suffixes here is that if you add:

# in std/durations:
proc `'millisecond`*(a: string): Duration =
  # in pseudo-code
  try: # handle 1234'second, avoiding loosing FP conversion accuracy issues
    result = toDuration(a.parseInt, kSecond)
  except ValueError: # handle 1.5'second
    result = toDuration(a.parseFloat, kSecond)

you'd still need to allow conversions from int|float, eg:

let t = 1000
let t2 = t.second

so it doesnt' really resolve anything

@juancarlospaco
Copy link
Contributor

I wonder if a std/units should exist too,
for basic unit ops without converting twice to/from int/float,
kinda 1.Hours + 55.Minutes or $(42.Microseconds) etc

@timotheecour
Copy link
Member Author

timotheecour commented Jun 19, 2021

I wonder if a std/units should exist too,

I've implemented std/units in a private branch, see timotheecour/Nim#760 which allows user defined units as distinct float and automatically computes arbitrary powers from those in expressions, eg:

runnableExamples:
  type
    Length = "m".makeUnit # can be also defined from other powers of other units
    Time = "s".makeUnit
    Mass = "kg".makeUnit
  assert $(Time(1.2)/Length(3.2)*Mass(5.4)*Mass(5.4)) == "10.935's*1/m*kg²"

but that'd require a separate RFC and isn't needed for this RFC (in particular, durations should be integral, not FP, whereas units are more useful as FP in most applications)

for basic unit ops without converting twice to/from int/float,

what do you mean twice? the conversion should be 1-way and propagated throughout stdlib until the point where it's consumed by final OS call into the unit it needs; the goal would be to only consume Duration through and through.

The only problem I'm facing to implement this is that I'm not sure the implementation of Duration is the right one, as it uses this:

type Duration* = object
  seconds: int64
  nanosecond: range[0..999_999_999]

which doesn't lend itself for efficient operations.

So I'm instead considering introducing a different type to represent durations that doesn't incur performance penalty, eg:

# in system:
type int128* {.magic: Int128.}
  # in C backend, uses `__int128` if `#ifdef __SIZEOF_INT128__` is defined,
  #   else falls back to software emulation as implemented in compiler/int128.nim
  # in VM, uses compiler/int128.nim
  # in js, uses jsbigints or a wrapper around it if we care about preserving bounds checking

# in std/durations:
type TimeUnit* = distinct int128
proc second*(a: int | float): TimeUnit = ... # now the math is simple and efficient

The name TimeUnit can be bikeshedded, eg: Nanoseconds, Timespan, Dur etc...

The alternative would be to use int64 but it can "only" represent ranges +- 292 years in nanoseconds, which can be a problem if doing math involving calendar year (eg 2021) + offset operations.

note

std/bigints would not be a viable replacement for int128, for 2 reasons:

  • performance
  • dependencies

ie, both should be added to stdlib
it's not an infinite recursion argument either, int128 "solves" time range precision for all practical matters (and other applications too), whereas int256 would be more niche

int would still be int64 even if we added int128 (as for BiggestInt, I'm not sure, TBD)

=> #399

@juancarlospaco
Copy link
Contributor

that'd require a separate RFC

Do it.

@Vindaar
Copy link

Vindaar commented Jun 19, 2021

I wonder if a std/units should exist too,
for basic unit ops without converting twice to/from int/float,
kinda 1.Hours + 55.Minutes or $(42.Microseconds) etc

https://github.com/SciNim/Unchained

I'm obviously a fan of this, but not sure if it should be a standard library feature to be honest.

@c-blake
Copy link

c-blake commented Jul 26, 2021

I think the distinction between Duration and Time is dubious at best..more about ceremony than utility. Every time has an origin/epoch/offset. It's just "standard" while Duration is more "user/context defined". Even something as simple as unit conversion via division by a billion might apply equally to Time and Duration to get from nanoseconds from the epoch to seconds, for example. Maybe when two types have identical representations that is a clue..

I believe OSes going to timeval with seconds, microseconds, long before timespec with nanoseconds, was done because so many extant apps only used seconds and ignored microseconds, like anything using the ANSI C time_t type, like the ANSI C struct tm stuff in particular. Heck, they even kept the tv_ prefix in timespec instead of changing it to ts_ (though they changes "usec" to "nsec"!).

Nim is not as constrained as the C world and this already talks of specifying units in API calls. I think that strongly motivates a dense valued int64 nanoseconds since the Unix/C Epoch. 128-bits is not necessary - I disagree that exact arithmetic for > +-292 years is a realistic problem and should not hinder this clean-up/be tied together.

I also think Duration = Time works fine. Maybe just don't define * and / for either. Instead make call sites select what kind of non-additive, non-unit-converting arithmetic they might want. That will usually be float64 arithmetic, but adding a .float or using lenientops is good documentation/expression of intent. This is a sort of "alternate simplification" than splitting into a std/times & std/duration modules that was raised. At the very least, if they are both int64 the degree of code condensation will be so extreme two separate modules will probably seem unnecessary.

I'm just trying to help here. I think there is a useful simplification/clean-up of times to be had, though date & time modules are pretty cursed.

@c-blake
Copy link

c-blake commented Jul 26, 2021

I don't think being burned by 2038/etc should generate an irrational fear to provision calculation "beyond the heat death of the universe to the nanosecond". :-) Why, Gregorian calendar adoption itself is only barely outside -292 years and we've been adding more and more "leap nanoseconds" as well (as tides slow the Earth's rotation).

To the extent crazy long-term calendar math does happen, those programmers have (or should have) their own 20 line of code library with full Julian-Gregorian calendar conversion. This is what astronomers do. They calculate in units of days not dates, use the Julian Calendar, and then convert, possibly splitting day & time of day calculations if they think they can get some calculation down to minute-accuracy. I don't see why this isn't a fine solution for people with "astronomer like time ranges". In fact, I would expect them to expect to need it, not rely on any prog.lang's stdlib time module. :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants