Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dramatically faster to import std.datetime (and anything under it) #5948

Merged
merged 1 commit into from
Dec 26, 2017

Conversation

andralex
Copy link
Member

This has been an iterative process of pushing imports from the top level into the places where imports are being used. That helped not only clarifying dependencies, but also accelerates import time by about 3x. Further improvements are possible.

@dlang-bot
Copy link
Contributor

Thanks for your pull request, @andralex!

Bugzilla references

Your PR doesn't reference any Bugzilla issue.

If your PR contains non-trivial changes, please reference a Bugzilla issue or create a manual changelog.

@quickfur
Copy link
Member

Very nice! Hooray for local imports. They are an overall win.

You could also reference https://issues.dlang.org/show_bug.cgi?id=13253 in one of your commits; it's a useful way to keep track of improvements that we've made in turning more imports local.

Copy link
Member

@wilzbach wilzbach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Windows doesn't like this yet:

std\datetime\systime.d(9304): Error: undefined identifier `msecs`
std\datetime\timezone.d(3038): Error: undefined identifier `DateTimeException`
std\datetime\timezone.d(3081): Error: undefined identifier `DateTime`
std\datetime\timezone.d(3086): Error: undefined identifier `Month`
std\datetime\timezone.d(3092): Error: undefined identifier `Month`
std\datetime\timezone.d(3101): Error: undefined identifier `Month`
std\datetime\timezone.d(3106): Error: undefined identifier `Month`
std\datetime\timezone.d(3125): Error: undefined identifier `DateTime`
std\datetime\timezone.d(3158): Error: template instance convert!("minutes", "hnsecs") template 'convert' is not defined
std\datetime\timezone.d(3160): Error: template instance convert!("minutes", "hnsecs") template 'convert' is not defined
std\datetime\timezone.d(3170): Error: undefined identifier `DateTime`
std\datetime\timezone.d(3234): Error: undefined identifier `DateTime`
std\datetime\timezone.d(3256): Error: template instance convert!("minutes", "hnsecs") template 'convert' is not defined
std\datetime\timezone.d(3262): Error: template instance convert!("minutes", "hnsecs") template 'convert' is not defined

@@ -1385,6 +1400,9 @@ public:
this(Duration utcOffset, string stdName = "") @safe immutable pure
{
// FIXME This probably needs to be changed to something like (-12 - 13).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe the imports should be above this comment?

@wilzbach
Copy link
Member

BTW during my work on this, I noticed a couple of things:

  • We shouldn't be doing this by hand. At least the conversion from full imports to selective imports could be automated
  • If we would have lazy selective imports, then all this work isn't needed and from a naive point of view, this shouldn't be too hard? The compiler knows where he can find a symbol and only needs to so if he really needs that symbol resolved...
  • There's currently no way to make imports in aggregated types lazy (except for using Voldemort types, but that would be a breaking change for public symbols). Example:
struct Foo
{
    import std.internal.unicode_tables; // not lazy, but ok
    import std.internal.unicode_tables : SimpleCaseEntry; // not lazy either :/
}

@wilzbach
Copy link
Member

continuous-integration/jenkins/pr-merge — This commit cannot be built

Due to a vibe.d test that hangs constantly (not related to this PR) -> dlang/ci#96

@jmdavis
Copy link
Member

jmdavis commented Dec 20, 2017

I don't see anything obviously wrong, though it's not surprising if the Windows build broke in the process given that std.datetime.timezone contains Windows-specific code. That's either going to have to be dealt with on a Windows system or risk a fair bit of back and forth with the auto-tester.

I confess thought that this PR helps show why I hate local and selective imports. They help with performance at least some of the time (apparently quite a bit here), so they can be necessary (especially for Phobos, which is imported by everything), and being more explicit about what uses which symbols might help some of the time, but man, the code just gets littered with import statements, and whenever you make changes, you frequently have to muck with the various selective imports instead of just writing code. It would be so much nicer if we could just have the imports at the top without bothering with selective or local imports, but at least right now, the compiler isn't able to do that as efficiently.

@andralex
Copy link
Member Author

I confess thought that this PR helps show why I hate local and selective imports. They help with performance at least some of the time (apparently quite a bit here), so they can be necessary (especially for Phobos, which is imported by everything), and being more explicit about what uses which symbols might help some of the time, but man, the code just gets littered with import statements, and whenever you make changes, you frequently have to muck with the various selective imports instead of just writing code. It would be so much nicer if we could just have the imports at the top without bothering with selective or local imports, but at least right now, the compiler isn't able to do that as efficiently.

I'm actually in two minds about this. I am old enough to have seen all of these arguments made earnestly (roughly in chronological order):

  • "These local variables are so tedious. You mean I need to define i with a type again and again whenever I need to loop over something? I wish I wouldn't need to bother with locals, globals are already there when you need them."
  • "You mean now some random statement can introduce a new variable? Wish I wouldn't need to bother with that, I prefer it when all variables in a function are nicely declared at the top of the function."
  • "With virtuals there's no way to tell what code gets executed. Such code is impossible to test!"
  • "You mean every function call can throw an exception so no more single exit? That's the end of civilization!"

... you get my drift. Local imports are a new technology. I know of no language that does them similarly to the way D does them. It is expected that new technology comes with new advantages, new liabilities, and generally changes in style that are not universally liked.

This PR has been cathartic. Yeah, imports at the top are there for the whole module, but... they're there for the whole module. And I mean all 11,000 lines of it. Initially I thought I'd never figure it out - cutting these mutually recursive imports is surprisingly difficult because compile-time errors don't always give the right information, e.g. when CTFE and constraints are involved. (I will look into filing some bugs.) With this PR it's way more clear what depends on what, where the large dependency knots are, etc.

It may be the case that global imports will become seen as okay for small projects, an antipattern for large programs.

@jmdavis
Copy link
Member

jmdavis commented Dec 20, 2017

Yeah, there are pros and cons, and in larger projects, they matter more. I even use them quite a bit even though I hate them. It's just that they get so tedious and repetitive, and when templates aren't involved, there really isn't going to be a performance boost from local imports, since the imports will happen anyway. It's just that it then helps show the dependencies in a more fine-grained manner, which can help avoid importing stuff when you don't need to after refactoring. But if there are a lot of duplicate imports, I'm quickly inclined to put the imports at the top.

In some ways though, the selective imports are far worse than the local imports, because they require more maintenance - and it's particularly bad if you go so far as to use selective imports for the range primitives given how many you end up using for even simple code that uses arrays as ranges. std.range.primitives is one that I pretty much never use selective imports for. In the end, these seem like a prime target for an automated tool, which isn't exactly a good sign. In theory, we're trying to avoid needing extra tools to make code manageable and avoid boilerplate.

If the compiler were smarter and did more imports in a lazy fashion, we could at least mitigate the need to do this for performance, but that still leaves all of the arguments about how much it improves things in terms of how it clarifies dependencies and whether that's worth all of the code duplication.

In any case, as with function attributes, this is one area where good, idiomatic D code seems to be really verbose with the stuff that isn't actual code. And I just can't look at a module with tons of import statements with many of them followed by whole lists of symbols being selectively imported and not get annoyed at how verbose and repetitive it is - especially when the verbosity isn't actually required by the language.

@quickfur
Copy link
Member

quickfur commented Dec 20, 2017

I've been slowly shifting away from global imports in my own code and moving towards local imports. Just like the old C++ innovation of declaring variables close to where you actually need them vs. at the top of the function like older forms of C, I've been finding that local imports are much more well-encapsulated, esp. if put close to or even next to the line where it's actually needed, and blocks of code written this way can be moved around easily without having to constantly fiddle with updating import statements at the top of the scope / module.

Like manually typing long type names in (older forms of?) C++, though, there is definitely a noticeable maintenance cost to it. Which makes me wonder if some import analogue of type inference might be the next step forward. Like some kind of lazy import or auto import. But presently I can't think of any concrete realization of this idea that doesn't either just send us straight back to square one (i.e., declaring dependent modules somehow at the top of the file / start of scope, etc.), or has significant implementational difficulties.

But maybe somebody can take this idea and run with it, and get somewhere new. :-)

@andralex
Copy link
Member Author

when templates aren't involved, there really isn't going to be a performance boost from local imports, since the imports will happen anyway.

Nonono. Careful here, I also was confused by that. When you import module xyz, only top-level imports and imports inside aggregates (struct/class/union) are opened. All others are NOT opened. That includes imports inside non-template functions.

@MartinNowak
Copy link
Member

MartinNowak commented Dec 21, 2017

... you get my drift. Local imports are a new technology. I know of no language that does them similarly to the way D does them. It is expected that new technology comes with new advantages, new liabilities, and generally changes in style that are not universally liked.

Yes, that's how things go. Also new things are often introduced with a more verbose syntax and get simplified later on (e.g delegate (arg) { return expr; } => arg => expr).

In that sentiment, now that things get more adopted, maybe it'd be worth to think about some shorter syntax for local imports. dlang/dmd#6589
I'm personally not a fan of one line per imported module.

Other candidates:
() @trusted { stmt; }(); => @trusted { stmt; }
immutable => imm or let

Not that urgent and a great bike-shedding topic though.

@@ -8560,6 +8630,7 @@ public:
// bug# 17801
@safe unittest
{
import core.time;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about a top-level version (unittest) import core.time; instead? Maybe even just a selective one?
It's indeed annoying that privately imported modules are resolved eagerly, albeit none of their symbols might ever be needed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I considered it but am weary of the bugs that could happen. Consider:

version(unittest) import std.stdio;
void fun(T)(T value)
{
    writeln(value);
}
unittest
{
    fun(5);
}

This module compiles on its own, passes unittests, is importable when not used, but issues an error whenever the user attempts to use fun. I think we should make it a guideline that version(unittest) top-level imports should be used with caution.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI: At least if it's a publicly documented test, this can't happen as we execute them separately exactly for the reason to protect against these kinds of bugs.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should make it a guideline that version(unittest) top-level imports should be used with caution.

It's more or less been a guideline for a while now, precisely because it caused problems in the past. I don't know that we've actually written it down anywhere though, just told people not to do it when it comes up in PRs.

@andralex
Copy link
Member Author

@MartinNowak good points. I think we should in the short term move forward with dlang/dmd#6589. No need for a DIP as long as we're removing a limitation.

@andralex
Copy link
Member Author

ping?

Copy link
Member

@quickfur quickfur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a relatively straightforward change with good benefits, and the autotester is green. I say we should just merge.

import std.range.primitives;
import core.time : abs, convert, dur, Duration, hours, minutes;
import std.datetime.systime : Clock, stdTimeToUnixTime, SysTime;
import std.range.primitives : back, front, empty, popFront;
Copy link
Contributor

@joakim-noah joakim-noah Dec 23, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As Seb pointed out, this could break code because of the selective import symbol leak laid out in Issue 17630 and demonstrated in #5584, which is unfixed according to bugzilla. We cannot push all D code to selective imports until all known leaks are fixed, without breaking code.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've backed off this part of the PR and left a note for the future.

@andralex
Copy link
Member Author

squashed, will put up for auto-merge

@andralex
Copy link
Member Author

fixed a couple more imports

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants