Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generics "sandwiched" between two modules don't mixin their scope symbols properly #11225

Closed
zah opened this issue May 10, 2019 · 22 comments
Closed

Comments

@zah
Copy link
Member

zah commented May 10, 2019

This problem is a bit tricky to understand, so please read carefully.

Let's imagine we have a library featuring generic functions that mixes in some symbols from the caller scope:

generic_library.nim

proc libraryFunc*[T](x: T) =
  mixin mixedIn, indirectlyMixedIn
  echo mixedIn()
  echo indirectlyMixedIn()

This library is used from another module that provides the needed symbols. mixedIn is defined locally, while indirectlyMixedIn is imported from yet another module:

module_using_generic_library.nim

import
  generic_library, helper_module

proc mixedIn: int = 100

proc makeUseOfLibrary*[T](x: T) =
  libraryFunc(x)

when isMainModule:
  makeUseOfLibrary "test"

helper_module.nim

proc indirectlyMixedIn*: int =
  200

So far, so good. If we compile module_using_generic_library, it will produce the expected output.

But let's introduce another module that imports and uses module_using_generic_library:

main.nim

import
  module_using_generic_library

makeUseOfLibrary "test"

If we try to compile it, we'll get the following error:

module_using_generic_library.nim(4, 6) Hint: 'mixedIn' is declared but not used [XDeclaredButNotUsed]
main.nim(4, 18) template/generic instantiation of `makeUseOfLibrary` from here
module_using_generic_library.nim(7, 14) template/generic instantiation of `libraryFunc` from here
generic_library.nim(3, 8) Error: undeclared identifier: 'mixedIn'

Why did this happen?

Please notice that the makeUseOfLibrary proc in the module_using_generic_library module is also generic. It is instantiated in main, but ultimately it consumes another generic defined in generic_library. Its scope is "sandwiched" between the main scope and the inner-most generic_library scope. At the moment mixin fails to recognise this and attempts to look for symbols in the outer-most scope (the one of main). This is highly surprising behavior for the user, because if you examine the lexical scopes, libraryFunc seems instantiated properly in the sandwiched module.

We've hit this problem multiple times during the development of nimcrypto, chronicles and nim-serialization. I've been asked to explain it on multiple occasions by almost everyone on our team, so it's an ongoing source of confusion and wasted time even for Nim experts. For this reason, I'm assigning high priority to the issue.

@Araq
Copy link
Member

Araq commented May 10, 2019

I'm not sure I'm buying that there is a bug here. When you write

import
  generic_library, helper_module

proc mixedIn: int = 100

proc makeUseOfLibrary*[T](x: T) =
  libraryFunc(x)

you're clearly aware that mixedIn is used for libraryFunc, so the "obvious" solution is to either export it here or to bind/mixin it for the libraryFunc inside makeUseOfLibrary. This doesn't "break the abstraction" or anything like that because there isn't any, makeUseOfLibrary does know about libraryFunc's requirements.

Notice how the casual reader can easily believe that proc mixedIn: int = 100 is dead code since it's not used anywhere and it lacks the export marker.

Or to put it differently: The mixin statement within a generic is in reality a requirement that should be captured by the parameter list and so it needs to be "repeated" / passed around in makeUseOfLibrary.

@Araq
Copy link
Member

Araq commented May 11, 2019

Thinking more about this issue ... your solution would imply that in a module that has a generic proc somewhere everything that is declared before the generic proc could be subject to some inferred mixin declaration and so the compiler would never even have a chance of telling you "deadProc is never used"! That's a worse design than the existing solution, all things considered.

Never mind, there is a way to do this and keeping the "deadProc is never used" analysis.

@zah
Copy link
Member Author

zah commented May 11, 2019

The real-world manifestations of this issue are much more WTF-ivoking than this simplified test case. I can point to more examples, but the two linked issues here are a good start.

To clarify the expected behavior, I think the compiler should determine the lexical scope at the position where libraryFunc is instantiated and symbols should be mixed in from that scope. The author of the sandwiched module is still free to extend the symbol search to another module by repeating the mixin statement. As I mentioned, the current behavior is surprising enough in practice that I've been summoned to help with related compiler errors by almost everyone on our team.

@Araq
Copy link
Member

Araq commented May 13, 2019

As I mentioned, the current behavior is surprising enough in practice that I've been summoned to help with related compiler errors by almost everyone on our team.

Yes, I read it and I do believe you. But it's a tradeoff here, just imagine the compiler could never reliably warn about unused symbols in e.g. asyncdispatch just because it contains a generic proc. And even if the compiler itself can figure it out (as I've said, I think I can make it work), it's totally suprising behaviour for every casual reader of the code.

@Araq Araq added the RFC label May 13, 2019
@Araq
Copy link
Member

Araq commented May 13, 2019

How about this solution:

# declare module-wise that these are mixins and also export them
mixin mixedIn*, indirectlyMixedIn*


proc libraryFunc*[T](x: T) =
  echo indirectlyMixedIn()

It would be a language change, but #7385 also indicates something like that is required.

Of course, in today's Nim this can be written as:

proc mixedIn*() = discard "overload one"
proc mixedIn*(x: typeof(nil)) = discard "overload two to make it a mixin"

@zah
Copy link
Member Author

zah commented May 14, 2019

I think top level mixins are necessary in general - both because it's typical to use the same mixins in multiple procs and also because at the moment there is no way to refer to mixed in symbols in the right-hand side of a type section:

type
  Foo[T] = object
    hashValue: type(hash(default(T))

I've encountered the need to use something like this on few occasions.

But the other part of the proposal makes me feel uneasy. My suggestion tries to honour the standard behavior of lexical scopes. The forwarding approach seems less sound due to the following reasons:

  1. The mixed in symbols I'm exporting are unnecessary pollution.

  2. The feature is still tricky to use - I must remember to export the right symbols, even when the additional mixin definitions are not necessary when compiling usages of the sandwiched generic inside the sandwiched module. I think focusing on this issue reveals the problem. How come it's reasonable for the same code to compile in one occasion, but to fail in another? In practice, I think even with this extra feature, I'll see the developers being tripped by the problem frequently.

@Araq
Copy link
Member

Araq commented May 14, 2019

The mixed in symbols I'm exporting are unnecessary pollution.

Well mixin itself is a bandaid, ideally you would write a concept for the generic code that covers the required procs and then symbol lookup within generic code can be precise.

The feature is still tricky to use.

Well generics with their two different interacting scopes are simply harder to write that non-generic code. However, your point is valid.

proc mixedIn: int = 100

proc makeUseOfLibrary*[T](x: T) =
  libraryFunc(x)

This is what it comes down to and maybe it shouldn't compile at all, regardless of the context? What is really done here is a form of implicit parameter passing libraryFunc(x, mixedIn) and we need to be able to write it down more explicitly IMO.

@zah
Copy link
Member Author

zah commented May 14, 2019

As far as I understand, your concern about the proposed behavior is that it will interfere with the ability of the user or the compiler to reason about unused symbols.

I think the analysis for the compiler is certainly possible. Every time a symbol gets mixed in at a call-site, this counts as an usage. Symbols with zero usages are still detected normally.

The problem for the user is perhaps a bit overstated. Why would anyone bother to manually hunt for unused symbols if the compiler provides accurate hints for them? If I am confused and I want to know where a particular symbol is used, I can still invoke the "find references" operation in my IDE and nimsuggest can output the call-sites where the symbol was mixed in as well.

@Araq
Copy link
Member

Araq commented May 15, 2019

This isn't a question about tooling, I do know how nimsuggest can figure it out.

Here is another solution: Enforce that mixin symbols are exported then the code becomes

proc mixedIn*: int = 100

proc makeUseOfLibrary*[T](x: T) =
  libraryFunc(x)

and everything is fine. IMO.

@zah
Copy link
Member Author

zah commented May 15, 2019

This gets more tricky in the presence of other modules and re-exports. In the current example, I must also re-export indirectlyMixedIn. If the chain of generic instantiations is longer, the re-exports must follow the entire chain of modules. If I happen to export an individual generic proc somewhere, I must also export the mixins needed by its transitive tree of dependencies.

Are you arguing that the current look-up rules are better or is this just some kind of "worse is better" argument, trying to avoid sinking time into implementing a fix?

@Araq
Copy link
Member

Araq commented May 15, 2019

Are you arguing that the current look-up rules are better or is this just some kind of "worse is better" argument, trying to avoid sinking time into implementing a fix?

I'm arguing that the current lookup rules have their merits, otherwise too much magic would be going on for my taste.

trying to avoid sinking time into implementing a fix?

Not really, but kind of yes, because I consider mixin a bandaid and concepts to be the real solution. I would rather see the effort spent on implementing concepts that actually can be type-checked.

Not attaching procs to types has numerous downsides, most of them were unknown to me when I designed Nim and that's what we should focus our attention on. Currently not even the generic caching that we do is sound, remember?

@zah
Copy link
Member Author

zah commented May 16, 2019

I'm arguing that the current lookup rules have their merits, otherwise too much magic would be going on for my taste.

I really don't understand what part of the proposed solution can be described as "too much magic". All I am saying is that the compiler should follow the standard lexical scope rules - the files and their textual structure determine what is visible at any given line. You can reason about these rules by examining only the local file and its imports and the influence of modules that import you is limited to explicitly visible mixin statements. By all objective ways to look at it, I would qualify this as "less magical" than the current rules.

Concepts don't really change the rule of the game that much when it comes to mixed in symbols. One way to look at them is that they introduce a group of implicitly mixed in symbols. We need to fix the caching mechanics as a separate effort regardless of how the current issue is fixed.

@zah
Copy link
Member Author

zah commented May 16, 2019

Today, after studying the compiler for a while, @mratsim was tripped by the same issue, but with a slight twist - this time the problem manifested as a run-time failure because a wrong overload was selected (see the referenced issue and fix above).

@Araq
Copy link
Member

Araq commented May 16, 2019

Ok, so what to do:

  • Inject mixin statements into "sandwiched" generics. (Silly, IMO.)

  • or -

  • Enforce that resolved mixin symbols are public. (Makes sense but no idea if that prevents your bugs.)

@krux02
Copy link
Contributor

krux02 commented May 17, 2019

@zah I discussed this problem with @Araq this morning. I do now understand the problem. But there was something in your example that bothered me, it is in this part:

proc libraryFunc*[T](x: T) =
  mixin mixedIn, indirectlyMixedIn
  echo mixedIn()
  echo indirectlyMixedIn()

What is bothering me here is the part that you have a generic parameter, but then you don't use it at all. It has nothing to do with the generic instantiation of the procedure. So I changed the problem a bit to this:

proc libraryFunc*[T](x: T) =
  mixin mixedIn, indirectlyMixedIn
  echo mixedIn(x)
  echo indirectlyMixedIn(x)

The parameter x is now used and the mixins are actually dependent of T. This is what I would call a legal mixin, but I am not sure if it still suits your problem or not.

My second question, is T really a string type in your real word example, or did your problem simplification reduce it to string but in your real program it is actually a custom type. I am asking, because for the solution that I have in my head, it actually matters if the type is declared in your module, or if the type is declared in the standard library.

I am currently exploring if argument dependent namespace lookup, a feature from c++ would be a good solution to this problem or not. Here is the slightly changed problem mapped to c++.

// ----------------------------------------
// genericlibrary.hpp

#include <cstdio>

// Note that in c++ mixedIn cannot be resolved for types T that are not in namespace.
// Also note that mixedIn isn't forward declared here at all. The compile will later resolve it to mylib::mixedIn, because MyType is also in the namespace mylib

template <typename T>
auto libraryFunc(T x) -> void {
  std::printf("mixed in: %d\n", mixedIn(x));
  std::printf("mixed in: %d\n", indirectlyMixedIn(x));
}

// ----------------------------------------
// helpermodule.hpp

// ----------------------------------------
// moduleusinggenericlibrary.hpp

// #include "genericlibrary.hpp""
// #include "helpermodule.hpp"

template <typename T>
auto makeUseOfLibrary(T x) -> void {
  libraryFunc(x);
}

// ----------------------------------------
// main.cpp

// #include "moduleusinggenericlibrary.hpp"

namespace mylib {

struct MyType {
  int a,b;
};

auto mixedIn(MyType arg) -> int {
  return 100;
}

auto indirectlyMixedIn(MyType arg) -> int {
  return 200;
}

}

auto main() -> int {
  mylib::MyType mt;
  makeUseOfLibrary(mt);
  return 0;
}

Mapped to Nim, this would mean that x.mixedIn would not just try to resolve mixedIn in the current scope, but Nim would also try to resolve mixedIn in the module of the type of x.

@zah
Copy link
Member Author

zah commented May 17, 2019

@krux02, my test case was artificially simplified in order to produce the most minimal reproduction. If you want to look at some real-world examples, the linked issues above could help. Here is another one.

You are right that there are some ADL-like properties in the real-world examples, but the picture is more complicated for the serialization library case, because there you must also support non-intrusive specialization for certain types. You probably remember the recent discussion about this in another RFC. Also, the ADL is not as clear-cut as you would hope - the "argument" may appear in any position in the call.

@krux02
Copy link
Contributor

krux02 commented May 18, 2019

@zah, I know that this issue is related to the $ overload discussion. But for the $ case I am pretty sure the ADL would solve a lot of use cases, because custom overloads of $ are usually made in the same module as the type of the argument.

But I am also not too happy with ADL. It does improve the situation for now, but I can also see that it is just a matter of time until we have extension modules and other complications in the language to address the shortcomings of ADL.

@Araq
Copy link
Member

Araq commented Jun 6, 2020

As a first step to mitigate the problem, bind should be made to work inside generics.

onqtam added a commit to status-im/nimbus-eth2 that referenced this issue Jul 30, 2020
mratsim added a commit to mratsim/constantine that referenced this issue Oct 10, 2020
- includes type system workaround: generic sandwich nim-lang/Nim#11225
- converting NimNode to typedesc: nim-lang/Nim#6785
mratsim added a commit to mratsim/constantine that referenced this issue Oct 10, 2020
* Implement a Sage codegenerator for frobenius constants

* Sage codegen for pairings

* Autogen of endomorphism acceleration constants

* The autogen fixed a copy-paste bug in lattice decomposition. We can use conditional negation now and save an add+dbl in scalar mul

* small fixes

* sage code for square root bls12-377 is not old

* readme updates

* Provide test suggestions for derive_frobenius

* indentation + add equation form to sage

* Sage test vector generator

* Use the json vectors
- includes type system workaround: generic sandwich nim-lang/Nim#11225
- converting NimNode to typedesc: nim-lang/Nim#6785

* Delete old sage code

* Install nim-serialization and nim-json-serialization in CI

* CI nimble install force yes
mratsim added a commit to status-im/nimbus-eth2 that referenced this issue Jan 6, 2021
mratsim referenced this issue in status-im/nimbus-eth2 Jan 26, 2021
mratsim added a commit to status-im/nimbus-eth2 that referenced this issue Jan 26, 2021
mratsim added a commit to status-im/nimbus-eth2 that referenced this issue Feb 5, 2021
@Araq Araq closed this as completed in 2f213db Mar 9, 2021
narimiran pushed a commit that referenced this issue Mar 18, 2021
* fixes #11225; generic sandwich problems; [backport:1.2]
* progress
* delegating these symbols must be done via 'bind'

(cherry picked from commit 2f213db)
narimiran pushed a commit that referenced this issue Mar 18, 2021
* fixes #11225; generic sandwich problems; [backport:1.2]
* progress
* delegating these symbols must be done via 'bind'

(cherry picked from commit 2f213db)
narimiran pushed a commit that referenced this issue Mar 18, 2021
* fixes #11225; generic sandwich problems; [backport:1.2]
* progress
* delegating these symbols must be done via 'bind'

(cherry picked from commit 2f213db)
narimiran pushed a commit that referenced this issue Mar 20, 2021
* fixes #11225; generic sandwich problems; [backport:1.2]
* progress
* delegating these symbols must be done via 'bind'

(cherry picked from commit 2f213db)
ringabout pushed a commit to ringabout/Nim that referenced this issue Mar 22, 2021
…lang#17255)

* fixes nim-lang#11225; generic sandwich problems; [backport:1.2]
* progress
* delegating these symbols must be done via 'bind'
ardek66 pushed a commit to ardek66/Nim that referenced this issue Mar 26, 2021
…lang#17255)

* fixes nim-lang#11225; generic sandwich problems; [backport:1.2]
* progress
* delegating these symbols must be done via 'bind'
narimiran added a commit that referenced this issue Apr 13, 2021
narimiran pushed a commit that referenced this issue Dec 7, 2021
* fixes #11225; generic sandwich problems; [backport:1.2]
* progress
* delegating these symbols must be done via 'bind'

(cherry picked from commit 2f213db)
mratsim added a commit to mratsim/constantine that referenced this issue Jul 12, 2024
* feat(bench): PoC of integration with zkalc

* feat(bench): zkalc prepare for adding pairing benches - generic type resulution issue

* feat(bench): add pairing and G2 bench for zkalc and nimble build script

* fix nimble dependencies

* feat(bench-zkalc): polish and output json bench file

* fix(nimble): version gate task-level dependencies

* fix(deps): std/os instead of commandline for 1.6.x and gmp@#head

* fix(deps): .nimble and nimble have different versioning format

* fix(zkalc): workaround generic sandwich bug in Nim 1.6.x nim-lang/Nim#8677 nim-lang/Nim#11225

* fix(zkalc CI): skip zkalc build on 32-bit CI as it needs clang multilib

* fix(zkalc CI): skip Windows as Clang throws fatal error LNK1107
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants