-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Datasource eval breakdown #12615
Datasource eval breakdown #12615
Conversation
Not sure why this was defined and returned, but the value was set, but never used, as such this is not useful to keep in the code, so let's simplify this now.
Since datasources are recursively evaluated depending on their dependencies, we don't need to pre-execute those that depend on nothing, as the recursive traversal of the datasources will take care of that for us.
When a datasource fails to be evaluated because a cycle has been detected, we point out one of the links of the chain now so that users have a better idea of what to look at.
Datasources use their attribute's expressions to determine whether or not they depend on another datasource, in order to get the list of dependencies and execute them before executing a datasource. This code may be useful later on for figuring out the dependencies for any block, so we move this code to the utils.go file, and use this for datasources.
The previous implementation of the GetVarsByType function worked only on top-level attributes, ignoring the nested blocks in the structure. This implies that if a datasource depends on another through an expression within a nested block, we may not execute it first, and then executing this datasource before its dependent is possible, resulting in an error in the end. This commit is an attempt at making this more reliable for HCL configs, but only works on configs lifted from HCL files for now. We need to make this more reliable for later iterations.
In previous versions of Packer, the parser would load a HCL file (in either HCL or JSON format), and process it in phases. This process meant that if the file was not valid in the first place, we'd have some duplicate errors first. Another thing is that sources and build blocks were not being parsed at the beginning of the parsing phase at all, but were discovered later in the process, when the data sources have been executed and variables have been evaluated. This commit changes the parsing to happen all at once, so we parse the files with the parser, which will fail should one file be malformatted, then we visit the body once using the top-level schema for a Packer build template, handling misplaced blocks once in the process. Then, since the builds and sources may contain some dynamic blocks that need to be expanded once their context is ready (i.e. the variables/datasources they depend on), we can expand the build block for them, and populate their respective objects. This last step should also be adapted for datasources later, as they may need to support dynamic blocks, and be expanded lazily.
The `Datasources' type holds a map of datasource references to their respective block. In this map, the blocks were structures rather than pointer to structures, which makes updating their contents a harder process than it should be. To allow for in-place changes for the datasource blocks, we change the type of the blocks to be a pointer instead.
Since we want to allow for independently executing datasources when needed, we break down the logic to do so. Each datasource will now hold a list of dependencies. For now, these dependencies are all datasources themselves, but this may change in the future. Then, actual execution of the datasource is self-contained within a method of the structure, so that we can individually execute them. Finally, we replace the current sequential execution by a method of the PackerConfig with an approach akin to local variable evaluation, where we execute all the datasources one-by-one, and if they cannot be executed, move along. We retry this loop for as many times as we don't reach either a state where all the datasources have been executed; or when we reach a terminal state where datasources cannot be evaluated because of dependencies remaining, and no changes have occurred after a try.
In some cases, rather than walking through a full block of HCL in order to lift the references to other components of a build, we only have the expression to walk on. From this expression, we can lift the variables as a collection of hcl.Traversal, from which we then filter only those that interest us. So, since this is a use-case, we split the function that lifts the expressions from a hcl block and filter them based on the root type of the traversal into two separate functions.
As with datasources, we break down the local variable evaluation in two functions, one doing the evaluation for a local variable, and the sequential/recursive evaluation of all the local variables lifted from the packer template. This, plus dependency management will let us later reuse it for another evaluation strategy, while reusing those components as much as possible.
Local variables can't have a validation block in their definition, so this step in not useful and should be removed. Besides, since the validation was done on the local variables before evaluation, it did nothing at all, as the PackerConfig.LocalVariables collection gets populated during evaluation, so this is essentially a no-op, and can be safely removed.
Moving to draft; this will be where I push the scheduler code as I work on introducing it to the main branch. The other PRs that it depends on can be reviewed and merged since they fix some issues with the current versions of Packer, but this one is essentially setting the stage for the DAG, as such this may be moved to another branch than main, but should be maintained up-to-date as much as possible. |
Since hcl.Diagnostics supports extending it through the `Extend' method, we might as well use it instead of manually appending a series of diagnostics to it.
c91b2c7
to
57cf6b8
Compare
Since these constants and functions will be later useful for the extraction of the sequential logic in a scheduler, we change the visibility of those now.
When processing builds, we convert build blocks/sources to a series of CoreBuild, which contains the sequence of provisioners and post-processors to run, and the builder to use to boot the instance and create the image from. This process of converting builds used to be completely sequential, and orchestrated by the `GetBuilds' function. This gets broken down into a `ToCoreBuild' function on the build block, whose function is to lift and filter-out (based on --only/--except) the core builds from a valid build block, and this gets invoked from `GetBuilds'. This breakdown is useful as a preparation to the DAG's introduction to the code later on.
When a user specifies a --only/--except option in their command-line, and it turns-out to be unused, Packer signals it with a Warning diagnostic. This diagnostic, in previous versions of Packer, would only state that one option did not match anything, without stating which option did not match a build or post-processor. This commit tracks the usage of each option so we can report later which of those options was not used in this warning diagnostic.
Since those arguments will need to be updated from outside the hcl2template package, we change their visibility from private to public.
Since the initialization code was embedded in the sequential logic that we'll be moving away from soon, we move that to the build block itself, so we're able to invoke it in any order later on.
Since these will be used by the schedulers' code, we need to make it available from outside the `packer' package, so we make them public.
Since this will be required to be called from elsewhere later, we make this function public, and since it relies on data from config to be passed in at callsite, we may as well move this function to be a PackerConfig method instead.
Since printing the diagnostics to users requires using a map of filename to HCL files, we change this in the config definition for HCL templates.
1819bb2
to
72bd2f5
Compare
Since we'll move the initialization code to the schedulers once they're in the codebase, we make the validation function public.
In order to be able to change how templates are processed later on, we move the logic that processes templates to a new abstraction: Scheduler. Right now, one implementation of Scheduler exists: sequential. This is essentially the same approach as what we have now embedded in the configs, but is now moved to this structure, so we can change it later.
72bd2f5
to
e8eba9e
Compare
As the last command that relies on the deprecated interfaces for interacting with a template, we move hcl2_upgrade to the schedulers, at least for the part that prepares the template for translation.
Since the Initialise/GetBuilds functions are not used anymore by Packer's commands directly, as they are superseded by the scheduler's implementations, we can remove them from the Handler interface.
e8eba9e
to
31c4925
Compare
This won't go into main any time soon, and we will probably building on this in a separate branch for the scheduler testing and the next evolutions, so I'll close this, and we can keep this branch up (probably rename it so it is clearer what its purpose will be). |
I'm going to lock this pull request because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. |
This PR aims to rework the logic around datasource evaluation, so we can execute each datasource separately, and not necessarily through the
evaluateDatasources
function.This is made as a building block for the introduction later of the DAG.
Essentially, this PR adds some dependency list to the datasources. This list is for now limited to other datasources, but will likely change when we move to a more DAG-y workflow, where locals for example can become dependencies of the datasource.
With the change in execution behaviour, we also rework the
evaluateDatasources
(renamed toexecuteDatasources
for consistency), and adopt a flow similar to the local variables evaluation.