Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rake -n does not mark dependents for Execution #92

Closed
silvioricardoc opened this issue Dec 16, 2015 · 8 comments · Fixed by #183
Closed

Rake -n does not mark dependents for Execution #92

silvioricardoc opened this issue Dec 16, 2015 · 8 comments · Fixed by #183

Comments

@silvioricardoc
Copy link

I have a Rakefile with these file tasks:

file "fileA" => %W[] do |t| sh %Q[echo contentA >#{t.name}] end
file "fileB" => %W[fileA] do |t| sh %Q[(cat fileA; echo transformationB) >#{t.name}] end
file "fileC" => %W[fileB] do |t| sh %Q[(cat fileB; echo transformationC) >#{t.name}] end

I perform this setup, where everything is built and then the first file in the chain (fileA) is updated:

$ rake fileC
echo contentA >fileA
(cat fileA; echo transformationB) >fileB
(cat fileB; echo transformationC) >fileC
$ touch fileA

And now rake -n tells me that only fileB needs to be updated...

$ rake -n fileC
** Invoke fileC (first_time, not_needed)
** Invoke fileB (first_time)
** Invoke fileA (first_time, not_needed)
** Execute (dry run) fileB

Even though it knows it must update both fileB and fileC:

$ rake fileC
(cat fileA; echo transformationB) >fileB
(cat fileB; echo transformationC) >fileC

I expected rake -n to tell me this:

$ rake -n fileC
[...]
** Execute (dry run) fileB
** Execute (dry run) fileC

The problem seems to stem from the fact that the internal function out_of_date? (in file_task.rb) does not take into account when a file should have updated its timestamp (but didn't, because this is a dry-run).

@grzuy
Copy link
Contributor

grzuy commented Feb 7, 2018

Even though it knows it must update both fileB and fileC

fileC task should only be executed if fileB changed. So how can rake know that fileB is going to change without actually executing fileB task (given it's a dry-run)?

@grzuy
Copy link
Contributor

grzuy commented Feb 7, 2018

I think rake cannot know if fileB is going to have it's file timestamp updated or not, that depends on what the actual fileB file task does in the block implementation, which is never executed in a dry-run.

So at most, rake can say something like

** Execute (dry run) fileB
** May execute (dry run) fileC

Or maybe i am missing something.

@silvioricardoc
Copy link
Author

I think I've always assumed that if the code for building fileB does not change its timestamp (so as to be more recent than fileC), then rake would assume that there was an error and fail loudly.

If this check was in place, then it wouldn't be a problem for the dry-run to print this:

** Execute (dry run) fileB
** Execute (dry run) fileC

because the semantics would be the following: "fileC will be run as long as there is no problem when building fileB". This is already true if the fileB task raises an exception (in which case the fileC task will not be executed).

In fact, whenever there is more than one line such as ** Execute (dry run) <something> in the dry-run output, we can never be sure if all of them will be executed. If any task fails with an exception, rake correctly stops all following executions, even the ones that are completely unrelated to it. There is always the implicit assumption of May execute for all but the first occurrence of Execute.

But then, all of this would require rake to actually fail on error (instead of failing silently) when its file tasks fail to update the timestamp from the dependent to a value that is more recent than the one of the target. Would this be feasible to implement? Maybe there's a downside I'm not seeing?

@grzuy
Copy link
Contributor

grzuy commented Feb 8, 2018

Hi Silvio, thanks for responding.


I think I've always assumed that if the code for building fileB does not change its timestamp (so as to be more recent than fileC), then rake would assume that there was an error and fail loudly.

Whether rake should or should not raise an error if the file task doesn't update the file as part of the file task execution seems like an interesting but separate discussion to me.

Back to the original issue about dry-runs, it seems like, as of today, rake doesn't assume anything about what fileB task does when it's executed. User is free to implement it in any way it likes. rake is agnostic about that. So that means, as per the actual situation, the dry-run cannot assume fileC is going to be executed, just based on the fact that fileB file task is going to be executed, right?

@silvioricardoc
Copy link
Author

Well, whenever there are two tasks marked for execution, we can never know if the second one will ever execute. Even if we run $ rake fileB fileC we are not guaranteed that the fileC task will be executed (it won't if fileB fails with an exception). But still, rake -n will show both as marked for execution, because under normal circumstances, both should be executed. So I agree that we cannot assume that fileC will be executed, but I will say this assumption is always true whenever we have more than 1 task marked for execution.

The reason why mentioned my assumption of rake raising an error is that any other solution would require silent failure (I think). For example, if fileB does not update the timestamp then either:
a) Rake will try to build fileC anyway (silently ignore the fact that fileB failed). I think nobody wants this option.
b) Rake will silently skip the target fileC, and users will think that the build worked even though it didn't.
c) Rake will fail with an error saying that fileB was built with a timestamp that is older than the one in fileC.

If we go for option (c), then I don't see any problem in saying both ** Execute fileB and ** Execute fileC, because under normal circumstances both will be executed. And under exceptional circumstances, rake will stop on the first error, regardless of whether the error occurred inside fileB or it was an error in the timestamp of fileB.

Maybe there's an option (d) that I'm missing? Or maybe options (a) or (b) are better than I think?

Side note:

[...] as of today, rake doesn't assume anything about what fileB task does when it's executed. User is free to implement it in any way it likes. rake is agnostic about that.

Actually, I think rake does assume that file tasks update the timestamps. That's why it will not run fileB if it thinks the timestamp is already fine. Basically, the main assumption rake makes about file tasks would be that they update the timestamp for the given file. Timestamps are important for rake, which is also why it will not run fileC if the timestamp of fileB is not updated, in options (b) and (c) above.

@grzuy
Copy link
Contributor

grzuy commented Feb 20, 2018

(it won't if fileB fails with an exception)

Keep in mind fileC won't run also if fileB file task has a conditional implementation, and the condition returns false. E.g.

file "fileB" => %W[fileA] do |t|
  if something_happens?
    sh %Q[touch #{t.name}]
  end
end

Implementation of the task, including the update of the file timestamp it's up to the user, plus rake doesn't assume anything about the implementation of the task as of now.

So I agree that we cannot assume that fileC will be executed

Cool, we agree on that 👍

So if we cannot assume that fileC will be executed, wouldn't be misleading to include

** Execute (dry run) fileC

in the output of the dry-run?

@silvioricardoc
Copy link
Author

rake doesn't assume anything about the implementation of the task as of now

Rake does assume that file tasks update the timestamps. That's why it will not run fileB if it thinks the timestamp is already fine. This is the main assumption behind Makefiles, and is the expected behavior of the file method. If the file method were not supposed to be used for timestamp-related tasks, then I would say that the current documentation on rake is very misleading. But I think we can agree that the point of file is to represent a file-building task that relies on timestamps. That's an assumption that's made by both make and rake's file method.

So if we cannot assume that fileC will be executed, wouldn't be misleading to include
** Execute (dry run) fileC
in the output of the dry-run?

Whenever there are two tasks marked for execution, we can never know if the second one will ever execute. Even if we run $ rake fileB fileC we are not guaranteed that the fileC task will be executed (it won't if fileB fails with an exception). But still, rake -n will show both as marked for execution, because under normal circumstances, both should be executed.

Should rake -n be modified so as to never ever show more than one line containing Execute, so as to avoid a literal interpretation that this Execute necessarily means that this taks will be executed no matter what? I wouldn't say so. I'd say it's implicit that any failure in the chain will stop the execution of further tasks, so it's a bug if rake currently does not show everything that will be executed. This bug makes rake -n extremely misleading (we cannot rely on it to know what tasks will actually be called), and prevents any scripting of its output to predict what tasks will actually be executed.

@mblumtritt
Copy link

"rake -n shows everything that will be executed when required." defines the behavior for me. Maybe the documentation needs to be more specific for file tasks to help beginners which never used make or similar before…

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging a pull request may close this issue.

3 participants