HDFS targets show in confirmation even if not needed #15

alexrobbins · 2013-01-25T20:54:07Z

Terminal output here: http://pastebin.com/J08GAk1Y

drake has already run once, to completion. No files have been modified. drake correctly notices this as skips all the steps. Why does it still say it is going to do the steps?

All the steps involve at least one hdfs location. A very similar workflow that was all local didn't exhibit this same behavior.

alexrobbins · 2013-01-25T20:54:34Z

Pastebin content:

alexr@dev101:~/src/resolve/src/resolve/ml$ drake
The following steps will be run, in order:
1: hdfs://user/alexr/resolve-ml/gold-annotations <- data/gold-annotations [missing output]
2: hdfs://user/alexr/resolve-ml/good-pairs <- hdfs://user/alexr/resolve-ml/gold-annotations [projected timestamped]
3: hdfs://user/alexr/resolve-ml/uuid-and-attrs <- hdfs://user/alexr/resolve-ml/gold-annotations [projected timestamped]
4: hdfs://user/alexr/resolve-ml/all-pairs <- hdfs://user/alexr/resolve-ml/gold-annotations [projected timestamped]
5: data/good-pairs <- hdfs://user/alexr/resolve-ml/good-pairs [projected timestamped]
6: data/all-pairs <- hdfs://user/alexr/resolve-ml/all-pairs [projected timestamped]
7: hdfs://user/alexr/resolve-ml/bad-pairs <- data/all-pairs, data/good-pairs [projected timestamped]
8: hdfs://user/alexr/resolve-ml/good-pairs-with-features <- hdfs://user/alexr/resolve-ml/good-pairs, hdfs://user/alexr/resolve-ml/uuid-and-attrs [projected timestamped]
9: hdfs://user/alexr/resolve-ml/bad-pairs-with-features <- hdfs://user/alexr/resolve-ml/bad-pairs, hdfs://user/alexr/resolve-ml/uuid-and-attrs [projected timestamped]
Confirm? [y/n] y
Running 9 steps...

--- 0. Skipped (up-to-date): hdfs://user/alexr/resolve-ml/gold-annotations <- data/gold-annotations

--- 1. Skipped (up-to-date): hdfs://user/alexr/resolve-ml/good-pairs <- hdfs://user/alexr/resolve-ml/gold-annotations

--- 2. Skipped (up-to-date): hdfs://user/alexr/resolve-ml/uuid-and-attrs <- hdfs://user/alexr/resolve-ml/gold-annotations

--- 3. Skipped (up-to-date): hdfs://user/alexr/resolve-ml/all-pairs <- hdfs://user/alexr/resolve-ml/gold-annotations

--- 4. Skipped (up-to-date): data/good-pairs <- hdfs://user/alexr/resolve-ml/good-pairs

--- 5. Skipped (up-to-date): data/all-pairs <- hdfs://user/alexr/resolve-ml/all-pairs

--- 6. Skipped (up-to-date): hdfs://user/alexr/resolve-ml/bad-pairs <- data/all-pairs, data/good-pairs

--- 7. Skipped (up-to-date): hdfs://user/alexr/resolve-ml/good-pairs-with-features <- hdfs://user/alexr/resolve-ml/good-pairs, hdfs://user/alexr/resolve-ml/uuid-and-attrs

--- 8. Skipped (up-to-date): hdfs://user/alexr/resolve-ml/bad-pairs-with-features <- hdfs://user/alexr/resolve-ml/bad-pairs, hdfs://user/alexr/resolve-ml/uuid-and-attrs

Done (0 steps run).

aboytsov · 2013-01-25T22:11:25Z

This is extremely strange. How is it possible that "gold-annotations" listed with "missing output" reason, but then not build with "up-to-date" reason? Does the file really exist or not? One of those is definitely wrong, but which one? Can you dig into it a little further, i.e. ls -l inputs and outputs of the first target? Thanks!

alexrobbins · 2013-01-28T18:19:38Z

So, after further investigation, this issue is intermittent, and seems to occur when I'm moving from local to hdfs, or back. I wonder if it is a synchronization issue? Sometimes the hdfs last modified time ends up slightly different than the local one. If I rerun the workflow sometimes it is a problem again, and sometimes not.

aboytsov · 2013-01-28T21:04:21Z

Yes, it is most likely the synchronization issue. I ran into this issue before, and reported to Philip. After he fixed it, it was gone. It probably happened again. Even 1 second de-synchronization could create this issue. I won't close the bug just yet - but let me know once you talk to Philip.

dirtyvagabond · 2013-01-28T21:08:03Z

either one of you guys up for adding details on this to the wiki? sounds like an annoying gotcha that we should warn folks about

aboytsov · 2013-01-28T21:09:29Z

when you start FAQ, I'll add it there :)

alexrobbins · 2013-01-29T15:29:34Z

So, is there is a way we could make drake more tolerant of this? Right now if the servers get out of sync by even a millisecond, we'll have a problem. The problem only surfaces when the dependencies go the wrong way across the divide.

Could we add some sort of configurable delay between steps? If we waited one second after each step, then the servers would have to be off by more than one second to see the problem. That seems much less likely than being one millisecond off. In most data workflows an extra second or two per step is not going to be a big deal.

Alternately, could we just configure a "fuzz" factor into the out-of-date calculation that calls things ok if their timestamp is after their dependency, after adding the fuzz factor?

I think the problem shows up mostly when using dummy commands that don't take very long. If the commands were longer, it'd overcome the server time difference. That said, I'm seeing the issue around hdfs -copyToLocal and -put, which are going to be something I use in the future. Adding a 30 second sleep to each command does fix the problem.

dirtyvagabond · 2013-01-29T17:09:03Z

Maybe we could control the scope of the forced delay? The rule would be like: If using HDFS and step was fast, then artificially pause.

Or is there some elegant way to detect the problem beforehand?

alexrobbins · 2013-01-29T17:21:05Z

AFAICT, the issue only comes up when moving between filesystems, so we could add the (configurable?) delay before any step that uses both local and hdfs filesystems.

Also, we could try to detect the unsync condition between the two systems and correct for it. Maybe make a tmp file on both systems at the same time, compare eventual access times, then adjust later times by the difference? This might work, except that network lag may not be consistent, so the adjustment we derive may only be accurate for that single point in time.

I think there isn't an easy fix because distributed systems have to deal with network lag that can make time comparisons tough. The delay method will work as long as the delay is bigger than the time difference between the two systems, I think.

aboytsov · 2013-01-29T21:30:53Z

Alex, thank you very much for your thoughts.

There isn't an easy fix. Ideally, every system would have its time synchronized over NTP.

I like both your ideas. We can add configurable delay (say, 300-400 ms) before every step that uses more than one filesystem. It would not solve all possible problems, but probably a big chunk.

I'm a little more worried about "fuzzy" timestamp evaluation. If it relaxed the requirements (i.e. more targets would be evaluated than otherwise), it would be OK. But it tightens the requirements, which can be problematic. Consider, for example, a user that runs a script which runs touch on several targets to invalidate part of the workflow. This script might not work very well, and the result can be a bit dangerous and hard to debug.

I also like the idea to test the filesystem delay. We can have a special flag (--fs_test or something) and report the timestamp delays on all filesystems. We can run temporary file creation in a thread pool for best results.

But if the desynchronization is seconds, nothing will really help.

I'm not sure if all of the above is the highest priority - would you like to help us with the code? I'd be more than happy to review or point you to the right place.

aboytsov · 2013-01-30T07:50:16Z

Added --step-delay flag in feature/vvv: ee833c5

alexrobbins · 2013-01-30T14:55:50Z

The --step-delay flag delays every step. While that would work, we really
only need to delay before steps that cross the local to hdfs divide. (I
imagine your code was just a first pass at a solution, and it does fix the
problem.)

I wonder if the fs_test you mention should run as a precondition whenever
there are multiple file systems. If the systems are out of sync, there are
going to be weird issues. I'm in favor of failing fast with explicit error
messages, as opposed to failing weird, later, for no apparent reason.

"But if the desynchronization is seconds, nothing will really help." Yeah,
at that point I think the best we can do is complain loudly to the user.

aboytsov · 2013-01-31T05:25:03Z

Yes, I remember your suggestion to implement it only for steps crossing over 2 or more filesystems, and it's a fine one. But I had to implement it this way, because the problem seems to be fundamental on certain filesystems - #36.

We could have another flag that would turn control the behavior of --step-delay and enable it on multiple filesystem only, i.e. --step-delay-cross-fs or something like that.

I agree we should fail fast, and I think we can be even smarter with fs_test. We can put a flag under .drake/ directory where Drake keeps all temporary files including logs and script files, which would indicate whether the filesystem testing happened for this workflow or not. We can repeat it every week (day?), if needed, and, of course, one needs to be able to disable it completely. I agree we can easily detect whether the workflow uses multiple filesystems.

I have to say this issue is quite low on my priority list for now. But I'd be more than happy to review anyone's code contributions and provide direction and guidance.

alexrobbins · 2013-02-01T16:32:59Z

Oh, I didn't realize HDFS was limited to 1s resolution. Your change makes sense in light of that.

aboytsov · 2013-02-02T06:13:51Z

I'm actually not sure what timestamp resolution HDFS has. If you could run any workflow that uses HDFS with --debug flag and see what timestamps it reports, it would be helpful. @larsyencken was talking about HFS+ which is the file system OS X uses.

aboytsov · 2013-02-03T04:13:45Z

Alex, we should probably close this bug since it is related to Factual's HDFS/NFS desynchronization, and if Philip fixed it, this problem should go away.

I liked all your other ideas, however, and I was wondering if you could file a feature request for what you think we could do to make it even better (i.e. detection of multiple filesystems, automated tests) etc.?

ghost assigned aboytsov Jan 25, 2013

ghost self-assigned this Jan 28, 2013

larsyencken mentioned this issue Jan 30, 2013

HDFS file existence check failing #35

Closed

ghost assigned alexrobbins Feb 3, 2013

alexrobbins mentioned this issue Feb 4, 2013

Check for filesystem desynchronization #44

Open

alexrobbins closed this as completed Feb 4, 2013

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HDFS targets show in confirmation even if not needed #15

HDFS targets show in confirmation even if not needed #15

alexrobbins commented Jan 25, 2013

alexrobbins commented Jan 25, 2013

aboytsov commented Jan 25, 2013

alexrobbins commented Jan 28, 2013

aboytsov commented Jan 28, 2013

dirtyvagabond commented Jan 28, 2013

aboytsov commented Jan 28, 2013

alexrobbins commented Jan 29, 2013

dirtyvagabond commented Jan 29, 2013

alexrobbins commented Jan 29, 2013

aboytsov commented Jan 29, 2013

aboytsov commented Jan 30, 2013

alexrobbins commented Jan 30, 2013

aboytsov commented Jan 31, 2013

alexrobbins commented Feb 1, 2013

aboytsov commented Feb 2, 2013

aboytsov commented Feb 3, 2013

HDFS targets show in confirmation even if not needed #15

HDFS targets show in confirmation even if not needed #15

Comments

alexrobbins commented Jan 25, 2013

alexrobbins commented Jan 25, 2013

aboytsov commented Jan 25, 2013

alexrobbins commented Jan 28, 2013

aboytsov commented Jan 28, 2013

dirtyvagabond commented Jan 28, 2013

aboytsov commented Jan 28, 2013

alexrobbins commented Jan 29, 2013

dirtyvagabond commented Jan 29, 2013

alexrobbins commented Jan 29, 2013

aboytsov commented Jan 29, 2013

aboytsov commented Jan 30, 2013

alexrobbins commented Jan 30, 2013

aboytsov commented Jan 31, 2013

alexrobbins commented Feb 1, 2013

aboytsov commented Feb 2, 2013

aboytsov commented Feb 3, 2013