Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SSH TM_MAD returns "No such file or directory" when file exists #19

Open
fernanqv opened this issue Aug 17, 2021 · 2 comments
Open

SSH TM_MAD returns "No such file or directory" when file exists #19

fernanqv opened this issue Aug 17, 2021 · 2 comments

Comments

@fernanqv
Copy link
Contributor

Example:

The job is finished in the remote resource and the output is created at 13:53:01

(base) [valva@ui ~]$ stat /oceano/gmeteo/users/valva//.gw_valva_8/stdout.wrapper
  File: `/oceano/gmeteo/users/valva//.gw_valva_8/stdout.wrapper'
  Size: 610       	Blocks: 8          IO Block: 1048576 regular file
Device: 1ch/28d	Inode: 5933518     Links: 1
Access: (0600/-rw-------)  Uid: (15104/   valva)   Gid: (15030/  gmeteo)
Access: 2021-08-17 13:53:30.870373268 +0200
Modify: 2021-08-17 13:53:01.000000000 +0200
Change: 2021-08-17 13:53:01.882373272 +0200

The TM_MAD tries to copy the file 16 seconds later and it gets the following error:

drm4g_tm.log

2021-08-17 13:53:17,324 DEBUG     drm4g.core.tm_mad CP 8 0 - remote://meteoc/~/.gw_valva_8/stdout.wrapper file:///home/valva/.drm4g/var/000-099/8/stdout.wrapper.0
2021-08-17 13:53:17,842 ERROR     drm4g.core.tm_mad scp: /oceano/gmeteo/users/valva//.gw_valva_8/stdout.wrapper: No such file or directory
scp.SCPException: scp: /oceano/gmeteo/users/valva//.gw_valva_8/stdout.wrapper: No such file or directory
2021-08-17 13:53:17,844 DEBUG     drm4g.core.tm_mad CP 8 0 FAILURE scp: /oceano/gmeteo/users/valva//.gw_valva_8/stdout.wrapper: No such file or directory
@cofinoa
Copy link
Member

cofinoa commented Aug 18, 2021

@fernanqv we need to check that it's not a race condition on the NFS filesystem when the file is created on remote working node, but it's no visible in the frontend.

@fernanqv
Copy link
Contributor Author

Temporary workaround: Set

NUMBER_OF_RETRIES = 6

in .drm4g/etc/job_template.default

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants