Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

devtools::check deletes all files and folders from home directory #120

Closed
mr-francois opened this issue Jan 21, 2020 · 29 comments · Fixed by #156
Closed

devtools::check deletes all files and folders from home directory #120

mr-francois opened this issue Jan 21, 2020 · 29 comments · Fixed by #156
Labels
feature a feature request or enhancement

Comments

@mr-francois
Copy link

I encountered this problem twice. When i run the check command inside RStudio all my files and folders get deleted.
Screenshot from 2020-01-21 10-35-13

@mr-francois
Copy link
Author

mr-francois commented Feb 6, 2020

This seems to happen when you have a folder called ~ in your package directory. For example
~/pkg/~/. I guess check() sets working directory to ~/pkg and creates a list of folders to remove, but doesn't handle it as delete ~/pkg/~/ but delete ~?

@jimhester jimhester transferred this issue from r-lib/devtools Feb 6, 2020
@gaborcsardi
Copy link
Member

Maybe this is a bug in base R, because if you unlink() a file called ~ then it will just remove your home directory. See below for an example.

But really, naming a file or directory ~ is just asking for trouble.

E.g. this is on Linux, as user 'root', so the whole /root directory will be deleted.

❗ DO NOT RUN THIS CODE, BECAUSE YOUR HOME DIRECTORY WILL BE DELETED. ❗ 

> setwd("~")
> cat("foo\n", file = "~/foo")
> dir.create("a")
> dir.create("a/~")
> dir()
[1] "a"   "foo"
> dir("a")
[1] "~"
> setwd("a")

Now this will remove the /root home directory instead of the directory called ~.

❗ DO NOT RUN THIS CODE, BECAUSE YOUR HOME DIRECTORY WILL BE DELETED. ❗ 

> unlink("~", recursive = TRUE)
> dir()
character(0)
> getwd()
NULL
> setwd("~")
Error in setwd("~") : cannot change working directory
> q("no")
bash-5.0# pwd
/root
bash-5.0# ls
bash-5.0# cd /
bash-5.0# ls
bin       etc       lib       opt       run       sys       var
build.sh  home      media     output    sbin      tmp
dev       input     mnt       proc      srv       usr
bash-5.0# ls root
ls: root: No such file or directory

@gaborcsardi
Copy link
Member

gaborcsardi commented Feb 6, 2020

Maybe we could do a quick check for files/directories named ~ and then refuse to run the check, but I don't think it is worth spending time on working around this issue.

With a file or directory called ~ you live a very dangerous life.

@mr-francois
Copy link
Author

I didn't call a folder ~ on purpose. I think this happened when i wrote output to something like path = ~/myFile.csv. Not sure how this ended up as ~/pkg/~/myFile.csv.

@gaborcsardi
Copy link
Member

Right. Considering how many times we use unlink() (see https://github.com/search?q=org%3Ar-lib+unlink+path%3A%2FR%2F&type=Code), and that it is not hard to accidentally create a file or dir like that, we should probably use a wrapper to unlink(), that works around this.

@lbusett
Copy link

lbusett commented Feb 26, 2020

@gaborcsardi

We had the same exact issue two days ago. We were lucky to be able to recover all the files, but I think that current behaviour is a disaster waiting to happen. We do not know exactly how/why the "spurious" ~ folder was created. Could have been a mistake on our side, a glitch of some other package or whatever (We also do not know if the folder was created recently, or if it had been there for some time and some recent change in the devtools "stack" triggered the delete).
However, this is really dangerorus and IMHO should be reported as bug/unexpected behaviour and a fix/workaround implemented ASAP. We understand that the "glitch" is in unlink but as you say, given how easy it is to end up with a dangeoruos folder, setting some check to prevent accidental deletion is very important.

BTW: @jimhester It took us a while to understand what was happening. I can understand the motivation, but I think that if this issue was not moved from devtools to here we could have found it earlier, saving us hours of trial and errors.

@gaborcsardi
Copy link
Member

However, this is really dangerorus and IMHO should be reported as bug/unexpected behaviour and a fix/workaround implemented ASAP.

I am sure that you are welcome to report it at https://bugs.r-project.org/bugzilla/

@lbusett
Copy link

lbusett commented Feb 26, 2020

@gaborcsardi I hope I was not harsh in my previous. It was not my intention.

I am sure that you are welcome to report it at https://bugs.r-project.org/bugzilla/

Thanks for the suggestion, will do that. A fix in base R is for sure the best solution. In the meantime, however, I think that your idea/proposal to:

use a wrapper to unlink(), that works around this

would have its clear merits to avoid accidents. I understand that it would involve changes in several packages, so it is not something you/I could implement on our own, but it could be worth discussing it. What do you think?

HTH!

Lorenzo

@gaborcsardi
Copy link
Member

I can imagine that this is a major pain.

It would be also great to know which R function creates these files. It is probably a function that does not call path.expand() or normalizePath(), and it should. If you can restore your home directory from backups, can you check what is inside the ~ directory? Maybe that gives us a clue.

As for the workaround, we can start with a safe_unlink() function that runs something like

dir(recursive = TRUE, include.dirs = TRUE, pattern = "~")

and then refuses to continue if the result in non-empty.

@ranghetti
Copy link

ranghetti commented Feb 26, 2020

Hi @gaborcsardi ,
I launched the recent dangerous command @lbusett was speaking about, by building a package in which the harmful "~" subdirectory was present.

As for the workaround, we can start with a safe_unlink() function that runs something like

dir(recursive = TRUE, include.dirs = TRUE, pattern = "^~")

and then refuses to continue if the result in non-empty.

This seems to be a good workaround to avoid this specific dangerous problem.
I did not deepen the callr code, but I noticed that setwd() is used in several cases.
Do you think it would be possible to working with absolute paths instead than changing the working directory, or this would involve drastically rewriting the code?

@gaborcsardi
Copy link
Member

I did not deepen the callr code, but I noticed that setwd() is used in several cases.
Do you think it would be possible to working with absolute paths instead than changing the working directory, or this would involve drastically rewriting the code?

I am not sure what you mean, callr has nothing to do with this, and those setwd() calls are fine.

@lbusett
Copy link

lbusett commented Feb 26, 2020

May I ask a clarification here, if possible, without trying to wipe out my home folder?

DO NOT CODE BELOW, BECAUSE YOUR HOME DIRECTORY MAY BE DELETED

Let's assume that I created a folder named let's say /home/tmp/~. Would the "bad behaviour" be triggered also if I try unlink("home/tmp", recursive = TRUE), or only if I previously set the working directory to "home/tmp" and then try deleting its contents?

@ranghetti
Copy link

I am not sure what you mean, callr has nothing to do with this, and those setwd() calls are fine.

I mean that, using setwd(), you can encounter the situation unlink("~"); while, using absolute path, this (and maybe other dangerous cases) could not occur (launching command unlink("/tmp/RtmpEF5aqf/file4d65487c5c34/packagename/~") would correctly delete the "~" subdirectory instead than the home).

Maybe I should not ask this in a rcmdcheck issue, but I did it since the topic is the same and the two packages are linked. However, I did not deepen the code and so maybe this is not practcable; this was only an idea to avoid commands like unlink("something_ambiguous").

@gaborcsardi
Copy link
Member

rcmdcheck does not create a list of files, it calls unlink() on the root of the directory. So either some other package or RStudio calls unlink() on ~ or base R expands a foo/bar/~ path to the home directory.

The expansion in the unlink() code is different on Windows and Unix. What's your platform?

@lbusett
Copy link

lbusett commented Feb 26, 2020

I did not deepen the callr code, but I noticed that setwd() is used in several cases.

@ranghetti maybe you meant "devtools"? I suppose that it is devtools::check(), that sets the WD to the packages root ~/pkg (as suggested by @mr-francois).

@gaborcsardi
Copy link
Member

Can you pls tell us your platform and R version?
I suppose the package versions are gone already....

@lbusett
Copy link

lbusett commented Feb 26, 2020

I may be out of my depth here, so I'll let @ranghetti comment on more "technical" issues. Sorry if I introduced some confusion.

A word, regarding this:

either some other package or RStudio calls unlink() on ~ or base R expands a foo/bar/~ path to the home directory

Do not know if this helps, but what I can contribute is that we noticed deletion was triggered by calls to rhub::check_for_cran(), but also by devtools::build_vignettes(), which I guess both call devtools::check().

@gaborcsardi
Copy link
Member

OK, this is a bug in R CMD build, apparently:

root@e6c2e5ef1dac:~# git clone https://github.com/r-lib/pkgconfig
Cloning into 'pkgconfig'...
remote: Enumerating objects: 377, done.
remote: Total 377 (delta 0), reused 0 (delta 0), pack-reused 377
Receiving objects: 100% (377/377), 47.78 KiB | 1.54 MiB/s, done.
Resolving deltas: 100% (178/178), done.
root@e6c2e5ef1dac:~# mkdir pkgconfig/~
root@e6c2e5ef1dac:~# touch pkgconfig/~/foobar
root@e6c2e5ef1dac:~# R CMD build pkgconfig/
* checking for file ‘pkgconfig/DESCRIPTION’ ... OK
* preparing ‘pkgconfig’:
* checking DESCRIPTION meta-information ... OK
* checking for LF line-endings in source and make files and shell scripts
* checking for empty or unneeded directories
* building ‘pkgconfig_2.0.3.tar.gz’
Warning in gzfile(tarfile, "wb", compression = compression_level) :
  cannot open compressed file '/root/pkgconfig_2.0.3.tar.gz', probable reason 'No such file or directory'
Error in gzfile(tarfile, "wb", compression = compression_level) :
  cannot open the connection
Execution halted
root@e6c2e5ef1dac:~# ls /root
ls: cannot access '/root': No such file or directory

@gaborcsardi
Copy link
Member

gaborcsardi commented Feb 26, 2020

It is fixed in R-devel: wch/r-source@1d4f7aa
It will not be fixed in the upcoming 3.6.3 R release, only in 4.0.0.

This said, I think unlink() could just refuse to delete ~, that is very rarely a useful operation.

@gaborcsardi
Copy link
Member

It seems ~ only causes problems if it is in the package root, so we can probably check for that in pkgbuild, remotes, etc. A bunch of packages call R CMD build...

@ranghetti
Copy link

It is fixed in R-devel: wch/r-source@1d4f7aa
It will not be fixed in the upcoming 3.6.3 R release, only in 4.0.0.

Ok, this is comforting; thank you for signaling it.

@lbusett
Copy link

lbusett commented Feb 26, 2020

@gaborcsardi Many thanks for looking into this. Glad you found the culprit.

It seems ~ only causes problems if it is in the package root, so we can probably check for that in pkgbuild, remotes, etc. A bunch of packages call R CMD build...

That would be nice. I recon we did a mistake in creating that folder, but it could occur to others between now and R4.0 release.

@gaborcsardi
Copy link
Member

@mr-francois @ranghetti Are the packages you were checking public? They would help me find the package/function/tool that created the ~ directory, potentially.

@gaborcsardi
Copy link
Member

Emailed R-devel about this: https://www.mail-archive.com/[email protected]/msg41749.html

@ranghetti
Copy link

@mr-francois @ranghetti Are the packages you were checking public? They would help me find the package/function/tool that created the ~ directory, potentially.

I checked my package "sen2r", and it does not seem to contain functions which can create "~" directory. The presence of that directory was probably due to a manual accident (occurred having the working directory set on the root package path).

@ranghetti
Copy link

ranghetti commented Feb 27, 2020

Just one clarification about this sentence:

A real example is R CMD build which deletes the home directory of the current user if the root of the package contains a non-empty "~" directory.

In my case, the home directory was deleted even in presence of an empty "~" folder.

@mr-francois
Copy link
Author

@mr-francois @ranghetti Are the packages you were checking public? They would help me find the package/function/tool that created the ~ directory, potentially.

I think one time devtools::test() created a ~/pkg/~ directory when i had one R script in my package directory containing a call to keras::fit_generator and inside that function i had a path to write tensorboard callbacks called ~/tensorboard and ended up with ~/pkg/~/tensorboard.

@pyltime
Copy link

pyltime commented Jan 19, 2021

Would it be possible to add a check to devtools to look for these specific folders that might trigger this, and give a warning to the end user if so? This is potentially catastrophic if the end user hasn't backed up and accidentally triggers this when running devtools::check or devtools::install.

@gaborcsardi gaborcsardi added the feature a feature request or enhancement label Apr 27, 2021
@gaborcsardi
Copy link
Member

The root cause of this issue was fixed in R 4.0.0:

unlink() gains a new argument expand to disable wildcard and tilde expansion. Elements of x of value "~" are now ignored.

(https://cran.r-project.org/doc/manuals/r-release/NEWS.html)

I can still add a check for the earlier R versions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature a feature request or enhancement
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants