Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zip path is too long #719

Closed
brianmsm opened this issue Feb 5, 2023 · 6 comments
Closed

zip path is too long #719

brianmsm opened this issue Feb 5, 2023 · 6 comments

Comments

@brianmsm
Copy link

brianmsm commented Feb 5, 2023

I am in windows and I have a certain folder structure. I have a database that I try to import with readxl::read_excel(), however I get the following error:

Error in unz(zip_path, file_path, open = "rb") : 
  cannot open the connection
In addition: Warning message:
In unz(zip_path, file_path, open = "rb") : zip path is too long

I have copied the same file to the same location in .sav and .dta format with the haven package and it reads normally. I have also activated long paths as suggested here (https://learn.microsoft.com/en-us/windows/win32/fileio/maximum-file-path-limitation?tabs=powershell), but it still does not work.

haven::read_sav("1. Data/Valence Depresion Domaradzka.sav")
#> # A tibble: 1,632 × 39
#>       Id sex         age VD02    VD03    VD04    VD05    VD06    VD07    VD08   
#>    <dbl> <dbl+lbl> <dbl> <dbl+l> <dbl+l> <dbl+l> <dbl+l> <dbl+l> <dbl+l> <dbl+l>
#>  1     2 1 [Femal…    32 2 [I d… 2 [I d… 2 [I d… 2 [I d… 2 [I d… 2 [I d… 2 [I d…
#>  2     4 1 [Femal…    34 2 [I d… 1 [I a… 1 [I a… 2 [I d… 2 [I d… 1 [I a… 2 [I d…
#>  3    10 1 [Femal…    30 2 [I d… 2 [I d… 2 [I d… 2 [I d… 2 [I d… 1 [I a… 2 [I d…
#>  4    11 1 [Femal…    23 1 [I a… 2 [I d… 2 [I d… 2 [I d… 2 [I d… 1 [I a… 2 [I d…
#>  5    15 1 [Femal…    53 2 [I d… 1 [I a… 2 [I d… 2 [I d… 2 [I d… 1 [I a… 2 [I d…
#>  6    16 1 [Femal…    46 2 [I d… 2 [I d… 2 [I d… 2 [I d… 2 [I d… 2 [I d… 2 [I d…
#>  7    17 1 [Femal…    51 2 [I d… 2 [I d… 2 [I d… 1 [I a… 2 [I d… 2 [I d… 2 [I d…
#>  8    19 1 [Femal…    62 1 [I a… 1 [I a… 2 [I d… 1 [I a… 1 [I a… 2 [I d… 1 [I a…
#>  9    22 1 [Femal…    34 2 [I d… 2 [I d… 2 [I d… 2 [I d… 2 [I d… 2 [I d… 2 [I d…
#> 10    24 1 [Femal…    43 2 [I d… 1 [I a… 2 [I d… 1 [I a… 1 [I a… 1 [I a… 1 [I a…
#> # … with 1,622 more rows, and 29 more variables: VD09 <dbl+lbl>,
#> #   VD10 <dbl+lbl>, VD11 <dbl+lbl>, VD12 <dbl+lbl>, VD14 <dbl+lbl>,
#> #   VD15 <dbl+lbl>, VD16 <dbl+lbl>, VD17 <dbl+lbl>, VD18 <dbl+lbl>,
#> #   VD19 <dbl+lbl>, VD20 <dbl+lbl>, VD21 <dbl+lbl>, VD22 <dbl+lbl>,
#> #   VD23 <dbl+lbl>, VD24 <dbl+lbl>, VD25 <dbl+lbl>, VD26 <dbl+lbl>,
#> #   VD27 <dbl+lbl>, VD28 <dbl+lbl>, VD29 <dbl+lbl>, VD30 <dbl+lbl>,
#> #   VD31 <dbl+lbl>, VD33 <dbl+lbl>, VD34 <dbl+lbl>, VD35 <dbl+lbl>, …
haven::read_dta("1. Data/Valence depresion Domaradzka.dta")
#> # A tibble: 1,632 × 39
#>       Id sex         age VD02    VD03    VD04    VD05    VD06    VD07    VD08   
#>    <dbl> <dbl+lbl> <dbl> <dbl+l> <dbl+l> <dbl+l> <dbl+l> <dbl+l> <dbl+l> <dbl+l>
#>  1     1 2 [Male]     31 1 [I a… 2 [I d… 2 [I d… 1 [I a… 1 [I a… 1 [I a… 1 [I a…
#>  2     2 1 [Femal…    32 2 [I d… 2 [I d… 2 [I d… 2 [I d… 2 [I d… 2 [I d… 2 [I d…
#>  3     3 2 [Male]     40 2 [I d… 2 [I d… 2 [I d… 2 [I d… 2 [I d… 2 [I d… 2 [I d…
#>  4     4 1 [Femal…    34 2 [I d… 1 [I a… 1 [I a… 2 [I d… 2 [I d… 1 [I a… 2 [I d…
#>  5     5 2 [Male]     40 2 [I d… 2 [I d… 1 [I a… 2 [I d… 1 [I a… 2 [I d… 2 [I d…
#>  6     6 2 [Male]     24 2 [I d… 1 [I a… 2 [I d… 2 [I d… 2 [I d… 2 [I d… 2 [I d…
#>  7     7 2 [Male]     29 2 [I d… 2 [I d… 2 [I d… 2 [I d… 2 [I d… 2 [I d… 2 [I d…
#>  8     8 2 [Male]     25 1 [I a… 1 [I a… 2 [I d… 2 [I d… 2 [I d… 2 [I d… 1 [I a…
#>  9     9 2 [Male]     25 1 [I a… 2 [I d… 2 [I d… 2 [I d… 1 [I a… 2 [I d… 1 [I a…
#> 10    10 1 [Femal…    30 2 [I d… 2 [I d… 2 [I d… 2 [I d… 2 [I d… 1 [I a… 2 [I d…
#> # … with 1,622 more rows, and 29 more variables: VD09 <dbl+lbl>,
#> #   VD10 <dbl+lbl>, VD11 <dbl+lbl>, VD12 <dbl+lbl>, VD14 <dbl+lbl>,
#> #   VD15 <dbl+lbl>, VD16 <dbl+lbl>, VD17 <dbl+lbl>, VD18 <dbl+lbl>,
#> #   VD19 <dbl+lbl>, VD20 <dbl+lbl>, VD21 <dbl+lbl>, VD22 <dbl+lbl>,
#> #   VD23 <dbl+lbl>, VD24 <dbl+lbl>, VD25 <dbl+lbl>, VD26 <dbl+lbl>,
#> #   VD27 <dbl+lbl>, VD28 <dbl+lbl>, VD29 <dbl+lbl>, VD30 <dbl+lbl>,
#> #   VD31 <dbl+lbl>, VD33 <dbl+lbl>, VD34 <dbl+lbl>, VD35 <dbl+lbl>, …
readxl::read_excel("1. Data/Valence depresion Domaradzka.xlsx")
#> Warning in unz(zip_path, file_path, open = "rb"): el path de zip es demasiado
#> largo
#> Error in unz(zip_path, file_path, open = "rb"): no se puede abrir la conexión

fs::path_real("1. Data/Valence depresion Domaradzka.xlsx")
#> D:/Insync/[email protected]/Google Drive/Cursos de Brian Peña - Compartido/Mios/Cursos en la SPP/1. Curso Virtual. Análisis de datos con R para Psicólogos/Materiales/Cuarta Edición/Sesión 01/1. Data/Valence depresion Domaradzka.xlsx

Created on 2023-02-05 with reprex v2.0.2

Session info
sessioninfo::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value
#>  version  R version 4.2.2 (2022-10-31 ucrt)
#>  os       Windows 10 x64 (build 22621)
#>  system   x86_64, mingw32
#>  ui       RTerm
#>  language (EN)
#>  collate  Spanish_Peru.utf8
#>  ctype    Spanish_Peru.utf8
#>  tz       America/Bogota
#>  date     2023-02-05
#>  pandoc   3.0.1 @ C:/Users/brian/AppData/Local/Pandoc/ (via rmarkdown)
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package     * version date (UTC) lib source
#>  cellranger    1.1.0   2016-07-27 [1] CRAN (R 4.2.2)
#>  cli           3.6.0   2023-01-09 [1] CRAN (R 4.2.2)
#>  crayon        1.5.2   2022-09-29 [1] CRAN (R 4.2.2)
#>  digest        0.6.31  2022-12-11 [1] CRAN (R 4.2.2)
#>  ellipsis      0.3.2   2021-04-29 [1] CRAN (R 4.2.2)
#>  evaluate      0.20    2023-01-17 [1] CRAN (R 4.2.2)
#>  fansi         1.0.3   2022-03-24 [1] CRAN (R 4.2.2)
#>  fastmap       1.1.0   2021-01-25 [1] CRAN (R 4.2.2)
#>  forcats       1.0.0   2023-01-29 [1] CRAN (R 4.2.2)
#>  fs            1.5.2   2021-12-08 [1] CRAN (R 4.2.2)
#>  glue          1.6.2   2022-02-24 [1] CRAN (R 4.2.2)
#>  haven         2.5.1   2022-08-22 [1] CRAN (R 4.2.2)
#>  hms           1.1.2   2022-08-19 [1] CRAN (R 4.2.2)
#>  htmltools     0.5.4   2022-12-07 [1] CRAN (R 4.2.2)
#>  knitr         1.42    2023-01-25 [1] CRAN (R 4.2.2)
#>  lifecycle     1.0.3   2022-10-07 [1] CRAN (R 4.2.2)
#>  magrittr      2.0.3   2022-03-30 [1] CRAN (R 4.2.2)
#>  pillar        1.8.1   2022-08-19 [1] CRAN (R 4.2.2)
#>  pkgconfig     2.0.3   2019-09-22 [1] CRAN (R 4.2.2)
#>  purrr         1.0.1   2023-01-10 [1] CRAN (R 4.2.2)
#>  R.cache       0.16.0  2022-07-21 [1] CRAN (R 4.2.2)
#>  R.methodsS3   1.8.2   2022-06-13 [1] CRAN (R 4.2.0)
#>  R.oo          1.25.0  2022-06-12 [1] CRAN (R 4.2.0)
#>  R.utils       2.12.2  2022-11-11 [1] CRAN (R 4.2.2)
#>  R6            2.5.1   2021-08-19 [1] CRAN (R 4.2.2)
#>  readr         2.1.3   2022-10-01 [1] CRAN (R 4.2.2)
#>  readxl        1.4.1   2022-08-17 [1] CRAN (R 4.2.2)
#>  reprex        2.0.2   2022-08-17 [1] CRAN (R 4.2.2)
#>  rlang         1.0.6   2022-09-24 [1] CRAN (R 4.2.2)
#>  rmarkdown     2.20    2023-01-19 [1] CRAN (R 4.2.2)
#>  rstudioapi    0.14    2022-08-22 [1] CRAN (R 4.2.2)
#>  sessioninfo   1.2.2   2021-12-06 [1] CRAN (R 4.2.2)
#>  styler        1.9.0   2023-01-15 [1] CRAN (R 4.2.2)
#>  tibble        3.1.8   2022-07-22 [1] CRAN (R 4.2.2)
#>  tzdb          0.3.0   2022-03-28 [1] CRAN (R 4.2.2)
#>  utf8          1.2.2   2021-07-24 [1] CRAN (R 4.2.2)
#>  vctrs         0.5.1   2022-11-16 [1] CRAN (R 4.2.2)
#>  withr         2.5.0   2022-03-03 [1] CRAN (R 4.2.2)
#>  xfun          0.36    2022-12-21 [1] CRAN (R 4.2.2)
#>  yaml          2.3.6   2022-10-18 [1] CRAN (R 4.2.2)
#> 
#>  [1] C:/Users/brian/AppData/Local/R/win-library/4.2
#>  [2] C:/Program Files/R/R-4.2.2/library
#> 
#> ──────────────────────────────────────────────────────────────────────────────
@jennybc
Copy link
Member

jennybc commented Feb 9, 2023

readxl only uses base R facilities in the internal helper where this is coming from:

https://github.com/tidyverse/readxl/blob/main/R/xlsx-zip.R

So the answer for now is that this path truly is problematic for readxl, because there's not some quick fix we can make in our code.

I know you say have activated long paths, but here's someone reporting success with that method, pointing to exactly the same article:
https://stackoverflow.com/a/71621579
Have you definitely restarted your computer since making the change?

It looks like openxlsx uses a 3rd party library to access the files inside the .zip archive (which is what .xlsx files actually are), so you may want to try using that package instead.

@jennybc
Copy link
Member

jennybc commented Feb 9, 2023

And another lead re: something to check on your system:
https://community.rstudio.com/t/does-rstudio-use-windows-longpathsenabled-registry-setting/130033

@jennybc
Copy link
Member

jennybc commented Mar 10, 2023

I have by no means digested all of the content in this post, but it gives me hope that perhaps the problem is going to be fixed at the source, i.e. in R itself, in the not-too-distant future:

https://blog.r-project.org/2023/03/07/path-length-limit-on-windows/

@brianmsm
Copy link
Author

I'm sorry, I had not seen the responses in this thread. I made the change in gpedit.msc and restarted also but the problem persists.

@jennybc
Copy link
Member

jennybc commented Mar 10, 2023

It is possible that the next version of R will handle long paths better and solve this for us.

@brianmsm
Copy link
Author

Hello!

This is now working without problems!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants