revdep vardpoor example failures after melt change #6071

tdhock · 2024-04-11T00:09:14Z

after this commit 8fa8c8f
we get the following, which I will investigate.

* checking examples ... ERROR
Running examples in 'vardpoor-Ex.R' failed
The error most likely occurred in:

> base::assign(".ptime", proc.time(), pos = "CheckExEnv")
> ### Name: vardannual
> ### Title: Variance estimation for measures of annual net change or annual
> ###   for single and multistage stage cluster sampling designs
> ### Aliases: vardannual
> ### Keywords: vardannual
> 
> ### ** Examples
> 
>  
> ### Example
> library("data.table")
> 
> set.seed(1)
> 
> data("eusilc", package = "laeken")
> eusilc1 <- eusilc[1:20, ]
> rm(eusilc)
> 
> dataset1 <- data.table(rbind(eusilc1, eusilc1),
+                        year = c(rep(2010, nrow(eusilc1)),
+                                 rep(2011, nrow(eusilc1))))
> rm(eusilc1)
> 
> dataset1[, country := "AT"]
> dataset1[, half := .I - 2 * trunc((.I - 1) / 2)]
> dataset1[, quarter := .I - 4 * trunc((.I - 1) / 4)]
> dataset1[age < 0, age := 0]
> 
> PSU <- dataset1[, .N, keyby = "db030"][, N := NULL][]
> PSU[, PSU := trunc(runif(.N, 0, 5))]
> 
> dataset1 <- merge(dataset1, PSU, all = TRUE, by = "db030")
> rm(PSU)
> 
> dataset1[, strata := "XXXX"]
> dataset1[, employed := trunc(runif(.N, 0, 2))]
> dataset1[, unemployed := trunc(runif(.N, 0, 2))]
> dataset1[, labour_force := employed + unemployed]
> dataset1[, id_lv2 := paste0("V", .I)]
> 
> vardannual(Y = "employed", H = "strata",
+            PSU = "PSU", w_final = "rb050",
+            ID_level1 = "db030", ID_level2 = "id_lv2",
+            Dom = NULL, Z = NULL, years = "year",
+            subperiods = "half", dataset = dataset1,
+            percentratio = 100, confidence = 0.95,
+            method = "cros")
Error in t(X) %*% A_matrix : non-conformable arguments
Calls: vardannual -> lapply -> FUN -> data.table
Execution halted

The text was updated successfully, but these errors were encountered:

tdhock · 2024-04-11T04:31:00Z

using data.table from CRAN, and adding print statements to vardpoor, I get the output below

vrdnnl> vardannual(Y = "employed", H = "strata",
vrdnnl+            PSU = "PSU", w_final = "rb050",
vrdnnl+            ID_level1 = "db030", ID_level2 = "id_lv2",
vrdnnl+            Dom = NULL, Z = NULL, years = "year",
vrdnnl+            subperiods = "half", dataset = dataset1,
vrdnnl+            percentratio = 100, confidence = 0.95,
vrdnnl+            method = "cros")
[1] "X"
 num [1:2] 242 1263
[1] "A_matrix"
 num [1:2, 1:2] 1 0.375 0.375 1
[1] "X"
 num [1:2] 1272 1542
[1] "A_matrix"
 num [1:2, 1:2] 1 -1 -1 1

Using data.table from github master I get output below

vrdnnl> vardannual(Y = "employed", H = "strata",
vrdnnl+            PSU = "PSU", w_final = "rb050",
vrdnnl+            ID_level1 = "db030", ID_level2 = "id_lv2",
vrdnnl+            Dom = NULL, Z = NULL, years = "year",
vrdnnl+            subperiods = "half", dataset = dataset1,
vrdnnl+            percentratio = 100, confidence = 0.95,
vrdnnl+            method = "cros")
[1] "X"
 num(0) 
[1] "A_matrix"
 num [1:2, 1:2] 1 NA NA 1
Erreur dans t(X) %*% A_matrix : arguments inadéquats
Appels : example ... eval -> eval -> vardannual -> lapply -> FUN -> data.table
Exécution arrêtée

tdhock · 2024-04-11T05:01:50Z

vardannual calls vardcros which calls melt with measure via the code below

  if (!is.null(namesZ)) varsYZ <- list(namesY, namesZ)
  str(list(varsYZ=varsYZ, DTagg=DTagg))
  DTagg <- melt(DTagg, id = c(namesperc, gnamesDom),
                measure = varsYZ,
                variable.factor = FALSE)

output below using data.table master:

List of 2
 $ varsYZ:List of 1
  ..$ : chr "employed"
 $ DTagg :Classes ‘data.table’ and 'data.frame':	4 obs. of  3 variables:
  ..$ pers    : chr [1:4] "2010__1" "2010__2" "2011__1" "2011__2"
  ..$ percoun : chr [1:4] "1" "1" "1" "1"
  ..$ employed: num [1:4] 2742 3479 2259 2622
  ..- attr(*, "sorted")= chr [1:2] "pers" "percoun"
  ..- attr(*, ".internal.selfref")=<externalptr>

and same for data.table release, which suggests varsYZ (list of length 1) should be changed to character vector.

tdhock · 2024-04-11T05:07:23Z

the fix is to add the following line in the revdep code, I will file a PR.

  if (length(varsYZ)==1) varsYZ <- unlist(varsYZ)

jangorecki · 2024-04-11T07:38:57Z

From report you filled downstream it doesn't sound it was a bug in revdep code. Therefore why not to add, at least, one extra release cycle for transition, not only CRAN packages but all the others? Currently used approach was never used in data.table, Matt was pulling out breaking changes, and sometimes even deciding to not proceed with them at all. Giving extra release cycle of time, and a notice in NEWS file sounds as softer way to approach this change.

tdhock · 2024-04-11T18:46:37Z

good idea

MichaelChirico · 2024-04-12T01:08:06Z

I think in this case, because downstream is relying on undocumented behavior, we can be a bit more aggressive than usual: I would move to a warning in the next release, then error, then remove code supporting the old way.

jangorecki · 2024-04-12T06:54:24Z

warning means that a package check will start to fail already from next release, and that sounds discouraging for new pkgs to import DT

MichaelChirico · 2024-04-12T15:01:21Z

warning means that a package check will start to fail already from next release, and that sounds discouraging for new pkgs to import DT

warning in example is not registered by R CMD check (just confirmed)

MichaelChirico · 2024-07-30T05:06:14Z

@tdhock I filed a suggested improvement to your downstream PR. However, IMO it makes their code superficially worse by introducing inconsistency across the length==1/length>1 cases -- is there a strong reason not to support that style (e.g. doing so would require some contradictory/collided behavior & we prefer the new approach)?

If we plan to go ahead with breaking downstream, please make sure to add a relevant NEWS item explaining the change of behavior, and inform downstream of the imminent release; otherwise, please help to file a PR for the continued support of measure=list(<single vector>).

tdhock · 2024-07-30T10:46:28Z

Right, the breaking change was made for increased consistency in data.table, see #5209
This is another example of bringing the implementation up to date with the docs, so not sure if a deprecation cycle is warranted. It definitely would be easier for us not to do one.
Since this is a weird corner case, and there is only one known revdep, I tend to err on the side of just making & explaining the breaking change. What do you think?
More generally, we should probably write a policy for when breaking changes are warranted (and when deprecation cycles are necessary). I believe the case of bringing the implementation up to date with the docs should be one example when a breaking change is warranted.

MichaelChirico · 2024-07-30T16:47:09Z

I do agree it's not ideal to bend over backwards for one revdep, however

not all revdeps that are broken will be revealed by our testing -- many packages simply have poor test coverage!
user space (scripts, not packages) is even bigger and almost completely unknown/untested

So we tend to err on the side of caution wherever possible -- that's the "strong preference" for back-compatibility in the GOVERNANCE doc.

Adding consistency is good, but unless there's a new code path that prevents us from maintaining more back-compatibility in the short term, we should endeavor not to break downstreams without any deprecation cycle.

In this case, too, I worry from the example we did find that we are reducing consistency -- downstream goes from doing if (is.null(B)) list(A) else list(A, B) (parallel use of list() across branches) to if (is.null(B)) A else list(A, B); the former is more consistent to my reading. Is there an even nicer approach they should be using instead?

More generally, we should probably write a policy for when breaking changes are warranted (and when deprecation cycles are necessary).

I agree it would be useful to have this written out.

tdhock · 2024-08-03T13:55:08Z

revdep checks confirm this is no longer an issue for vardpoor

tdhock · 2024-08-03T14:13:32Z

I agree it would be useful to have this written out.

Here is a first attempt:
https://github.com/Rdatatable/data.table/wiki/Release-management-and-revdep-checks#when-is-a-breaking-change-warranted

tdhock self-assigned this Apr 11, 2024

tdhock added the revdep Reverse dependencies label Apr 11, 2024

tdhock mentioned this issue Apr 11, 2024

unlist varsYZ for compatibility with upcoming data.table CSBLatvia/vardpoor#26

Open

tdhock added this to the 1.16.0 milestone Jun 5, 2024

tdhock mentioned this issue Aug 1, 2024

melt warns for measure.vars=list of length=1 #6333

Merged

tdhock closed this as completed in #6333 Aug 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

revdep vardpoor example failures after melt change #6071

revdep vardpoor example failures after melt change #6071

tdhock commented Apr 11, 2024

tdhock commented Apr 11, 2024

tdhock commented Apr 11, 2024

tdhock commented Apr 11, 2024

jangorecki commented Apr 11, 2024

tdhock commented Apr 11, 2024

MichaelChirico commented Apr 12, 2024

jangorecki commented Apr 12, 2024

MichaelChirico commented Apr 12, 2024

MichaelChirico commented Jul 30, 2024

tdhock commented Jul 30, 2024

MichaelChirico commented Jul 30, 2024

tdhock commented Aug 3, 2024

tdhock commented Aug 3, 2024

revdep vardpoor example failures after melt change #6071

revdep vardpoor example failures after melt change #6071

Comments

tdhock commented Apr 11, 2024

tdhock commented Apr 11, 2024

tdhock commented Apr 11, 2024

tdhock commented Apr 11, 2024

jangorecki commented Apr 11, 2024

tdhock commented Apr 11, 2024

MichaelChirico commented Apr 12, 2024

jangorecki commented Apr 12, 2024

MichaelChirico commented Apr 12, 2024

MichaelChirico commented Jul 30, 2024

tdhock commented Jul 30, 2024

MichaelChirico commented Jul 30, 2024

tdhock commented Aug 3, 2024

tdhock commented Aug 3, 2024