Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add simple check for variables with different capitalization #84

Merged
merged 14 commits into from
Mar 5, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .buildlibrary
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
ValidationKey: '6013424'
ValidationKey: '6035035'
AcceptedWarnings:
- 'Warning: package ''.*'' was built under R version'
- 'Warning: namespace ''.*'' is not available and has been replaced'
Expand Down
4 changes: 2 additions & 2 deletions CITATION.cff
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,8 @@ cff-version: 1.2.0
message: If you use this software, please cite it using the metadata from this file.
type: software
title: 'gms: ''GAMS'' Modularization Support Package'
version: 0.30.4
date-released: '2024-02-28'
version: 0.30.5
date-released: '2024-03-05'
abstract: A collection of tools to create, use and maintain modularized model code
written in the modeling language 'GAMS' (<https://www.gams.com/>). Out-of-the-box
'GAMS' does not come with support for modularized model code. This package provides
Expand Down
4 changes: 2 additions & 2 deletions DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
Package: gms
Type: Package
Title: 'GAMS' Modularization Support Package
Version: 0.30.4
Date: 2024-02-28
Version: 0.30.5
Date: 2024-03-05
Authors@R: c(person("Jan Philipp", "Dietrich", email = "[email protected]", role = c("aut","cre")),
person("David", "Klein", role = "aut"),
person("Anastasis", "Giannousakis", role = "aut"),
Expand Down
152 changes: 109 additions & 43 deletions R/checkAppearance.R
Original file line number Diff line number Diff line change
@@ -1,72 +1,138 @@
#' checkAppearance
#'
#'
#' Checks for all declared objects in which parts of the model they appear and
#' calculates the type of each object (core object, interface object, module
#' object of module xy,...)
#'
#'
#'
#'
#' @param x A code list as returned by \code{\link{codeExtract}}
#' @param capitalExclusionList A vector of names that should be ignored when
#' checking for unified capitalization of variables
#' @return A list with four elements: appearance, setappearance, type and
#' warnings. Appearance is a matrix containing values which indicate whether an
#' object appears in a part of the code or not (e.g. indicates whether "vm_example"
#' warnings. Appearance is a matrix containing values which indicate whether an
#' object appears in a part of the code or not (e.g. indicates whether "vm_example"
#' appears in realization "on" of module "test" or not.). 0 means that it does not appear,
#' 1 means that it appears in the code and 2 means that it appears in the
#' not_used.txt. setappearance contains the same information but for sets instead of other
#' objects. Type is a vector containing the type of each object (exluding sets). And warnings
#' contains a list of warnings created during that process.
#' @author Jan Philipp Dietrich

#' @export
#' @seealso \code{\link{codeCheck}},\code{\link{readDeclarations}}
checkAppearance <- function(x) {
checkAppearance <- function(x, capitalExclusionList = NULL) {
w <- NULL
ptm <- proc.time()["elapsed"]
message(" Running checkAppearance...")
colnames <- unique(names(x$code))
rownames <- unique(x$declarations[,"names"])
if(!is.null(x$not_used)) rownames <- unique(c(rownames,x$not_used[,"name"]))

# remove right-hand sides in execute_load statetements as there are non-module-related
rownames <- unique(x$declarations[, "names"])

if (!is.null(x$not_used)) rownames <- unique(c(rownames, x$not_used[, "name"]))

Check warning on line 30 in R/checkAppearance.R

View check run for this annotation

Codecov / codecov/patch

R/checkAppearance.R#L30

Added line #L30 was not covered by tests

# check for variables with different capitalization in declarations
if (length(rownames[duplicated(tolower(rownames))]) > 0) {
w <- .warning(paste0(
"Found variables with more than one capitalization in declarations and not_used.txt files: ",
paste0(rownames[duplicated(tolower(rownames))], collapse = ", ")
), w = w)

Check warning on line 37 in R/checkAppearance.R

View check run for this annotation

Codecov / codecov/patch

R/checkAppearance.R#L34-L37

Added lines #L34 - L37 were not covered by tests
}

# remove right-hand sides in execute_load statements as there are non-module-related
# object names allowed (here one refers to the names in the gdx file, but these
# could come from other modules)
tmp <- grep("execute_load",x$code,ignore.case=TRUE)
x$code[tmp] <- gsub("=[^,]*","",x$code[tmp])

tmp_func <- function(name,x) { return(paste(x[names(x)==name],collapse=" "))}
tmp <- sapply(colnames,tmp_func,x$code)

#add empty entry in tmp for module realization which do not contain any code but have a not_used.txt
not_used_names <- unique(dimnames(x$not_used)[[1]])
missing <- not_used_names[!(not_used_names %in% colnames)]
if(length(missing)>0) {
mtmp <- rep("",length(missing))
tmp <- grep("execute_load", x$code, ignore.case = TRUE)
x$code[tmp] <- gsub("=[^,]*", "", x$code[tmp])

tmp <- sapply(colnames, function(name, x) {
return(paste(x[names(x) == name], collapse = " "))
}, x$code)

# add empty entry in tmp for module realization which do not contain any code but have a not_used.txt
notUsedNames <- unique(dimnames(x$not_used)[[1]])
missing <- notUsedNames[!(notUsedNames %in% colnames)]
if (length(missing) > 0) {
mtmp <- rep("", length(missing))

Check warning on line 54 in R/checkAppearance.R

View check run for this annotation

Codecov / codecov/patch

R/checkAppearance.R#L54

Added line #L54 was not covered by tests
names(mtmp) <- missing
tmp <- c(tmp,mtmp)
colnames <- c(colnames,missing)
tmp <- c(tmp, mtmp)
colnames <- c(colnames, missing)

Check warning on line 57 in R/checkAppearance.R

View check run for this annotation

Codecov / codecov/patch

R/checkAppearance.R#L56-L57

Added lines #L56 - L57 were not covered by tests
}
message(" Start variable matching... (time elapsed: ",format(proc.time()["elapsed"]-ptm,width=6,nsmall=2,digits=2),")")
# This part is the most time consuming (90% of the time in codeCheck) Here, the variable names are searched for in
# all module realizations. This process primarily seems to scale with the number of variables and not with the number
# of module realizations.It is hard to optimize since the number of variables that the code has to look for can

declarationsRegex <- paste("(^|[^[:alnum:]_])", escapeRegex(rownames), "($|[^[:alnum:]_])", sep = "")

message(" Start variable matching... (time elapsed: ",
format(proc.time()["elapsed"] - ptm, width = 6, nsmall = 2, digits = 2), ")")

# This part is the most time consuming (90% of the time in codeCheck). Here, the variable names are searched for in
# all module realizations. This process primarily seems to scale with the number of variables and not with the number
# of module realizations. It is hard to optimize since the number of variables that the code has to look for can
# hardly be reduced
a <- t(sapply(paste("(^|[^[:alnum:]_])",escapeRegex(rownames),"($|[^[:alnum:]_])",sep=""),grepl,tmp, perl = TRUE))

message(" Finished variable matching... (time elapsed: ",format(proc.time()["elapsed"]-ptm,width=6,nsmall=2,digits=2),")")

a <- t(sapply(declarationsRegex, grepl, tmp, perl = TRUE))

message(" Finished variable matching... (time elapsed: ",
format(proc.time()["elapsed"] - ptm, width = 6, nsmall = 2, digits = 2), ")")

dimnames(a)[[1]] <- rownames
dimnames(a)[[2]] <- colnames
w <- NULL
if(!is.null(x$not_used)){
for(i in 1:dim(x$not_used)[1]) {
if(a[x$not_used[i,"name"],dimnames(x$not_used)[[1]][i]]) {
w <- .warning(x$not_used[i,"name"]," appears in not_used.txt of module ",dimnames(x$not_used)[[1]][i]," but is used in the GAMS code of it!",w=w)

# Find variables with different capitalization

code <- x$code
# exclude \" comments \"
code <- gsub("\\\".*\\\"", "", code)
# exclude any text surrounded by with single quotes
code <- gsub("'.*'", "", code)
# exclude display statements
code <- gsub("display.*", "", code)

message(" Start var capitalization check... (time elapsed: ",
format(proc.time()["elapsed"] - ptm, width = 6, nsmall = 2, digits = 2), ")")

# check for each variable if it appears with different capitalization in the code
duplicates <- sapply(declarationsRegex, function(x) {
# get all lines of code that match the variable (case insensitive)
chunks <- code[grepl(x, code, ignore.case = TRUE)]
# if case sensitive search yields less results, there must be occurrences different capitalization
return(length(chunks) != length(chunks[grepl(x, chunks, ignore.case = FALSE)]))
})

if (length(rownames[setdiff(rownames[duplicates], capitalExclusionList)] > 0)) {

duplicateNames <- unname(setdiff(rownames[duplicates], capitalExclusionList))

Check warning on line 100 in R/checkAppearance.R

View check run for this annotation

Codecov / codecov/patch

R/checkAppearance.R#L100

Added line #L100 was not covered by tests

msg <- paste0(
"Found variables with more than one capitalization in the codebase: ",
paste0(duplicateNames, collapse = ", "), "\n"

Check warning on line 104 in R/checkAppearance.R

View check run for this annotation

Codecov / codecov/patch

R/checkAppearance.R#L102-L104

Added lines #L102 - L104 were not covered by tests
)

for (dup in duplicateNames) {
msg <- paste0(msg, "- Lines found for item '", dup, "':\n")
dup <- paste("(^|[^[:alnum:]_])", escapeRegex(dup), "($|[^[:alnum:]_])", sep = "")
chunks <- code[grepl(dup, code, ignore.case = TRUE)]
msg <- paste0(msg, paste0(setdiff(chunks, chunks[grepl(dup, chunks, ignore.case = FALSE)]), collapse = "\n"))
msg <- paste0(msg, "\n")

Check warning on line 112 in R/checkAppearance.R

View check run for this annotation

Codecov / codecov/patch

R/checkAppearance.R#L107-L112

Added lines #L107 - L112 were not covered by tests
}

w <- .warning(msg, w = w)

Check warning on line 115 in R/checkAppearance.R

View check run for this annotation

Codecov / codecov/patch

R/checkAppearance.R#L115

Added line #L115 was not covered by tests

}

message(" Finished var capitalization check... (time elapsed: ",
format(proc.time()["elapsed"] - ptm, width = 6, nsmall = 2, digits = 2), ")")

if (!is.null(x$not_used)) {
for (i in 1:dim(x$not_used)[1]) {
if (a[x$not_used[i, "name"], dimnames(x$not_used)[[1]][i]]) {
w <- .warning(x$not_used[i, "name"], " appears in not_used.txt of module ", dimnames(x$not_used)[[1]][i],
" but is used in the GAMS code of it!", w = w)

Check warning on line 126 in R/checkAppearance.R

View check run for this annotation

Codecov / codecov/patch

R/checkAppearance.R#L123-L126

Added lines #L123 - L126 were not covered by tests
}
a[x$not_used[i,"name"],dimnames(x$not_used)[[1]][i]] <- 2
a[x$not_used[i, "name"], dimnames(x$not_used)[[1]][i]] <- 2

Check warning on line 128 in R/checkAppearance.R

View check run for this annotation

Codecov / codecov/patch

R/checkAppearance.R#L128

Added line #L128 was not covered by tests
}
}
sets <- x$declarations[x$declarations[,"type"]=="set","names"]
a_sets <- a[sets,,drop=FALSE]
a <- a[!(rownames(a)%in%sets),,drop=FALSE]
type <- sub("^(o|)[^_]*?(m|[0-9]{2}|)_.*$","\\1\\2",dimnames(a)[[1]])

sets <- x$declarations[x$declarations[, "type"] == "set", "names"]
aSets <- a[sets, , drop = FALSE]
a <- a[!(rownames(a) %in% sets), , drop = FALSE]
type <- sub("^(o|)[^_]*?(m|[0-9]{2}|)_.*$", "\\1\\2", dimnames(a)[[1]])
names(type) <- dimnames(a)[[1]]
return(list(appearance=a,setappearance=a_sets,type=type,warnings=w))
return(list(appearance = a, setappearance = aSets, type = type, warnings = w))
}
19 changes: 15 additions & 4 deletions R/codeCheck.R
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,10 @@
#' in the code and returns a list containing the interfaces of each module of
#' the code.
#'
#' Additional settings can be provided via a yaml file ".codeCheck" in the main
#' folder of the model. Currently supported settings are:
#' - capitalExclusionList: a list of names that should be ignored when checking
#' for unified capitalization of variables
#'
#' @param path path of the main folder of the model
#' @param modulepath path to the module folder relative to "path"
Expand All @@ -12,7 +16,7 @@
#' debugging the codeCheck function
#' @param interactive activates an interactive developer mode in which some of
#' the warnings can be fixed interactively.
#' @param test_switches (boolean) Should realization switches in model core be tested for completness?
#' @param test_switches (boolean) Should realization switches in model core be tested for completeness?
#' Usually set to TRUE but should be set to FALSE for standalone models only using a subset of
#' existing modules
#' @param strict (boolean) test strictness. If set to TRUE warnings from codeCheck will stop calculations
Expand Down Expand Up @@ -144,7 +148,6 @@
return(list(gams = gams, w = w))
}


.getInterfaceInfo <- function(ap, gams, w) {
# setting up a list of used interfaces for each module
interfaceInfo <- list()
Expand Down Expand Up @@ -193,8 +196,8 @@

ptm <- proc.time()["elapsed"]


modulesInfo <- getModules(paste0(path, "/", modulepath))

gams <- .collectData(path = path, modulepath = modulepath, coreFiles = core_files, modulesInfo = modulesInfo)

if (returnDebug) {
Expand All @@ -213,7 +216,15 @@
.emitTimingMessage(" Naming conventions check done...", ptm)

# Check appearance of objects
ap <- checkAppearance(gams)

capitalExclusionList <- NULL

if (file.exists(file.path(path, ".codeCheck"))) {
# read in exclusions for capitalization check from .codeCheck
capitalExclusionList <- read_yaml(file.path(path, ".codeCheck"))[["capitalExclusionList"]]

Check warning on line 224 in R/codeCheck.R

View check run for this annotation

Codecov / codecov/patch

R/codeCheck.R#L224

Added line #L224 was not covered by tests
}

ap <- checkAppearance(gams, capitalExclusionList = capitalExclusionList)
w <- c(w, ap$warnings)

.emitTimingMessage(" Investigated variable appearances...", ptm)
Expand Down
8 changes: 4 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# 'GAMS' Modularization Support Package

R package **gms**, version **0.30.4**
R package **gms**, version **0.30.5**

[![CRAN status](https://www.r-pkg.org/badges/version/gms)](https://cran.r-project.org/package=gms) [![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.4390032.svg)](https://doi.org/10.5281/zenodo.4390032) [![R build status](https://github.com/pik-piam/gms/workflows/check/badge.svg)](https://github.com/pik-piam/gms/actions) [![codecov](https://codecov.io/gh/pik-piam/gms/branch/master/graph/badge.svg)](https://app.codecov.io/gh/pik-piam/gms) [![r-universe](https://pik-piam.r-universe.dev/badges/gms)](https://pik-piam.r-universe.dev/builds)

Expand Down Expand Up @@ -43,7 +43,7 @@ In case of questions / problems please contact Jan Philipp Dietrich <dietrich@pi

To cite package **gms** in publications use:

Dietrich J, Klein D, Giannousakis A, Beier F, Koch J, Baumstark L, Pflüger M, Richters O (2024). _gms: 'GAMS' Modularization Support Package_. doi:10.5281/zenodo.4390032 <https://doi.org/10.5281/zenodo.4390032>, R package version 0.30.4, <https://github.com/pik-piam/gms>.
Dietrich J, Klein D, Giannousakis A, Beier F, Koch J, Baumstark L, Pflüger M, Richters O (2024). _gms: 'GAMS' Modularization Support Package_. doi:10.5281/zenodo.4390032 <https://doi.org/10.5281/zenodo.4390032>, R package version 0.30.5, <https://github.com/pik-piam/gms>.

A BibTeX entry for LaTeX users is

Expand All @@ -52,8 +52,8 @@ A BibTeX entry for LaTeX users is
title = {gms: 'GAMS' Modularization Support Package},
author = {Jan Philipp Dietrich and David Klein and Anastasis Giannousakis and Felicitas Beier and Johannes Koch and Lavinia Baumstark and Mika Pflüger and Oliver Richters},
year = {2024},
note = {R package version 0.30.4},
url = {https://github.com/pik-piam/gms},
note = {R package version 0.30.5},
doi = {10.5281/zenodo.4390032},
url = {https://github.com/pik-piam/gms},
}
```
9 changes: 6 additions & 3 deletions man/checkAppearance.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

8 changes: 7 additions & 1 deletion man/codeCheck.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading