-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
diagnostics (and more) broken by string-valued array, cdscan broken by 0-d array #2145
Comments
@painter1 can you please attach here a zip of the failing files. @mcenerney1 gave us some that work on my system w/o patchng. It is VERY disturbing if cdscan behaves differently on the same files depending on the system |
Unzip this little file and run I'll attach a demo file for the 'N/A' problem in my next message. |
@painter1 on my ubunut:
without your full patch |
This zip file contains a NetCDF file cam_dw.nc. It has only one variable, date_written. This will give you an error:
That's not the error I had complained about, though. Or you can run cdscan first:
Now the error message is about 'N/A', roughly as I had described. |
@doutriaux1 What exactly is your function cleanupAttrs() which worked for you? |
I just looked at the cdscan.py on github, UV-CDAT/cdms, master branch. It has a better cleanupAttrs() then the one which I got with my UV-CDAT 2.8, one which should work. I see that @dnadeau4 had fixed it two weeks ago. So I suppose, that you, @doutriaux1, are running the very latest UV-CDAT while I was running the 2.8 release version. |
@painter1 this actually using cdscan before any patch... |
@painter1 ok I get the same error as you on crunchy with older version of cdscan. I'm going to try to apply your patch and see if that makes any difference |
There are two problems with the new UV-CDAT 2.8 which I have had to fix in order to get the diagnostics (uvcmetrics) to work.
First, suppose you have several related NetCDF files. Run cdscan to produce an xml file describing them all. Usually you can't! Generally the variables in the files will have a missing_value attribute. This will be a 0-d array (thus representing a scalar), so the cdscan function cleanupAttrs() issues an exception when trying to compute len(attval). The solution is to catch the exception; I'll paste a working version of this function below.
Second, if a Numpy array be string-valued, Numpy assigns the array a fill_value attribute of 'N/A' (It's a stupid choice, but we have to live with it.) In UV-CDAT, if you read a NetCDF file containing a variable which is a string-valued array, the original fill_value attribute gets propagated into the missing_value attribute of the variable. So far, so good. That is harmless to the diagnostics, although it is troublesome in general. For example, you can't use UV-CDAT to read such a variable.
The problem for the diagnostics is when there are several NetCDF files, each containing a string-valued array. Again, run cdscan to produce an xml file. This works, if it has been patched as described here. Now open that xml file. It breaks. When you call cdms2.open(), all variables are made available. That includes the string-valued variables even if you don't want them. Look in avariable.py, at AbstractVariable.init(). This has a test "numpy.isnan(self.missing_value)". But the function insnan() fails on strings, and our self.missing_value is 'N/A', a string! The local solution is to test for a string first. I'll include sample code at the end.
But note that I called this the "local" solution. I think that a better solution is to replace that 'N/A' with None right away, when it comes in as a fill_value attribute from Numpy. The patch below is just an easy way to get things going for uvcmetrics.
Here is the replacement function for cleanupAttrs() in cdscan.py:
Here is an improved if...elif...else clause from the end of AbstractVariable.init():
The text was updated successfully, but these errors were encountered: