Skip to content

CST_PeriodStandardization (2 issues: dates with ref_period and NA's issue)

(This is a template to report errors and bugs. Please fill in the relevant information and delete the rest.)

Hi @tkariyat and @abatalla (not sure if the issue should be addressed to you, please tag anyone else if needed),

R and packages version

Which R version are you using? R 4.1.2
Which R packages versions are you using? CSIndicators_1.1.1
Which machine are you using? WS

Summary

Bug: there are 2 different bugs:

  • first, an issue with the parameter ref_period, when using using it I get a warning that it is not used because parameter dates is not provided (but I do provide it in the s2dv_cube attrs$Dates, I provide an example below)
  • second issue, regardless of ref_period, when using handle_infinity = TRUE I would not expect NA's anywhere that the original data didn't have NA's, however I do get NA's. The sample data that I'm providing (see "other relevant information" beloww) has NA's in time 1 and 2, but nowhere else; the result has NA's in other places, it seems that the leadtime information (dimension "time") is being somehow mixed in the calculation and I don't think this should happen, leadtimes should alwasy be independent.

Example

load('./sample_data_spei.RData'))

# Issue1 about Dates:
test <- CST_PeriodStandardization(data = sample_data_spei, handle_infinity = TRUE, ref_period = list(1994,2016))

#Warning message:
#In PeriodStandardization(data = data$data, data_cor = data_cor$data,  :
#  Parameter 'dates' is not provided so 'ref_period' can't be used.

class(sample_data_spei$attrs$Dates)
#[1] "POSIXct" "POSIXt"

# Issue2 mixing leadtimes and including NA from first leadtimes in the final result (other leadtimes):

summary(sample_data_spei$data)
#   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's
#-469.79   -8.67   89.62   84.83  170.46 1695.77   69414

dim(sample_data_spei$data)
#latitude    syear     time ensemble
#    1509       23        8        1

summary(sample_data_spei$data[,,3:8,])
#    Min.  1st Qu.   Median     Mean  3rd Qu.     Max.
#-469.790   -8.666   89.616   84.831  170.464 1695.765

# NAs come from leadtimes 1 and 2

summary(test$data)
#  Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's
#  -8.56   -0.78    0.01    0.00    0.76    7.13   69629

dim(test$data)
#latitude    syear     time ensemble
#    1509       23        8        1

summary(test$data[,,3:8,])
#    Min.  1st Qu.   Median     Mean  3rd Qu.     Max.     NA's
#-8.56118 -0.77889  0.01000  0.00384  0.76391  7.12988      215
# I wouldn't expect NA's here, as handle_infinity = TRUE and the 2 first leadtimes (where the original data had NA's) have been removed

Other Relevant Information

sample_data_spei.RData

The coordinates of the data are only latitude because the data has been aggregated by regions and the latitude is the latitude of the centroid of the region (that I need to calculate SPEI)

FYI @eball