Title: | Hydrologic Model Evaluation and Time-Series Tools |
---|---|
Description: | Facilitates the analysis and evaluation of hydrologic model output and time-series data with functions focused on comparison of modeled (simulated) and observed data, period-of-record statistics, and trends. |
Authors: | Colin Penn [aut, cre] , Caelan Simeone [aut] , Sara Levin [aut] , Samuel Saxe [aut] , Sydney Foks [aut] , Robert Dudley [dtc] , Glenn Hodgkins [dtc] , Timothy Hodson [aut] , Thomas Over [dtc] , Amy Russell [dtc] |
Maintainer: | Colin Penn <[email protected]> |
License: | CC0 |
Version: | 1.1.3 |
Built: | 2024-11-27 02:53:44 UTC |
Source: | https://github.com/cran/HyMETT |
Facilitates the analysis and evaluation of hydrologic model output and time-series data with functions focused on comparison of modeled (simulated) and observed data, period-of-record statistics, and trends.
Please see doi:10.5066/P9FNXEWI for more details.
Calculate benchmark Kling–Gupta efficiency (KGE) values from daily observed time-series data
benchmark_KGE_DOY(obs_preproc)
benchmark_KGE_DOY(obs_preproc)
obs_preproc |
'data.frame' of daily observational data, preprocessed as output from |
This function calculates a "benchmark" KGE value (see Knoben and others, 2020) from a daily
observed data time-series. First, the interannual mean and median is calculated for each day of
the calendar year. Next, the interannual mean and median values are joined to each corresponding
day in the observation time series. Finally, a KGE value (GOF_kling_gupta_efficiency
) is
calculated comparing the mean or median value repeated time series to the daily observational
time series. These benchmark KGE values can be used as comparisons for modeled (simulated)
calibration results.
A data.frame with columns "KGE_DOY_mean"
and "KGE_DOY_median"
.
Knoben, W.J.M, Freer, J.E., Peel, M.C., Fowler, K.J.A, Woods, R.A., 2020. A Brief Analysis of
Conceptual Model Structure Uncertainty Using 36 Models and 559 Catchments: Water Resources
Research, v. 56.
[Also available at https://doi.org/10.1029/2019WR025975.]
benchmark_KGE_DOY(obs_preproc = example_preproc)
benchmark_KGE_DOY(obs_preproc = example_preproc)
Calculate annual flow statistics from daily data
calc_annual_flow_stats( data = NULL, Date, year_group, Q, Q3 = NA_real_, Q7 = NA_real_, Q30 = NA_real_, jd = NA_integer_, calc_high = FALSE, calc_low = FALSE, calc_percentiles = FALSE, calc_monthly = FALSE, calc_WSCVD = FALSE, longitude = NA, calc_ICVD = FALSE, zero_threshold = 33, quantile_type = 8, na.action = c("na.omit", "na.pass") )
calc_annual_flow_stats( data = NULL, Date, year_group, Q, Q3 = NA_real_, Q7 = NA_real_, Q30 = NA_real_, jd = NA_integer_, calc_high = FALSE, calc_low = FALSE, calc_percentiles = FALSE, calc_monthly = FALSE, calc_WSCVD = FALSE, longitude = NA, calc_ICVD = FALSE, zero_threshold = 33, quantile_type = 8, na.action = c("na.omit", "na.pass") )
data |
'data.frame'. Optional data.frame input, with columns containing |
Date |
'Date' or 'character' vector when |
year_group |
'numeric' vector when |
Q |
'numeric' vector when |
Q3 |
'numeric' vector when |
Q7 |
'numeric' vector when |
Q30 |
'numeric' vector when |
jd |
'numeric' vector when |
calc_high |
'boolean' value. Calculate high flow statistics for years in |
calc_low |
'boolean' value. Calculate low flow statistics for years in |
calc_percentiles |
'boolean' value. Calculate percentiles for years in |
calc_monthly |
'boolean' value. Calculate monthly statistics for years in |
calc_WSCVD |
'boolean' value. Calculate winter-spring center volume date for years in
|
longitude |
'numeric' value. Site longitude in North American Datum of 1983 (NAD83),
required in WSCVD calculation. Default is |
calc_ICVD |
'boolean' value. Calculate inverse center volume date for years in |
zero_threshold |
'numeric' value as percentage. The percentage of years of a statistic that
need to be zero in order for it to be deemed a zero flow site for that statistic. For use in
trend calculation. See Details on attributes. Default is |
quantile_type |
'numeric' value. The distribution type used in the |
na.action |
'character' string indicating na.action passed to |
year_group
is commonly water year, climate year, or calendar year.
Default annual statistics returned:
annual_mean
annual mean in year_group
annual_sd
annual standard deviation in year_group
annual_sum
annual sum in year_group
If calc_high/low
are selected, annual statistics returned:
1-, 3-, 7-, and 30-day high/low and Julian date (jd) of n-day high/low.
high_q
n
where n = 1, 3, 7, and 30
high_q
n_jd
where n = 1, 3, 7, and 30
low_q
n
where n = 1, 3, 7, and 30
low_q
n_jd
where n = 1, 3, 7, and 30
If calc_percentiles
is selected, annual statistics returned:
1, 5, 10, 25, 50, 75, 90, 95, 99 percentile based on daily streamflow.
annual_
n_percentile
where n = 1, 5, 10, 25, 50, 75, 90, 95, and 99
If calc_monthly
is selected, annual statistics returned:
Monthly mean, standard deviation, max, min, percent of annual for each month in year_group
.
_mean
monthly mean, where month = month.abb
_sd
monthly standard deviation, where month = month.abb
_max
monthly maximum, where month = month.abb
_min
monthly minimum, where month = month.abb
_percent_annual
monthly percent of annual, where month = month.abb
If calc_WSCVD
is selected, Julian date of annual winter-spring center volume date is returned.
Longitude (in NAD83 datum) is used to determine the ending month of spring. July for longitudes
West of 95 degrees, May for longitudes east of
95 degrees. See References
Dudley and others, 2017. Commonly calculated when
year_group
is water year.
WSCVD
Julian date of winter-spring center volume
If calc_ICVD
is selected, Julian date of annual inverse center volume date is returned.
Commonly calculated when year_group
is climate year.
ICVD
Julian date of inverse center volume date
Attribute: zero_flow_years
A data.frame with each annual statistic calculated, the percentage of years where the
statistic = 0, a flag indicating if the percentage is over the zero_threshold
parameter,
and the number of years with a zero value. Columns in zero_flow_years
:
annual_stat
annual statistic
percent_zeros
percentage of years with 0 statistic value
over_threshold
boolean if percentage is over threshold
number_years
number of years with 0 value statistic
The zero_flow_years
attribute can be useful in trend calculation, where a trend may not be
appropriate to calculate with many zero flow years.
A tibble (see tibble::tibble
) with annual statistics depending on options selected.
See Details.
Dudley, R.W., Hodgkins, G.A, McHale, M.R., Kolian, M.J., Renard, B., 2017, Trends in snowmelt-related streamflow timing in the conterminous United States: Journal of Hydrology, v. 547, p. 208-221. [Also available at https://doi.org/10.1016/j.jhydrol.2017.01.051.]
calc_annual_flow_stats(data = example_preproc, Date = "Date", year_group = "WY", Q = "value")
calc_annual_flow_stats(data = example_preproc, Date = "Date", year_group = "WY", Q = "value")
Calculate trend in annual statistics
calc_annual_stat_trend(data = NULL, year, value, ...)
calc_annual_stat_trend(data = NULL, year, value, ...)
data |
'data.frame'. Optional data.frame input, with columns containing |
year |
'numeric' vector when |
value |
'numeric' vector when |
... |
further arguments to be passed to or from |
This function is a wrapper for EnvStats::kendallTrendTest
with the passed equation
value ~ year
. The returned values include Mann-Kendall test statistic and p-value,
Theil-Sen slope and intercept values, and trend details (Millard, 2013; Helsel and others, 2020).
z_stat
Mann-Kendall test statistic, returned directly from
EnvStats::kendallTrendTest
p_value
z_stat
p-value, returned directly from
EnvStats::kendallTrendTest
sen_slope
Sen slope in units value per year, returned directly from
EnvStats::kendallTrendTest
intercept
Sen slope intercept, returned directly from EnvStats::kendallTrendTest
trend_mag
Trend magnitude over entire period, in units of value
,
calculated as sen_slope * (max(year)
min(year))
val_beg/end
Calculated value at beginning or end of period, calculated as
sen_slope * year + intercept
val_perc_change
Percentage change over period, calculated as
(val_end - val_beg) / val_beg * 100
A tibble (see tibble::tibble
) with test statistic, p-value, trend coefficients, and
trend calculations. See Details.
Millard, S.P., 2013, EnvStats: An R Package for Environmental Statistics: New York, New York, Springer, 291 p. [Also available at https://doi.org/10.1007/978-1-4614-8456-1.]
Helsel, D.R., Hirsch, R.M., Ryberg, K.R., Archfield, S.A., and Gilroy, E.J., 2020, Statistical methods in water resources: U.S. Geological Survey Techniques and Methods, book 4, chap. A3, 458 p. [Also available at https://doi.org/10.3133/tm4a3.]
calc_annual_stat_trend(data = example_annual, year = "WY", value = "annual_mean")
calc_annual_stat_trend(data = example_annual, year = "WY", value = "annual_mean")
Calculate logistic regression (Everitt and Hothorn, 2009) in annual statistics with zero values. A model fit to compute the probability of a zero flow annual statistic.
calc_logistic_regression(data = NULL, year, value, ...)
calc_logistic_regression(data = NULL, year, value, ...)
data |
'data.frame'. Optional data.frame input, with columns containing |
year |
'numeric' vector when |
value |
'numeric' vector when |
... |
further arguments to be passed to or from |
This function is a wrapper for stats::glm(y ~ year, family = stats::binomial(link="logit")
with y = 1
when value = 0
(for example a zero flow annual statistic) and y = 0
otherwise.
The returned values include
p_value
Probability value of the explanatory (year
) variable in the logistic model
stdErr_slope
Standard error of the regression slope (log odds per year)
odds_ratio
Exponential of the explanatory coefficient (year coefficient)
prob_beg/end
Logistic regression predicted (fitted) values at the beginning and ending year.
prob_change
Change in probability from beginning to end.
Example, an odds ratio of 1.05 represents the odds of a zero-flow year (versus non-zero) increase by a factor of 1.05 (or 5 percent).
A tibble (see tibble::tibble
) with logistic regression p-value, standard error of
slope, odds ratio, beginning and ending probability, and probability change. See Details.
Everitt, B. S. and Hothorn T., 2009, A Handbook of Statistical Analyses Using R, 2nd Ed. Boca Raton, Florida, Chapman and Hall/CRC, 376p.
calc_logistic_regression(data = example_annual, year = "WY", value = "annual_mean")
calc_logistic_regression(data = example_annual, year = "WY", value = "annual_mean")
Quantile of Pearson Type III distribution for log-transformed data
calc_qlpearsonIII(p, meanlog = 0, sdlog = 1, skew = 0)
calc_qlpearsonIII(p, meanlog = 0, sdlog = 1, skew = 0)
p |
Vector of non-exceedance probabilities, between 0 and 1, to calculate quantiles. |
meanlog |
Vector of mean of the distribution of the log-transformed data. |
sdlog |
Vector of standard deviation of the distribution of the log-transformed data. |
skew |
Vector of skewness of the distribution of the log-transformed data. |
calc_qpearsonIII
and calc_qlpearsonIII
are functions to fit a log-Pearson type III
distribution from a given mean, standard deviation, and skew. This source code is replicated,
unchanged, from the swmrBase
package in order to reduce the dependency on that package.
Quantiles for the described distribution
Asquith, W.H., Kiang, J.E., and Cohn, T.A., 2017, Application of at-site peak-streamflow frequency analyses for very low annual exceedance probabilities: U.S. Geological Survey Scientific Investigation Report 2017–5038, 93 p. [Also available at https://doi.org/10.3133/sir20175038.]
Lorenz, D.L., 2015, smwrBase—An R package for managing hydrologic data, version 1.1.1: U.S.
Geological Survey Open-File Report 2015–1202, 7 p.
[Also available at https://doi.org/10.3133/ofr20151202.]
calc_qlpearsonIII(0.1)
calc_qlpearsonIII(0.1)
Quantile of Pearson Type III distribution
calc_qpearsonIII(p, mean = 0, sd = 1, skew = 0)
calc_qpearsonIII(p, mean = 0, sd = 1, skew = 0)
p |
Vector of non-exceedance probabilities, between 0 and 1, to calculate quantiles. |
mean |
Vector of means of the distribution of the data. |
sd |
Vector of standard deviation of the distribution of the data. |
skew |
Vector of skewness of the distribution of the data. |
calc_qpearsonIII
and calc_qlpearsonIII
are functions to fit a log-Pearson type III
distribution from a given mean, standard deviation, and skew. This source code is replicated,
unchanged, from the swmrBase
package in order to reduce the dependency on that package.
Quantiles for the described distribution
Asquith, W.H., Kiang, J.E., and Cohn, T.A., 2017, Application of at-site peak-streamflow frequency analyses for very low annual exceedance probabilities: U.S. Geological Survey Scientific Investigation Report 2017–5038, 93 p. [Also available at https://doi.org/10.3133/sir20175038.]
Lorenz, D.L., 2015, smwrBase—An R package for managing hydrologic data, version 1.1.1: U.S.
Geological Survey Open-File Report 2015–1202, 7 p.
[Also available at https://doi.org/10.3133/ofr20151202.]
calc_qpearsonIII(0.1)
calc_qpearsonIII(0.1)
Replaces values in a vector with NA
when above or below a censor level.
Censoring is values censor_symbol censor_threshold
are censored,
for example with the defaults (values lte 0 set to NA
) all values <= 0 are
replaced with NA
.
censor_values( value, censor_threshold = 0, censor_symbol = c("lte", "lt", "gt", "gte") )
censor_values( value, censor_threshold = 0, censor_symbol = c("lte", "lt", "gt", "gte") )
value |
'numeric' vector. Values to censor. |
censor_threshold |
'numeric' value. Threshold to censor values on. Default is 0. |
censor_symbol |
'character' string. |
'numeric' vector with censored values replaced with NA
censor_values(value = seq.int(1, 10, 1), censor_threshold = 5)
censor_values(value = seq.int(1, 10, 1), censor_threshold = 5)
An example dataset with daily observed streamflow processed to annual water year values.
example_annual
example_annual
A data.frame with the following variables:
WY
water year
annual_mean
annual mean
annual_sd
annual standard deviation
annual_sum
annual sum
high_q1
annual maximum of daily mean
high_q3
annual maximum of 3-day mean
high_q7
annual maximum of 7-day mean
high_q30
annual maximum of 30-day mean
high_q1_jd
Julian day of annual maximum of daily mean
high_q3_jd
Julian day of annual maximum of 3-day mean
high_q7_jd
Julian day of annual maximum of 7-day mean
high_q30_jd
Julian day of annual maximum of 30-day mean
low_q7
annual minimum of 7-day mean
low_q30
annual minimum of 30-day mean
low_q3
annual minimum of 3-day mean
low_q1
annual minimum of daily mean
low_q7_jd
Julian day of annual minimum of 7-day mean
low_q30_jd
Julian day of annual minimum of 30-day mean
low_q3_jd
Julian day of annual minimum of 3-day mean
low_q1_jd
Julian day of annual minimum of daily mean
annual_1_percentile
annual first percentile
annual_5_percentile
annual 5th percentile
annual_10_percentile
annual 10th percentile
annual_25_percentile
annual 25th percentile
annual_50_percentile
annual 50th percentile
annual_75_percentile
annual 75th percentile
annual_90_percentile
annual 90th percentile
annual_95_percentile
annual 95th percentile
annual_99_percentile
annual 99th percentile
Jan_mean
annual January mean
Jan_sd
annual January standard deviation
Jan_max
annual January maximum
Jan_min
annual January minimum
Jan_percent_annual
annual January percentage of annual sum
Feb_mean
annual February mean
Feb_sd
annual February standard deviation
Feb_max
annual February maximum
Feb_min
annual February minimum
Feb_percent_annual
annual February percentage of annual sum
Mar_mean
annual March mean
Mar_sd
annual March standard deviation
Mar_max
annual March maximum
Mar_min
annual March minimum
Mar_percent_annual
annual March percentage of annual sum
Apr_mean
annual April mean
Apr_sd
annual April standard deviation
Apr_max
annual April maximum
Apr_min
annual April minimum
Apr_percent_annual
annual April percentage of annual sum
May_mean
annual May mean
May_sd
annual May standard deviation
May_max
annual May maximum
May_min
annual May minimum
May_percent_annual
annual May percentage of annual sum
Jun_mean
annual June mean
Jun_sd
annual June standard deviation
Jun_max
annual June maximum
Jun_min
annual June minimum
Jun_percent_annual
annual June percentage of annual sum
Jul_mean
annual July mean
Jul_sd
annual July standard deviation
Jul_max
annual July maximum
Jul_min
annual July minimum
Jul_percent_annual
annual July percentage of annual sum
Aug_mean
annual August mean
Aug_sd
annual August standard deviation
Aug_max
annual August maximum
Aug_min
annual August minimum
Aug_percent_annual
annual August percentage of annual sum
Sep_mean
annual September mean
Sep_sd
annual September standard deviation
Sep_max
annual September maximum
Sep_min
annual September minimum
Sep_percent_annual
annual September percentage of annual sum
Oct_mean
annual October mean
Oct_sd
annual October standard deviation
Oct_max
annual October maximum
Oct_min
annual October minimum
Oct_percent_annual
annual October percentage of annual sum
Nov_mean
annual November mean
Nov_sd
annual November standard deviation
Nov_max
annual November maximum
Nov_min
annual November minimum
Nov_percent_annual
annual November percentage of annual sum
Dec_mean
annual December mean
Dec_sd
annual December standard deviation
Dec_max
annual December maximum
Dec_min
annual December minimum
Dec_percent_annual
annual December percentage of annual sum
WSV
winter-spring volume
wscvd
Julian date of winter-spring center volume
Generated with example_obs
from
HyMETT::preproc_main(data = example_obs, Date = "Date", value = "streamflow_cfs", longitude = -68)$annual
str(example_annual)
str(example_annual)
An example dataset with daily modeled (simulated) streamflow.
example_mod
example_mod
A data.frame with the following variables:
date
date as 'character' column class.
streamflow_cfs
modeled streamflow in units of feet^3/second.
Date
date as 'Date' column class.
Generated from example data available at system.file("extdata", "01013500_MOD.csv", package = "HyMETT")
Johnson, M., D. Blodgett, 2020, NOAA National Water Model Reanalysis Data at RENCI, HydroShare,
accessed September 17, 2020 at
https://doi.org/10.4211/hs.89b0952512dd4b378dc5be8d2093310f
Johnson, M., 2021, nwmHistoric: National Water Model Historic Data. R package version 0.0.0.9000, accessed September 17, 2020 at https://github.com/mikejohnson51/nwmHistoric
str(example_mod)
str(example_mod)
An example dataset with daily modeled (simulated) streamflow that includes zero flows.
example_mod_zf
example_mod_zf
A data.frame with the following variables:
date
date as 'character' column class.
streamflow_cfs
modeled streamflow in units of feet^3/second.
Date
date as 'Date' column class.
Generated from example data available at system.file("extdata", "08202700_MOD.csv", package = "HyMETT")
Johnson, M., D. Blodgett, 2020, NOAA National Water Model Reanalysis Data at RENCI, HydroShare,
accessed September 17, 2020 at
https://doi.org/10.4211/hs.89b0952512dd4b378dc5be8d2093310f
Johnson, M., 2021, nwmHistoric: National Water Model Historic Data. R package version 0.0.0.9000, accessed September 17, 2020 at https://github.com/mikejohnson51/nwmHistoric
str(example_mod_zf)
str(example_mod_zf)
An example dataset with daily observed streamflow.
example_obs
example_obs
A data.frame with the following variables:
date
date as 'character' column class.
streamflow_cfs
observed streamflow in units of feet^3/second.
quality_cd
qualifier for value in streamflow_cfs
(U.S. Geological Survey, 2020b)
Date
date as 'Date' column class.
Generated from example data available at system.file("extdata", "01013500_OBS.csv", package = "HyMETT")
De Cicco, L.A., Hirsch, R.M., Lorenz, D., and Watkins, W.D., 2021, dataRetrieval: R packages for discovering and retrieving water data available from Federal hydrologic web services, accessed September 16, 2020 at https://doi.org/10.5066/P9X4L3GE.
U.S. Geological Survey, 2020a, USGS water data for the Nation: U.S. Geological Survey National
Water Information System database, accessed September 16, 2020, at
https://doi.org/10.5066/F7P55KJN.
U.S. Geological Survey, 2020b, Instantaneous and Daily Data-Value Qualification Codes, in USGS water data for the Nation: U.S. Geological Survey National Water Information System database, accessed September 16, 2020, at https://doi.org/10.5066/F7P55KJN. [information directly accessible at https://help.waterdata.usgs.gov/codes-and-parameters/instantaneous-value-qualification-code-uv_rmk_cd.]
str(example_obs)
str(example_obs)
An example dataset with daily observed streamflow that includes zero flows.
example_obs_zf
example_obs_zf
A data.frame with the following variables:
date
date as 'character' column class.
streamflow_cfs
observed streamflow in units of feet^3/second.
quality_cd
qualifier for value in streamflow_cfs
(U.S. Geological Survey, 2020b)
Date
date as 'Date' column class.
Generated from example data available at system.file("extdata", "08202700_OBS.csv", package = "HyMETT")
De Cicco, L.A., Hirsch, R.M., Lorenz, D., and Watkins, W.D., 2021, dataRetrieval: R packages for discovering and retrieving water data available from Federal hydrologic web services, accessed September 16, 2020 at https://doi.org/10.5066/P9X4L3GE.
U.S. Geological Survey, 2020a, USGS water data for the Nation: U.S. Geological Survey National
Water Information System database, accessed September 16, 2020, at
https://doi.org/10.5066/F7P55KJN.
U.S. Geological Survey, 2020b, Instantaneous and Daily Data-Value Qualification Codes, in USGS water data for the Nation: U.S. Geological Survey National Water Information System database, accessed September 16, 2020, at https://doi.org/10.5066/F7P55KJN. [information directly accessible at https://help.waterdata.usgs.gov/codes-and-parameters/instantaneous-value-qualification-code-uv_rmk_cd.]
str(example_obs_zf)
str(example_obs_zf)
An example dataset with daily observed streamflow preprocessed to include additional timing and n-day moving averages.
example_preproc
example_preproc
A data.frame with the following variables:
Date
value
year
month
day
decimal_date
WY
Water Year: October 1 - September 30
CY
Climate Year: April 1 - March 30
Q3
3-Day Moving Average: computed at end of moving interval
Q7
7-Day Moving Average: computed at end of moving interval
Q30
30-Day Moving Average: computed at end of moving interval
jd
Julian date
Generated with example_obs
from
HyMETT::preproc_main(data = example_obs, Date = "Date", value = "streamflow_cfs", longitude = -68)$daily`
str(example_preproc)
str(example_preproc)
Calculates Kendall's Tau, Spearman's Rho, Pearson Correlation, and p-values
as a wrapper to the stats::cor.test
function. Output is tidy-style data.frame.
GOF_correlation_tests(mod, obs, na.rm = TRUE, ...)
GOF_correlation_tests(mod, obs, na.rm = TRUE, ...)
mod |
'numeric' vector. Modeled or simulated values. Must be same length as |
obs |
'numeric' vector. Observed or comparison values. Must be same length as |
na.rm |
'boolean' |
... |
Further arguments to be passed to or from |
See stats::cor.test
for more details and further arguments to be passed to or from methods.
Defaults are used.
A tibble (tibble::tibble
) with test statistic values and p-values.
GOF_correlation_tests(mod = example_mod$streamflow_cfs, obs = example_obs$streamflow_cfs)
GOF_correlation_tests(mod = example_mod$streamflow_cfs, obs = example_obs$streamflow_cfs)
Calculate Kling–Gupta Efficiency (KGE) (or modified KGE ('KGE)) between modeled (simulated) and observed values.
GOF_kling_gupta_efficiency(mod, obs, modified = FALSE, na.rm = TRUE)
GOF_kling_gupta_efficiency(mod, obs, modified = FALSE, na.rm = TRUE)
mod |
'numeric' vector. Modeled or simulated values. Must be same length as |
obs |
'numeric' vector. Observed or comparison values. Must be same length as |
modified |
'boolean' |
na.rm |
'boolean' |
Value of computed KGE or 'KGE.
Kling, H., Fuchs, M. and Paulin, M., 2012. Runoff conditions in the upper Danube basin under an
ensemble of climate change scenarios: Journal of Hydrology, v. 424-425, p. 264-277.
[Also available at https://doi.org/10.1016/j.jhydrol.2012.01.011.]
Gupta, H.V., Kling, H., Yilmaz, K.K., and Martinez, G.G., 2009. Decomposition of the mean
squared error and NSE performance criteria: Implications for improving hydrological modelling:
Journal of Hydrology, v. 377, no.1-2, p. 80-91.
[Also available at https://doi.org/10.1016/j.jhydrol.2009.08.003.]
GOF_kling_gupta_efficiency( mod = example_mod$streamflow_cfs, obs = example_obs$streamflow_cfs )
GOF_kling_gupta_efficiency( mod = example_mod$streamflow_cfs, obs = example_obs$streamflow_cfs )
Calculates mean absolute error (MAE) between modeled (simulated) and observed values. Error is defined as modeled minus observed.
GOF_mean_absolute_error(mod, obs, na.rm = TRUE)
GOF_mean_absolute_error(mod, obs, na.rm = TRUE)
mod |
'numeric' vector. Modeled or simulated values. Must be same length as |
obs |
'numeric' vector. Observed or comparison values. Must be same length as |
na.rm |
'boolean' |
The absolute value of each modeled-observed pair error is calculated, then the mean of those values taken. Values returned are in units of input data.
Value of calculated mean absolute error (MAE).
GOF_mean_absolute_error(mod = example_mod$streamflow_cfs, obs = example_obs$streamflow_cfs)
GOF_mean_absolute_error(mod = example_mod$streamflow_cfs, obs = example_obs$streamflow_cfs)
Calculates mean error between modeled (simulated) and observed values. Error is defined as modeled minus observed.
GOF_mean_error(mod, obs, na.rm = TRUE)
GOF_mean_error(mod, obs, na.rm = TRUE)
mod |
'numeric' vector. Modeled or simulated values. Must be same length as |
obs |
'numeric' vector. Observed or comparison values. Must be same length as |
na.rm |
'boolean' |
Values returned are in units of input data.
Value of calculated mean error.
GOF_mean_error(mod = example_mod$streamflow_cfs, obs = example_obs$streamflow_cfs)
GOF_mean_error(mod = example_mod$streamflow_cfs, obs = example_obs$streamflow_cfs)
Calculate Nash–Sutcliffe Efficiency (NSE) (with options for modified NSE) between modeled (simulated) and observed values.
GOF_nash_sutcliffe_efficiency(mod, obs, j = 2, na.rm = TRUE)
GOF_nash_sutcliffe_efficiency(mod, obs, j = 2, na.rm = TRUE)
mod |
'numeric' vector. Modeled or simulated values. Must be same length as |
obs |
'numeric' vector. Observed or comparison values. Must be same length as |
j |
'numeric' value. Exponent value for modified NSE (mNSE) equation. Default value is
|
na.rm |
'boolean' |
Value of computed NSE or mNSE.
Krause, P., Boyle, D.P., and Base, F., 2005. Comparison of different efficiency criteria for
hydrological model assessment: Advances in Geosciences, v. 5, p. 89-97.
[Also available at https://doi.org/10.5194/adgeo-5-89-2005.]
Legates D.R and McCabe G.J., 1999, Evaluating the use of "goodness-of-fit" measures in hydrologic and hydroclimatic model validation: Water Resources Research. v. 35, no. 1, p. 233-241. [Also available at https://doi.org/10.1029/1998WR900018.]
Nash, J.E. and Sutcliffe, J.V., 1970, River flow forecasting through conceptual models part I: A discussion of principles: Journal of Hydrology, v. 10, no. 3, p. 282-290. [Also available at https://doi.org/10.1016/0022-1694(70)90255-6.]
GOF_nash_sutcliffe_efficiency( mod = example_mod$streamflow_cfs, obs = example_obs$streamflow_cfs )
GOF_nash_sutcliffe_efficiency( mod = example_mod$streamflow_cfs, obs = example_obs$streamflow_cfs )
Calculates percent bias between modeled (simulated) and observed values.
GOF_percent_bias(mod, obs, na.rm = TRUE)
GOF_percent_bias(mod, obs, na.rm = TRUE)
mod |
'numeric' vector. Modeled or simulated values. Must be same length as |
obs |
'numeric' vector. Observed or comparison values. Must be same length as |
na.rm |
'boolean' |
Values returned are in percent.
Value of calculated percent bias as percent.
GOF_percent_bias(mod = example_mod$streamflow_cfs, obs = example_obs$streamflow_cfs)
GOF_percent_bias(mod = example_mod$streamflow_cfs, obs = example_obs$streamflow_cfs)
Calculate root-mean-square error (RMSE) between modeled (simulated) and observed values. Error is defined as modeled minus observed.
GOF_rmse( mod, obs, normalize = c("none", "mean", "range", "stdev", "iqr", "iqr-1", "iqr-2", "iqr-3", "iqr-4", "iqr-5", "iqr-6", "iqr-7", "iqr-8", "iqr-9", NULL), na.rm = TRUE )
GOF_rmse( mod, obs, normalize = c("none", "mean", "range", "stdev", "iqr", "iqr-1", "iqr-2", "iqr-3", "iqr-4", "iqr-5", "iqr-6", "iqr-7", "iqr-8", "iqr-9", NULL), na.rm = TRUE )
mod |
'numeric' vector. Modeled or simulated values. Must be same length as |
obs |
'numeric' vector. Observed or comparison values. Must be same length as |
normalize |
'character' value. Option to normalize the root-mean-square error (NRMSE) by
several normalizing options. Default is |
na.rm |
'boolean' |
'numeric' value of computed root-mean-square error (RMSE) or normalized root-mean-square error (NRMSE)
# RMSE GOF_rmse(mod = example_mod$streamflow_cfs, obs = example_obs$streamflow_cfs) # NRMSE GOF_rmse( mod = example_mod$streamflow_cfs, obs = example_obs$streamflow_cfs, normalize = 'stdev' )
# RMSE GOF_rmse(mod = example_mod$streamflow_cfs, obs = example_obs$streamflow_cfs) # NRMSE GOF_rmse( mod = example_mod$streamflow_cfs, obs = example_obs$streamflow_cfs, normalize = 'stdev' )
Calculate Goodness-of-fit (GOF) metrics for correlation, Kling–Gupta efficiency, mean absolute error, mean error, Nash–Sutcliffe efficiency, percent bias, root-mean-square error, normalized root-mean-square error, and volumetric efficiency, and output into a table.
GOF_summary( mod, obs, metrics = c("cor", "kge", "mae", "me", "nse", "pb", "rmse", "nrmse", "ve"), censor_threshold = NULL, censor_symbol = NULL, na.rm = TRUE, kge_modified = FALSE, nse_j = 2, rmse_normalize = c("mean", "range", "stdev", "iqr", "iqr-1", "iqr-2", "iqr-3", "iqr-4", "iqr-5", "iqr-6", "iqr-7", "iqr-8", "iqr-9", NULL), ... )
GOF_summary( mod, obs, metrics = c("cor", "kge", "mae", "me", "nse", "pb", "rmse", "nrmse", "ve"), censor_threshold = NULL, censor_symbol = NULL, na.rm = TRUE, kge_modified = FALSE, nse_j = 2, rmse_normalize = c("mean", "range", "stdev", "iqr", "iqr-1", "iqr-2", "iqr-3", "iqr-4", "iqr-5", "iqr-6", "iqr-7", "iqr-8", "iqr-9", NULL), ... )
mod |
'numeric' vector. Modeled or simulated values. Must be same length as |
obs |
'numeric' vector. Observed or comparison values. Must be same length as |
metrics |
'character' vector. Which GOF metrics should be computed and output. Default is
|
censor_threshold |
'numeric' value. Threshold to censor values on utilizing
|
censor_symbol |
'character' string. Inequality symbol to censor values based on
|
na.rm |
'boolean' |
kge_modified |
'boolean' |
nse_j |
'numeric' value. Exponent value for modified NSE (mNSE) equation, utilized if
|
rmse_normalize |
'character' value. Normalize option for NRMSE, utilized if "nrmse" option
is in paramter |
... |
Further arguments to be passed to or from |
See GOF_correlation_tests
, GOF_kling_gupta_efficiency
,GOF_mean_absolute_error
, GOF_mean_error
, GOF_nash_sutcliffe_efficiency
, GOF_percent_bias
, GOF_rmse
,
and GOF_volumetric_efficiency
.
A tibble (see tibble::tibble
) with GOF metrics
censor_values
, GOF_correlation_tests
,
GOF_kling_gupta_efficiency
,
GOF_mean_absolute_error
, GOF_mean_error
, GOF_nash_sutcliffe_efficiency
, GOF_percent_bias
,
GOF_rmse
, GOF_volumetric_efficiency
GOF_summary(mod = example_mod$streamflow_cfs, obs = example_obs$streamflow_cfs)
GOF_summary(mod = example_mod$streamflow_cfs, obs = example_obs$streamflow_cfs)
Calculate Volumetric efficiency (VE) between modeled (simulated) and observed values. VE is defined as the fraction of water delivered at the proper time (Criss and Winston, 2008).
GOF_volumetric_efficiency(mod, obs, na.rm = TRUE)
GOF_volumetric_efficiency(mod, obs, na.rm = TRUE)
mod |
'numeric' vector. Modeled or simulated values. Must be same length as |
obs |
'numeric' vector. Observed or comparison values. Must be same length as |
na.rm |
'boolean' |
Volumetric efficiency was proposed in order to circumvent some problems associated to the
Nash–Sutcliffe efficiency. It ranges from 0
to 1
and represents the fraction of water
delivered at the proper time; its compliment represents the fractional volumetric mismatch
(Criss and Winston, 2008).
Value of computed Volumetric efficiency.
Criss, R.E. and Winston, W.E., 2008, Do Nash values have value? Discussion and alternate
proposals: Hydrological Processes, v. 22, p. 2723-2725.
[Also available at https://doi.org/10.1002/hyp.7072.]
Zambrano-Bigiarini, M., 2020, hydroGOF: Goodness-of-fit functions for comparison of simulated and observed hydrological time series R package version 0.4-0. accessed September 16, 2020, at https://github.com/hzambran/hydroGOF. [Also available at https://doi.org/10.5281/zenodo.839854.]
GOF_volumetric_efficiency( mod = example_mod$streamflow_cfs, obs = example_obs$streamflow_cfs )
GOF_volumetric_efficiency( mod = example_mod$streamflow_cfs, obs = example_obs$streamflow_cfs )
This function computes the 50th and 90th percentiles of a streamflow time series from annual n-day high flow values and returns a data.frame in the format of other period-of-record (POR) metrics.
POR_apply_annual_hiflow_stats(annual_max, quantile_type = 8)
POR_apply_annual_hiflow_stats(annual_max, quantile_type = 8)
annual_max |
'numeric' vector or data.frame. Vector or data.frame with columns of annual n-day maximum streamflows. |
quantile_type |
'numeric' value. The distribution type used in the |
annual maximum of n-day moving averages can be computed during pre-processing step using preproc_precondition_data
and calc_annual_flow_stats
, or preproc_main
for both
observed and modeled data.
Data.frame of 0.5 and 0.9 non-exceedance probabilities (50th and 90th percentiles),
with metric names if annual_max
is a data.frame with columns named by metric.
quantile
, preproc_precondition_data
,
calc_annual_flow_stats
, preproc_main
POR_apply_annual_hiflow_stats(annual_max = example_annual[ , c("high_q1", "high_q30")])
POR_apply_annual_hiflow_stats(annual_max = example_annual[ , c("high_q1", "high_q30")])
Calculates 10-year and 2-year return periods of a streamflow time series from annual n-day low streamflow values and returns a data.frame in the format of other period-of-record (POR) metrics.
POR_apply_annual_lowflow_stats(annual_min)
POR_apply_annual_lowflow_stats(annual_min)
annual_min |
'numeric' vector or data.frame. Vector or data.frame with columns of annual n-day minimum streamflows. |
POR_apply_POR_lowflow_metrics
is a helper function that applies the POR_calc_lp3_quantile
function to the data.frame of n-day moving averages, which can be computed during pre-processing
step using preproc_precondition_data
and calc_annual_flow_stats
, or preproc_main
for
both observed and modeled data. This function returns a data.frame with the 10-year and 2-year
return period streamflows for each n-day low streamflow in the input data.frame.
data.frame with 10-year and 2-year return period of n-day streamflows.
POR_calc_lp3_quantile
, preproc_precondition_data
,
calc_annual_flow_stats
, preproc_main
POR_apply_annual_lowflow_stats(annual_min = example_annual[ , c("low_q1", "low_q30")])
POR_apply_annual_lowflow_stats(annual_min = example_annual[ , c("low_q1", "low_q30")])
Calculates the seasonal amplitude and phase of a daily time series.
POR_calc_amp_and_phase( data = NULL, Date, value, time_step = c("daily", "monthly") )
POR_calc_amp_and_phase( data = NULL, Date, value, time_step = c("daily", "monthly") )
data |
'data.frame'. Optional data.frame input, with columns containing |
Date |
'numeric' vector of Dates corresponding to each |
value |
'numeric' vector of values (often streamflow) when |
time_step |
'character' value. Either |
A data.frame with calculated seasonal amplitude and phase
Farmer, W.H., Archfield, S.A., Over, T.M., Hay, L.E., LaFontaine, J.H., and Kiang, J.E., 2014, A comparison of methods to predict historical daily streamflow time series in the southeastern United States: U.S. Geological Survey Scientific Investigations Report 2014–5231, 34 p. [Also available at https://doi.org/10.3133/sir20145231.]
POR_calc_amp_and_phase(data = example_obs, Date = "Date", value = "streamflow_cfs")
POR_calc_amp_and_phase(data = example_obs, Date = "Date", value = "streamflow_cfs")
calculates lag-one autocorrelation (AR1) coefficient for a time series
POR_calc_AR1(data = NULL, Date, value, time_step = c("daily", "monthly"))
POR_calc_AR1(data = NULL, Date, value, time_step = c("daily", "monthly"))
data |
'data.frame'. Optional data.frame input, with columns containing |
Date |
'numeric' vector of Dates corresponding to each |
value |
'numeric' vector of values (often streamflow) when |
time_step |
'character' value. Either |
The function calculates lag-one autocorrelation (AR1) coefficient for a time series using thestats::ar
function. When applied to an observed or modeled time series of streamflow, the POR_deseasonalize
function can be applied to the raw data prior to running the
POR_calc_AR1
function.
A data.frame with calculated seasonal amplitude and phase.
Farmer, W.H., Archfield, S.A., Over, T.M., Hay, L.E., LaFontaine, J.H., and Kiang, J.E., 2014, A comparison of methods to predict historical daily streamflow time series in the southeastern United States: U.S. Geological Survey Scientific Investigations Report 2014–5231, 34 p. [Also available at https://doi.org/10.3133/sir20145231.]
POR_calc_AR1(data = example_obs, Date = "Date", value = "streamflow_cfs")
POR_calc_AR1(data = example_obs, Date = "Date", value = "streamflow_cfs")
Calculate the specified flow quantile from a fitted log-Pearson type III distribution from a time series of n-day low flows.
POR_calc_lp3_quantile(annual_min, p)
POR_calc_lp3_quantile(annual_min, p)
annual_min |
'numeric' vector. Vector of minimum annual n-day mean flows. |
p |
'numeric' value of exceedance probabilities. Quantile of fitted distribution that is
returned ( |
POR_calc_lp3_quantile
fits an log-Pearson type III distribution to a series of annual n-day
flows and returns the quantile of a user-specified probability using calc_qlpearsonIII
. This
represents a theoretical return period for than n-day flow.
Specified quantile from the fitted log-Pearson type 3 distribution.
Asquith, W.H., Kiang, J.E., and Cohn, T.A., 2017, Application of at-site peak-streamflow frequency analyses for very low annual exceedance probabilities: U.S. Geological Survey Scientific Investigation Report 2017–5038, 93 p. [Also available at https://doi.org/10.3133/sir20175038.]
POR_calc_lp3_quantile(annual_min = example_annual$low_q1, p = 0.1)
POR_calc_lp3_quantile(annual_min = example_annual$low_q1, p = 0.1)
Removes seasonal trends from a daily or monthly time series. Daily data are deseasonalized by subtracting monthly mean values. Monthly data are deseasonalized by subtracting mean monthly values.
POR_deseasonalize(data = NULL, Date, value, time_step = c("daily", "monthly"))
POR_deseasonalize(data = NULL, Date, value, time_step = c("daily", "monthly"))
data |
'data.frame'. Optional data.frame input, with columns containing |
Date |
'numeric' vector of Dates corresponding to each |
value |
'numeric' vector of values (often streamflow) when |
time_step |
'character' value. Either |
The deseasonalize function removes seasonal trends from a daily or monthly time series
and returns a deseasonalized time series, which can be used in the POR_calc_AR1
function.
Deseasonalized values.
POR_deseasonalize(data = example_obs, Date = "Date", value = "streamflow_cfs")
POR_deseasonalize(data = example_obs, Date = "Date", value = "streamflow_cfs")
Calculates various metrics that describe the distribution of a time series of streamflow, which can be of any time step.
POR_distribution_metrics(value, quantile_type = 8, na.rm = TRUE)
POR_distribution_metrics(value, quantile_type = 8, na.rm = TRUE)
value |
'numeric' vector of values (assumed to be streamflow) at any time step. |
quantile_type |
'numeric' value. The distribution type used in the |
na.rm |
'boolean' |
Metrics computed include:
p_
n
Flow-duration curve (FDC) percentile where n = 1, 5, 10, 25, 50, 75, 90, 95, and 99
POR_mean
Period of record mean
POR_sd
Period of record standard deviation
POR_cv
Period of record coefficient of variation
POR_min
Period of record minimum
POR_max
Period of record maximum
LCV
L-moment coefficient of variation
Lskew
L-moment skewness
Lkurtosis
L-moment kurtosis
A data.frame with FDC quantiles, and distribution metrics. See Details. This function calculates various metrics that describe the distribution of a time series of streamflow, which can be of any time step.
Farmer, W.H., Archfield, S.A., Over, T.M., Hay, L.E., LaFontaine, J.H., and Kiang, J.E., 2014, A comparison of methods to predict historical daily streamflow time series in the southeastern United States: U.S. Geological Survey Scientific Investigations Report 2014–5231, 34 p. [Also available at https://doi.org/10.3133/sir20145231.]
Asquith, W.H., Kiang, J.E., and Cohn, T.A., 2017, Application of at-site peak-streamflow frequency analyses for very low annual exceedance probabilities: U.S. Geological Survey Scientific Investigation Report 2017–5038, 93 p. [Also available at https://doi.org/10.3133/sir20175038.]
Asquith, W.H., 2021, lmomco—L-moments, censored L-moments, trimmed L-moments,
L-comoments, and many distributions. R package version 2.3.7, Texas Tech University,
Lubbock, Texas.
POR_distribution_metrics(value = example_obs$streamflow_cfs)
POR_distribution_metrics(value = example_obs$streamflow_cfs)
Audit daily data for total days in year. An audit is performed to inventory and flag missing days in daily data and help determine if further analyses are appropriate.
preproc_audit_data( data = NULL, Date, value, year_group, use_specific_years = FALSE, begin_year = NULL, end_year = NULL, days_cutoff = 360, date_format = "%Y-%m-%d" )
preproc_audit_data( data = NULL, Date, value, year_group, use_specific_years = FALSE, begin_year = NULL, end_year = NULL, days_cutoff = 360, date_format = "%Y-%m-%d" )
data |
'data.frame'. Optional data.frame input, with columns containing |
Date |
'Date' or 'character' vector when |
value |
'numeric' vector when |
year_group |
'numeric' vector when |
use_specific_years |
'boolean' value. Flag to clip data to a certain set of years in
|
begin_year |
'numeric' value. If |
end_year |
'numeric' value. If |
days_cutoff |
'numeric' value. Designating the number of days required for a year to be
counted as full. Default is |
date_format |
'character' string. Format of Date. Default is |
Year grouping is commonly water year, climate year, or calendar year.
A data.frame with year_group
, count (n, excluding NA
values)
of days in each year_group
, and a complete years 'boolean' flag.
preproc_fill_daily
, preproc_precondition_data
preproc_audit_data( data = example_preproc, Date = "Date", value = "value", year_group = "WY" )
preproc_audit_data( data = example_preproc, Date = "Date", value = "value", year_group = "WY" )
NA
valuesFills daily data with missing dates as NA
values. Days that are
absent from the daily time series are inserted with a corresponding value of NA
.
preproc_fill_daily( data = NULL, Date, value, POR_start = NA, POR_end = NA, date_format = "%Y-%m-%d" )
preproc_fill_daily( data = NULL, Date, value, POR_start = NA, POR_end = NA, date_format = "%Y-%m-%d" )
data |
'data.frame'. Optional data.frame input, with columns containing |
Date |
'Date' or 'character' vector when |
value |
'numeric' vector when |
POR_start |
'character' value. Optional period of record start. If not specified, defaults
to |
POR_end |
'character' value. Optional period of record end. If not specified, defaults to
|
date_format |
'character' string. Format of Date. Default is |
Can be used prior to preproc_precondition_data
to fill daily data before computation
of n-day moving averages, or prior to preproc_audit_data
.
A data.frame with Date
and value
, sequenced from POR_start
to POR_end
by 1 day.
preproc_audit_data
, preproc_precondition_data
Dates = c(seq.Date(as.Date("2020-01-01"), as.Date("2020-01-10"), by = "1 day"), seq.Date(as.Date("2020-01-20"), as.Date("2020-01-31"), by = "1 day")) values = c(seq.int(1, 22, 1)) preproc_fill_daily(Date = Dates, value = values)
Dates = c(seq.Date(as.Date("2020-01-01"), as.Date("2020-01-10"), by = "1 day"), seq.Date(as.Date("2020-01-20"), as.Date("2020-01-31"), by = "1 day")) values = c(seq.int(1, 22, 1)) preproc_fill_daily(Date = Dates, value = values)
A wrapper function for preproc_precondition_data
,
preproc_audit_data
, and calc_annual_flow_stats
preproc_main( data = NULL, Date, value, date_format = "%Y-%m-%d", year_group = c("WY", "CY", "year"), use_specific_years = FALSE, begin_year = NULL, end_year = NULL, days_cutoff = 360, calc_high = TRUE, calc_low = TRUE, calc_percentiles = TRUE, calc_monthly = TRUE, calc_WSCVD = TRUE, longitude = NA, calc_ICVD = FALSE, zero_threshold = 33, quantile_type = 8, na.action = c("na.omit", "na.pass") )
preproc_main( data = NULL, Date, value, date_format = "%Y-%m-%d", year_group = c("WY", "CY", "year"), use_specific_years = FALSE, begin_year = NULL, end_year = NULL, days_cutoff = 360, calc_high = TRUE, calc_low = TRUE, calc_percentiles = TRUE, calc_monthly = TRUE, calc_WSCVD = TRUE, longitude = NA, calc_ICVD = FALSE, zero_threshold = 33, quantile_type = 8, na.action = c("na.omit", "na.pass") )
data |
'data.frame'. Optional data.frame input, with columns containing |
Date |
'Date' or 'character' vector when |
value |
'numeric' vector when |
date_format |
'character' string. Format of Date. Default is |
year_group |
'character' value. Specify either |
use_specific_years |
'boolean' value. Flag to clip data to a certain set of years in
|
begin_year |
'numeric' value. If |
end_year |
'numeric' value. If |
days_cutoff |
'numeric' value. Designating the number of days required for a year to be
counted as full. Default is |
calc_high |
'boolean' value. Calculate high streamflow statistics for years in |
calc_low |
'boolean' value. Calculate low streamflow statistics for years in |
calc_percentiles |
'boolean' value. Calculate percentiles for years in |
calc_monthly |
'boolean' value. Calculate monthly statistics for years in |
calc_WSCVD |
'boolean' value. Calculate winter-spring center volume date for years in
|
longitude |
'numeric' value. Site longitude in NAD83, required in WSCVD calculation.
Default is |
calc_ICVD |
'boolean' value. Calculate inverse center volume date for years in
|
zero_threshold |
'numeric' value as percentage. The percentage of years of a statistic that
need to be zero in order for it to be deemed a zero streamflow site for that statistic. For
use in trend calculation. See Details on attributes. Default is |
quantile_type |
'numeric' value. The distribution type used in the |
na.action |
'character' string indicating na.action passed to |
This is a wrapper function of preproc_precondition_data
, preproc_audit_data
, andcalc_annual_flow_stats
. Data are first passed to the precondition function, then audited,
then annual statistics are computed.
It also checks the timestep of the data to make sure that it is daily timestep.
Other time steps are currently not supported and will return the data.frame without moving
averages computed.
A list of three data.frames: 1 of preconditioned data, 1 data audit, and 1 annual statistics.
preproc_audit_data
, preproc_precondition_data
,
calc_annual_flow_stats
preproc_main(data = example_obs, Date = "Date", value = "streamflow_cfs", longitude = -68)
preproc_main(data = example_obs, Date = "Date", value = "streamflow_cfs", longitude = -68)
Pre-conditions data with time information and n-day moving averages, with options
to fill missing days with NA
values.
preproc_precondition_data( data = NULL, Date, value, date_format = "%Y-%m-%d", fill_daily = TRUE )
preproc_precondition_data( data = NULL, Date, value, date_format = "%Y-%m-%d", fill_daily = TRUE )
data |
'data.frame'. Optional data.frame input, with columns containing |
Date |
'Date' or 'character' vector when |
value |
'numeric' vector when |
date_format |
'character' string. Format of |
fill_daily |
'logical' value. Should gaps in |
These columns are added to the data:
year
month
day
decimal_date
WY
Water Year: October 1 to September 30
CY
Climate Year: April 1 to March 30
Q3
3-Day Moving Average: computed at end of moving interval
Q7
7-Day Moving Average: computed at end of moving interval
Q30
30-Day Moving Average: computed at end of moving interval
jd
Julian date
This function also checks the time step of the data to make sure that it is daily time step. Daily
values with gaps are important to fill with NA
to ensure proper calculation of n-day moving
averages. Use fill_daily = TRUE
or preproc_fill_daily
. Other time steps are currently not
supported and will return the data.frame without moving averages computed.
A data.frame with Date, value, and additional columns with time and n-day moving average information.
preproc_precondition_data(data = example_obs, Date = "Date", value = "streamflow_cfs")
preproc_precondition_data(data = example_obs, Date = "Date", value = "streamflow_cfs")
Validates that daily data do not contain gaps
preproc_validate_daily( data = NULL, Date = "Date", value = "value", date_format = "%Y-%m-%d" )
preproc_validate_daily( data = NULL, Date = "Date", value = "value", date_format = "%Y-%m-%d" )
data |
'data.frame'. Optional data.frame input, with columns containing |
Date |
'Date' or 'character' vector when |
value |
'numeric' vector when |
date_format |
'character' string. Format of |
Used to validate there are no gaps in the daily record before computing n-day moving averages in
preproc_precondition_data
or lag-1 autocorrelation in POR_calc_AR1
. If gaps are present,
preproc_fill_daily
can be used to fill them with NA
values.
An error message with missing dates, otherwise nothing.
preproc_validate_daily(data = example_obs, Date = "Date", value = "streamflow_cfs")
preproc_validate_daily(data = example_obs, Date = "Date", value = "streamflow_cfs")