Journal Paper Digests

Journal Paper Digests 2021 #20

One size does not fit all: Toward regional conservation practice guidance to reduce phosphorus loss risk in the Lake Erie watershed
Regional ensemble modeling reduces uncertainty for digital soil mapping
The Concept, Practice, Application and Results of Locally Based Monitoring of the Environment
Connecting Top-Down and Bottom-Up Approaches in Environmental Observing
The Use of Digital Platforms for Community-Based Monitoring
Amalgamation and harmonization of soil survey reports into a multi-purpose database
Multisensor fusion of remotely sensed vegetation indices using space-time dynamic linear models
Do soil health tests match farmer experience? Assessing biological, physical, and chemical indicators in the Upper Midwest United States
Evaluating three calibration transfer methods for predictions of soil properties using mid-infrared spectroscopy
Continuous in situ soil nitrate sensors: The importance of high-resolution measurements across time and a comparison with salt extraction-based methods
Bayesian-based time-varying multivariate drought risk and its dynamics in a changing environment
Transferring Hydrologic Data Across Continents - Leveraging Data-Rich Regions to Improve Hydrologic Prediction in Data-Sparse Regions

One size does not fit all: Toward regional conservation practice guidance to reduce phosphorus loss risk in the Lake Erie watershed

Agricultural phosphorus (P) losses to surface water bodies remain a global eutrophication concern, despite the application of conservation practices on farm fields. Although it is generally agreed upon that the use of multiple conservation practices (“stacking”) will lead to greater improvements to water quality, this may not be cost effective to farmers, reducing the likelihood of adoption. At present, wholesale recommendations of conservation practices are given; however, the application of specific conservation practices in certain environments (e.g., no-till with surface application, cover crops) may not be effective and can even lead to unintended consequences. In this paper, we present the Lake Erie watershed as a case study. The Lake Erie watershed contains regions with unique physical geographies that include differences in climate, soil, topography, and land use, which have implications for both P transport from agricultural fields and the efficacy of conservation practices in mitigating P losses. We define major regions within the Lake Erie watershed where common strategies for conservation practice implementation are appropriate, and we propose a five-step plan for bringing regionally tailored, adaptive, and cost-conscious conservation practice into watershed planning. Although this paper is specific to the Lake Erie watershed, our framework can be transferred across broader geographic regions to provide guidance for watershed planning.

Regional ensemble modeling reduces uncertainty for digital soil mapping

Recent country and continental-scale digital soil mapping efforts have used a single model to predict soil properties across large regions. However, different ecophysiographic regions within large-extent areas are likely to have different soil-landscape relationships so models built specifically for these regions may more accurately capture these relationships relative to a ‘global’ model. We ask the question: Is a single ‘global’ model sufficient or are regionally-specific models useful for accurate digital soil mapping? We test this question by modeling soil depth classes across the 432,000 km(2) upper Colorado River Basin in the Western USA using a single global model, multiple ecophysiographic models, and ensembles of the ecophysiographic models. Effective soil depth class observations (n = 12,194) were derived from multiple soil databases. Fifty-seven environmental covariates were derived from a 30 m digital elevation model, climate data, satellite imagery, and aeroradiometric data. Three independent land classifications were used to stratify the area. Two expert-derived land classifications, USDA Major Land Resource Areas (MLRA) and US-EPA Level III ecoregions, divided the study area into multiple ecophysiographic regions based on vegetation and broad-scale physiographic differences. The third land classification divided the study area into broad landforms. Soil depth observations were split into separate training (n = 10,470) and validation (n = 1,724) datasets. First, a ‘global’ random forest model was used to model soil depth classes using all training observations and covariates. ‘Global’ denotes a model built with all training data across the extent of the area, not a model at world extent. Second, the land classifications were used to subset the observations into ecophysiographic sub-datasets and random forest models were refit for each region. Models fit by ecophysiographic region are referred to as regional models. Thirdly, predictions from each regional model were fused into regional-ensemble models. Accuracy, Brier scores, and Shannon’s entropy were used to compare model accuracy and uncertainty. Regional ecophysiographic models were also compared to models built for geographic areas that were defined solely to be approximately equal in area. Training dataset density and the imbalance ratio were investigated to determine if data characteristics influenced regional accuracy/uncertainty metrics. Accuracy for the global model using the validation set was 62.8%. Regional model accuracies ranged between 56.1% and 75.0%. We found: 1) useful inter-regional differences in global model accuracy were revealed when the global model was validated by region, 2) no consistent relationship between training observation density and accuracy/uncertainty metrics, 3) no meaningful differences in accuracy and uncertainty metrics between physiographic and geographic regions, 4) ensembles of regionally-specific models were approximately as accurate as global models, and 5) both region-specific models and ensembles of regional models were less uncertain than the global model. Overall, we recommend the use of soil depth class predictions made from MLRA regional ensemble models because this prediction had higher accuracy than the ecoregion ensemble model prediction, but lower uncertainty than both the global model and the landform ensemble model predictions. We answer our question: Ensembles of regionally-specific models are approximately as accurate as global models, but result in less uncertainty.

The Concept, Practice, Application and Results of Locally Based Monitoring of the Environment

Locally based monitoring is typically undertaken in areas in which communities have a close attachment to their natural resource base. We present a summary of work to develop a theoretical and practical understanding of locally based monitoring and we outline tests of this approach in research and practice over the past 20 years. Our tests show that locally based monitoring delivers credible data at local scale independent of external experts and can be used to inform local and national decision making within a short time frame. We believe that monitoring conducted by and anchored in communities will gain in importance where scientist-led monitoring is sparse or too expensive to sustain and for ecosystem attributes in cases in which remote sensing cannot provide credible data. The spread of smartphone technology and online portals will further enhance the importance and usefulness of this discipline.

Connecting Top-Down and Bottom-Up Approaches in Environmental Observing

Effective responses to rapid environmental change rely on observations to inform planning and decision-making. Reviewing literature from 124 programs across the globe and analyzing survey data for 30 Arctic community-based monitoring programs, we compare top-down, large-scale program driven approaches with bottom-up approaches initiated and steered at the community level. Connecting these two approaches and linking to Indigenous and local knowledge yields benefits including improved information products and enhanced observing program efficiency and sustainability. We identify core principles central to such improved links: matching observing program aims, scales, and ability to act on information; matching observing program and community priorities; fostering compatibility in observing methodology and data management; respect of Indigenous intellectual property rights and the implementation of free, prior, and informed consent; creating sufficient organizational support structures; and ensuring sustained community members’ commitment. Interventions to overcome challenges in adhering to these principles are discussed.

The Use of Digital Platforms for Community-Based Monitoring

Environmental observing programs that are based on Indigenous and local knowledge increasingly use digital technologies. Digital platforms may improve data management in community-based monitoring (CBM) programs, but little is known about how their use translates into tangible results. Drawing on published literature and a survey of 18 platforms, we examine why and how digital platforms are used in CBM programs and illuminate potential challenges and opportunities. Digital platforms make it easy to collect, archive, and share CBM data, facilitate data use, and support understanding larger-scale environmental patterns through interlinking with other platforms. Digital platforms, however, also introduce new challenges, with implications for the sustainability of CBM programs and communities’ abilities to maintain control of their own data. We expect that increased data access and strengthened technical capacity will create further demand within many communities for ethically developed platforms that aid in both local and larger-scale decision-making.

Amalgamation and harmonization of soil survey reports into a multi-purpose database

There is a growing demand for standardized, easily accessible, and detailed information pertaining to soil and its variability across the landscape. Typically, this information is only available for selected areas in the form of local or regional soil surveys reports which are difficult, and costly, to develop. Additionally, soil surveying protocols have changed with time, resulting in inconsistencies between surveys conducted over different periods. This article describes systematic procedures applied to generate an aspatial, terminologically, and unit-consistent, database for forest soils from county-based soil survey reports for the province of New Brunswick, Canada. The procedures involved (i) amalgamating data from individual soil surveys following a hierarchical framework, (ii) summarizing and grouping soil information by soil associations, (iii) assigning correct soil associates to each association, with each soil associate distinguished by drainage classification, (iv) assigning pedologically correct horizon sequences, as identified in the original soil surveys, to each soil associate, (v) assigning horizon descriptors and measured soil properties to each horizon, as outlined by the Canadian System of Soil Classification, and (vi) harmonizing units of measurement for individual soil properties. Identification and summarization of all soil associations (and corresponding soil associates) was completed with reference to the principal soil-forming factors, namely soil parent material, topographic surface expressions, soil drainage, and dominant vegetation type(s). This procedure, utilizing 17 soil surveys, resulted in an amalgamated database containing 106 soil associations, 243 soil associates, and 522 soil horizon sequences summarizing the variability of forest soil conditions across New Brunswick.

Multisensor fusion of remotely sensed vegetation indices using space-time dynamic linear models

High spatiotemporal resolution maps of surface vegetation from remote sensing data are desirable for vegetation and disturbance monitoring. However, due to the current limitations of imaging spectrometers, remote sensing datasets of vegetation with high temporal frequency of measurements have lower spatial resolution, and vice versa. In this research, we propose a space-time dynamic linear model to fuse high temporal frequency data (MODIS) with high spatial resolution data (Landsat) to create high spatiotemporal resolution data products of a vegetation greenness index. The model incorporates the spatial misalignment of the data and models dependence within and across land cover types with a latent multivariate Matern process. To handle the large size of the data, we introduce a fast estimation procedure and a moving window Kalman smoother to produce a daily, 30-m resolution data product with associated uncertainty.

Do soil health tests match farmer experience? Assessing biological, physical, and chemical indicators in the Upper Midwest United States

Soil health testing provides an integrated assessment of biological, physical, and chemical attributes to inform the sustainable management of farm fields. However, it is unclear how tests reflect farmers’ own assessments of soil quality and agronomic performance, which may disproportionately influence farm management practices. We asked farmers in three regions of Michigan to identify three fields to compare their own assessments against soil health tests: a “best,” a “worst,” and a “non-row crop” reference field. Each field was tested for soil aggregate stability, available water capacity, soil organic matter (SOM), mineralizable carbon (MinC), permanganate oxidizable carbon (POXC), pH, P, and K. We evaluated soil health scores using paired t tests to compare results from contrasting fields with farmers’ assessments of each field. Across all farms, the overall soil health test score for cropped fields was significantly higher on fields farmers rated as “Best.” This result was driven solely by physical and biological (including C) parameters; inorganic chemical tests did not distinguish among field types. On reference fields in all regions, biological parameters were consistently higher, but inorganic chemical and physical measures were not. The performance of soil C measures was inconsistent: SOM and MinC consistently detected significant differences between “Best” and “Worst” cropped fields, but POXC did not. Our results suggest that common soil health assays for physical and biological attributes generally align well with farmers’ assessments of their fields. That soil health tests match farmer experience reinforces the value of these tests as a meaningful guide for soil management decisions.

Evaluating three calibration transfer methods for predictions of soil properties using mid-infrared spectroscopy

Mid-infrared (MIR) spectroscopy models have been developed for rapid assessment of soils but are often soil and instrument specific because of differences in laboratory conditions and sensor setup. Calibration transfer is required to apply a spectral model such as partial least squares (PLS) regression developed from a primary instrument to a spectral dataset measured by a secondary instrument with statistically retained accuracy and precision. The study aimed to compare the performance of three transfer methods (i.e., direct standardization [DS], piecewise direct standardization [PDS], and spectral space transfer [SST]) and investigate the effects of transfer sample size and sample selection methods. The transfer methods were developed for predicting total C, clay, silt, and sand contents, cation exchange capacity (CEC), pH in water (pH(W)) and CaCl2, CaCO3 equivalent, and -1,500-kPa water retention using spectral measurements of a secondary instrument. Calibration transfer methods of three PLS models for estimating soil properties with a high (total C), intermediate (clay content), and low (pH(W)) predictability were discussed. The effect of sample size required for the development of the calibration transfer and the selection method of the transferred samples were investigated. It was found that SST was most favorable for a relatively small sample size used in calibration transfer (<= 12 samples). The performance of transfer methods was optimal when the transfer samples accounted for the variability of MIR spectra from the secondary instrument. We conclude that SST and PDS have the potential to be applied in spectroscopy for predicting soil properties using secondary instruments.

Continuous in situ soil nitrate sensors: The importance of high-resolution measurements across time and a comparison with salt extraction-based methods

Soil NO3- affects microbial processes, plant productivity, and environmental N losses. However, the ability to measure soil NO3- is limited by labor-intensive sampling and laboratory analyses. Hence, temporal variation in soil solution NO3- concentration is poorly understood. We evaluated a new potentiometric sensor that continuously measures soil solution NO3- concentration with unprecedented specificity due to a novel membrane that serves as a barrier to interfering anions. First, we compared sensor and salt extraction-based measurements of soil NO3- in well-controlled laboratory conditions. Second, using 60 d of in situ soil NO3- measurements every 10 s, we quantified temporal variation and the effect of sampling frequency on field estimations of mean daily NO3- concentration both within and across days. In the laboratory, sensors measured soil NO3- concentration without significant difference from theoretical adjusted soil NO3- concentration or conventional salt extraction-based methods. In the field, the sensors demonstrated no within-day pattern in soil NO3- concentration, although individual measurements within a day differed by as much as 20% from the daily mean. Across days, when soil solution NO3- was dynamic (early spring) and sampling frequency was >5 d, estimates of mean daily NO3- concentration were >20% from the actual mean daily concentration. In situ soil sensors offer potential to improve fundamental and applied sciences. However, in most situations, sensors will measure soil properties in a different manner than conventional salt-extract soil sampling-based approaches. Research will be required to interpret sensor measurements and optimize sensor deployment.

Bayesian-based time-varying multivariate drought risk and its dynamics in a changing environment

Drought is one of the most damaging but least understood environmental disasters. The time-varying multi variate drought risk and its dynamics have remained unresolved in a changing environment. To this end, a Bayesian framework with time in its location parameter as a covariate was introduced in this study to conduct time-varying distributions of duration and severity. Besides, the joint distribution of precipitation and runoff was developed by bivariate non-parameter density kernel estimation for multivariate drought index NKMSDI (Nonparametric Kernel Multivariate Standardized Drought Index) and Expected Waiting Time (EWT)-based return period was used to estimate drought risk. Finally, the time-varying risk trends were explored and verified via correlations between drought risk and Normalized Difference Vegetation Index series. Results indicate that: (1) bivariate return period is more accurate than univariate return period for drought risk assessment and return periods under non-stationary assumption are more reasonable than those under stationary assumption; (2) the multivariate drought risks present obviously increasing trends and the western basin shows the highest increasing rate; and (3) the increasing drought risks exhibit strong association with sunspot activities and local vegetation dynamics. In general, this study provides new insights into drought risk and its dynamics under the time-varying drought properties condition, which is highly important for robust and effective management practices.

Transferring Hydrologic Data Across Continents - Leveraging Data-Rich Regions to Improve Hydrologic Prediction in Data-Sparse Regions

There is a drastic geographic imbalance in available global streamflow gauge and catchment property data, with additional large variations in data characteristics. As a result, models calibrated in one region cannot normally be migrated to another without significant modifications. Currently in these regions, non-transferable machine learning models are habitually trained over small local data sets. Here we show that transfer learning (TL), in the senses of weight initialization and weight freezing, allows long short-term memory (LSTM) streamflow models that were pretrained over the conterminous United States (CONUS, the source data set) to be transferred to catchments on other continents (the target regions), without the need for extensive catchment attributes available at the target location. We demonstrate this possibility for regions where data are dense (664 basins in Great Britain), moderately dense (49 basins in central Chile), and scarce with only remotely sensed attributes available (5 basins in China). In both China and Chile, the TL models showed significantly elevated performance compared to locally trained models using all basins. The benefits of TL increased with the amount of available data in the source data set, and seemed to be more pronounced with greater physiographic diversity. The benefits from TL were greater than from pretraining LSTM using the outputs from an uncalibrated hydrologic model. These results suggest hydrologic data around the world have commonalities which could be leveraged by deep learning, and synergies can be had with a simple modification of the current workflows, greatly expanding the reach of existing big data. Finally, this work diversified existing global streamflow benchmarks. Plain Language Summary We introduced a method to utilize available big data to better start and warm up a machine learning streamflow model that is later fine-tuned for prediction in basins on other continents (Asia, South America and Europe). This procedure noticeably improved streamflow volume prediction for different scenarios with varying amounts of data in the target basins (in terms of time period, length of collected data, and number of basins having data). This allows thousands of basins across the world with only a few years’ worth of streamflow observations to benefit from improved modeling and accuracy resulting from the use of deep learning.