University of Illinois Chicago
Browse

Hydrometerological Variables Predict Fecal Indicator Bacteria Densities in Freshwater: Data-driven Methods for Variable Selection

Download (235.22 kB)
journal contribution
posted on 2014-04-15, 00:00 authored by Rachael M. Jones, Li Liu, Samuel Dorevitch
Statistical models of microbial water quality inform risk management for water recreation. Current research focuses on resource-intensive, location-specific data collection and water quality modeling, but this approach may be cost-prohibitive for risk managers responsible for numerous recreation sites. As an alternative, we tested the ability of two data-driven models, tree regression and random forests with conditional inference trees, to select readily available hydrometeorological variables for use in linear mixed effects (LME) models predicting bacterial density. The study included the Chicago Area Waterway System (CAWS) and Lake Michigan beaches and harbors in Chicago, Illinois, at which Escherichia coli and enterococci were measured seasonally in 2007-2009. Tree regression node variables reduced data dimensionality by > 50 %. Variable importance ranks from random forests were used in a forward-step selection based on R (2) and root mean squared prediction error (RMSPE). We found two to three variables explained bacteria densities well relative to random forests with all variables. LME models with tree- or forest-selected variables performed reasonably well (0.335 < R (2) < 0.658). LME models for Lake Michigan had good prediction accuracy with respect to the single sample maximum standard (72-77 %), but limited sensitivity (23-62 %). Results suggest that our alternative approach is feasible and performs similarly to more resource-intensive approaches.

Funding

Metropolitan Water Reclamation District of Greater Chicago

History

Publisher Statement

Post print version of article may differ from published version. The final publication is available at springerlink.com; DOI:10.1007/s10661-012-2716-8

Publisher

Springer Verlag

Language

  • en_US

issn

0167-6369

Issue date

2013-03-01

Usage metrics

    Categories

    No categories selected

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC