Digging Deep: An attempt to model groundwater quality with land use and more
Can we decipher the groundwater quality by looking at the land? This is the key objective of this project. Uncovering the correlations between the measurable driven factors, especially the land use and groundwater quality will inform decision-making in land management to preserve groundwater resources. In addition, by establishing an understanding of how land use impacts groundwater quality, this research offers predictive insights into the potential changes in groundwater as land use evolves over time.
This study involves analysing 19 years of groundwater quality data, examining the statistical significance of differences in water quality under each land use, and leveraging on machine learning techniques to infer the contributing factors, including land use, while incorporating additional features that influence groundwater quality. The focus area of this study is the Canterbury region, which hosts substantial groundwater resources and experiences the highest groundwater usage for irrigation, industries and community water supply. Nitrate nitrogen concentration in groundwater was used as the water quality indicator and the target variable in statistical modelling due to its strong association to land use.
Four versions of land use classification results spanning over the period during which the water quality measurements were analysed to examine the influence of land use changes over time. For each version of land use classification, Kruskal-Wallis rank sum test was applied to detect the differences across land use types. Subsequent Dunn’s tests were conducted to identify specific groups of land use types where nitrate nitrogen concentrations differed significantly. Land use classification, soil type classification, hydrogeological system classification, precipitation, season of measurements, measuring site ground elevation, territorial area, distance to coast, depth of well screen were the features that were included in training machine learning models. Linear regression, glmnet, sparse partial least squares, random forest and boosted linear models were selected as finalist models.
Out of these models, random forest from ranger package was the most effective model due to lowest root mean squared error and the least variance in performance during 10-fold cross-validation. The model results show that land use has an important impact on the groundwater quality. Specifically, low-producing grassland is generally associated with low nitrate nitrogen concentrations and annual cropland is linked to high nitrate nitrogen concentrations. This result indicates that farming activities on annual cultivated land can lead to elevated nitrate concentrations, highlighting the need for strict management, particularly within the capture areas of wells used for domestic water supply.
ABOUT THE AUTHOR
Alice Zhao
At the time of applying to present at the eResearch Conference, Alice Zhao was pursuing a Master of Applied Data Science degree at the University of Canterbury. This presentation at the conference is the result of her master’s project, marking the final step in completing her master’s degree. Starting off her professional journey as a data scientist, Alice worked with the data science team in ESR on this groundwater quality modelling project for a few months. She gained an understanding of groundwater dynamics in Canterbury and approached the project in an innovative way by utilising her expertise in data analysis and statistical modelling. Alice has developed a strong interest in environmental science through her involvement in the ESR project as well as several projects led by Environment Canterbury. She is inspired to apply data science techniques to tackle critical environmental challenges.
Alice Zhao is currently working in the School of Mathematics and Statistics at the University of Canterbury. Her primary roles involve course content development and teaching for the Master of Applied Data Science programme delivered through UC Online.
-----
For more information about the eResearch NZ / eRangahau Aotearoa conference, visit:
https://eresearchnz.co.nz/