AKN Publications
Data mining for discovery of pattern and process in ecological systems
Authors: W.M. Hochachka, R. Caruana, D. Fink, S. Kelling, A. Munson, M. Riedewald, D. Sorokina, S. Kelling
Publication: Journal of Wildlife Management, in press in August 2007 issue.
Description: This paper is an introduction to the philosophy and basic methodologies of data mining, for an audience that is largely unfamiliar with the strengths of data mining as a complementary analytical tool to the statistical techniques commonly used by ecologists. Three fundimental strengths of data mining are illustrated with examples using bagged decision trees as the method of analysis. The strengths are: (1) making accurate predictions, (2) discovery of important predictors, and (3) description of functional forms of relationships between predictors and response.
keywords/keyphrases: bagging, data mining, decision trees, exploratory data analysis, hypothesis generation, machine learning, prediction
Mining Citizen
Science Data to Predict Prevalence of Wild Bird Species (PDF)
Authors: R. Caruana, M. Elhawary, A.
Munson, M. Riedewald, D. Sorokina, D. Fink, W. Hochachka, S.
Kelling.
Publication: Proceedings of the 12th ACM
SIGKDD International Conference on Knowledge Discovery and Data Mining
(KDD'06), Philadelphia, PA, 2006
Description: Ecologists are interested in identifying which features have the strongest effect on the distribution and abundance of bird species as well describing the forms of these relationships. We show how data mining can be successfully applied to the environmental data sets, enabling the ecologists to discover unanticipated relationships. We compare a variety of methods for measuring attribute importance with respect to the probability of a bird being observed at a feeder and discuss the biological relevance of the results.
keywords/keyphrases: attribute importance, bagging, decision trees, model inspection, partial dependence function, sensitivity analysis