About >

AKN Publications

 

Data mining for discovery of pattern and process in ecological systems

Authors: W.M. Hochachka, R. Caruana, D. Fink, S. Kelling, A. Munson, M. Riedewald, D. Sorokina, S. Kelling

Publication: Journal of Wildlife Management, in press in August 2007 issue.

Description: This paper is an introduction to the philosophy and basic methodologies of data mining, for an audience that is largely unfamiliar with the strengths of data mining as a complementary analytical tool to the statistical techniques commonly used by ecologists. Three fundimental strengths of data mining are illustrated with examples using bagged decision trees as the method of analysis. The strengths are: (1) making accurate predictions, (2) discovery of important predictors, and (3) description of functional forms of relationships between predictors and response.

keywords/keyphrases: bagging, data mining, decision trees, exploratory data analysis, hypothesis generation, machine learning, prediction

 

Mining Citizen Science Data to Predict Prevalence of Wild Bird Species (PDF)

Authors: R. Caruana, M. Elhawary, A. Munson, M. Riedewald, D. Sorokina, D. Fink, W. Hochachka, S. Kelling.
Publication: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'06), Philadelphia, PA, 2006

Description: Ecologists are interested in identifying which features have the strongest effect on the distribution and abundance of bird species as well describing the forms of these relationships. We show how data mining can be successfully  applied to the environmental data sets, enabling the ecologists to discover unanticipated relationships. We compare a variety of methods for measuring attribute importance with respect to the probability of a bird being observed at a feeder and discuss the biological relevance of the results.

keywords/keyphrases: attribute importance,  bagging, decision trees, model inspection, partial dependence function, sensitivity analysis