Displaying results 1 - 1 of 1
Results per page
10
25
50

Statistical Methods for Mn/Model Phase 4

Image
Date Created
2007-08
Description
Mn/Model is a project that combines landscape and archaeological databases in a Geographic Information System (GIS) with statistical prediction methods to provide an estimate of the risk that a given location contains archaeological artifacts. Planners use these estimates both to seek out areas of low risk and to accommodate areas of high risk when planning transportation projects. Obviously, more accurate risk estimates lead to improved planning and reduced costs. Mn/Model is about to move into its fourth phase, which will include improved landscape and archaeological data. At this time, Mn/DOT wishes to reconsider the statistical prediction methods used in Phase 3 to determine if better alternatives are available. This project proposed and compared eight prediction methods, the Phase 3 method and seven alternatives. The methods were logistic regression with BIC model selection (the Phase 3 approach), logistic regression with Bayesian model averaging, naïve Bayes classification, tree-structured regression, "bumped" trees, "bagged" trees, "double bagged" trees, and "boosted" trees. Bumping, bagging, and boosting are examples of "perturb and aggregate methods," which repeatedly modify the data in minor ways and then combine the predictions from the modified data sets. Overall, bagging, double bagging, and boosting had the best predictive ability. We recommend that bagged trees, or bagging, be the default prediction method for Phase 4. Bagging is easier to do in S-Plus (the statistical software used) than boosting and easier to implement in the GIS framework. Bagging provides substantial improvement in predictive capability over the Phase 3 method. Tree-structured models are also fairly easy to explain to the general public. Double bagging provides a small improvement over bagging, but at the cost of substantially more effort in implementation.