Here are some notes on machine learning models. Concepts Behind Decision Trees Bagging (boostrap aggregation): Randomly sample with replacement, and average the results. Majority vote: The most commonly-occuring prediction. Internal node: Where the splits occur. Branches: Segments that connect the nodes. Terminal node (leafs, regions): Where the observations end up. The average of the responses (or majority vote) is the prediction for future observations. Gini index: where (m) is the leaf and (k) is the class (0 or 1 for binary classification, but can be extedned for multiple classes).
Continue reading

The goal of this tutorial is to do the following: Collect addresses (via Google Forms) Download to R (via googlesheets) Geocode them (via geocode) Plot them (using leaflet) Get driving distance between them (via gmapsdistance) Cluster them (kmeans) Making the leaflet plot fancy 1. Collect Perhaps in a future post I’ll explore googleformr. For now, I create forms the old-school way.
Continue reading

Author's picture

Bryan Whiting

father, innovator, data scientist

Data Scientist

Washington, D.C.