The goal of this tutorial is to do the following: Collect addresses (via Google Forms) Download to R (via googlesheets) Geocode them (via geocode) Plot them (using leaflet) Get driving distance between them (via gmapsdistance) Cluster them (kmeans) Making the leaflet plot fancy 1. Collect Perhaps in a future post I’ll explore googleformr. For now, I create forms the old-school way.
Continue reading

The goal of this tutorial is to do the following: Collect addresses (via Google Forms) Download to R (via googlesheets) Geocode them (via geocode) Plot them (using leaflet) Get driving distance between them (via gmapsdistance) Cluster them (kmeans) Making the leaflet plot fancy 1. Collect Perhaps in a future post I’ll explore googleformr. For now, I create forms the old-school way.
Continue reading

The beauty of open source is “Oh, let me just download that package and I can do amazing things!”. The reality is “ok, I downloaded it, and I got the ‘hello world’ example working. But now to actually get it to do what I want in the environment that I want takes like…now 30 hours? Just one more bug and I’ll finally give up…” Bugs I hit: I hit a lot of bugs when building my Leaflet tutorial.
Continue reading

Last March, I wanted to break into the Data Science community but was struggling with confidence and starting to doubt I’d make it. I had been reading blog after blog about data science, practiced additional coding and learning new methods at night. And rejection after rejection, I wondered if I was on the right path. Now on the other side, hindsight is 20⁄20. I write this post to my past self, a person seeking for a job they couldn’t seem to get.
Continue reading

If you’ve ever wanted to tag your data science model, you’ve probably wondered how to version it. Which will it be: vx.4.1, v34.1231.51.21, or v91.x4.dev34? After reading about semantic versioning, I propose a method for versioning data science models. Semantic Versioning for Software Semantic versioning proposes the following: Given a version number MAJOR.MINOR.PATCH, increment the: MAJOR version when you make incompatible API changes, MINOR version when you add functionality in a backwards-compatible manner, and PATCH version when you make backwards-compatible bug fixes.
Continue reading

So you want to buy a car, but you don’t know anything about them? Welcome to my life. You show up at the dealer and there’s a sticker on the window. You know the difference between make and model, but you soon learn what a trim is. Some versions come with leather. Some have a sun roof. Some have all wheel drive. Some have 20k in miles, and a similarly priced car in a higher trim is at 40k miles.
Continue reading

In another post, I describe how I use this data that I’ve scraped, but I wanted to provide a more in-depth tutorial for those interested in how I got the data. Note, this data belongs to Truecar, so all uses herein are for personal and academic reasons only. Get the data In order to do any good analaysis, you first need data. I prefer to have more data than less, where possible.
Continue reading

Author's picture

Bryan Whiting

father, innovator, data scientist

Data Scientist

Washington, D.C.