Last March, I wanted to break into the Data Science community but was struggling with confidence and starting to doubt I’d make it. I had been reading blog after blog about data science, practiced additional coding and learning new methods at night. And rejection after rejection, I wondered if I was on the right path. Now on the other side, hindsight is 20⁄20. I write this post to my past self, a person seeking for a job they couldn’t seem to get.
Continue reading

If you’ve ever wanted to tag your data science model, you’ve probably wondered how to version it. Which will it be: vx.4.1, v34.1231.51.21, or v91.x4.dev34? After reading about semantic versioning, I propose a method for versioning data science models. Semantic Versioning for Software Semantic versioning proposes the following: Given a version number MAJOR.MINOR.PATCH, increment the: MAJOR version when you make incompatible API changes, MINOR version when you add functionality in a backwards-compatible manner, and PATCH version when you make backwards-compatible bug fixes.
Continue reading

So you want to buy a car, but you don’t know anything about them? Welcome to my life. You show up at the dealer and there’s a sticker on the window. You know the difference between make and model, but you soon learn what a trim is. Some versions come with leather. Some have a sun roof. Some have all wheel drive. Some have 20k in miles, and a similarly priced car in a higher trim is at 40k miles.
Continue reading

In another post, I describe how I use this data that I’ve scraped, but I wanted to provide a more in-depth tutorial for those interested in how I got the data. Note, this data belongs to Truecar, so all uses herein are for personal and academic reasons only. Get the data In order to do any good analaysis, you first need data. I prefer to have more data than less, where possible.
Continue reading

I believe that each creation has four phases: dreaming, planning, acting, and reflecting. Think about it - is there anything you’ve ever made that didn’t first enter into your mind, you came up with some game plan, you carried it out, and then when you were done you could see what went well and where you improved? Isn’t this what scrums are really about? I wanted a scrum for my personal life, but I didn’t find it practical to use the many online resources available.
Continue reading

This blog will outline what I see as differences between Hugo and Jekyll, some benefits and drawbacks of using Netlify vs. GitHub pages to host, and how to launch the Hugo Tranquilpeak theme from scratch. Why Hugo? One of my first posts was about blogging with Jekyll hosted on GitHub. About six months after writing that post, I hit a few bugs trying to debug it and got frustrated because I had already forgotten all of what I binge-learned earlier.
Continue reading

R users fall in love with ggplot2, the growing standard for data visualization in R. The ability to quickly vizualize trends, and customize just about anything you’d want, make it a powerful tool. Yet this week, I made a discvoery that may reduce how much I used ggplot2. Enter plot_ly(). For this post, I assume that you have a working knowledge of the dplyr (or magrittr) and ggplot2 packages.
Continue reading

Author's picture

Bryan Whiting

father, innovator, data scientist

Data Scientist

Washington, D.C.