Part I - Who are Github Users?

In this series of posts, we work through an exploratory data analysis from start to finish. With these pieces, we hope to give our current and future clients a glimpse into the type of work that goes on behind the scenes as well as provide brief tutorials to aspiring data scientists on how to approach an exploratory data problem.

Part II - Gathering the Data

11 MIN READ - 1/2/2017

This is the second post in a series of data science tutorials involving, GitHub profile data. We used Ruby to pull from GitHub’s API at the max rate and then used Unix tools and R to combine the data into one CSV.

Continue Reading ->

Part III - Localizing Github Profile Data

6 MIN READ - 1/3/2017

This is the third post in a series of data science tutorials using GitHub profile data. By doing Wikipedia queries on location names from Github profiles, we created a dataset of longitude and latitudes for Github users worldwide.

Continue Reading ->

Part IV - Mapping the USA

12 MIN READ - 1/4/2017

This is the fourth post in a series of data science tutorials using GitHub profile data. Using our dataset of Github user longitudes and latitudes from Part 3, we visualize and analyze the distribution of user locations using R with maps and ggplot2.

Continue Reading ->
Find out how we can help you with your next project