In this series of posts, we work through an exploratory data analysis from start to finish. With these pieces, we hope to give our current and future clients a glimpse into the type of work that goes on behind the scenes as well as provide brief tutorials to aspiring data scientists on how to approach an exploratory data problem.
This is the second post in a series of data science tutorials involving, GitHub profile data. We used Ruby to pull from GitHub’s API at the max rate and then used Unix tools and R to combine the data into one CSV.Continue Reading ->
This is the third post in a series of data science tutorials using GitHub profile data. By doing Wikipedia queries on location names from Github profiles, we created a dataset of longitude and latitudes for Github users worldwide.Continue Reading ->
This is the fourth post in a series of data science tutorials using GitHub profile data. Using our dataset of Github user longitudes and latitudes from Part 3, we visualize and analyze the distribution of user locations using R with maps and ggplot2.Continue Reading ->