The most common environment, if not a standard of doing data analysis, is Anaconda and Jupiter Notebook. In one word, Anaconda is a management tool of packages and environment, and Jupiter Notebook is a web-based integration tool of code, plots and markdowns. This post introduces the basic concept and commands of Anaconda.
Summary: Based on the data from 2005 to 2014, the story shows Propser, a P2P online lending platform’s annual performance in listing volume, default rate and lender estimate return according to different ratings and occupations. The story finds the most likely occupations to default and focus on the top 3 occupation in loan volume and explain their contribution to the platform. Continue reading Prosper Performance on Default Loans 2005-2014 using Tableau
This data explotray analysis focus on the US Prosper loan data from 2005 to 2014. Prosper is a p2p online loan and invest platform for small business and individuals. The analysis of the data focus on two key ratings: “Credit Score” for borrowers and “Prosper Rating” for the listings. Through EDA I would like to answer how does Credit Score affect borrowers loan, and also the affection of the Prosper Rating to the investors.
This post is like a 101 course for data visualization. Anyone who start exploring data analysis must be quite familiar with the data set gapminder. I still remember when I was taking the course at Udacity. Hans Rosling’s video is so cool and I would like to share a similar animation plotting in R to show how the world was developing in both health and wealth in the past few decades.