


Of course Hal conceded that it is perfectly possible that there was racial discrimination elsewhere in the mortgage process, or that some of the variables included are highly correlated with race. When this is done, it turns out that the accuracy of the tree based model doesn't change at all: exactly the same cases are misclassified in the Varian (2014) paper. Varian (2014) sets out to determine the importance of the race variable by excluding it from the prediction and then observe the performance of the modelling that included race by comparison. Varian (2014) notes that the race variable (black) shows up far down a conditional inference tree and seems to be relatively unimportant. In this case, Varian (2014) observes that fitting varying tree models that omitted race as an explanatory variable to fit 1990 mortgage origination data as well as a tree model that included race produced mixed results and somewhat conflicting evidence. Varian (2014) revisits the classic mortgage lending discrimination study done by Munnell, Tootell, Browne, and McEneaney (1996) of the Boston Federal Reserve to show how it could be redone using machine learning techniques: conditional inference tree estimation and Randomforest. Machine Learning - Revisiting Munnell, Tootell, Browne and McEneaney (1996) We apply the Tidyverse suite to perform Exploratory Data Analysis and then apply machine learning libraries which Varian (2014) provides in the form of an interesting and thought provoking primer. To demonstrate some of practical attributes of RStudio, we implement some of Machine Learning models below provided by Varian (2014) in the form of R scripts made available with his Journal of Economic Perspective paper.

It also permits collaboration in real time among colleagues working on shared projects. It replicates the Desktop experience seamlessly even on basic tablets and has the virtue of permitting projects to be saved to the cloud. If you are just running R scripts, RStudio Cloud is a natural choice for setting up and executing data projects. This allows for greater mobility and opportunities to collaborate. Cloud based computing implies storing and accessing data and programs over the Internet instead of your computer's hard drive. RStudio Cloud is a hosted version of RStudio in the cloud that makes it easy to engage in data analysis on the go - even for the tablet warriors among you.
