On Programming Languages; Why My Dad Went From Programming to Driving a Bus

In Data Science from Scratch, a book introducing data science using Python, Joel Grus said the following about R (pg. 302):

Although you can totally get away with not learning R, a lot of data scientists and data science projects use it, so it’s worth getting familiar with it.

In part, this is so that you can understand people’s R-based blog posts and examples and code; in part, this is to help you better appreciate the (comparatively) clean elegance of Python; and in part, this is to help you be a more informed participant in the never-ending “R versus Python” flamewars.

Continue reading


Where to Go from Here? Tips for Building Up R Experience

At the University of Utah, I teach the R lab that accompanies MATH 3070, “Applied Statistics I.”” None of my students are presumed to have any programming experience, and they never hesitate to remind me of that fact, especially when they are starting out. When I create assignments and pick problems, I often can write a one- or three-line solution in thirty seconds that students will sometimes spend four hours trying to solve. They then see my solution and slap their foreheads at its simplicity. I can be tricky with my solutions. For example, suppose you wish to find the sample proportion for a certain property. A common approach (or at least the one used in the textbook our course uses, Using R for Introductory Statistics by John Verzani) looks like this:

Continue reading

SSA Baby Names Visualization with R and Shiny

CapitalOne contacted me a few months ago and requested that I apply for an internship with them for a data science related position. I never got the job (nor did I really want it; I had already agreed to teach during the summer and I was apprehensive about leaving people hanging, and also about moving), but I did go through part of the interview process. CapitalOne had me complete their data science challenge, which had some problems that were supposedly common tasks in data science. Some of it I was not well equipped for, such as regression; I was used to regression from an econometric point of view, not a computer science or data science point of view, and I was still learning. But there was one part of the challenge that I remember very well, and I was very happy with the solution.

Continue reading