Around September of 2016 I wrote two articles on using Python for accessing, visualizing, and evaluating trading strategies (see part 1 and part 2). These have been my most popular posts, up until I published my article on learning programming languages (featuring my dad’s story as a programmer), and has been translated into both Russian (which used to be on backtest.ru at a link that now appears to no longer work) and Chinese (here and here). R has excellent packages for analyzing stock data, so I feel there should be a “translation” of the post for using R for stock data analysis.
In Data Science from Scratch, a book introducing data science using Python, Joel Grus said the following about R (pg. 302):
Although you can totally get away with not learning R, a lot of data scientists and data science projects use it, so it’s worth getting familiar with it.
In part, this is so that you can understand people’s R-based blog posts and examples and code; in part, this is to help you better appreciate the (comparatively) clean elegance of Python; and in part, this is to help you be a more informed participant in the never-ending “R versus Python” flamewars.
At the University of Utah, I teach the R lab that accompanies MATH 3070, “Applied Statistics I.”” None of my students are presumed to have any programming experience, and they never hesitate to remind me of that fact, especially when they are starting out. When I create assignments and pick problems, I often can write a one- or three-line solution in thirty seconds that students will sometimes spend four hours trying to solve. They then see my solution and slap their foreheads at its simplicity. I can be tricky with my solutions. For example, suppose you wish to find the sample proportion for a certain property. A common approach (or at least the one used in the textbook our course uses, Using R for Introductory Statistics by John Verzani) looks like this:
This post is the first in a two-part series on stock data analysis using Python, based on a lecture I gave on the subject for MATH 3900 (Data Science) at the University of Utah. In these posts, I will discuss basics such as obtaining the data from Yahoo! Finance using pandas, visualizing stock data, moving averages, developing a moving-average crossover strategy, backtesting, and benchmarking. The final post will include practice problems. This first post discusses topics up to introducing moving averages.
NOTE: The information in this post is of a general nature containing information and opinions from the author’s perspective. None of the content of this post should be considered financial advice. Furthermore, any code written here is provided without any form of guarantee. Individuals who choose to use it do so at their own risk.