Around September of 2016 I wrote two articles on using Python for accessing, visualizing, and evaluating trading strategies (see part 1 and part 2). These have been my most popular posts, up until I published my article on learning programming languages (featuring my dad’s story as a programmer), and has been translated into both Russian (which used to be on backtest.ru at a link that now appears to no longer work) and Chinese (here and here). R has excellent packages for analyzing stock data, so I feel there should be a “translation” of the post for using R for stock data analysis.
This post is the first in a two-part series on stock data analysis using Python, based on a lecture I gave on the subject for MATH 3900 (Data Science) at the University of Utah. In these posts, I will discuss basics such as obtaining the data from Yahoo! Finance using pandas, visualizing stock data, moving averages, developing a moving-average crossover strategy, backtesting, and benchmarking. The final post will include practice problems. This first post discusses topics up to introducing moving averages.
NOTE: The information in this post is of a general nature containing information and opinions from the author’s perspective. None of the content of this post should be considered financial advice. Furthermore, any code written here is provided without any form of guarantee. Individuals who choose to use it do so at their own risk.
CapitalOne contacted me a few months ago and requested that I apply for an internship with them for a data science related position. I never got the job (nor did I really want it; I had already agreed to teach during the summer and I was apprehensive about leaving people hanging, and also about moving), but I did go through part of the interview process. CapitalOne had me complete their data science challenge, which had some problems that were supposedly common tasks in data science. Some of it I was not well equipped for, such as regression; I was used to regression from an econometric point of view, not a computer science or data science point of view, and I was still learning. But there was one part of the challenge that I remember very well, and I was very happy with the solution.