Stock Data Analysis with Python (Second Edition)


This is a lecture for MATH 4100/CS 5160: Introduction to Data Science, offered at the University of Utah, introducing time series data analysis applied to finance. This is also an update to my earlier blog posts on the same topic (this one combining them together). I strongly advise referring to this blog post instead of the previous ones (which I am not altering for the sake of preserving a record). The code should work as of July 7th, 2018. (And sorry for some of the formatting;’s free version doesn’t play nice with code or tables.)

Continue reading


Transaction Costs are Not an Afterthought; Transaction Costs in quantstrat

DISCLAIMER: Any losses incurred based on the content of this post are the responsibility of the trader, not the author. The author takes no responsibility for the conduct of others nor offers any guarantees.

Introduction: Efficient Market Hypothesis

Burton Malkiel, in the finance classic A Random Walk Down Wall Street, made the accessible, popular case for the efficient market hypothesis (EMH). One can sum up the EMH as, “the price is always right.” No trader can know more about the market; the market price for an asset, such as a stock, is always correct. This means that trading, which relies on forecasting the future movements of prices, is as profitable as forecasting whether a coin will land heads-up; in short, traders are wasting their time. The best one can do is buy a large portfolio of assets representing the composition of the market and earn the market return rate (about 8.5% a year). Don’t try to pick winners and losers; just pick a low-expense, “dumb” fund, and you’ll do better than any highly-paid mutual fund manager (who isn’t smart enough to be profitable).

Continue reading

An Introduction to Stock Market Data Analysis with Python (Part 2)

THIS POST IS OUT OF DATE: AN UPDATE OF THIS POST’S INFORMATION IS AT THIS LINK HERE! (Also I bet that just garbled the code in this post.)

I’m keeping this post up for the sake of preserving a record.

This post is the second in a two-part series on stock data analysis using Python, based on a lecture I gave on the subject for MATH 3900 (Data Mining) at the University of Utah (read part 1 here). In these posts, I will discuss basics such as obtaining the data from Yahoo! Finance using pandas, visualizing stock data, moving averages, developing a moving-average crossover strategy, backtesting, and benchmarking. This second post discusses topics including divising a moving average crossover strategy, backtesting, and benchmarking, along with practice problems for readers to ponder.

NOTE: The information in this post is of a general nature containing information and opinions from the author’s perspective. None of the content of this post should be considered financial advice. Furthermore, any code written here is provided without any form of guarantee. Individuals who choose to use it do so at their own risk.

Continue reading