Curtis Miller's Personal Website

2017-05-22T10:09:49-06:00

I gave up on quantstrat 5 years ago when I was still in uni. Never liked its architecture and could never tell what is going to work and what is not. Also, R code of it is packed into long functions/methods. At a time I even wrote my own 500 lines of code backtester, which at the time was tens of times faster than quantstrat, but the gave up because realised the above:

Proper object oriented approach helps building a flexible and extendible framework. And I agree with you – python is your friend. Would recommend QStrader. I have not looked at it properly yet, but its event driven approach will bring you closer to real trading (at expense of speed of course). Zipline (better developed and more functional), although quite mature framework, but its architecture is a bit too messy for what it provides. In my opinion, its because it was developped to suit the needs of quantopian rather than provide “fork and go” easy to use and modify package. Having said that, I know few guys who are using it successfully and not complaining. 🙂

LikeLiked by 1 person

Pingback: Curtis Miller: The End of the Honeymoon: Falling Out of Love with quantstrat | Adrian Tudor Web Designer and Programmer

Pingback: The End of the Honeymoon: Falling Out of Love with quantstrat | A bunch of data

2017-05-22T13:06:34-06:00

once again I really liked reading your blog post on trading in R. I really hope you do not only stick to python as I am working only in R. That said I hope, you really try again programming trading models with tidyquant/tidyverse (“Let’s try re-imagining what a better R backtesting package might look like, one that takes lessons from the tidyverse.”)

LikeLiked by 1 person

2017-05-22T16:03:14-06:00

I would like to revisit it at some point. I think that if I wrote a package that does what I described in the post, it could be really popular and it would be my way of leaving my mark on the R universe, perhaps leading to jobs, talks, seminars and book deals. That opportunity really appeals to me. (If someone else writes it, I’d at least hope they’d mention me in credits.) And there are things that R does really well, even quantstrat; I was impressed by the use of quote() and functions that in principle anyone could write and substitute in, and I think R would handle that kind of functionality better than Python. But it would be a project pushing my limits, I doubt I know enough right now to write the best package, I’m already overburdened with other obligations, and perhaps by seeing how things are done in Python I will have better ideas of what a tidyverse version of a backtesting package would do.

LikeLike

Pingback: The End of the Honeymoon: Falling Out of Love with quantstrat – Mubashir Qasim

2017-05-22T17:58:33-06:00

Nice post Curtis. You’re definitely onto something. These packages are a big commitment. Python may have some advantages in OO, but R has advantages in infrastructure and time series handling that you might not be considering in the switch. At any rate, I like your idea and would like to see where you’d take it.

LikeLiked by 1 person

2017-05-22T18:08:46-06:00

The loss of the time series functionality is not lost on me. I have no idea how that would be replaced in Python besides writing it up from scratch, or using rpy2 to use the R stuff. I would not be too surprised if I end up coming back to R.

One way or another I was going to explore the Python backtesting universe anyway, since I think there’s a lot of demand for that content. I’ve been planning to look at it for a while now.

LikeLike

2017-05-22T19:23:00-06:00

Let us know how you make out with Python. I’m a bit biased, but I agree that exploration will only help.

Btw, the big advantage to tidyquant is scaleability. Not that you can’t scale from 1 to 500+ stocks in quantmod/xts, tq just makes things a bit easier. In R there’s always at least six ways to do something. As you show with quantstrat, sometimes it can be overly cumbersome. tq aims to make the motion from import (tq_get), to transform (tq_transmute), to aggregate (tq_portfolio), to analyze/model (tq_performance) as simple and modular as possible.

For your post, I dont think tidyquant will offer as much benefit since you are analyzing a single stock. With that said, tq is under major development. Let’s see where it goes 😉

LikeLiked by 1 person

2017-05-22T18:19:08-06:00

To make backtesting framework fast you need to drop cumbersom objects like xts at all – they are very slow. Single item indexing is slow, while vectorized stuff is a bit faster. R is a statistical toolkit not a development language. Only recently I started seeing posts about test driven development in R – how can you build a software in an interpreted language without tests without leaving a trail of bugs? I am a fan of R myself – been using it for many years now, but use the right tool for the right thing. E.g. I use it as a source repository for algorithms I cant find anywhere else and sometimes call them from python. Its main purpose is data analysis and statistics. And I understand the temptation to build everything in R, but dont be religious and/or lazy – try other languages and compare.

LikeLike

2017-05-27T11:55:17-06:00

I’m very interested in specific examples of what makes xts objects cumbersome and slow, because the entire point of the package is to make handling time-series data fast and easy.

You can certainly write simple backtesters using a matrix (which will sometimes be faster than xts). And you can write *really* fast backtests if you assume that all operations are *not* path dependent. But almost all of the backtests the developers of quantstrat are concerned with involve order management, which is path-dependent by definition. You need to know where your orders are.

LikeLiked by 1 person

2017-05-27T12:25:38-06:00

I have not had any complaints with xts. It seems to make working with data pretty easy. That said I’ve never been worrying about performance.

LikeLike

2017-05-27T15:08:37-06:00

Ok, lets make it clear. I am not saying that xts is very inefficient. I am sure people who wrote it did their best to make it efficient. Can it be more efficient? Probably, not sure. It is the same as python pandas library. Both are derivatives of base data types to provide new functionality and convenience methods. Convenience is an inverse function of speed. Whatever you do, xts will always be slower than simple vector.

I am not suggesting that replacement for xts should be developed. What I am saying that for an efficient backtesting system (brobably non event-driven) I would use simple vector. xts is one fits all at the expense of speed. Writing your custom stuff will make things faster as you will not be calling and checking the stuff that fits others but not you.

These times might be small. However, in case you are leveraging on optimization and parameter mining rather than complexity of your algorithms this will make enormous difference. I have tested this approach myself quite a while ago and for simple moving average strategies this was running dozens of times faster than quantstrat.

And yeah, for non-vectorized indexing xts is terrible. I know that for vectorized stuff it works well. Also, its binary (?) search is reasonably fast too so it is a good tool to index on dates if you can not avoid it or simply doing some data analysis.

But as I said, to make your backtester more efficient you simply don’t index on dates…

require(quantmod)

xts_object = getSymbols(“YHOO”,src=”google”, auto.assign=FALSE)
mat = coredata(xts_object)
vec = mat[,1]

> system.time(replicate(10000, {xts_object[1, 1]}))
user system elapsed
0.297 0.002 0.311
> system.time(replicate(10000, {mat[1, 1]}))
user system elapsed
0.015 0.000 0.015
> system.time(replicate(10000, {vec[1]}))
user system elapsed
0.006 0.000 0.007

LikeLike

2017-07-10T12:22:21-06:00

In your last comment, you say, “Ok, lets make it clear. I am not saying that xts is very inefficient.” and then “for non-vectorized indexing xts is terrible.” I don’t consider “not very inefficient” and “terrible” as synonymous.

I recall having this same discussion with you privately via email about 3 years ago. In that conversation, I explained why xts is slower than matrix for this one specific case (i.e. x[1,1]): because the matrix subsetting C code has an optimization for it. In all other subsetting operations, xts is either similar or faster than matrix:

	library(xts)
	library(microbenchmark)
	set.seed(21)
	m <- matrix(1,5e6,50)
	x <- .xts(m,1:nrow(m))
	i <- sort(sample(nrow(m),2e4))
	j <- sample(50, 10)
	I <- i[1]
	J <- j[1]
	microbenchmark(m[i,], x[i,], unit="us") # faster than matrix for rows
	microbenchmark(m[,J], x[,J], unit="us") # faster than matrix for 1 column
	microbenchmark(m[,j], x[,j], unit="us") # faster than matrix for columns
	microbenchmark(m[i,J], x[i,J], unit="us") # similar for rows and 1 column
	microbenchmark(m[i,j], x[i,j], unit="us") # faster for rows and columns
	microbenchmark(m[I,J], x[I,J], unit="us") # "terrible" for single-element

view raw

subset_showdown_xts_matrix.R

hosted with ❤ by GitHub

Perhaps you don’t recall our email exchange. But I wanted to point out that your comment essentially cherry-picked the only subsetting operation I can find where xts is noticeably slower than matrix. And the “terrible” performance is still something that only takes a handful of microseconds for a large object on my laptop… so it’s difficult to make a case that it’s a practical performance issue.

While you did get >10x performance improvement from your code by switching from xts to matrix, you also lost all safety that your market data are ordered correctly. That’s not a trade-off I would make.

LikeLiked by 1 person

2017-05-22T18:13:17-06:00

Nobody denies that quantstrat has a learning curve, as do other packages in the ReturnAnalytics family (I.E. PerformanceAnalytics, PortfolioAnalytics, etc.)
HOWEVER:
Note well that they are not packages deliberately made for retail consumption. They are packages created for industrial-level research, and are written by very high-level/senior quants and portfolio managers, and process backtests on swathes of tick data for trading analysis, or thousands of assets with complicated objective functions for portfolio management.
These tools have complexity because real-world trading gets very complex, and their features are necessary to model those dynamics.
I would recommend making liberal use of the R-SIG Finance mailing list for any inquiries you may have as to mechanics, as it may improve development in these packages, rather than giving up and moving onto more simplistic tools.
I’m not sure who wrote QSTrader in Python (if it’s Rob Carver, then it’s a safe bet to use), but zipline is behind quantstrat by a fair margin.

LikeLiked by 2 people

2017-05-22T18:18:49-06:00

Thanks for sharing this, Ilya.

I’ve signed up to the mailing list. Thanks for the tip.

LikeLike

2018-07-02T08:23:50-06:00

I’m not sure who wrote QSTrader in Python (if it’s Rob Carver, then it’s a safe bet to use)

Not me. Was Mike Halls Moore, now at AHL:

https://github.com/mhallsmoore/qstrader

LikeLike

2017-05-27T11:56:36-06:00

Take a look at my blotter/R6 experiments in the sandbox/ directory of the blotter GitHub repo: https://github.com/braverock/blotter/blob/master/sandbox/blotter_R6.R

LikeLiked by 1 person

2017-05-27T14:18:15-06:00

I tried running the code in this file on my system, but I got errors for the Account and Portfolio objects:
Error in R6Class(classname = “Account”, public = list(portfolios = NULL, :
All elements of public, private, and active must be named.

(Same for “Portfolio”)

I did read through the code and I like the idea. I like having portfolios and accounts be objects instead of locked away in an environment, and interacting with them using methods. (The Python packages I was looking at seem to use that approach, at least for strategies; backtrader I think treats strategies as objects with methods that you can customize. It is intuitive.)

I didn’t know about R6 until I saw this and I read through a demo of what it does, and I think it’s a good idea. Even if others don’t like that approach and want to use, say, magrittr pipes %>%, I don’t think it would be too difficult to write functions that serve as a “tidyverse interface” for the objects and thus allow that dialect to be used, if someone wanted. (Not that there’s anything wrong with method chaining, which I would like to see supported where appropriate.)

I thought about how that approach would solve the problem I set out initially to do with this post: the type of walk-forward analysis I wanted. I think it would be a lot easier using the method you’re suggesting: I could create a collection of these objects, backtest as I want in a loop, like lapply or something, and then collect the stats I want for each.

Basically, I think this approach is more flexible and more easily allows other tools to be combined with blotter’s objects, and I’m assuming that quantstrat strategies would also start to take a similar approach.

LikeLike

2017-05-29T06:54:27-06:00

Sorry it didn’t run. I should have noted that it’s 3+ years old and I haven’t run it since then. I mainly wanted to point you to the API and notes.

You should be able to use pipes with that design, since functions would
generally return the object invisibly. While I generally prefer a pure
functional design, I’m not sure it’s the best choice here. I don’t think a
pure functional approach would be terrible for a strategy class, but it might
be cumbersome for a portfolio class inside a strategy…

LikeLike

2017-05-31T11:34:53-06:00

I dont see a problem with you writing your own. Thats the point of programming. If something doesn’t work for you, fix it. Give back to the community that allows you to receive a degree and an income. R and python are best used together.

LikeLike

Pingback: Getting Started with backtrader | Curtis Miller's Personal Website

Pingback: Walk-Forward Analysis Demonstration with backtrader | Curtis Miller's Personal Website

Pingback: Page not found | Curtis Miller's Personal Website

Pingback: Stock Trading Analytics and Optimization in Python with PyFolio, R’s PerformanceAnalytics, and backtrader | Curtis Miller's Personal Website

2019-06-28T15:15:47-06:00

I’m reading this post as a newcomer to quantstrat having exactly the same problem you described here, but just over two years after your posted this. So I’m interested in where you ended up. Did Josh Ulrich convince you stick with it? Did Matt Dancho convince you to go whole-hog with tidyquant? Or are you running your customized python version of a backtester? How about an update? 🙂

LikeLiked by 1 person

Curtis Miller's Personal Website

Curtis Miller's Personal Website

Curtis Miller's personal website, with resume, portfolio, blog, etc.

The End of the Honeymoon: Falling Out of Love with quantstrat

Introduction

Cross-Validation

Attempting to Use quantstrat

blotter: The Fatal Flaw

Conclusion (On To Python)

26 thoughts on “The End of the Honeymoon: Falling Out of Love with quantstrat”

Leave a comment Cancel reply

Curtis Miller's Personal Website

Introduction

Cross-Validation

Attempting to Use quantstrat

blotter: The Fatal Flaw

Conclusion (On To Python)

Share this:

26 thoughts on “The End of the Honeymoon: Falling Out of Love with quantstrat”

Leave a comment Cancel reply