It’s been a long time since I wrote a blog post. Actually, this is my 100th post, according to my metrics. I wanted to write a fancy article about my history of blogging, how I started doing this because I simply couldn’t justify *not* blogging anymore at the time, how my articles on stock market data went surprisingly viral and caused me to get far more daily views than I ever expected and even landed me book deals, but then I was never able to get hits like those again. I wanted a deep discussion of my blogging history and what I really like writing about.

# Statistics and Data Science

Blog posts related to statistics and data science.

# Learn Python Statistics and Machine Learning with My New Book: Training Systems using Python Statistical Modeling

Packt Publishing has turned another one of my video courses, *Training Your Systems with Python Statistical Modeling*, into a book! This book is now available for purchase.

# Grades Aren’t Normal

Introduction

*This article is also available in PDF form.*

A while back someone posted on Reddit about the grading policies of their academic department. Specifically, the department chair made a statement claiming that grades should be Normally distributed with a C average. I responded, claiming that no statistician would ever take the idea that grades follow a Normal distribution seriously. Some asked for context, and I wrote a long response explaining my position. I repeat that argument here, and also give some R code demonstrations showing what curving grades does. Continue reading

# CPAT and the Rényi-Type Statistic; End-of-Sample Change Point Detection in R

*This article is also available in PDF form.*

Introduction

I started my first research project as a graduate student when I was only in the MSTAT program at the University of Utah, at the very end of 2015 (or very beginning of 2016; not sure exactly when) with my current advisor, Lajos Horváth. While I am disappointed it took this long, I am glad to say that the project is finished and I am finally published.

# What is the probability that in a box of a dozen donuts picked from 14 flavors there’s no more than 3 flavors in the box?

## Problem

Dave’s Donuts offers 14 flavors of donuts (consider the supply of each flavor as being unlimited). The “grab bag” box consists of flavors randomly selected to be in the box, each flavor equally likely for each one of the dozen donuts. What is the probability that at most three flavors are in the grab bag box of a dozen?

# Introducing Rank Data Analysis with Arkham Horror Data

## Introduction

Last week I analyzed player rankings of the Arkham Horror LCG classes. This week I explain what I did in the data analysis. As I mentioned, this is the first time that I attempted inference with rank data, and I discovered how rich the subject is. A lot of the tools for the analysis I had to write myself, so you now have the code I didn’t have access to when I started.

# Comparing the Classes of Arkham Horror; Why Survivors Need Work

## Introduction

This blog post was prompted by this meme posted in the Arkham Horror: The Card Game Facebook group:

# Problems in Estimating GARCH Parameters in R (Part 2; rugarch)

# Introduction

Now here is a blog post that has been sitting on the shelf far longer than it should have. Over a year ago I wrote an article about problems I was having when estimating the parameters of a GARCH(1,1) model in R. I documented the behavior of parameter estimates (with a focus on ) and perceived pathological behavior when those estimates are computed using **fGarch**. I called for help from the R community, including sending out the blog post over the R Finance mailing list.

# Making a Profit with Henry Wan in Arkham Horror: The Card Game

## Introduction

The *Forgotten Age* cycle of Arkham Horror is at a close and Fantasy Flight Games already announced the next cycle, *The Circle Undone*. Not only that, they’ve announced two mythos packs at a rate that… surprised me. A new cycle announcement and two mythos pack announcements in less than two months? Am I the only one who finds the new pace of announcements surprising? Perhaps that means they want to get product out at a faster pace?

# The Distribution of Time Between Recessions: Revisited (with MCHT)

## Introduction

These past few weeks I’ve been writing about a new package I created, **MCHT**. Those blog posts were basically tutorials demonstrating how to use the package. (Read the first in the series here.) I’m done for now explaining the technical details of the package. Now I’m going to use the package for purpose I initially had: exploring the distribution of time separating U.S. economic recessions.