This news is a few weeks late, but better late than never!
At the end of June I published my first video course, Unpacking NumPy and Pandas. In the blog post announcement, I said that the course was the first in a series of courses on using Python for data analysis. I recently published the next course in the series, entitled Data Acquisition and Manipulation with Python with Packt Publishing.
This course is effectively two courses in one. Half of the course is devoted to working with what some may call “clean and structured” data, focusing on loading in this data and intermediate Pandas usage, such as aggregation, data reshaping, and data transformation. The other half of the course focuses on tapping the vast database otherwise known as the Internet to create new datasets. The Internet is filled with data that is “messy” and this course introduces viewers to the tools to grab and clean this data.
The course is roughly two-and-a-half hours and is divided into six sections. Their topics are:
- Loading data from “structured” sources, such as CSV files, JSON files, Excel workbooks, and especially databases (MySQL in particular).
- Data wrangling, including combining datasets, data reshaping and transformation, and string manipulation.
- Grouping and data aggregation, including creating pivot tables and performing cross-tabulation.
- Using BeautifulSoup for web scraping HTML.
- Using Selenium for web scraping and interacting with web pages “like a human.”
- Scrapy for crawling the web.
I load the videos with demonstrations and example code. Often a section includes a dataset (which comes with the course) or application that serves as a common thread through the videos. For instance, in the first three sections I frequently revisit a dataset filled with population data provided by the U.S. Census Bureau, and we see how to manipulate this dataset in ways that are sometimes insightful on their own and sometimes would serve as a first step for a more involved analysis. On the other hand, each of the last three sections each has their own application, such as extracting Nobel laureate birthdays from Wikipedia, following Google’s “Searches related to” feature, or crawling through Reddit threads.
The videos in the course–narrated by me–include not only an explanation of the topic at hand but interactive demonstrations, so viewers can see how to use the software and follow along if they so desire. The video course includes the Jupyter notebooks I use in my demonstrations; viewers can run my code blocks to replicate my results, and edit them for their own experimentation.
You can buy the course on Packt’s website. If the price of the course is an obstacle, perhaps consider watching it on Mapt, Packt’s subscription service, which gives you access not only to my course but everything Packt has published (and they publish a ton of stuff), along with one free book of your choice to keep every month (likely without any DRM; just a plain ol’ PDF). (I also hear that Packt’s videos are available on other services, such as Lynda, but don’t quote me on that.)
I personally enjoyed writing this course more than Unpacking NumPy and Pandas as it’s using Python in a more creative way (but I consider that course to be a prerequisite). I enjoyed creating the examples seen in the videos, especially the web scraping examples. I hope my viewers enjoy the course as well.
I thank Packt for publishing this course. I also thank my editor, Viranchi Shetty, for offering feedback and keeping me on schedule. The editors at Packt had a big impact on the final product. I look forward to publishing my next course on machine learning with them. Expect that within a few months. I’m enjoying writing that course more (yes, I’m working on it now), and I’m sure many will find it enlightening.
If you like my blog and would like to support me, perhaps consider purchasing the course. If you have no need for it or don’t have the money to spend (which I understand completely; I don’t live a life of glamour myself, being a graduate student), I’d love for you to spread the word about the course. Tell a friend wanting to get started in data analysis or data science, or even share this post on Facebook or Twitter or whatever your preferred social network is. Directing more eyes to the course helps. Write a review if you have watched it; I would love to hear your feedback, both positive and negative (though if negative, be gentle and constructive please).
My website has a new page for the new course here.
Thanks for reading!
If you want to know more about what the course is like, below are some of the videos included in the course.