You are currently viewing Harvard Extension School Data Mining for Business (CSCI E-96) Course Review
Tom G Herman Harvard Data Mining Class

Harvard Extension School Data Mining for Business (CSCI E-96) Course Review

I took this class during fall semester 2024 in a continuation of my quest to earn the Data Analytics Graduate Certificate from Harvard Extension School (HES). This experience was very different from my first class at HES (Intro to Statistics and Applied Data Analysis) the summer before.

Compared to my previous class (which was structured with weekly problem sets, midterm and final exams), this Data Mining class was structured very differently. My grade was solely based on my performance analyzing and presenting three fictional business cases over the semester.

The thing I appreciated most about this class was the professor’s wealth of real-world experience and how he was able to communicate complex ideas and programming concepts in human, relatable terms using real life examples. His style was also very reassuring, in that he would show us some complex code but say something like “You don’t need to be able to write this kind of code right now,” or “All I need from you right now is to just understand what the code is doing.”

The lectures were always super interesting and demystified many topics I had heard about previously (Decision Trees, Random Forest, Natural Language Processing and much more) complete with code to show how these things actually work.

The first business case was meant to just get “hands on keyboard” and focused on exploratory data analysis (EDA). This was the “easy” case that would help set expectations for the more difficult cases to come. Each case required a functioning R script, written summary and a PowerPoint presentation (complete with video presentation).

Each case took many hours and lots of troubleshooting, but it was exhilarating using actual data to put a data mining workflow into motion, perform EDA, and build machine learning models that predicted outcomes. These cases felt very applicable to what you might do on the job. I loved how relevant this course was to real life and my own data analytics ambitions.

Overall, this course really demystified the machine learning world for me, improved my confidence and reinforced my interest in the data science world. It made me feel like I am on the right path and I can’t wait to learn more through my future courses at HES.

Some keys for success that I would suggest to anyone taking this course:

  • Keep up. This seems obvious, but the course starts deceptively slow and steady, and builds like a snowball over the semester. Concepts build on each other and your R skills will need to continuously improve. Work through all the code examples and challenges, and don’t wait for the case due date to start approaching before you get serious because you will be in for a world of pain at that point.
  • Start the cases early and put in the time. There are no quizzes, assignments or other “check-ins” to gauge your performance, so don’t get caught off guard, flailing to complete the case without any time to ask for help. The cases will also likely take a lot longer than you think, especially with all the separate components you need to deliver.
  • Consider skipping the textbook. It was listed as optional and I did not buy it. I don’t feel like I missed anything, as the professor provided everything I needed to know in class. Seems like an interesting book to have on the shelf though if you really want to go deep on this stuff.
  • Do the optional homework assignment! This is a straightforward assignment you can complete early in the course that will give you an extra 10 points toward your final grade. It will also help jumpstart your R programming skills.

Good Luck!

– Tom