DS7 Module 1 - Define ML Problems

Module Overview

In this module, you'll learn how to properly define machine learning problems. This is a crucial first step in any data science project, as a well-defined problem sets the foundation for all subsequent modeling decisions. You'll learn to choose appropriate targets, understand their distributions, and select evaluation metrics that align with your project goals.

Learning Objectives

choose a target to predict, and check its distribution
avoid leakage of information from test to train or from target to features
choose an appropriate evaluation metric
use the classification metric ROC AUC to interpret a classifier model

Guided Project

Open DS_231_guided_project.ipynb in the GitHub repository below to follow along with the guided project:

GitHub: Define ML Problems Slides

Guided Project Video

Module Assignment

For this assignment, you'll apply what you've learned to your own portfolio dataset. This hands-on experience will solidify your understanding of the concepts and prepare you for real-world machine learning tasks.

Note: There is no video solution for this assignment as you will be working with your own dataset and defining your own machine learning problem.

Module 1 Assignment

Module 1: Define ML Problems

Module Overview

Learning Objectives

Guided Project

Guided Project Video

Module Assignment

Additional Resources

Documentation

Tutorials and Articles