top of page

Charting KNN’s Performance Odyssey: Navigating Setups

This work leverages parallel coordinates using the Wisconsin Diagnostic Breast Cancer dataset to plot KNN Regression configurations as intuitive paths, spotlighting training and test RMSE’s rise and fall at a glance.

P000006_a.jpg

Test RMSE Extremes

This figure shows the best- and worst-performing KNN setups on the test set. You can see how adjusting training proportions, neighbor counts, and cross-validation folds influences the model’s generalization across scenarios.

P000006_b.jpg

Training RMSE Extremes

Unlike the left fig., which shows extremes in test RMSE, this one shows that the lowest training RMSE doesn’t ensure minimal test error. With the same %training, balanced k and more folds boost training performance.

P000006_c.jpg

Neighbor Count Extremes

This figure compares two different k settings with the same training size and cross-validation fold, showing that the setting with the minimal k (=1) ends up with higher RMSE on both training and test data.

Inspired by the Linear regression, Classification, and Resampling session for the Machine Learning Software Foundations Certificate at the Data Sciences Institute, University of Toronto.

Welcome to my e-home, and thanks for stopping by!

Here's where I stash my data viz creations, dive into analysis adventures, and share the coolest research sparks. You'll also catch glimpses of my "oceanic expedition"—a wild ride through curiosity and discovery. Feel free to snoop around, explore, and reach out with any ideas. I'm always up for a coffee chat!​

© 2025 by Chun-Yuan Chen. Powered and secured by Wix. Licensed under CC BY-NC-ND 4.0.

bottom of page