Statistical Data Visualization with Seaborn From UST

Welcome to this Guided Project on Statistical Data Visualization with Seaborn, From UST.

For more than 20 years, UST has worked side by side with the world’s best companies to make a real impact through transformation. Powered by technology, inspired by people and led by their purpose, they partner with clients from design to operation.

With this Guided Project from UST, you can quickly build in-demand job skills and expand your career opportunities in the Data Science field. Producing visualizations is an important first step in exploring and analyzing real-world data sets. As such, visualization is an indispensable method in any data scientist's toolbox as well as a powerful tool to identify problems in analyses and for illustrating results.

In this project, we will employ the statistical data visualization library, Seaborn, to discover and explore the relationships in the Breast Cancer Wisconsin (Diagnostic) data set.

Using the exploratory data analysis (EDA) results from the Breast Cancer Diagnosis – Exploratory Data Analysis Guided Project, you will practice dropping correlated features, implement feature selection and utilize several feature extraction methods including; feature selection with correlation, univariate feature selection, recursive feature elimination, principal component analysis (PCA) and tree based feature selection methods.

Lastly, we will build a boosted decision tree classifier with XGBoost to classify tumors as either malignant or benign. By the end of this Guided Project, you should feel more confident about working with data, creating visualizations for data analysis, and have practiced several methods which apply to a Data Scientist’s role.

Let's get started!