Understand Customer Churn With The Chi-squared Test Statistic

Introduction The chi-square statistic is a useful tool for understanding the relationship between two categorical variables. For the sake of example, let’s say you work for a tech company that has rolled out a new product and you want to assess the relationship between this product and customer churn. In the age of data, tech […]

How to Visualize Multiple Regression in 3D

Introduction No matter your exposure to data science & the world of statistics, at the very least, you’ve very likely heard of regression. In this post we’ll be talking about multiple regression, as a precursor, you’ll definitely want some familiarity with simple linear regression. If you aren’t familiar you can start here! Otherwise, let’s dive […]

Visualizing Multiple Linear Regression with Heatmaps

Introduction No matter your exposure to data science & the world of statistics, it’s likely that at some point, you’ve at the very least heard of regression. As a precursor to this quick lesson on multiple regression, you should have some familiarity with simple linear regression. If you aren’t, you can start here! Otherwise, let’s […]

The Intuitive Explanation of Logistic Regression

Introduction Logistic regression can be pretty difficult to understand! As such I’ve put together a very intuitive explanation of the why, what, and how of logistic regression. We’ll start with some building blocks that should lend well to clearer understanding so hang in there! Through the course of the post, I hope to send you […]

Multiple Regression in R

Introduction No matter your exposure to data science & the world of statistics, it’s likely that at some point, you’ve at the very least heard of regression. As a precursor to this quick lesson on multiple regression, you should have some familiarity with simple linear regression. If you aren’t, you can start here! Otherwise let’s […]

What Every Data Scientist Needs to Know About Clustering

Introduction to Machine Learning Machine learning is a frequently buzzed about term, yet there is often a lack of understanding into its different areas. One of the first distinctions made with machine learning is between what’s called supervised and unsupervised learning. Having a basic understanding of this distinction and the purposes/applications of either will be […]

Building a Regression Model with Categorical Factors

Introduction Regression is a staple in the world of data science, and as such it’s useful to understand it in its simplest form. I recently wrote a post that gave us more detail into regression. You can find that here. To follow on the ideas that we explored there, today we will be exploring the […]

Build, Evaluate, and Interpret a Linear Regression Model in Minutes

Intro Regression is central to so much of the statistical analysis & machine learning tools that we leverage as data scientists. Stated simply, we utilize regression techniques to model Y through some function of X. We’ll take a look at some additional ideas to set up the premise of regression; and then we’ll take a […]

Understanding The General Modeling Framework

When it comes to building statistical models, we do so with the purpose of understanding or approximating some aspect of our world. The concept of the general modeling framework lends well to breaking down the purposes and approaches that we might take to generate said understanding. What is the General Modeling Framework? Take a look […]

COVID-19: Data Visualization Mastery

I recently made a post where we explored the data recently put out by John Hopkins University on COVID-19; while we were able to make some interesting discoveries, it seemed pertinent to gather data that provided a more full picture. In my search I came across the following dataset acquired and distributed by Tableau. This […]