Introduction to Forecasting with ARIMA in R

What Makes ARIMA & XTS Objects So Useful for Forecasting XTS Objects If you’re not using XTS objects to perform your forecasting in R, then you are likely missing out! The major benefits that we’ll explore throughout is that these objects are a lot easier to work with when it comes to modeling, forecasting, & […]

Foundations of Probability that Every Data Scientist Should Know

Understanding Random Events Customers and Your Application Lets say that you have a random likelihood that a user will click on a call-to-action(I’ll call it a CTA from here on out, but this is anytime you invite the reader to buy, shop, give an email, etc.) within your application. Once they have clicked the call-to-action […]

Tired of Nested ifelse in Dplyr?

Using Mutate to Feature Engineer a New Categorical Among the most helpful functions from dplyr is mutate; it allows you to create new variables– typically by layering some logic on top of the other variables in your dataset. Quick Example Let’s say that you’re analyzing user data and you want to categorize users according to […]

You’re Not a Data Scientist Until You Understand the Binomial Distribution

Inference at the heart of data analysis What is the point of inference? Inference is about drawing conclusions about a greater population via some sample of observed data. For example, you have some sample of the countries opinion on the president and you’d like to make some conclusions about the population at large. Obviously you […]

Data Visualization for Product Managers

A few Rules of Thumb to Make You Dangerous Chances are if you’re reading this is you’re a product manager or in some way a contributor to a product team and would like to give yourself a leg up when it comes to understanding the data that is coming your way. I’m going to give […]

Machine Learning, Simplified. Be Apart of the Conversation.

What’s all the buzz about? Machine learning is a concept and frequently dropped buzz word in today’s tech environment that leaves a lot to be desired as far as explanation goes. People often refer to machine learning algorithms as a black box; and while there may be certain aspects of machine learning that may lack […]

A Must-have Algorithm for Your Machine Learning Toolbox: XGBoost

One of the most performant machine learning algorithms XGBoost is a supervised learning algorithm that can be used for both regression & classification. Like all algorithms it has its virtues & draws, of which we’ll be sure to walk through. For this post, we’ll just be learning about XGBoost from the context of classification problems. […]

What is Bootstrap Replication & How Do I Use it?

What is bootstrap replication? For those catching up here, bootstrap sampling refers to the process of sampling a given dataset ‘with replacement’…. And this is where most people get lost. You take many samples and build a distribution to mark your confidence interval. Lets take a quick example. Crypto at College Lets say that you […]

Build your First Chatbot in three minutes

30 sec explanation of how Chatbots work Whether you’re a data scientist, data analyst, or software engineer; and whether you have a strong handle on NLP tools and approaches, if you’re here, you’ve likely wondered how a chatbot works and how to build one, but haven’t ever had the need or chance. Well… you’re here […]

Three Key Charts for Visualizing Proportion Data

Proportion data examples Whatever your application of data analytics & data science, there are proportions everywhere. Proportions are all about understanding the different parts that make up a whole. Proportions are pretty much just a count of something across a given categorical variable. That could be number of customers across different industries, number of sales […]