As some of you may know, I have been experimenting with Youtube videos. The latest one is about standardizing data, and you can view it now.
This video belongs to a series I call "Data Science: The Missing Pieces" (link). Standardizing data is embedded into almost all data modeling pipelines, whether you are building deep learning, machine learning or traditional statistical models. Everyone does it but just like much in data science, it's not well explained why you standardize the data. In this series, I intend to cover these types of gaps and help you understand why you are doing something you've been told to do. "Because it works" is not a sufficient answer!
This video is an extension of a recent blog post in which I explained why standardizing data is not about turning data into a normal distribution, contrary to what you may have been told. I get into why we standardize data, the basic formula for standardization, and when you shouldn't be using it.
***
If you have a topic you'd like me to cover in the future, please comment below or send me an email.
Comments
You can follow this conversation by subscribing to the comment feed for this post.