Common Data and Model Problems with Solutions in Data Analytics and Data Science
Common issues in Data Analytics and Data Science Projects you should be aware of, how to check and how to fix them
Common Data and Model issues you should be aware of and check when working with data and when training Machine Learning or Deep Learning models.
Expect problems and eat them for breakfast. Alfred A. Montapert
In this article we will talk about the most common data problems you should know and check when conducting data analysis and modelling, as well as the most common model issues when training a Machine Learning or Deep Learning Model.
- Data Problems
- Missing Data
- Insufficient Data
- Errors in Data
- Imbalanced Data
- Biased Sample
- Model Problems
- Model Assumptions
- Not Sustainable Model in Long Term
- Not Compatible Model
Online, at universities, in courses and in bootcamps we talk a lot about ML or DL models for solving various business and software problems. Often times we assume that the chosen model is universal or that the data in a perfect shape, cleaned and checked, is stored somewhere, ready to be used.
Sadly, this couldn’t be further from the truth. Most of the time, the data is dirty and contains many issues, which should be checked and resolved before even considering using it for training a model and making formal recommendations to the Product Team.
So, we don’t discuss much this the set of possible problems that one should be aware of and check when implementing an ML or DL model.
FREE Data Science and AI Handbook
There are usually two type of problems:
1: Data Problems
2: Model Problems