Question For Data Scientist: in Shoelaces?

Mohammed Ouahman
1 min readNov 29, 2021

--

Data Science in shoelaces, let’s know how your intuition is, in solving real-world-data problems!

enjoyable=> {

have fun;

}

This Item by Matteo Goi

Nike has hired you as a data science consultant to help them save money on shoe materials. Your first assignment is to review a model one of their employees built to predict how many shoelaces they’ll need each month. The features going into the machine learning model include:

  • The current month (January, February, etc)
  • Advertising expenditures in the previous month
  • Various macroeconomic features (like the unemployment rate) as of the beginning of the current month
  • The amount of leather they ended up using in the current month

The results show the model is almost perfectly accurate if you include the feature about how much leather they used. But it is only moderately accurate if you leave that feature out. You realize this is because the amount of leather they use is a perfect indicator of how many shoes they produce, which in turn tells you how many shoelaces they need.

After you read it, tell me What are the features or a feature that constitutes a source of data leakage?

Explain it why and how…..in the comment!

--

--

Mohammed Ouahman

Data Scientist, Machine Learning Enthusiast, Passionate about E-commerce industry.