This is Karma. Karma is not a machine learning classifier 🐕‍🦺

Karma is a real dog trained to detect drugs. However, he would fail the simplest tests we apply in ML...

Let me take you through this story from the eyes of an ML engineer.

https://t.co/WAXRUlTvSI

Thread 🧵

Story TLDR 🔖

The story is about police dogs trained to sniff drugs. The problem is that the dogs often signal drugs even if there are none. Then innocent people land in jail for days.

The cops even joke about the “probable cause on four legs”.

Let's see why is that 👇
1. Sampling Bias 🤏

Drugs were found in 64% of the cars Karma identified, which was praised by the police as very good. In the end, most people don't carry drugs in their cars, so 64% seems solid.

There was a sampling problem though... 👇
The cars were not sampled at random! The police only did the sniff test if there was a serious suspicion that something is wrong.

The chance there are drugs in the car is much higher in this case!
2. Evaluation Metrics 🔍

The police referred to a 2014 study from Poland measuring the efficacy of sniffer dogs. The problem was that every test actually contained drugs!

This means there was no chance to measure false positives from the dogs! Only recall, not precision 🤦‍♂️
3. Leaking Training Data 🚰

Another study found that the dogs learned to recognize the emotions of their handlers during tests. They felt that their human wanted them to find drugs in the specific test scenario, so they did.

The trainer leaked the ground truth during testing.
4. Overfitting ➿

Similar to the one above, in many cases, the dog saw that their handler wanted to find drugs in a car during a traffic stop. So it would raise an alarm.

The dog was rewarded before the car was actually searched! It found an easy signal giving it a reward.
Summary 🏁

It is fascinating how many problems there are with the sniffer dogs that are well known to machine learning engineers (and of course mathematicians). Some of them are even common sense...

Avoid these problems not only when training your model, but also in life 😃
If you liked this thread and want to read more about self-driving cars and machine learning follow me @haltakov!

More from Vladimir Haltakov

Machine Learning Paper Reviews 🔎📜

Check out this thread for short reviews of some interesting Machine Learning and Computer Vision papers. I explain the basic ideas and main takeaways of each paper in a Twitter thread.

👇 I'm adding new reviews all the time! 👇

AlexNet - the paper that started the deep learning revolution in Computer Vision!


DenseNet - reducing the size and complexity of CNNs by adding dense connections between layers.


Playing for data - generating synthetic GT from a video game (GTA V) and using it to improving semantic segmentation models.


Transformers for image recognition - a new paper with the potential to replace convolutions with a transformer.
Let's talk about a common problem in ML - imbalanced data ⚖️

Imagine we want to detect all pixels belonging to a traffic light from a self-driving car's camera. We train a model with 99.88% performance. Pretty cool, right?

Actually, this model is useless ❌

Let me explain 👇


The problem is the data is severely imbalanced - the ratio between traffic light pixels and background pixels is 800:1.

If we don't take any measures, our model will learn to classify each pixel as background giving us 99.88% accuracy. But it's useless!

What can we do? 👇

Let me tell you about 3 ways of dealing with imbalanced data:

▪️ Choose the right evaluation metric
▪️ Undersampling your dataset
▪️ Oversampling your dataset
▪️ Adapting the loss

Let's dive in 👇

1️⃣ Evaluation metrics

Looking at the overall accuracy is a very bad idea when dealing with imbalanced data. There are other measures that are much better suited:
▪️ Precision
▪️ Recall
▪️ F1 score

I wrote a whole thread on


2️⃣ Undersampling

The idea is to throw away samples of the overrepresented classes.

One way to do this is to randomly throw away samples. However, ideally, we want to make sure we are only throwing away samples that look similar.

Here is a strategy to achieve that 👇

More from All

The best morning routine?

Starts the night before.

9 evening habits that make all the difference:

1. Write down tomorrow's 3:3:3 plan

• 3 hours on your most important project
• 3 shorter tasks
• 3 maintenance activities

Defining a "productive day" is crucial.

Or else you'll never be at peace (even with excellent output).

Learn more


2. End the workday with a shutdown ritual

Create a short shutdown ritual (hat-tip to Cal Newport). Close your laptop, plug in the charger, spend 2 minutes tidying your desk. Then say, "shutdown."

Separating your life and work is key.

3. Journal 1 beautiful life moment

Delicious tacos, presentation you crushed, a moment of inner peace. Write it down.

Gratitude programs a mindset of abundance.

4. Lay out clothes

Get exercise clothes ready for tomorrow. Upon waking up, jump rope for 2 mins. It will activate your mind + body.

You May Also Like

Rig Ved 1.36.7

To do a Namaskaar or bow before someone means that you are humble or without pride and ego. This means that we politely bow before you since you are better than me. Pranipaat(प्राणीपात) also means the same that we respect you without any vanity.

1/9


Surrendering False pride is Namaskaar. Even in devotion or bhakti we say the same thing. We want to convey to Ishwar that we have nothing to offer but we leave all our pride and offer you ourselves without any pride in our body. You destroy all our evil karma.

2/9

We bow before you so that you assimilate us and make us that capable. Destruction of our evils and surrender is Namaskaar. Therefore we pray same thing before and after any big rituals.

3/9

तं घे॑मि॒त्था न॑म॒स्विन॒ उप॑ स्व॒राज॑मासते ।
होत्रा॑भिर॒ग्निं मनु॑षः॒ समिं॑धते तिति॒र्वांसो॒ अति॒ स्रिधः॑॥

Translation :

नमस्विनः - To bow.

स्वराजम् - Self illuminating.

तम् - His.

घ ईम् - Yours.

इत्था - This way.

उप - Upaasana.

आसते - To do.

स्त्रिधः - For enemies.

4/9

अति तितिर्वांसः - To defeat fast.

मनुषः - Yajman.

होत्राभिः - In seven numbers.

अग्निम् - Agnidev.

समिन्धते - Illuminated on all sides.

Explanation : Yajmans bow(do Namaskaar) before self illuminating Agnidev by making the offerings of Havi.

5/9
I think a plausible explanation is that whatever Corbyn says or does, his critics will denounce - no matter how much hypocrisy it necessitates.


Corbyn opposes the exploitation of foreign sweatshop-workers - Labour MPs complain he's like Nigel

He speaks up in defence of migrants - Labour MPs whinge that he's not listening to the public's very real concerns about immigration:

He's wrong to prioritise Labour Party members over the public:

He's wrong to prioritise the public over Labour Party