Lesson 6 of 8

Bias in Machine Learning

5 minutes to complete

Bias in Machine Learning

79390180-18368d80-7f24-11ea-936e-22c30be1ccda_lnyPSqk.jpg

Understand what ML bias is and how to avoid it.

Download Lesson

Fairness in Machine Learning

So far, this course showed how machine learning can enhance your work, from saving precious time on existing tasks to opening up new opportunities. ML can do a lot for you, but it comes with challenges you shouldn't overlook.

To address those challenges, a growing number of researchers and practitioners focus on the topic of "fairness" in machine learning. Its guiding principle is that ML should equally benefit everyone, regardless of the societal categories that structure and impact our lives.

What is bias?

What are the negative consequences that might derive from the use of machine learning? The short answer is: Bias.

As humans, we all have our biases. They are tools our brain uses to deal with the information that is thrown at it every day.

Take this example: close your eyes and picture a shoe. Most likely you pictured a sneaker. Maybe a leather men's shoe. It's less likely that you thought of a high-heeled women's shoe. We may not even know why but each of us is biased toward one shoe over the others.

Now imagine that you want to teach a computer to recognise a shoe. You may end up exposing it to your own bias. That's how bias happens in machine learning. Even with good intentions, it's impossible to separate ourselves from our own biases.

Three types of bias

There are different ways in which our own biases risk to become part of the technology we create:

Interaction bias

Take the example before: if we train a model to recognise shoes with a dataset that includes mostly pictures of sneakers, the system won't learn to recognise high heels as shoes.

Latent bias

If you train a ML system on what a scientist looks like using pictures of famous scientists from the past, your algorithm will probably learn to associate scientists with men only.

Selection bias

Say you're training a model to recognise faces. If the data you use to train it over-represents one population, it will operate better for them at the expense of others, with potentially racist consequences.

So what can we do to avoid these biases?

Asking the right questions to avoid bias

As a journalist, a first line of defence against bias is firmly within your reach: the same values and ethical principles you apply every day in your profession should extend to assessing the fairness of any new technology that is added to your toolbox. Machine learning is no exception.

Furthermore, in all cases you should start by considering whether the consequences might negatively impact individuals’ economic or other important life opportunities. This is critical especially if the data you use includes sensible personal information.

Often, the unfair impact isn't immediately obvious, but requires asking nuanced social, political and ethical questions about how your machine learning system might allow bias to creep in.

Considering the main sources of bias

While no training data will ever be perfectly ‘unbiased’, you can greatly improve your chances of building a fair model if you carefully consider potential sources of bias in your data, and take steps to address them.

The most common reason for bias creeping in is when your training data isn't truly representative of the population that your model is making predictions on. You must make sure to have enough data for each relevant group.

A different kind of bias manifests itself when some groups are represented less positively than others in the training data. You should consider reviewing your data before using it to train a model, in order to verify whether it carries any prejudices that might be learned and reproduced by the algorithm.

Preventing bias: it starts with awareness

Bias can emerge in many ways: from training datasets, because of decisions made during the development of a machine learning system, and through complex feedback loops that arise when a ML system is deployed in the real world.

Some concrete questions you might want to ask in order to recognize potential bias include:

For what purpose was the data collected?
How was the data collected?
What is the goal of using this set of data and this particular algorithm?
How was the source of data assessed?
How was the process of data analysis defined before the analysis itself?

Bias is a complex issue and there is no silver bullet. The solution starts with awareness and with all of us being mindful of the risks and taking the right steps to minimise them.

Congratulations! You've just finished Bias in Machine Learning in progress

Recommended for you

open_in_new

Increase traffic with social sharing

Lesson

Tap into social networks to reach new audiences

Start

Remove from your account

Save to your account

None
open_in_new

Determine your revenue model

Lesson

Diversify revenue with subscriptions, ads, and more

Start

Remove from your account

Save to your account

None
open_in_new

Protect your election coverage

Lesson

In this module, we will look at tools to protect and secure your election coverage

Start

Remove from your account

Save to your account

None

How would you rate this lesson?

Your feedback will help us continuously improve our lessons!

TITLE

TITLE

TITLE

TITLE

TITLE

TITLE

TITLE

TITLE

TITLE

TITLE

TITLE

TITLE

TITLE

TITLE

TITLE

TITLE

Bias in Machine Learning

Fairness in Machine Learning

What is bias?

Three types of bias

Asking the right questions to avoid bias

Considering the main sources of bias

Preventing bias: it starts with awareness

Increase traffic with social sharing

Determine your revenue model

Protect your election coverage

I'm looking for resources in

Bias in Machine Learning

Fairness in Machine Learning

What is bias?

Three types of bias

Asking the right questions to avoid bias

Considering the main sources of bias

Preventing bias: it starts with awareness

Increase traffic with social sharing

Determine your revenue model

Protect your election coverage