Understand in what cases ML might be the solution to your problem.
Now that you have a better sense of what machine learning is and what are the different approaches to training a model, you are probably wondering how machine learning can help in your daily work. This lesson will address just that.
No-one framed this conversation in a more effective way than the Quartz AI Studio. In the following paragraphs we will borrow their model (with permission) to help you understand some of the situations and feelings you might have when machine learning could help.
Movies about journalism tend to glorify investigations where reporters spend months reading boxes of documents in a windowless room to uncover big stories of corruption. What if we could achieve the same results in a fraction of the time?
Machine learning can help you do exactly that and for this reason is already being used by investigative journalists all across the world.
In 2019, the International Consortium of Investigative Journalists (ICIJ) received more than 700,000 leaked documents, collectively known as the Luanda Leaks. In order to analyse all those files, the ICIJ partnered with Quartz, whose investigation team built a machine learning model to help journalists find the kinds of documents they expected in the cache of leaks.
Another kind of challenge a reporter might face when working on a story is the ability to compare a set of documents with a corpus of similar nature. For example, a political reporter might want to compare one president's State of the Union speeches to all those pronounced by other president's decade after decade.
It happens that this is another challenge machine learning is pretty good at dealing with.
Back in 2017, ProPublica used a computer model to analyse press releases from individual members of the US Congress in comparison with all Congressional press releases published during the same time. This allowed reporters to learn what topics members of Congress cared the most about, or at least talked about more than their peers.
Our world is photographed zillions of times a day. And this translates into an unprecedented amount of images reporters might find stories in. If only there was a way to teach computers to find specific details in a database of visual information... You know where this is going: enter machine learning.
The Ukrainian data journalism agency Texty used machine learning to detect illegal amber mines across Ukraine. Combining different algorithms, they were able to train the ML system on existing examples of amber mining, so that it could find new examples in a set of satellite images.
The resulting story included an online map in which a viewer can zoom into pictures of amber mines across the country.
Words, images, and now numbers. Among the many things computers can do better than humans, there is processing numeric data at scale. If you have thousands of numeric records to analyse, especially if you want to spot patterns and similarities, you are dealing with another case when machine learning can help.
That's what BuzzFeed News did in 2017 for their story on hidden spy planes, which made quite some noise as one of the early high-level examples of journalism applying machine learning for reporting purposes.
They trained a computer to find surveillance aircraft by letting a "random forest" algorithm sift for planes with flight patterns that resembled those operated by the FBI and the Department of Homeland Security.
Amber mines, corruption scandals, spy planes and State of the Union speeches. As you can see, machine learning can be quite handy in supporting your work by augmenting your ability to find and tell important stories with data.
By now, though, it should also be clear that machine learning is not magic. You might even say that it can't do anything you couldn't do – if you just had a thousand tireless interns working for you.
It's still entirely up to you to consider whether machine learning is the right tool to aid the story you want to report. After that assessment is made, you can count on machine learning to help you sift through an unmanageable amount of information and empower your journalism with the findings.