There were gasps of amazement in the #GIJC19 auditorium as Paul Myers, the BBC’s online investigation expert, pulled up a personal search page on the big screen, featuring a very familiar name.
He had just found seven phone numbers, eight email addresses, and many other private details for one of the most prominent and closely-guarded celebrities in Britain. (He asked that the person’s name not be printed.)
And it had taken Myers just three minutes, using two search tools and the specialist know-how to wield them like a scalpel.
But then there were moans of exasperation in the audience when the online sleuth, by contrast, did a basic search for what should have been the the most accessible information imaginable: who owns the BBC’s website domain. The result that came back: “REDACTED FOR PRIVACY.”
The two searches seemed to summarize Myers’ central messages: that while there are potent new digital tools for unearthing data for those with the training or patience to use them, new laws, company restrictions, and a global movement to protect privacy have also stripped whole categories of data from the journalists’ research arsenal.
This follows Europe’s recent General Data Protection Regulation (GDPR) and its ripple effects for privacy protections elsewhere, as well as the removal of key search features by Google and the sudden shutdown of Facebook’s powerful Graph Search function.
This could be good news for criminals and corrupt officials hoping to avoid detection — unless, of course, reporters are “sneaky,” Myers said, and know how to work around many of those restrictions.
He demonstrated these skills in a masterclass lecture and a workshop at the Global Investigative Journalism Conference, in which he presented dozens of tools and techniques with the power to find and verify information on wrongdoers. (And, of course, he did rapidly trace the owner of the BBC domain using advanced methods, showing that training can at least make up for some of the new data access limitations.)
Beyond the advanced coding and paid-for services that reporters can use, Myers also emphasized the importance of basic tools and old-fashioned curiosity for effective searches. For instance: having found a Facebook image of Paul Whelan — the alleged American spy now jailed in Russia — inside an aircraft, he wanted to know where Whelan was at the time. He noticed that there was a glary portion of an in-flight map visible behind Whelan’s head, showing the word “Georgian” and a coastal inlet. So he found an uncropped version of the same image online; pasted that image into Photoshop; adjusted the contrast and brightness settings; and, voila! – found the town names that showed that Whelan was flying toward Michigan.
Then there was a lighter moment, when he searched his own details on pipl.com and discovered that he had an “additional name” on the internet. Myers’s alias was listed there as “Hugh Jarse” — an unflattering name when said quickly, the audience realized to much laughter. He recalled that he’d given himself the handle long ago on an Angry Birds video game.
Myers, who offers a wealth of tips and tool suggestions on his website, spoke to GIJN about how online investigation is changing.
How has the recent western move toward data privacy affected investigative journalism?
My personal opinion as a researcher is that it’s very well-meaning to be taken with people’s privacy, but I fear that if you go too far with protecting people’s privacy, you’ll protect the privacy also of criminals, extremists, terrorist recruiters, and everything else you need to investigate. You’ll only leave investigations in the hands of official law enforcement, and ultimately this will be detrimental to society. We are certainly headed that way. There is a war between privacy and transparency.
Let’s say you’re investigating a company owned by the mafia, and they had locations all around the world but their headquarters were in Warsaw. What you used to be able to do was search for employees of that company based in Warsaw. Maybe because of the outcry over privacy, you can no longer do that search on Facebook, so you’re less likely to prove your case, and then people will continue to have their money stolen, or [worse]. Maybe you could draw a middle way, and say Facebook should provide a special research portal for people who are not stalkers, but doing legitimate research to make society better.
The even broader challenge, of course, is that so many countries have not yet digitized their data, so you can’t search for it … or have governments that close down access to the internet.
What should social media companies and regulators realize about their restrictions – and what could they do to find a better balance?
For instance, Bellingcat journalists were in the middle of an investigation in Yemen, crowdsourcing witnesses using Facebook, when, on the 6th of June, Facebook – without any warning or leniency whatsoever – just flicked a switch and everything was cancelled. It was a great day for lots of people abusing human rights in Yemen — probably threw a massive big party. You have to draw a balance.
It’s the same with domain registration. If you, in the past, bought some trainers from whatever-dot-com online, and they stole your money, you could look up the domain registration details and get their address and seek recourse. But thanks to GDPR, you can’t do it. Because American internet authorities are worried about the 20 million euro fine, and they don’t want to risk it. Rather than assessing who is a company and who is a private individual, they just stop it for everything.
There are ways they could change this. They could make a system where people could opt out of having your details available, offer it to everyone, and if they don’t choose that then they tacitly agree to share their details. That system has existed now for decades, and to me, it sounds like a reasonable thing to do. If you deny companies the ability to privately register, I don’t see anything wrong with that; they wouldn’t be covered by GDPR. But, now, it’s like finding the address of a company registered in the Cayman Islands.
If you had to recommend just four search tools to a journalist with limited internet skills, which would you choose?
If they could pay for it – or perhaps share the expense with others — I really like a tool called pipl.com. It seems to have everyone’s cell phone numbers. It’s hugely brilliant. There were other tools, but about a billion European numbers were deleted off them thanks to GDPR.
For reverse image searches, I like Yandex.
Also, don’t forget about the advanced search pages on Google – there is so much there that you normally can’t get, like the ability to date-range, the ability to choose a language or a region. Advanced search on Twitter allows you to track conversations on Twitter between two accounts that reference a third account – those are brilliant. There is no longer an advanced search on Facebook, sadly, but there is one on LinkedIn that allows you to specify individual fields that can differentiate between a first name and a surname, like Harrison Ford versus Rex Harrison, for example.
The great free tool to search for domain names is DomainBigData.com, which is very flexible.
And use Photoshop to crop things out when you’re doing your reverse image searches!
What habits should journalists develop to assist their online searches?
Beyond learning the tools and [research skills], just try to phrase things in a Google way, rather than the first thing that pops into your head. So, if you want to know who came to the conference, don’t just type in “list people GIJC” – that won’t be very good. Rather try something like “GIJC participants” or “attendees” – just think of the words that are likely to be on the page you’re looking for. That is the secret that so few people realize. We researchers tend to order things in our heads differently.
Rowan Philp was chief reporter for South Africa’s Sunday Times for a decade, a period book-ended by fellowships at the Washington Post and MIT. Rowan has reported from 27 countries, and his 2014 investigative report revealing Russia’s secret effort to sell eight nuclear reactors to the South African government for some $70 billion was credited for a role in the scrapping of that deal last year. He is a regular contributor to GIJN.