If you like to learn about the basics of verification, you’ve come to the right place!
This web page, a part of the KID project, aims to provide a short, down-to-earth guide to analyzing people/institutions, claims, and (audio)visual content on the internet.
What we’ve collected here will be helpful if you’re somewhat media- and tech-savvy (but not a geek), and if you keep mumbling things to yourself like “there’s something wrong with what Jack posted, but I can’t put my finger on it; how can I find out more?” The KID toolbox may also be a good resource for teachers and trainers.
To revisit some technical terms first, simply scroll down and read on.
Alternatively, go straight to the ➜ Tools & services section.
A couple of key terms and explanations
➜ Misinformation, disinformation, and types of fake content
➜ Media and information literacy (MIL), verification, fact checking, and debunking
➜ Open Source Intelligence (OSINT)
➜ Artificial Intelligence (AI)
Misinformation, disinformation, and types of fake content
Investigative journalists and verification experts try to avoid the term “fake news”. That’s because it is vague, misleading, and has been frequently hijacked by those who want to discredit truthful sources. Instead, the verification community prefers to talk about misinformation, disinformation, and different types of fake content.
Misinformation usually refers to the careless, unintentional sharing of false information.
Disinformation is a more serious issue: It refers to the willful dissemination (and production) of bogus content.
According to Claire Wardle/FirstDraft (and Global Voices), there are basically seven types of mis- and disinformation that can be ranked on a scale (1: least harmful, 7: most harmful):
- False connection (headlines, visuals etc. don’t support the content)
- Misleading content (framing of issues or individuals)
- False context (genuine content shared with false context info)
- Impostor content
- Manipulated content
- Fabricated content
Elaborating on a concept by Eliot Higgins/Bellingcat, Wardle has also come up with eight causes behind mis- and disinformation, the eight “Ps”:
- Poor journalism
- Political Influence
Types and causes can be juxtaposed on a matrix. We find Wardle’s typology very helpful and will try to stick to it whenever possible.
Media and information literacy (MIL), verification, fact checking, and debunking
There has been a fair amount of debate regarding the scope of MIL and the exact definition of verification, fact checking, debunking.
As for MIL, we’re oriented towards a (tried and tested) DW Akademie concept which rests on five key competences. Media literate people can:
- Access information/media (find accurate information; fact check and find original sources).
- Analyze information (question the consumed content; ask why a specific angle has been taken; once again check sources).
- Create content (e.g. a social media post, a web article, a video, a podcast).
- Reflect (e.g. understand personal rights, understand obligations as a media consumer; ask themselves how a piece of journalism could have been done better; determine the motive behind the published information).
- Act on conclusions (consciously consume media; participate; report misinformation and/or hate speech; demand transparency, protect personal data).
While verification, fact checking, and debunking are certainly rooted in the domain of information analysis, they’ve come to play an important role across categories and competences – and should never be looked at in isolation. Furthermore, making an exact distinction between the three terms isn’t easy:
In the Verification Handbook, Craig Silverman says that verification and fact-checking are often used interchangeably – which may lead to confusion. He then quotes Bill Adair and states that verification is “the editorial technique used by journalists — including fact-checkers — to verify the accuracy of a statement”. Silverman adds that “verification is a discipline that lies at the heart of journalism, and that is increasingly being practiced and applied by other professions.”
Fact checking is, on the other hand, “a specific application of verification in the world of journalism” (Silverman). According to both Adair and Silverman, verification is “a fundamental practice” and “enables” fact checking.
There are, of course, different views on the issue. Alexios Mantzarlis, for example, says that verification is something that’s usually done ex ante, on user generated content (UGC), focused on seeking primary evidence, and leading to a story being either published or pulled. In contrast, fact-checking is something that’s usually done ex-post, on claims of public relevance, focused on consulting expert sources, and leading to a conclusion regarding the veracity of the claim. Trying to indirectly define yet another term, Mantzarlis says that debunking is the overlap of verification and fact-checking – and concerned with countering false news and viral hoaxes. Mantzarlis’ concept is outlined in this (somewhat controversial) venn diagram.
While we certainly can’t and won’t settle the debate, we’d like to offer our own, rather open and inclusive view on verification and fact-checking.
To us, fact-checking is more concerned with verifying (or falsifying) certain details of a story, like: Did the President actually arrive at the conference at 9:30h CET? Did he talk to a security guy before entering the building? Was it raining? Did a group of protestors hold up a banner and shout slogans? Did somebody throw a tomato at the President? Were three protestors arrested by the police?
Verification, then again, serves as our umbrella term. It’s about:
- investigating the details (s. above)
- AND checking possible UGC (is the amateur footage of the scene uploaded to YouTube authentic or not?)
- AND establishing if the story basically holds up, even if some of the facts are still unclear (maybe the thrown object was a peach, maybe there were only two arrests – that doesn’t kill the news report).
Open Source Intelligence (OSINT)
OSINT refers to data that’s collected from publicly available sources and used in an intelligence context. Some people also use the term when they talk about software that can process and analyze OSINT data. In our case, the intelligence context is investigative journalism and media research.
It’s important to point out that the “open source” in OSINT is different from the one in “open source software”. The former just means that data and tools are overt and free to use (like the Google reverse image search service) whereas the latter implies that all code is non-proprietary and open to peer review at any time (like the Linux Kernel).
Artificial Intelligence (AI)
One of the aims of the KID project is to find out where and how artificial intelligence (AI) can help identify manipulated or fabricated content. There’s one catch, though: AI is another term that’s hard to define.
Consider the classic quote by John McCarthy: “As soon as it works, no one calls it AI any more”. And this joke: “What’s the definition of AI? Cool things that computers can’t do.”
The point is that large-scale database queries and sophisticated scripts, once considered pure computer magic, have long since become standard repertoire. On the other hand, there’s still no AI on the planet that could do the work of a smart, tech-savvy investigative journalist.
In order to avoid a hard definition of AI, experts (like the creators of the University of Helsinki AI online course) have suggested to find “properties characteristic to AI”, like autonomy and adaptivity. Autonomy is “the ability to perform tasks in complex environments without constant guidance by a user” whereas adaptivity is “the ability to improve performance by learning from experience.” The Finnish information science experts also talk about “AIness” or “a pinch of AI”.
In that sense, we’d like to explore and list very AI-ish tools (like the reverse image search services based on artificial neural networks), but also consider more traditional applications and approaches that draw on the “simpler” automation and streamlining of digital workflows.
Gut check, mindset, tools & services
What would you like to analyze?
In verification, there are basically five categories:
The meta category of
- identity (the sender/publisher of the message)
And four content categories – which can (and will) of course be mashed up:
- text (claims, quotes, articles)
- sound files
Two verification super tools
Before using any external software to analyze the piece of information you’re looking at, it’s a best practice to fire up two excellent tools of your very own operating system: your gut and your common sense.
In a lot of cases, you can spot mis-/disinformation just by pausing for a couple of seconds, taking a step back, and asking questions like “can this be real?” or “isn’t that too good/too crazy to be true?”
The classic “I know it when I see it” often applies to fake/false/doctored digital content as well.
If your built-in tools set off the misinformation alarm bells, but you can’t be sure whether the content you’re looking at is authentic and/or truthful, it’s time to make use of the old school 5W1H principle. Look at the piece of information once again – and ask yourself the following questions:
- Who produced this?
- What am I looking at exactly?
- When was this produced?
- Where was this produced?
- Why was this produced?
- How was this produced?
In an updated version of 5WHI specifically geared to digital images on the internet, FirstDraft came up with the very useful “five pillars of visual verification“:
- Provenance (Where does this come from? Is it the original account, article, piece of content?)
- Source (Who created this?)
- Date (When was it created?)
- Location (Where was the account/article/piece of content created?)
- Motivation (Why was it created?)
Useful OSINT tools by category
The 5WHI questions usually can’t be answered on the spot. They need to be investigated. And this is where OSINT tools can be very useful.
There are a lot of them out there by now. And they fall into different, sometimes overlapping categories. We’ve tried our best to list the most useful, basic and free tools on these pages, always keeping in mind the content categories – as well as the 5WH1 questions and/or 5 pillars.
Before you start verifying, a trenchant quote:
“The reporters who are best at this work have their own processes and gadgets to get there, but really (…) obsession and (virtual) shoe leather yield the best results.”
(Brandy Zadrozny in the Verification Handbook)
- Look closely at the URL: Are you dealing with an official, legitimate website?
- Look closely at the logos used: Are they warped or pixelated? Do the colors look weird?
- Use services like whois.com or dnshistory.org to find out who owns a website and how often it has been changed. What can you derive from that information?
- Use the Internet Archive Wayback Machine to check old versions/screenshots of the website you’re looking at. What was published here a week, a month or a year ago? The Wayback Machine is also available as a plug-in for Chrome.
- Use services like Face Comparing (Face++) to find out if the irritating or compromising pictures you’ve been presented actually show the same person.
- Check the age of the account that has become the centre of attention (it’s displayed in the header of an account’s profile page): How long has this account been around? What does that say about the tweet in question?
- Look for the blue checkmark: Is the account officially considered authentic? Be careful: The mark only means the person is who they claim to be; it doesn’t mean that person doesn’t spread false information.
- Use the Botometer: Is there a human behind the account – or more likely a piece of software?
- Use twitteraudit.com (free version has limited features): How many fake followers does the account in question have?
- Use the Treeverse service: What can conversations and links on Twitter tell you about an account?
- Use tools like accountanalysis or Truthnest.com for comprehensive Twitter research: What can tweet frequency, hashtags, tweet types etc. tell you about a user?
Facebook and Instagram
- Due to the closed nature of the platforms and a number of changes “under the hood”, there are currently no tools that allow for a simple, yet comprehensive analysis of a Facebook or Instagram account
- However, try and experiment with tools like Who Posted What? and graph.tips. Among other things, they help you find out about IDs, connections to other users, and posts published from a certain location and within a specific timerange that are related to a specific keyword
- Search Users helps you find people on Instagram and offers basic post statistics
Use the WhatsMyName tool to find (and identify) users across a range of social media and web services.
Text (claims, quotes, articles)
- Copy the claim/quote/piece of text you want to verify and do a Standard or Advanced Google search (don’t forget to add relevant keywords if necessary). In what context does the text appear? What do others have to say about it? To refine your search, use more sophisticated search operators. If you still can’t find anything, try other search engines, like Bing, Baidu, or Yandex.
- Another convenient way to get more context is to run a post through the CrowdTangle Link Checker – and thus learn how often it has been shared (on different platforms), by whom, and with what kind of comments
- Here’s a list of popular, trustworthy blogs/databases and fact checking platforms that can help you find dis- and misinformation (English only):
Believe it or not, but there are actually no OSINT tools yet – only commercial services for the forensics industry.
The Digger Project aims to change that. In a first step, Fraunhofer audio forensic tools will become a part of the Truly Media platform (which also isn’t OSINT, but easily accessible for journalists and fact checkers).
If you have a bit of time and want to work your way into (more or less) unassisted audio analytics,
you can try this:
- download and install Audacity
- read a tutorial on speech/audio analytics, e.g. this one
- also check the Audacity manual, especially the section on analysis tools
The best way to learn more about an image on the web is to do a reverse image search. And the (arguably) best tool for that investigation method is the InViD/WeVerify Plugin – because it includes all relevant image search engines: Baidu, Bing, Google (always the first place you should look), Karma Decay (for Reddit searches), Tineye, Yahoo, and Yandex. Furthermore, the plugin features:
- an image magnifier
- meta data analysis (only useful for images that weren’t uploaded via a standard social media platform)
- a set of forensic tools
Even though we recommend installing the plugin, you can also access the services via individual websites. Here’s a list of links:
An alternative to the plugin’s image magnifier is LunaPic Zoom.
Meta data (if available) can also be analyzed with Jeffrey’s Image Metadata Viewer.
FotoForensics is a good choice for further investigations.
The InViD/WeVerify Plugin is also your best bet when it comes to quick and hands-on video verification. The continuously updated tool currently offers:
- video fragmentation/keyframes/analysis
- a reverse image search based on above-mentioned key frames
- information on video meta data/video rights
An alternative to using the plugin would be the following workflow:
- get a tool like 4k Video Downloader (also a good idea if you need to reliably archive content)
- run the clip in question on the VLC media player
- pause the video whenever necessary and create stillframes/screenshots
- run a reverse image search (see above)
Created by Alexander Plaum, Tilman Wagner and DW Innovation.
Images used on this page (top to bottom) courtesy of Emiliano Vittoriosi, Alfons Morales, Markus Winkler, Firmbee, NASA, Hal Gatewood, Fleur, Peter Stumpf, Markus Winkler, Dewang Gupta, Marvin Meyer, Alex Iby, 99.films, Ritupom Baishya, Rich Soul, NordWood Themes.
The KID Project and the Verification Toolbox are funded by: