Journalists have used scrapers to collect data that rooted out extremist cops, tracked lobbyists, and uncovered an underground market for adopted children
By: The Markup Staff
The fruits of web scraping—using code to harvest data and information from websites—are all around us.
People build scrapers that can find every Applebee’s on the planet or collect congressional legislation and votes or track fancy watches for sale on fan websites. Businesses use scrapers to manage their online retail inventory and monitor competitors’ prices. Lots of well-known sites use scrapers to do things like track airline ticket prices and job listings. Google is essentially a giant, crawling web scraper.
Scrapers are also the tools of watchdogs and journalists, which is why The Markup filed an amicus brief in a case before the U.S. Supreme Court this week that threatens to make scraping illegal.
The case itself—Van Buren v. United States—is not about scraping but rather a legal question regarding the prosecution of a Georgia police officer, Nathan Van Buren, who was bribed to look up confidential information in a law enforcement database. Van Buren was prosecuted under the Computer Fraud and Abuse Act (CFAA), which prohibits unauthorized access to a computer network such as computer hacking, where someone breaks into a system to steal information (or, as dramatized in the 1980s classic movie “WarGames,” potentially start World War III).
In Van Buren’s case, since he was allowed to access the database for work, the question is whether the court will broadly define his troubling activities as “exceeding authorized access” to extract data, which is what would make it a crime under the CFAA. And it’s that definition that could affect journalists.
Or, as Justice Neil Gorsuch put it during Monday’s oral arguments, lead in the direction of “perhaps making a federal criminal of us all.”
Investigative journalists and other watchdogs often use scrapers to illuminate issues big and small, from tracking the influence of lobbyists in Peru by harvesting the digital visitor logs for government buildings to monitoring and collecting political ads on Facebook. In both of those instances, the pages and data scraped are publicly available on the internet—no hacking necessary—but sites involved could easily change the fine print on their terms of service to label the aggregation of that information “unauthorized.” And the U.S. Supreme Court, depending on how it rules, could decide that violating those terms of service is a crime under the CFAA.
“A statute that allows powerful forces like the government or wealthy corporate actors to unilaterally criminalize newsgathering activities by blocking these efforts through the terms of service for their websites would violate the First Amendment,” The Markup wrote in our brief.
What sort of work is at risk? Here’s a roundup of some recent journalism made possible by web scraping:
- The COVID tracking project, from The Atlantic, collects and aggregates data from around the country on a daily basis, serving as a means of monitoring where testing is happening, where the pandemic is growing, and the racial disparities in who’s contracting and dying from the virus.
- This project, from Reveal, scraped extremist Facebook groups and compared their membership rolls to those of law enforcement groups on Facebook—and found a lot of overlap.
- Reveal also used scrapers to find that hundreds of millions of dollars in property taxes should have never been charged to Detroit residents who then lost their homes through foreclosure.
- The Markup’s recent investigation into Google’s search results found that it consistently favors its own products, leaving some websites from which the web giant itself scrapes information struggling for visitors and, therefore, ad revenue. The U.S. Department of Justice cited the issue in an antitrust lawsuit against the company.
- In Copy, Paste, Legislate, USA Today found a pattern of cookie-cutter laws, pushed by special interest groups, circulating in legislatures around the country.
- Reuters scraped social media and message boards to find an underground market for adopted children whose parents, who had usually adopted the children from abroad, decided the children were too much for them. A couple featured in the piece was later convicted of kidnapping as a result of the investigation.
- Gizmodo was able to use similar tools to find the probable locations of tens of thousands of Ring surveillance cameras.
- The Trace and The Verge, using scrapers, found people using an online market to sell guns without a license and without performing background checks.
This article was originally published on The Markup and was republished under the Creative Commons Attribution-NonCommercial-NoDerivatives license.
- This is iPhone 13Pro Max best new feature by far
- iOS 15 & iPhone 13 Pro: Problems, Known limitations, Issues & Highlights
- Deeper Dive into iPhone 13 Pro Max Cameras after 48 Hours of Testing
- Leonardo DiCaprio and Jennifer Lawerence Try to Save the World in ‘Don’t Look Up’
- In ‘Licorice Pizza’ Paul Thomas Anderson teams with Bradley Cooper to go beyond ‘Boogie Nights’
Lynxotic may receive a small commission based on any purchases made by following links from this page
Hot-Lynx1 week ago
No, the Richest One Percent Don’t Pay 40 Percent of the Taxes
Entertainment5 days ago
Apple TV+ Goes Big in Series Adaptation of Isaac Asimov’s “Foundation” Trilogy
Books3 weeks ago
Burning down the House: New Scandals Exposed in Ex-Trump Aide’s upcoming Book
Apple5 days ago
In iOS 15.1 you’ll be able to put Proof of Vaccination ID into your Wallet
Apple2 weeks ago
iPhone 13, iOS 15, iPadOS 15 and macOS 12 Monterey Unveiling now Hours Away
Documentaries2 weeks ago
SpaceX Docu-series on Manned Mission about to Launch on Netflix
Economics2 weeks ago
Who Created our Obscene Levels of Income Inequality?: Laws & Tax Codes
Hot-Lynx1 week ago
Chris Rock tests positive for coronavirus -‘Trust me you don’t want this’
Apple2 days ago
Oh. So. Pro.: Apple’s new slogan for iPhone 13 Pro is both Obvious and an Infinite Enigma
Apple3 days ago
iPhone 13 Pro live test, How to shoot in Macro Mode (hint: It Just Works…)
Business7 days ago
House Bill Would Blow Up the Massive IRAs of the Superwealthy
Entertainment5 days ago
Apple TV+ rocks it Dystopian Style w/ Tom Hanks in ‘Finch’, ‘Invasion’ and ‘Foundation’