Is Web Scraping Legal? The Definitive Legal Guide for 2026

Web scraping publicly available data is generally legal in the United States, following the Ninth Circuit's 2022 ruling in hiQ Labs v. LinkedIn. However, legality depends on what you scrape, how you scrape it, and what you do with the data. Scraping behind login walls, copyrighted content, or personal data protected by GDPR/CCPA introduces significant legal risk.

The legality of web scraping is one of the most frequently misunderstood topics in technology and data law. The confusion stems from the fact that there is no single "web scraping law." Instead, scraping legality is determined by the intersection of multiple legal frameworks: computer fraud statutes, copyright law, contract law (Terms of Service), privacy regulations, and trespass to chattels. Each framework applies differently depending on the specific scraping scenario, creating a complex matrix of legality that requires case-by-case analysis.

This guide provides a comprehensive legal analysis of web scraping in 2026, covering the major court decisions that have shaped the law, the key legal frameworks that apply, jurisdiction-specific rules, and practical compliance guidelines. It is not legal advice—consult an attorney for your specific situation—but it provides the factual foundation you need to understand the legal landscape.

The Legal Frameworks That Apply to Web Scraping

Web scraping does not exist in a single legal category. Five distinct legal frameworks can apply, and a scraping operation may be legal under one framework while violating another. Understanding each framework is essential for assessing the legality of any scraping activity.

Legal Framework	Key Law/Regulation	What It Covers	Risk Level for Scraping
Computer Fraud	CFAA (US), CMA (UK)	Unauthorized access to computer systems	Medium — clarified by hiQ ruling
Copyright	US Copyright Act, EU Copyright Directive	Reproduction of copyrighted content	High — if scraping protected content
Contract Law	Terms of Service / Terms of Use	Breach of website agreements	Medium — enforceability varies
Data Privacy	GDPR, CCPA, LGPD	Collection and processing of personal data	High — strict requirements for personal data
Trespass to Chattels	Common law tort	Interference with computer systems	Low — requires actual damage

Computer Fraud and Abuse Act (CFAA)

The CFAA, enacted in 1986, prohibits accessing a computer "without authorization" or "exceeding authorized access." For decades, companies used the CFAA to argue that web scraping constituted unauthorized access to their servers. The critical question was whether accessing a publicly available website could ever be "without authorization."

The hiQ Labs v. LinkedIn case resolved this question for publicly available data. The Ninth Circuit ruled in 2022 that scraping publicly available data on the open internet does not violate the CFAA because there is no "authorization" requirement for data that anyone can access without credentials. The court reasoned that the CFAA was designed to prevent hacking into systems protected by authentication, not to prevent accessing data that is already public.

However, the CFAA still applies when scraping involves circumventing access controls. If a website requires a login, and you scrape data behind that login either without an account or in a way that violates the account's terms, the CFAA may be implicated. The key distinction is between public data (no authorization needed, scraping likely legal under CFAA) and restricted data (authorization required, scraping may violate CFAA).

Copyright Law

Copyright law protects original creative works from unauthorized reproduction. When you scrape a website, you are technically making copies of its content. If that content is copyrighted—articles, photographs, videos, creative writing, database structures—the reproduction may constitute copyright infringement.

The fair use doctrine provides some protection for scraping, particularly when the scraped data is used for transformative purposes such as research, analysis, or building new products that do not compete with the original content. However, fair use is a legal defense, not a license—it is determined by courts after the fact, based on four factors: (1) the purpose and character of the use, (2) the nature of the copyrighted work, (3) the amount copied, and (4) the effect on the market for the original work.

Factual data generally is not copyrightable. The Supreme Court's decision in Feist Publications v. Rural Telephone Service (1991) established that compilations of facts are only copyrightable if they involve creative selection, coordination, or arrangement. Raw factual data—names, addresses, prices, specifications—cannot be copyrighted. This distinction is critical for B2B data scraping: scraping factual business information (company names, employee titles, public financial data) carries lower copyright risk than scraping creative content (blog posts, product descriptions, images).

Terms of Service

Most websites include Terms of Service (ToS) that explicitly prohibit scraping, crawling, or automated data collection. The legal question is whether these Terms of Service are enforceable against scrapers who never explicitly agreed to them.

Courts have taken different positions on this question. "Clickwrap" agreements (where users must click "I agree" before accessing content) are generally enforceable. "Browsewrap" agreements (where terms are posted on the website but users are not required to acknowledge them) have weaker enforceability. Several courts have ruled that simply visiting a website does not constitute agreement to its ToS, particularly when the terms are not prominently displayed.

However, even if ToS are not technically enforceable, violating them creates legal risk. Companies can use ToS violations as evidence of bad faith in other legal claims (trespass, unfair business practices), and the mere threat of litigation can be costly to defend against regardless of the outcome.

Data Privacy Regulations (GDPR, CCPA)

Privacy regulations add a critical layer of complexity to scraping legality, particularly when the scraped data includes personal information. The GDPR (EU), CCPA (California), LGPD (Brazil), and similar laws regulate the collection, processing, and storage of personal data regardless of how the data is obtained.

Under the GDPR, scraping personal data from the internet requires a lawful basis for processing. The two most relevant bases are consent (impractical for scraped data) and legitimate interest (possible but requires a balancing test). The GDPR does not distinguish between data provided directly by individuals and data scraped from public sources—the same processing requirements apply to both.

The CCPA takes a slightly different approach. It gives California residents the right to know what personal information businesses collect about them and to request its deletion. If you scrape personal data about California residents, you may be required to respond to these requests, which creates significant operational burden for large-scale scraping operations.

Key Court Cases That Define Scraping Legality

Case	Year	Court	Key Ruling	Impact
hiQ Labs v. LinkedIn	2022	9th Circuit	Scraping public data doesn't violate CFAA	Landmark — strongest pro-scraping precedent
Van Buren v. United States	2021	Supreme Court	CFAA "exceeds authorized access" narrowly defined	Narrowed CFAA scope significantly
Meta v. Bright Data	2024	N.D. Cal.	Scraping public Facebook data is not a CFAA violation	Extended hiQ to social media platforms
Ryanair v. PR Aviation	2015	EU Court of Justice	ToS can restrict scraping of unprotected databases	EU precedent — ToS more enforceable
Feist v. Rural Telephone	1991	Supreme Court	Facts are not copyrightable	Foundational — protects factual data scraping
eBay v. Bidder's Edge	2000	N.D. Cal.	Excessive scraping can be trespass to chattels	Server overload creates liability
Clearview AI (Various)	2020–2025	Multiple	Scraping biometric data violates privacy laws	GDPR/BIPA violations for facial data

Scraping Legality by Data Type

The type of data you are scraping is one of the strongest predictors of legal risk. Public factual data carries the lowest risk, while personal biometric data carries the highest.

Data Type	Legal Risk	Key Concerns	Practical Advice
Public business data (company names, addresses)	Low	Minimal — factual data, not copyrightable	Generally safe; respect rate limits
Product pricing and availability	Low–Medium	Copyright on database structure possible	Safe for comparison; avoid reproducing entire databases
Public profiles (LinkedIn, etc.)	Medium	GDPR for EU subjects; ToS violations	Use for legitimate B2B purposes; comply with GDPR
News articles and blog posts	Medium–High	Copyright on creative content	Fair use for analysis; don't republish full content
User-generated content (reviews, comments)	Medium	Copyright belongs to users; ToS issues	Aggregate analysis OK; don't republish verbatim
Behind-login content	High	CFAA authorization issues; ToS breach	High risk — get legal advice before proceeding
Personal data (emails, phone numbers)	High	GDPR, CCPA, state privacy laws	Must have lawful basis; provide opt-out
Biometric data (photos for recognition)	Very High	BIPA, GDPR Art. 9, specific biometric laws	Avoid entirely without explicit consent

Scraping Legality by Jurisdiction

Web scraping legality varies significantly by country. The United States provides the most permissive environment for scraping public data, while the European Union imposes stricter requirements through the GDPR, and some countries have additional database protection laws.

Jurisdiction	Public Data Scraping	Personal Data	Key Laws	Notes
United States	Generally legal	CCPA (California)	CFAA, Copyright Act	Most permissive after hiQ ruling
European Union	Legal with restrictions	Strict (GDPR)	GDPR, Database Directive	Database rights add extra protection
United Kingdom	Legal with restrictions	Strict (UK GDPR)	CMA, UK GDPR, Copyright Act	Post-Brexit, similar to EU
Australia	Generally legal	Moderate (Privacy Act)	Privacy Act, Copyright Act	No CFAA equivalent
Canada	Legal with restrictions	Strict (PIPEDA)	PIPEDA, Copyright Act	Similar to EU approach
Japan	Generally legal	Moderate (APPI)	APPI, Copyright Act	2020 amendment increased data protections
Brazil	Legal with restrictions	Strict (LGPD)	LGPD	GDPR-inspired, strong personal data protection
India	Generally legal	Evolving (DPDP Act)	IT Act, DPDP Act 2023	New privacy law still being implemented

Practical Compliance Framework

Based on current case law and regulatory guidance, here is a practical compliance framework for web scraping operations. Following these guidelines does not guarantee legal protection, but it significantly reduces risk and demonstrates good faith.

1. Only scrape publicly available data. Avoid scraping behind login walls, paywalls, or any access restriction. The hiQ ruling specifically protects scraping of data that is available to anyone on the open internet. The moment you circumvent any access control, the CFAA risk increases dramatically.

2. Respect robots.txt. While robots.txt is not legally binding in most jurisdictions, respecting it demonstrates good faith and reduces the likelihood of legal action. Ignoring robots.txt can be used as evidence of intentional disregard for the website's wishes in trespass or unfair business practice claims.

3. Implement reasonable rate limiting. Scraping at high rates that degrade website performance can constitute trespass to chattels (as in eBay v. Bidder's Edge). Space your requests to avoid impacting the target server's performance. A good rule of thumb is no more than one request per second per domain for most websites.

4. Comply with GDPR/CCPA for personal data. If you are scraping personal data, you must have a lawful basis for processing under applicable privacy laws. For B2B data, legitimate interest is the most common basis, but it requires a documented balancing test. Maintain records of your processing activities and provide opt-out mechanisms.

5. Do not reproduce copyrighted creative content. Scraping factual data (prices, specifications, business information) carries low copyright risk. Scraping and republishing creative content (articles, images, descriptions) carries high risk. If you scrape creative content, use it for analysis only, not republication.

6. Document your compliance efforts. Maintain records of your scraping policies, rate limiting configurations, robots.txt compliance, GDPR assessments, and any legal review. If challenged, these records demonstrate that you operated in good faith and made reasonable efforts to comply with applicable laws.

For teams using scraped B2B data for outreach, platforms like Sales.co handle data compliance as part of their integrated workflow, ensuring that contact data used in campaigns meets privacy requirements and was collected through compliant methods.

Common Scraping Scenarios: Legal Analysis

Scraping product prices for comparison: Generally legal. Price data is factual and not copyrightable. The main risks are ToS violations and trespass to chattels if scraping at high volume. Use reasonable rate limiting and you are in a strong legal position.

Scraping LinkedIn for B2B leads: Medium risk. Public LinkedIn profiles are publicly available data (supported by hiQ), but LinkedIn's ToS prohibit scraping. GDPR applies if scraping EU profiles. Use the data for legitimate B2B purposes, provide opt-out, and do not scrape behind the login wall.

Scraping news articles for sentiment analysis: Low to medium risk. The scraping itself is likely legal; the use of the content determines copyright risk. Extracting sentiment (transformative use) is stronger than republishing summaries. Do not reproduce substantial portions of articles.

Scraping real estate listings: Low risk for factual data (addresses, prices, square footage). Medium risk for creative descriptions and photographs. Aggregate data analysis is safe; reproducing full listings is not.

Scraping social media posts: Medium risk. Public posts are publicly available, but privacy expectations vary by platform and user. Aggregate analysis is lower risk than individual targeting. GDPR applies to EU users regardless of platform.

The Bottom Line

Web scraping is legal in most circumstances when you scrape publicly available data, respect robots.txt, implement reasonable rate limits, and comply with applicable privacy laws. The hiQ v. LinkedIn ruling provides strong precedent for scraping public data in the United States, and similar principles are being recognized in other jurisdictions.

The legal risk increases significantly when you scrape behind access controls, collect personal data without a lawful basis, reproduce copyrighted creative content, or overwhelm target servers with excessive request volumes. Each of these scenarios introduces a different legal framework with different requirements and penalties.

The practical approach is to assess each scraping project against the five legal frameworks (CFAA, copyright, ToS, privacy, trespass), document your compliance efforts, and consult an attorney for high-risk scenarios. The law in this area is still evolving, with new court decisions and regulations being issued regularly. Staying informed and maintaining good-faith compliance practices is the best risk mitigation strategy available.