Data gathering, data extraction, data collection, web scraping, you name it – there’s no limit to how this common practice can be called. The Internet is a resource, and it only makes sense to use it as such. There are many tools for this, such as Smartproxy – this provider offers premium proxies and awarded service that helps their clients safely conduct market research, localize their ads, and audit websites.
However, everything has its limit, and web scraping can become a cautionary tale if it doesn’t comply with the GDPR, or, most recently, with the CCPA.
What is the CCPA and how is it different from the GDPR?
The California Consumer Privacy Act has been in effect since the 1st of January this year. It’s meant to protect consumers’ privacy and their personal details from online exploitation. It gives these rights to people: knowledge about whether their information is collected and how it is used, access to their data, ability to request their data, and the ability to request deletion of their data. Even though the goal is similar to the one of GDPR, it has a few crucial differences worth noting.
Quite obviously, the GDPR has a broader territorial reach, whilst the CCPA protects only California residents. Besides this, the former regulates data controllers, and the CCPA regulates for-profit businesses. When it comes to them, it is clear that the GDPR lets businesses collect data about their consumers under certain circumstances, while the CCPA requires businesses to let their users opt out by clearly adding this choice in their websites.
The CCPA and its terms
Let’s look at the CCPA and its terms more closely.
According to the legislation, personal information is anything that relates or identifies you, such as your name, postal address, IP address, social security number, any financial, social, or health insurance information, passport number, and similar. Your internet activity (browser history, interactions with apps) and any assumptions based on your information about your beliefs and preferences can also be classified as personal information.
The CCPA covers businesses that do business in California, have a higher gross annual revenue than $25 million, and earn half of that by selling consumers’ private information. One of the fines for not respecting this act intentionally starts at $7,500 for each violation among others.
There’s no wonder that Google and Facebook opposed this bill – it’s hard to track data that flows through the machines, and the fines don’t stop at the aforementioned financial sanctions. The CCPA also gives more power to Internet users as they can sue the companies that don’t comply with the legislation.
As always, there is one major highlight to note – the CCPA gives a clear definition of personal data and how it’s protected. However, here’s the catch: the act does not touch on the issue of public data.
Data scraping has always been tricky, and its legality can be questioned in certain cases. This can be seen from disputes that reached the courtroom, such as Facebook’s charges against Power Ventures, LinkedIn’s case against HiQ labs, and Ryanair’s famous appeals against Expedia.
Facebook faced Power Ventures in court in 2009, with the claims that it breached the U.S. Computer Fraud and Abuse Act. Power Ventures scraped the website for information that it used on its own aggregation page. It was decided that the claim was legitimate and that Power Ventures had to stop – this is a famous case as it raises questions about what public data really is and if it can be used by competitors.
The LinkedIn versus HiQ labs case was very similar yet wielded a different outcome. In this case, information on LinkedIn was deemed public, which meant that it could be scraped by other parties. This case proved that scraping public data is legal if it is done carefully and follows the requirements of CCFA and other legislatures.
Seeing as both the CCPA and the GDPR have territorial reach, are privacy laws applied internationally? The Irish company Ryanair sued Expedia in 2019, and the courts established that the CCFA (a U.S. law) can also be applied to American companies acting internationally. The companies settled the case between themselves, and the details still remain private.
Does the CCPA pose a threat to web scraping?
The CCPA does not change the web scraping game – Publicly Available Information stays public and open to scrutiny. Nevertheless, privacy legislation is a step in the right direction when it comes to protecting consumers’ rights, and other states besides California are also expected to issue their own laws this year.
The CCPA and future legislations will not make a big difference to web scraping, as gathering public information is a common and legal practice. Market leaders like Smartproxy let their clients do it successfully and with ease. They offer a proxy IP pool of over 40 million, over 195 locations to choose from, and the best prices in the market. As long as private information is not involved, analyzing competition and collecting data will get a green light.