How much personal information does one share on your social media profile pages?
Name, location, age, job role, legal status, headshot? the quantity of data people is comfortable with posting online varies.
But most people accept that whatever we placed on our public profile page is called at the general public domain.
So, how would you are feeling if all of your information was cataloged by a hacker and put into a monster spreadsheet with many entries, to be sold online to the very best paying cyber-criminal?
That’s what a hacker calling himself Tom Liner did last month “for fun” when he compiled a database of 700 million LinkedIn users from everywhere the planet, which he’s selling for around $5,000 (£3,600; €4,200).
The incident, and other similar cases of social media scraping, have sparked a fierce debate about whether or not the essential personal information we share publicly on our profiles should be better protected.
In the case of Mr. Liner, his latest exploit was announced at 08:57 BST during a post on a notorious hacking forum.
It was an unusually civilized hour for hackers, but in fact, we’ve no idea which era zone, the hacker who calls himself Tom Liner, lives in.
“Hi, I even have 700 million 2021 LinkedIn records”, he wrote.
Included within the post was a link to a sample of 1,000,000 records and an invitation for other hackers to contact him privately and make him offers for his database.
Understandably the sale caused a stir within the hacking world and Tom tells me he’s selling his haul to “multiple” happy customers for around $5,000 (£3,600; €4,200).
He won’t say who his customers are, or why they might want this information, but he says the info is probably going to get used for further malicious hacking campaigns.
The news has also set the cyber-security and privacy world alight with arguments about whether or not we should always be worried about this growing trend of mega scrapes.
What’s important to know here is that these databases aren’t being created by breaking into the servers or websites of social networks.
They are largely constructed by scraping the public-facing surface of platforms using automatic programs to require whatever information is freely available about users.
In theory, most of the info being compiled might be found by simply picking through individual social media profile pages one by one. Although in fact, it might take multiple lifetimes to collect the maximum amount of data together, because the hackers are ready to do.
So far this year, there is a minimum of three other major “scraping” incidents.
In April, a hacker sold another database of around 500 million records scraped from LinkedIn.
In the same week, another hacker posted a database of scraped information from 1.3 million Clubhouse profiles on a forum for free of charge.
Also in April, 533 million Facebook user details were compiled from a mix of old and new scraping before being given away on a hacking forum with an invitation for donations.
The hacker who says he’s liable for that Facebook database, calls himself Tom Liner.
I spoke with Tom over three weeks on Telegram messages, a cloud-based instant messenger app. Some messages and even missed calls were made within the middle of the night, et al. during working hours so there was no clue on his location.
The only clues to his normal life were when he said he couldn’t talk on the phone as his wife was sleeping which he had a daytime job and hacking was his “hobby”.
Tom told me he created the 700 million LinkedIn database using “almost the precise same technique” that he wont to create the Facebook list.
He said: “It took me several months to try to to. it had been very complex. I had to hack the API of LinkedIn. If you are doing too many requests for user data on just one occasion then the system will permanently ban you.”
API stands for application programming interface and most social networks sell API partnerships, which enable other companies to access their data, perhaps for marketing purposes or for building apps.
Tom says he found how to trick the LinkedIn API software into giving him the large tranche of records without setting off alarms.
Privacy Shark, which first discovered the sale of the database, examined the free sample and located it included full names, email addresses, gender, phone numbers, and industry information.
LinkedIn insists that Tom Liner didn’t use their API but confirmed that the dataset “includes information scraped from LinkedIn, also as information obtained from other sources”.
It adds: “This wasn’t a LinkedIn data breach and no private LinkedIn member data was exposed. Scraping data from LinkedIn may be a violation of our Terms of Service and that we are constantly working to make sure our members’ privacy is protected.”
In response to its April data scare Facebook also ignored the incident as an old scrape.