How data brokers track your digital footprint, and profit from it

Sriram Sharma July 27, 2017

“I get pissed off with the number of calls that I am getting, because this DND registration has not worked at all. They don’t give a damn. I have DND on my phone, but still I get calls. So it’s not working at all,” says a livid Anil Raina, on a phone call discussing how much phone spam he gets.

I found Raina’s number on a sample spreadsheet of Bengaluru-based HNIs (high net worth individuals) provided by a Noida-based data broker. The spreadsheet, on offer to any client who asks for a sample, contained his personal details such as name, email, phone number, industry, and address. The selling price for details on 10,000 HNIs from Bengaluru was Rs 2,000. One lakh contacts would cost me just Rs 7,000.

The selling price for details on 10,000 HNIs from Bengaluru was Rs 2,000. One lakh contacts would cost me just Rs 7,000… From the same data broker, FactorDaily was able to obtain sample spreadsheets of online shoppers, with details of client and consignee, home address, and phone numbers  

Truecaller, an immensely popular caller ID app in India, is Raina’s method of coping with a daily barrage of phone calls from telemarketers. “It has become indispensable to me. I block four to five numbers every day. I must have blocked 200 numbers so far, still I get two to three calls a day,” he says. Raina reckons that his details probably got leaked through a job portal.

“I am surprised as to who and why this data is being collected, and for what purpose,” says Sankaranarayanan R, another individual whose details checked out from the HNI spreadsheet.

“I can slightly sense that my sons applied for some course. There are particular queries in that application form. And he comes and calls me, and asks me what is my income. So, I am not sure if there are such people (in educational institutions) who are leaking this information, or probably another group where there is a database collected for credit card. These are the two sources I can think of where data is being collated unscrupulously by people who have good or bad intentions. God only knows,” he says.

If you’ve ever gotten unsolicited phone calls, spam emails, and SMSes from marketers and spammers, it probably emanated from one of these data brokers  

From the same data broker, FactorDaily was able to obtain sample spreadsheets of online shoppers, with details of client and consignee, home address, and phone numbers. The spreadsheet was accurate, with everyone we contacted confirming the veracity of the information contained within. The spreadsheet contained details of deliveries made by ecommerce companies such as Xiaomi, Limeroad, Paytm, Shopclues, and Yepme. The asking price for one lakh online shoppers on a spreadsheet was just Rs 3,000.

Also read: Move over darknet, WhatsApp is where India’s new digital black market is at

If you’ve ever gotten unsolicited phone calls, spam emails, and SMSes from marketers and spammers, it probably emanated from one of these data brokers. And it’s quite likely that will sell this data to pretty much anyone who has the money for it.

Data broking: A multi-billion dollar industry

“It’s very easy to get out of a ping tree. Names, email addresses, last name, city, etc,” says Manan Shah, cofounder and CEO at Avalance Global Solutions. He explained the concept of a “ping post”,  or “ping tree” — a type of lead distribution software that’s used by the data broking industry. “A ping tree is a model for lead generation, it’s a model for data. Data gets uploaded here, there are bids for it, and (it’s) sold to the highest seller.” This Quora post has a good explainer on the same. It’s a thin line between marketing tech and data broking. A 2014 Guardian article, which examined how ping tree software was used by the payday loan industry called it “breathtakingly rapacious and amoral”.

“A ping tree is a model for lead generation, it’s a model for data. Data gets uploaded here, there are bids for it, and (it’s) sold to the highest seller” — Manan Shah, cofounder and CEO at Avalance Global Solutions  

The Data and Marketing Association estimates that the data-driven market economy added $202 billion in revenue to the US economy in 2014. There are an estimated 5,000 data brokers worldwide, and nearly 10 million open datasets published by government agencies and non-governmental organisations (NGOs), according to Gartner.

Data_Broking_Sources
An infographic from FTC’s report on the data broking industry, which explains how a data broker collects information

In an industry that vast, not all data brokers operate at the level of spreadsheets, and almost no one refers to themselves as a data broker. They call it a “customer engagement company”, or “consumer data collection company”, “information services”, “consumer risk management”, or “marketing automation”, based on their degree of sophistication.

The Data and Marketing Association estimates that the data-driven market economy added $202 billion in revenue to the US economy in 2014. There are an estimated 5,000 data brokers worldwide  

A 2014 report by the US FTC investigated nine companies in this space — Acxiom, Corelogic, Datalogix, eBureau, ID Analytics, Intelius, PeekYou, Rapleaf, and Recorded Future. The report found that these companies package the data as marketing, fraud and risk mitigation, and “people search” products, collecting consumer data from many sources, without any consent from the people whose personal information is being profiled. They’re also incredibly well networked internally — seven of the nine data brokers provide data to each other, the study notes.

Data brokers can combine multiple data sets, including online and offline data, to build a strong profile on a person based on commercial, government, and publicly available sources. This information is analysed to make potentially sensitive inferences, related to ethnicity, income, and health conditions. The report also examines how data brokers that operate as marketing tech companies are able to target consumers across multiple websites using cookies, based on their search or browsing history.

The Data and Marketing Association estimates that the data driven market economy added $202 billion in revenue to the US economy in 2014
The Data and Marketing Association estimates that the data driven market economy added $202 billion in revenue to the US economy in 2014

Companies that don’t invest in database security also leak user data, says Shah. “Data can be exfiltrated through social engineering as well. Internal teams can also leak data,” he says. Data is also easily scraped from websites with poor security policies. Sometimes there is zero percent security on a database. Banks of course, know their security, but on a lot of hosting sites, the FTP (file transfer protocol) login is admin/admin,” he says.

A fool and his data are soon parted

The more unscrupulous data brokers can also create large databases through fraudulent websites. A week ago, we’d spotted one that impersonates Flipkart, and promises implausible GST sale offers. We had also pointed out a fake WhatsApp website that had a similar modus operandi in a news story earlier this month.

“There’s gonna be a Kaun Banega Crorepati scam for sure. Every time there’s a KBC show, there is a flurry of fake mails from scammers saying you’ve been selected, there’s a percentage you will need to pay us to participate, as there will be a minimum amount you win. They’ll ask for a security deposit,” says Shah.

“Every time there’s a KBC show, there is a flurry of fake mails from scammers saying you’ve been selected, there’s a percentage you will need to pay us to participate, as there will be a minimum amount you win. They’ll ask for a security deposit” — Shah  

True to his claim, we found a number of sites (http://www.kbcsonyregistration.com/), (kbcofficial.in) and (kbcliv.in), which seem to be potential scams, trademark infringement notwithstanding. Whoisdetails on the domain don’t lend much confidence to the legitimacy of domains, and certainly, don’t compare to the official domain.

The fake site beats the original site on Google search for this important keyword
The fake site beats the original site on Google search for this important keyword

In an amazing demonstration of SEO prowess, kbcliv.in is the second result in Google for the “KBC 2017” keyword, following the official site — kbc.sonyliv.com on an anonymised search. More crucially, for the keyword “kbc official website”, it’s surprisingly the first search result. KBC’s official site received 19.8 million registrations, so there’s no telling how many people have relinquished their data on kbcliv.in.  Sony Entertainment Television confirmed to us over email that these domains do not belong to them. Registration is only through, www.setindia.com, or by downloading the Sony Liv app.

No legal recourse?

“There are no data protection laws in India save for Section 43A and Section 72A of the IT Act. Neither of which have been used as far as I know to punish anyone who has disclosed any material containing breach of personal information,” says Mishi Choudhary, legal director, Software Freedom Law Center, in an email discussing the data broking industry with FactorDaily.

Sharanya G Ranga, corporate lawyer at Advaya Legal, says that there is no legal framework specifically dealing with data brokers in India. Data brokering is still a largely grey area and not quite regulated, she says. “We have a very rudimentary (and almost outdated) framework relating to data protection under the Information Technology Act, 2000 and the rules relating to reasonable practices and procedures and sensitive personal data or information that are in force since 2011.”

“Most apps are collecting tons and tons of data about each user. Without a fair amount of user information permitted by the user, the app’s functionality is reduced” — Mishi Choudhary, legal director, Software Freedom Law Center  

She also discussed how something as ubiquitous as an Android smartphone, with all its free apps could prove to be a leaky bucket, privacy wise. “Most apps are collecting tons and tons of data about each user. Without a fair amount of user information permitted by the user, the app’s functionality is reduced. Of course, the standard disclaimers, privacy policy and consent terms are there. While it may be argued that user consent has been freely provided, can such consent be treated as specific and informed consent?” she asks.

Marketing_Tech_Cookie_Data _Broker
An infographic from the FTC report on how data brokers help businesses target users for online advertising

From a citizen’s perspective, one of the key issues is around consent for collection, storage and use of her personal data and ensuing disclosures, Ranga says. “Some data mining bloke somewhere will have our personal profiles completely mapped — who we are, what we do, where do we live, what we buy, where we travel, etc. As a user, if I give my consent to a particular entity to SMS me promotional alerts, I have no control on who gets that information, how that is mined, broken down and shared, commissioned, and monetised with my personal data serving as ‘raw material’! Also, even if user consent is obtained under contract by a user clicking the ‘I accept’ box on the website/app’s terms and conditions, how does the user get to know about any misuse of her data? How will she monitor that? At the end of the day, it is my data, but I have no control over who has access to it, where it is disseminated, the contours of such usage, etc.”

How other countries have tackled this issue

When it comes to data protection regulations, the gold standard is the EUGDPR (European Union’s General Data Protection Regulations), says Ranga. The US Federal Trade Commission had even penalised certain data brokers for unauthorised usage of data in November 2016, but most data protection provisions seem to be state specific, she said.

“The EU-US Privacy Shield also comes to mind but its status is not clear in the Trump era, so that may not be a great example for now. While the GDPR does not define data brokering as a concept, it refers to the ‘right to be forgotten’ where the user has the right to get his data deleted under certain conditions,” Ranga says.

Also read: Turning the debate on India’s data protection laws

In the US, the assumption is that processing of data is permitted, whereas in the EU, all personal data processing needs a legal justification, says Choudhary. “The flow of data between sources, data brokers and their customers needs its own legal justification, sometimes requiring consent. These limits apply even to the collection of public records and publicly available data,” she said, speaking about the EU’s data protection laws.

“In the US, there is no single, comprehensive national law regulating the collection and use of personal data. There is a patchwork of sectoral laws, federal and state laws and regulations, guidelines and frameworks. There has been several calls for action by FTC, but nothing so far,” she says.

In the US, the assumption is that processing of data is permitted, whereas in the EU, all personal data processing needs a legal justification  

In India, cases in respect of the right to be forgotten have recently come up in three high courts of India, Ranga says. “While the Gujarat High Court partially granted the right in favour of the petitioner, the Karnataka High Court dismissed it and the case before the Delhi High Court is still pending.”

Ranga is hopeful about the ongoing privacy debate in the Supreme court, and media reports that there is a data protection bill is in the works. “Hopefully, this paves the way for a robust legislative framework for data protection in tune with the times. Why is it essential? Take the example of Acxiom, one of the leading data brokers in the US which reportedly has data of over 500 million users worldwide. Come to think of it, this is the data of almost 7% of the population of the world, with one company making profits out of it!” she says.

How to reduce your digital footprint

There are an estimated 5,000 data brokers worldwide, according to Gartner estimates

For those looking to devise strategies to reduce one’s digital footprint, Choudhary recommends reading privacy policies, using ad-blockers, using privacy respecting applications like Signal, auditing your social media accounts and disabling third party apps that you may not be using.

Also read: Privacy concerns spill over offline: Five steps you must take to prevent identity theft

“While it’s not simple to ensure no personal data is shared, we can also try being mindful of what we share,” Choudhary says. “We frequently share our personal details like mobile phone numbers with any restaurant, shop, office reception that asks us. We should ask why such information is needed and not share it if not for a defined purpose.”

She also exhorts users to demand better practices, and transparency from data brokers. “Require data brokers to disclose the names of their sources of data, demand companies to provide notice about how consumer data is shared and how,” she says.

Also read: Two new India-born features on Facebook aim to protect your privacy

 

Updated at 12:07 pm on July 27th with inputs from Sony Entertainment Television.
Lead visual: Nikihl Raj

Disclosure: FactorDaily is owned by SourceCode Media, which counts Accel Partners, Blume Ventures and Vijay Shekhar Sharma among its investors. Accel Partners is an early investor in Flipkart. Vijay Shekhar Sharma is the founder of Paytm. None of FactorDaily’s investors have any influence on its reporting about India’s technology and startup ecosystem.