Bengaluru’s traffic is soul-sucking. Going from Bannerghatta Road to Koramangala on the other side of town in peak hours takes close to 90 minutes and a lot of patience. Yet, Shailendra Kumar Sharma makes a round trip every day without fail. His new job is like that.
Sharma leads a team of about 50 engineers whose job is to train algorithms that will make some 40 million users of news app DailyHunt spend more time on it reading news, watching videos and sometimes, killing time in traffic. At stake is the future of India’s biggest news app, by one account a billion dollar company in making.
About 40 million people, or nearly 10% of India’s internet users, use DailyHunt in a month. An average user spends about 20 minutes on it. The plan is to grow to about 250-300 million monthly active users and 100 million daily active users in 2-3 years.
That will put DailyHunt in the league of Facebook, which has over 250 million users or WhatsApp which has over 200 million users in the country. If all goes well, DailyHunt will become one of the few homegrown companies that challenge the Google-Facebook duopoly on India’s digital advertising market, pegged at $1.2 billion dollars.
To get there, DailyHunt will have to get a few things right: drive first-time internet users to download the app, increase engagement, grow video and hyper-local content and make users come back for more. All this will work only if they get one thing right: the content graph.
“We’re building the largest content graph of users,” says Sharma, vice president – personalisation and search at DailyHunt. The social graph or the connection between users on a social network, popularised by Facebook, is a familiar concept for most people now. The content graph, is a relatively new term, for tying a user with his content consumption patterns.
Building a recommendation engine based on your content graph and not on your friends’ tastes, is what content companies like Netflix are famous for: because you watched The Wire, you might also be interested watching Luther, another police procedural with some of the cast that’s similar to The Wire.
Newer companies like Snapchat are also building on these ideas. Last year, Evan Spiegel, the founder of Snapchat pointed out in an opinion piece that the form of personalisation which relies on your past behaviour is a “far better predictor of what you’re interested in than anything your friends are doing”. China’s biggest content app, Toutiao, keeps its users hooked for as much as 74 minutes every day using a content graph-based recommendation engine.
“Toutiao and Facebook are the best in the industry when it comes to machine learning,” says Sandeep Amar, a top media executive. “Toutiao not only uses personalisation models but also takes into account geolocation and the user’s surroundings,” adds Amar, who is currently building Inaaj, a startup that uses artificial intelligence serve better content to users.
At DailyHunt, similar efforts are underway at scale. “What you want to read may not be what your friends are reading,” says Sharma. That means understanding the user and his surroundings deeply. Sharma used to work for a US firm called Trilogy, where he lead a team that had to analyze user behaviour and recommend products to buy. But this time around, it’s a lot more complicated because local languages are hard to crack.
Sunayana Sitaram, Senior Applied Scientist at Microsoft Research India explains this well: “Unfortunately, there has not been a lot of data collected due to various reasons – most of the data is in languages that the US is interested in, not Indian languages.” The other challenge is the sheer number of languages in India. “There’s a big data scarcity. We don’t have anywhere close to the data that other European and Asian languages have,” she says. India has 22 officially recognised languages and thousands of dialects. According to the census of 2001, 422 million speak Hindi, 83 million Bengali, 74 million Telugu and 71 million speak Marathi. Some 150 million speak and understand English, second only to the US population but that market is in the stranglehold of Google and Facebook.Shailendra Sharma, VP – User Engagement and Personalisation at DailyHunt. Photo: Rajesh Subramanian
Luckily for Sharma, DailyHunt has a large hoard of data from nearly 1,000 publisher partners in 14 different languages. Sharma and his team used about 32 million stories over 30 months to train its machine learning algorithms for 500 hours.
“We identify named entities like celebrities, or important places such as parliament house, locations, whether it’s a mass article or local article, with all the information, and end-user interests,” says Sharma. When stories come in, they are enriched using nearly 10,000 tags in each language. This, combined with an algorithm which learns a user’s past behaviour, can help predict articles that the user is most likely to read.
In March 2014, researchers at Google had published a paper (pdf) that helps machines understand the relationship between words at an unprecedented scale and speed. A technique, called Word2vec, is used to embed words into a neural network so that machines can understand the context in which they are used (more on that here). DailyHunt has extended the same technique to documents– so that a machine can tell the relationship between various articles.
DailyHunt also tracks the user’s actions– clicks, swipes, time spent, shares and so on– closely. Every minute, a user’s profile is updated. “We call it near real-time profiles,” says Sharma. Marrying this understanding with content is what works.
The personalisation engine also powers notifications that are sent to users. Earlier, users were notified of stories handpicked by campaign managers. But now, they use an assistive console that suggests stories to campaign managers. DailyHunt also runs a set of notifications generated only by machines. In cases like breaking news, humans bring better results. But otherwise, the machines work better. “Eventually, we want the AI to do all notifications. The system is showing better promises and eventually, it will take over,” said Sharma.
For business, the personalisation engine becomes critical. “I’m trying to fight every other eyeball channel. It’s not about an app or three apps. I’m trying to give the user a better experience and that will come with a 360-degree view and that’s what my investment in tech is,” says Virendra Gupta, the 45-year-old founder of DailyHunt.
Valued at nearly $350 million last year, the company is his biggest venture so far. Back in 2012, he’d set out to build a news app after acquiring NewsHunt (as it was called then). Now his ambitions have grown: he wants DailyHunt to be the largest local language platform in India.
In its early days, DailyHunt solved the problem of rendering fonts on different phones– the market wasn’t dominated by Android phones like it is now. Those were the days of Nokia and Samsung and their walled gardens. The device ecosystem was fractured and it was difficult to get apps to work well across operating systems. Once that was solved, the company’s focus shifted to getting more publishers on the app.
Many things are going right for Gupta. In 2015, Goldman Sachs said DailyHunt could become a billion-dollar company from India. It is reported that Alibaba is likely to invest in the company at a valuation of $500 million (Gupta declined to comment on this). Moreover, local language consumption on the internet is on the rise. Nine of every 10 new internet users in India are likely to be Indian language users, according to a KPMG-Google report in April 2017.
Its biggest and most audacious phase is yet to come: a pivot to video. The idea is to nearly double the time people spend on the app in the next 12 months by showing them relevant videos. Getting its video play right will also be crucial to DailyHunt’s ability to attract advertisers, in a market dominated by Facebook and Google.
Videos are played nearly 400-500 million times a month on the app now. As data becomes cheaper and more and more Indians consume video, Gupta wants to take it to a billion plays soon. Here again, Sharma’s team is building models to extract frames from videos that machines will ingest to throw up relevant videos. That’s still only a sliver online viewership in India where news networks on Youtube generated 18 billion views in 2017 (pdf).
Ver Se Innovation, the company which owns DailyHunt, also runs DailyHunt Lite, a lighter version of DailyHunt; Newzly, a short format news app and OneIndia, a local language news company it bought into in 2016. Together, the company has 89 million monthly active users.
“As you start hitting those numbers, you become a significant player for advertisers. Also, the kind of audience we deliver is unique. This is the Bharat audience,” says Gupta who not only wants to play in the top end of the market (the top 500 companies in India who spend on advertising) but also sell to small and medium-sized businesses.
But will advertisers pay for the attention of these users? “Language doesn’t define the wealth of a person. You’ll find the highest number of Mercedes (cars) in Ludhiana. If you go to Aurangabad and Nagpur, the kind of wealth that exists there is amazing,” says Gupta.
Browsing through DailyHunt can be a letdown sometimes as our trial showed. We tried the app out in English and Hindi over the week, and it is clear that the long tail of Indian media, the content farms and churnalists are in full flow here. Sometimes, you wish there was some way to remove them from the feed. We tried to train the algorithm by reading only tech and business news stories, but Bollywood and clickbait wouldn’t go away.
The personalised “For You” feed occasionally throws up cringe-worthy content. For instance, the story about a well-endowed 16-year-old who is going viral that showed up on the top of our feed. It’s not clear why the algorithm determined that this was the most important story for us to read. Perhaps because we didn’t use it long enough, DailyHunt somehow failed to understand that we’re serious about our news.
This is where an important distinction needs to be made. There are broadly two categories of consumers– those who are particular about their news and then the casual news consumers. “The former can be users of news aggregators, the latter is currently being served by social networks and messaging apps,” says Ankit Gupta, the co-founder of Pulse, a news aggregator acquired by LinkedIn.
The casual news consumers are the ones who get the most out of personalisation. “But you need a lot of data to have high accuracy here. (That’s the) reason why social networks are able to do it well,” says Gupta of Pulse. People who truly care about their news care less about personalization because they care a lot about the sources of their news, their authenticity and biases.
“Machine learning can maybe improve their experience a little but what they really need is high-quality news and analysis,” he points out. This partly explains why our experience with the feed wasn’t the best. However, it seems to be working well for millions of users.
Since May last year, the company has seen an increase of 33% in time spent to 20 minutes and about 70% in the click-through rates. It has also seen retention rate go up by 20%, says Sharma, 37. “We haven’t solved 100% of these problems,” he agrees. “We’re constantly building models that can understand the user better.”
Gupta also faces challenges from Chinese rivals. Toutiao, despite being an investor in DailyHunt, has plans to launch on its own in India. UC News and NewsDog are also making aggressive moves in India, striking partnership with publishers and acquiring new users at a fast clip. The two feature regularly in the list of top content destinations in the country (see App Annie rankings).
Indian language internet users have grown from 42 million in 2011 to 234 million in 2016, shows data from Google-KPMG report. By 2021, this number is expected to go up to 536 million. At that point, large advertising budgets are expected to shift to local language platforms. If Gupta manages to keep his technology engine in good shape, he stands a fair chance of fending off newer rivals and also break the Google-Facebook chokehold on India’s digital advertising market.
Subscribe to FactorDaily
Our daily brief keeps thousands of readers ahead of the curve. More signals, less noise.
Updated at 03:44 pm on July 23, 2018 to change legend of the chart Defining the Indian Language User.
Disclosure: FactorDaily is owned by SourceCode Media, which counts Accel Partners, Blume Ventures and Vijay Shekhar Sharma among its investors. Accel Partners is an early investor in Flipkart. Vijay Shekhar Sharma is the founder of Paytm. None of FactorDaily’s investors have any influence on its reporting about India’s technology and startup ecosystem.