Google’s less-than-74,000 employees have been able to generate a market cap of around $725 billion — one-third of India’s GDP. Few companies in history have had the power the Mountain View, San Francisco tech behemoth leading to comparisons with god.
If “In Google’s search results, we trust”, the search giant itself is reposing its trust in artificial intelligence, self-learning code that spot patterns in large schemes of data and suggest solutions.
At a press meet at the Google office in Roppongi Hills, Tokyo on Tuesday, journalists from the Asia Pacific were given a glimpse of the tech giant’s vision to reinvent itself around AI. It’s been a little over a year since its CEO Sundar Pichai shared his vision for turning the company from mobile-first to AI-first. Google has made significant strides in that direction: through its apps (Gmail, Google Photos, Google Translate), personal assistant (Google Assistant), and its hardware portfolio (Google Home, Pixel, Pixel Buds).
At the event, titled MadeWithAI, its product and engineering heads showed off Google’s AI-driven features, highlighting how AI is the centerpiece of their software and hardware portfolio.
Jeff Dean, Senior Google Fellow started off the conference stating Google’s three goals with AI: to make its products more useful, help businesses and developers innovate, and provide researchers tools to tackle humanity’s big challenges. His definitions of AI (the science of making things smart) and machine learning (teaching computers to learn without having to program rules) were followed by a deep dive into its ML tools such as TensorFlow and its AI-driven cloud services.
Talent is the biggest barrier to entry in AI. Only a thousand Googlers had machine learning skills in 2012. Google has trained 18,000 of its employees in it so far. Google plans to put this course, which was used internally, online for free early next year. While it will help more people to understand how to leverage machine learning, it seems like a skill that will soon be replaced by AI.
In a talk about existing AI challenges, Dean gave a brief explanation of Google’s work on AutoML, which uses machine learning models to generate machine learning models. “Essentially, the way that works is [that] we generate ten machine learning models, and then we train those on a problem we care about. And we see which ones work well, and which ones don’t. And we can use that signal as feedback to the machine learning generating model, where it can say, ‘These ones didn’t work very well, so don’t do many like this; and this one worked really well, so do more kind of in that thing.’ And then we repeat this,” says Dean. “We might generate 15,000 or 20,000 models and train them on a problem we care about. This is actually a lot like a human machine-learning expert does, when they’re trying to find a good model. This is essentially automated now. And it works really, really well.”
Dean highlighted AutoML’s superiority with an ImageNet classification chart, which compares the computational cost and accuracy of different kinds of models that have been developed over the last four to five years. NASNet, a novel architecture that uses Google’s AutoML, beats results produced by the top research labs and universities from around the world.
“The Y-axis is accuracy, higher is better. Obviously, you like lower computational cost. The place you’d like to be is the upper left corner. Every black dot there is the work of months or years of accumulated research to produce a new kind of model that has better results than the previous best,” Dean says, adding that AutoML is getting higher accuracy with less computational cost, which could help its deployment on low power devices such as even a smartphone. AutoML will work on machine learning problems of a restricted class initially but, eventually, more and more kinds of ML problems can get a solution automatically.
Dean says that Google is using AutoML internally for a lot of their problems and that they plan to bring this technology to external organisations as well. “We think there’s a lot of people who have machine learning problems and have the expertise to maybe label the data for a supervised learning problem, but maybe don’t have as much skill in how do you actually create a machine learning model,” he says. “I think these kinds of approaches can perhaps broaden the use of machine learning to a much wider set of developers and others. This won’t be a ten-year thing, it might be a relatively short-term thing where it’s starting to have an impact on all.”
In another feat published a month ago, Google one-upped its computer program that plays the game of Go better than any human being with AlphaGo Zero, which defeated the previous version by a margin of 100-0. While previous versions relied on training data of thousands of games played by humans, AlphaGo Zero mastered the game simply by playing against itself.
When Google execs opened up for questions from journalists, the first question that was asked was about Google’s need to balance user privacy with its need for user data, in light of recent revelations that Google was tracking location data even when location services were turned off on Android.
“I do think in a lot of cases, many of the problems that we’re tackling don’t require human or user-specific data, and actually don’t require very large amounts of data,” says Dean, adding that a lot of Google’s algorithmic and computational improvements were not necessarily achieved through massive data sets. “I think, obviously, we want to use user data to improve a lot of our products but give users control over that thing,” he adds.
While Google spent a lot of time talking about the democratisation of AI, it is also putting a considerable amount of effort into blending AI into its hardware products, to produce features that cannot be easily cloned. Exclusive AI-driven features (such as Google Lens, AutoHDR+ mode) grace its Pixel line of smartphones, features which other Android OEMs will struggle to match. The camera on the Pixel 2, which uses a combination of machine learning and computational photography, has been getting rave reviews for its DSLR-like quality photos and nighttime photo quality.
Pure hardware innovation is largely over, says Isaac Reynolds, Product Manager, Pixel Camera. “All premium smartphones kind of look and feel the same. Luckily there’s a lot of innovation left, if you look at the intersection of AI and software, with hardware,” he says. Reynolds showed off photos of Tokyo that he had taken using the Pixel 2 camera comparing them with shots taken with a DSLR camera. He gave us a deep dive into the portrait feature mode, which fakes the blur effect one gets when using a big set of lenses through a tight integration of machine learning and hardware.
“As a human being, you have two eyes, and they give you a left and a right view, and that’s responsible for your sense of depth perception. Pixel 2 has two eyes as well, it has a special image sensor. On most image sensors, each pixel is a square. On Pixel 2, we split that square into two left and right sub-pixels that are each a little rectangle. We end up getting two images of the same instance, one through the left side of the lens, and one through the right. It’s having two different views of the same world, through two sides of a very small single camera. Once we have those two views, we can generate a rough depth map, which shows what’s close and what’s far,” he says.
The Pixel 2 also generates what’s called a segmentation mask, a machine learning algorithm that looks at every pixel in the image, and determines whether it’s a part of the person (not blur) or part of the background (blur). The segmentation algorithm is trained on almost a million images with all kinds of people, clothing, and backgrounds, which is combined with the rough depth map, to create a highly refined depth map to fake the blur effect.
Linne Ha, Director of Research Programs, who leads Google teams around the world working on applications of AI for linguistic problems, shared a human interest story that’s as dramatic and flattering as Lion was for Google Maps, to highlight the importance of Google’s mission to organise the world’s information, and make it universally accessible and useful.
While smartphones have brought more internet users online, web content has not caught up. Ha shared a data point highlighting this disparity: over 50% of the internet is in English, while Hindi, which is the fourth largest language in terms of speakers in the world is actually barely in the top 30 online. Ha shared some of Google’s effort to bring all the languages online, such as open sourcing Noto fonts, shorthand for No Tofu (named after the little boxes that show up when your device doesn’t have the font to display the text) in over 800 languages. In a language like Khmer, which has 70 characters, machine learning helps create usable keyboards thanks to predictive words that make smartphone typing much easier.
She also shared Google’s efforts with Project Unison, which uses machine learning to build text-to-speech engines, for under-resourced languages (like Bengali, Khmer, and Javanese), and crowdsources input using about $2,000 worth of hardware. “We got Bangladeshi Googlers, about 15 of them who auditioned, each spoke about 45 minutes, and we got 2,000 phrases in Bangla and English, then we had the wider community vote as to which voice they liked the best,” Ha says. The TTS voice needs to sound human, but can’t sound like a real person, she says. So speakers of similar vocal profiles were adapted to create a blended (average) voice.
In a group interview, a band of journalists from India grilled Pravir Gupta, Engineering Director, Google Assistant, about privacy concerns in light of an incident earlier this year where information from an Amazon Echo history was demanded by the prosecutor over a murder case in the US.
“Privacy comes front and centre for us, we want to let the user have all the tools and controls of whatever interactions they have had with any part of Google,” says Gupta. “You can go to the ‘My Activity’ page on the assistant, and delete previous activity. Much like search, providing that transparency and control is front and centre.”
And what if the user had made a query such as “How to hide a dead body”, and is accused of murder?” “If it’s deleted, it’s deleted,” Gupta says. The PR rep, who had been listening to our conversation from the sidelines, chips in: “If you go into my account, and see your activity, and you delete it, then it’s deleted, and it’s gone from our systems.”
Some of the first batch of Google Home Mini devices, which were sent to tech journalists earlier this year had a hardware glitch that turned them into always-on devices. Google eventually fixed the glitch by removing top touch functionality in an update less than a week later. “Unless the hot word is activated, nothing is actually sent to the server for speech recognition,” Gupta assured us. “Is it always listening? No, it’s not, unless it is invoked,” he says.
Lily Peng, Product Manager, Medical imaging team highlighted how Google is solving some of humanity’s big challenges in the areas of environmental protection, energy, and healthcare. Machine learning was used to track sea cows using aerial photography, to keep track of their populations. In another well-publicised case, Deepmind was used to reduce the amount of energy used for cooling its data centre by 40%. Peng’s team is using machine learning on two healthcare challenges: to diagnose diabetic retinopathy and detect cancer.
“Training the machine learning algorithm is one part of it but actually putting that in a clinic is really a bulk of the work. Our group looks at the research parts of it, prototype solutions, and as the solutions become viable, we will usually partner with healthcare organisations,” she says.
“Building medical devices is a very daunting endeavour. There are a lot of people already doing this, so it actually makes a lot of sense to partner with the people who are already invested in the space and have the deployment and development channels,” says Peng. “That’s how our partnership with Verily and Nikon came about. There will be a portion of the work that we hope will be viable, commercial product, and they are really the best folks to take this to the market.”