One of the just-announced Pixels’ most intriguing features is Hold for Me, a Google Assistant-powered service that waits on hold when you call a retailer, utility, airline, or other business’ toll-free support number. When a human comes on the line and is ready to talk, Hold for Me — which will launch in preview in the U.S. in English before expanding to other regions and devices — notifies you with sound, vibration, and a prompt on your screen.
It wasn’t immediately clear how Hold for Me worked, but Google responded to a list of VentureBeat’s questions in the hours following the event. According to a spokesperson, Hold for Me is powered by Google’s Duplex technology, which not only recognizes hold music but also understands the difference between a recorded message — for example, “Hello, thank you for waiting” — and a representative on the line. (That said, a support page admits Hold for Me’s detection accuracy might not remain high “in every situation”.) Google says it gathered feedback from a number of companies, including Dell and United, as well as from studies with customer support representatives to help design Hold for Me’s interactions.
“Every business’s hold loop is different and simple algorithms can’t accurately detect when a customer support representative comes onto the call,” Google told VentureBeat. “Consistent with our policies to be transparent, we let the customer support representative know that they are talking to an automated service that is recording the call and waiting on hold on a user’s behalf.”
Hold for Me is an optional feature that must be enabled in a supported device’s settings menu and activated manually during each call. In the interest of privacy, Google says that audio processing by Google Assistant to determine when a representative is on the line is done entirely on-device and doesn’t require a Wi-Fi or data connection. Effectively, no audio from the call is shared with Google or saved to a Google account unless a user explicitly decides to share it and help improve the feature. (Call data like recordings, transcripts, phone numbers, greetings, and disclosures are stored on Google servers for 90 days before deletion.) If they don’t, interactions between Hold for Me and support representatives are wiped after 48 hours; returning to the call stops the audio processing.
Google claims its embrace of techniques like on-device processing and federated learning minimize the exchange of data between its servers. For instance, its Now Playing feature on Pixel phones, which shows what song might be playing nearby, leverages federated analytics to analyze data in a decentralized way. Under the hood, Now Playing taps an on-device database of song fingerprints to identify music near a phone without the need for an active network connection.
Google’s Call Screen feature, which screens and transcribes incoming calls, also happens on-device, as does Live Caption, Smart Reply, and Face Match. That’s thanks in part to offline language and computer vision models that power, among other things, the Google Assistant experience on smartphones like the Pixel 4, Pixel 4a and 4a (5G), and Pixel 5.