In December 2010, journalist Matt Thompson predicted a future in which “automatic speech transcription will become fast, free, and decent.” He called this future the “Speakularity” – playing on the concept of the technological singularity.
In Thompson’s words:
“So much of the raw material of journalism consists of verbal exchanges — phone conversations, press conferences, meetings. One of journalism’s most significant production challenges, even for those who don’t work at a radio company, is translating these verbal exchanges into text to weave scripts and stories out of them.”
“After the Speakularity, much more of this raw material would become available. It would render audio recordings accessible to the blind and aid in translation of audio recordings into different languages. Obscure city meetings could be recorded and auto-transcribed; interviews could be published nearly instantly as Q&As; journalists covering events could focus their attention on analyzing rather than capturing the proceedings.”
“Imagine if that capability were opened up to citizens — if every on-air utterance of every pundit, politician, or policy wonk were searchable on Google.”
But that capability wouldn’t be open to just good citizens…
Writing in the September 3rd issue of Nautilus, James Somers take a deep drive at the idea and asks, will recording every spoken word help or hurt us?
He imagines a near future in which all business meetings are transcribed as part of “The Record”.
“We are going to start recording and automatically transcribing most of what we say. Instead of evaporating into memory, words spoken aloud will calcify as text, into a Record that will be referenced, searched, and mined. It will happen by our standard combination of willing and allowing. It will happen because it can. It will happen sooner than we think.”
“It will make incredible things possible. Think of all the reasons that you search through your email. Suddenly your own speech will be available in just the same way.”
Sounds wonderful? Not so fast.
“[Consider] what it might be like to live in a society where everything is recorded. There is an episode of the British sci-fi series Black Mirror set in a world where Google Glass–style voice and video recording is ubiquitous. It is a kind of hell.”
A kind of hell. Indeed. It’s already difficult enough for people to forgive and forget. Imagine the difficulty doing so if your spoken words are immortalized.
But let’s not overreact just yet.
“Between these visions of heaven and hell lies the likely truth: When something like the Record comes along, it won’t reshape the basic ways we live and love. It won’t turn our brains to mush, or make us supermen. We will continue to be our usual old boring selves, on occasion deceitful, on occasion ingenuous. Yes, we will have new abilities—but what we want will change more slowly than what we can do.”
Yes, let’s hope.
CBC’s Spark interviews Somers here:
Something like the Speakularity is getting closer, I think. Auto-transcribing business conferencing services seem very much like something the lifeloggers among us will want to use if and when they become available. And when there are people who want to use them, somebody will build them.
So that causes me to consider this important question… can the Speakularity be secured?
It’s not difficult to imagine people discussing corporate strategy and other sensitive proprietary information indiscriminately within earshot of a microphone. After all, they already do. (Phones.) Fortunately, it’s not trivial to hack everybody’s phone and it generally requires expensive tools.
But imagine if seachable audio of personal, corporate, and government speech is just sitting somewhere in the cloud, there for the taking. Today’s data breaches could look like small potatoes by comparison.