What’s the story on Speech Server’s future?
Over the last few months I keep getting inquiries from people concerning the future of Speech Server 2007. They are interested in starting new IVR projects, converting MSS 2004 apps or moving from other platforms to Microsoft. Disappointingly Microsoft is keeping very quiet on this topic when asked, and promises more news only in Q1 of 2010.
Yet there is a lot of information out there in the public domain and I thought it would be useful to make a compile of all that information, till we get word from Microsoft on their roadmap.
1) Microsoft is in the Speech business to stay
Microsoft has been investing in Speech technology since 1993, when XD Huang and other brains behind the best speech engine in the world (CMU’s Sphinx II), joined Microsoft. In 1999 it acquired the world renowned Hidden Markov Model Toolkit (HTK) through an acquisition of Entropic, and it shelled out around 800 million dollar to acquire Tellme in March 2007.
And there are solid proof points that this investment is still considered valuable. The new General Manager of the Speech at Microsoft group, which united Tellme with the Speech Components Group, Zig Serafin, lists in the Microsoft press pass article a long list of key Microsoft products using Microsoft’s speech technology. Speech is part of Windows 7, and of Bing 411 directory assistance, and Bing for Mobile (now also for the iPhone!) has voice search. Exchange 2010 ships with the feature Voice Mail Preview in three languages, a feature that transcribes voice mails into readable text. Outlook Voice Access, a voice user interface to Exchange, now ships with support for 26 languages. And Tellme is apparently switching their platform over to Microsoft’s speech technology, phasing out Nuance’s ASR and AT&T’s TTS.
Most interesting announcement in 2009 however was that Xbox’s Project Natal will use speech recognition.
So, for all those tired of Nuance, Microsoft still seems to be the best bet.
2) Speech Server 2007 is still going to be supported for quite a while
Though there were no changes in Speech Server since it shipped with OCS 2007, it still can compete with any IVR platform out there. To be honest, Speech Server is not the only IVR platform out there that is in maintenance mode. In the last few years most IVR platforms have disappeared from the market, or were bought by companies that have no interest in furthering them. Yet the critical question is: how long is Speech Server 2007 supported?
Well, Speech Server shipped as part of OCS 2007 and was reshipped unmodified in OCS 2007 R2. Microsoft server products follow strict support rules with a minimum of ten years of support: 5 years mainstream support, and 5 years of extended support. OCS 2007 R2 mainstream support however does not extend till beyond OCS 2007’s support term.
So Speech Server 2007 is supported product until January 2013. And then there is 5 more years of extended support beyond that date.
Communications Server 2007
Communications Server 2007 R2
FYI, if you are still using Microsoft Speech Server 2004 or Speech Server 2004 R2, mainstream support has either run out July 2009, or is going to run out in October of 2010:
Speech Server 2004
Speech Server 2004 R2
3) The UC Managed API is the future
In October of 2006 Microsoft announced at SpeechTEK that Speech Server was to be integrated in its Unified Communications platform. On GotSpeech more details were given by Microsoft in response to my blog: in the next generation of the UC Managed API, there would be a VoiceXML browser, continued support for the Windows Workflow Foundation, and 26 languages. In a later blog post Microsoft asked to prioritize the speech tools.
Many people do not have access to the content that was presented at TechEd in May. Yet the best information of what is to come, was presented in the talk by Vishwa Ranjan and Albert Kooiman (subscribers only!). In the slides presented there, there are a couple of interesting slides:
- Exchange 2010 SP1 Unified Messaging will move from Speech Server to the UC Managed API. That implies that by the time of Exchange 2010 SP1 the UC Managed API will be a viable alternative to Speech Server.
- There are detailed slides comparing UCMA with Speech Server, also with the forward looking information in there regarding VXML and the speech tools.
- Also the details are in there on the expanded language portfolio:
- In the end there is guidance on how to build applications to minimize the upgrade effort from Speech Server (2007) to UCMA in the future:
There is a lot in these slides that is not as I would want it to be. As per these slides above, Microsoft will not provide the following:
- An Application Hosting process. When you write an UCMA application you need to write a Windows Service. Doing that well is not trivial. Michael Dunn has written a sample on MSDN Code Gallery, but there is a major gap to overcome for average speech developers.
- There is not going to be a prompt engine either. The Speech Server prompt engine has shortcomings, like that there is no good version control, no open database that it is based on, yet again for the average speech developer the Speech Server prompt engine is good enough and has some great tools built in into Visual Studio.
- Application provisioning is overly complex as well in UCMA. The slide implies that that is not going to be fixed either. Again quite a stumbling block for average speech developer.
All in all UCMA will be great for those who have the know-how to build .NET Windows Server applications. But a lot of effort for the average speech developer.
Yet there are a lot of great things in UCMA as well that I would like to call out:
- First of all the price of UCMA is unbeatable: speech applications build on the UC Managed API, including the speech engines, can be distributed without licensing dues to Microsoft – it is a free redistribution. A more flexible SIP stack for communications and collaboration, top class speech engines – all as a free redistribution. That frees up a lot of $$ for building a great platform.
- I think UCMA is the most visionary and compelling multi-channel platform for self service. UCMA is capable of much more than the legacy IVR. Take a look at this video of Clarity Consulting’s Clarity Connect. A hosted ACD that is enabled using OCS Federation and uses Communicator and Silverlight.
UCMA supports not only the voice channel, but also web based IM support, as well as bots and features like call-back etc.
- Development in managed code with the UC Managed API using Visual Studio is a pleasure. The UC Managed API 2.0 comes with great examples, and the API is extremely powerful. It has a SIP stack, a media stack and great speech recognition (if we only could have Tellme’s voice font Zira in UCMA, I would claim the same greatness for Microsoft’s speech synthesis!).
- For those who develop in VoiceXML on Speech Server 2007, it is great to see that the VXML 2.1 browser is making its comeback. Microsoft just introduced a new Speech Technology section on MSDN. I have great expectations that Microsoft will bring together the Tellme experience with the Unified Communications technology.
1) See the future: join the Wave 14 Metro Program
At PDC Chris Mayo at the end of his presentation (slide 25) on the Wave 14 Unified Communications Platform (video here) announced there will be a Unified Communications Metro program for Wave 14 starting at the end of Q1 in 2010 (here is that date again!). This program will encompass not only OCS, but also the extensibility of Communicator (see that session here) and of Exchange (session video here). Interested partners and customers can ask their Microsoft account manager to nominate them for Metro, or if you do not know your partner or customer account manager you can send mail to metroreq at microsoft.com to apply on behalf of your company (no individuals are admitted, and an NDA is required!). I suggest all of us in the Gotspeech.net Community join this program!
Not only get early access to OCS ‘14’ and the next version of UCMA, but also use this opportunity to give clear feedback on what we expect from the Microsoft when it comes to speech technology going forward!
For more proof that Speech Server is alive and well you don’t have to look any further thatn GotSpeech.Net. Check out these recent posts by Ken on Using Windows 7 For Speech Server Development and the series that Brian is entitled Speech Server Marries FreeSwitch.
As always I would love to hear about your future Voice development plans.