The Speech Server Scoop on OCS R2

Ok the word is getting out now about Office Communications Server 2007 R2 so I thought I would give you some details on how this will affect Speech Server developers.

First let me say that there are no changes for Speech Server in OCS R2. It will still be a separate install and the bits will be the same. You will still use Visual Studio 2005 and all of the development tools are the same.

Now for the cool news about the R2 release.

R2 will include the new UC Managed API 2.0. The API shows the new approach for developing Speech Applications going forward: speech technology will be an integrated developer capability in the whole of the UC platform. The UCMA 2.0 API does consist of 3 API major pieces – Core (including a SIP signaling stack and a media stack), a managed server Speech API and UC Workflow Activities that are built on top of both the core and server speech managed APIs. All together make the one UC Managed API 2.0.

The UCMA 2.0 Server Speech SDK will support 12 languages with both ASR and TTS: US English, Canadian French, Mexican Spanish, Brazilian Portuguese, UK English, German German, French French, Italian, Japanese, Mandarin (simplified/mainland plus traditional Taiwanese), Korean.

And get this – it will support Visual Studio 2008! Actually the UC Workflow Activities support both activities for speech as well as for IM automated agents (a.k.a. bots).

More info –

  1. You can develop Speech Server (2007) applications just like you have in the past (using VS 2005)
  2. You can now develop speech “bots” using the new Workflow Activities on top of the UCMA 2.0, or in managed code only using the Core and Speech APIs, if you are really hard core.
  3. The UCMA Speech SDK will be missing some of the tools that you are currently used to having. For example there is no grammar tool but SRGS grammars are still supported and you can use the existing Grammar Editor (in VS2005) to create grammars, or use your favorite XML editor.
  4. Conversational grammars may or may not work due to changes in the way the engine works.
  5. OCS 2007 R2 has no VXML support on top of UCMA 2.0. This might change for the future ’14’ release. SALT definitely is dropped from the roadmap.
  6. The UCMA is much closer to SIP but will still be familiar to you. It will be able to manipulate the SIP stack and the media stack as well.
  7. In the next ’14’ release (the one after R2) Speech Server will no longer be a standalone install but will be an integral part of OCS.

You are probably wondering how you can get your hands on Office Communicator 2007 R2?
The official Launch Date will be early February. Till then there only is a very small private beta.

There however is a Developer program called Metro (http://www.discovermetro.net) for managed Microsoft accounts.

Managed ISVs and Corporate developers just need to get in touch with your Microsoft (Partner) Account Manager asking if you can be admitted to this Metro program. The Metro program gives access to Hyper-V images of a complete developer OCS 2007 R2 setup, including speech, training across the world in the complete platform, and a (email only) help desk, in exchange for a commitment to build applications on the UC (OCS 2007 R2 and Exchange) platform.

I am really excited about this as it will allow us Speech Server developers better access to the core OCS components and will give us a new way to develop speech applications. For now the best approach will probably be to keep developing the way you have in the past and start experimenting with the new stuff before settling on it for all of your development. Or at least that is the approach I plan on using.

Gold Systems (the company I work for) has OCS R2 up and running in production and we are very excited about the new release

I’ll blog more on the UCMA later.

Microsoft looking for our input

Several months ago I started a thread on GotSpeech called Long Compile Times. In that thread I described some problems I was having with long load times when trying to load a Speech project into Visual Studio. I was experiencing compile times of around 10 minutes and long load times when running the application on a production server. I also had some rather long times loading pages as the caller moved through the application.

I contacted Microsoft on this, a ticket was opened and I’ve stayed in touch with them concerning this. It seems the problem revolves around using subflows (or what ever you want to call them) and with nesting of subflows. I’m not going to delve anymore into the problem here as you can just read the thread (and others on GotSpeech) to see what others are experiencing.

Now Microsoft is soliciting some more information on this topic and Anthony has posted to the thread describing what is happening and asking us to fill in some of the details. So if you have encountered this issue then head on over here and make yourself heard on this. Lets make this an active thread.

Microsoft is also working on a whitepaper dealing with this. No firm date on when it will be published but I’m guessing it will depend somewhat on how soon Anthony gets the answers to his questions.

Here’s your chance to make a difference so go for it.

Back on the Road

I’m in Minneapolis next week conducting some Speech Server training so if you live in the area and want to meet up to talk about Speech Server or OCS then shoot me an email.

It would be great to see what is happening in the area. Am I Done? and I will meet for dinner Monday evening and it will be great to catch up on things. Got to have something to fill my time on these business trips. :-)