Menu


SERVER
400 MHz CPU and 256k L2 cache
128 MB RAM
300 MB hard disk space per language
NT 4 Server SP 6a, 2000 Server SP2, XP Pro
  CLIENT
200 MHz CPU and 256k L2 cache
128 MB RAM
10 MB hard disk space
NT 4.0 SP 6a, 2000 SP2, XP Pro



 
About WizzScribe
WizzScribe for Windows is a server based implementation that converts audio from a variety of sources into text for a wide range of application uses. Powered by IBM's proven and sophisticated large vocabulary voice recognition technology, WizzScribe leverages server architecture to add mobility, flexibility and scalability, thus opening up this powerful technology to an exploding list of uses that go far beyond the traditional mind set associated with dictation technology. Through the use of an offline (deferred, a.k.a speak now, reco later) mode of voice recognition, WizzScribe supports a wide variety of input devices and methods, and can add speech recognition capability to "industrial strength" server based applications whenever large volumes of audio need to be converted into text for searching, data-mining, storage and documentation. Although basically a speaker dependent technology, many applications exist where recognition accuracy is not paramount i.e. data-mining and searches, and in these instances WizzScribe may also be deployed in a speaker independent environment.
Licensing
WizzScribe Server is licensed in two steps. The first step is to purchase an SDK license. This provides the software environment for internal development and testing. Each SDK comes with sample runtimes in supported languages. The second step is to purchase a license to internally deploy or externally distribute Runtimes (engine etc). Please note that all types of internal deployment/external distribution require licensing. The most common categories are runtime deployment within an enterprise or runtime distribution with a voice enabled product. Wizzard offers licensing models and packages to cover a wide spectrum of scenarios.

WizzScribe for Windows Server Software Developers Kit (SDK) package: $995.00
License for internal development use
IBM ViaVoice Topic Factory Developer's Guide
WizzScribe Server Sample Guide
WizzScribe Server Programmer's Guide
Sample Client (Visual C++)
Sample source code and (5) sample client applications
   

Runtime deployment/distribution licensing package: Prices are based on volume & scenario
In order to obtain a license for deployment/distribution of WizzScribe for Windows Server, click "inquire" or call Wizzard Sales at 954-678-4155 with your usage scenario. In either case, a sales professional will respond to answer your questions and assist with your purchase.


Note: All licensing is based on volume grids where the per unit prices decrease as volumes increase. Each runtime license covers one speech engine on one processor. For example: WizzScribe Speaker Dependant for Windows Server licenses start at $2400.00 for one processor.

Supported Languages
  • US English
  • US English with 8khz telephony acoustic model
  • French *
  • UK English*
  • German *
  • Brazilian Portuguese
  • Italian*
  • Japanese*
  • Castilian Spanish*
  • Simplified Chinese (Mandarin/China)*
  • Traditional Chinese (Mandarin/Taiwan)*

    *Special bid only


Note: The WizzScribe API Reference and sample client application source code provide information needed to understand the details of the server API and how to use it.

Functional Attributes
  • Powered by state-of-the-art IBM ViaVoice speech recognition technology

  • Provides speech recognition and conversion to text at the request of a client application (not provided as part of WizzScribe) through the server API.

  • Rich set of COM/DCOM interfaces, which support OLE automation. Clients can be implemented using any automation supported languages including C/C++ or Visual Basic. DCOM provides the transport layer so that a client can access the server remotely. The following types of services are available through the API:
    • creating and managing speech user profiles
    • personalizing speech profiles
    • transcribing audio to text
    • basic server management
    • results log reporting

  • Processes variety of audio inputs and convert them directly to text
  • Ability to continually improve recognition accuracy over time to achieve optimum results
  • Supports acoustic adaptation so audio can be captured in different environments
  • Ability to add custom words and custom pronunciations to the vocabulary
  • Ability to adapt word usage depending on context
  • Handles user enrollments, create and manage user profiles, and support custom pronunciations for each user.
  • Supports scalability that allows multiple WizzScribe Servers to handle large volumes of audio processing. When used for dictation applications, multiple WizzScribe Servers can be easily configured to share user profiles and provide scalability for customers that require faster turnaround. A configuration utility provides easy setup for access to speech user profiles shared on a network.
Basic Operational Theory and Workflow
Voice audio is captured by a workflow application running in a Client system. The audio file, in .wav format, can originate from a variety of devices ranging from noise- canceling microphones for optimum recognition to digital recorders or telephones' and in some cases even mobile phones although there may be a substantial degradation of accuracy at this end of the input device range.

Next, the workflow application submits a request for transcription to WizzScribe to begin processing, using the application programming interfaces (APIs) provided. An example of information critical to this client/server request is the user associated with the .wav file. The server maintains a database of user profiles, containing the user's name (a single user may use multiple profiles for different acoustic environments or enrollments) and the language used for the transcription (each enrollment may have only one language associated with it.)

A client workflow application may be attached to multiple servers and in this instance workload balancing across these servers is a responsibility of the client application.

Once the client/server session is established, the server accepts the .wav file and transcribes it into text. This text is then returned to the workflow application on the client, where it can be forwarded to reviewers for editing and then returned to the workflow application, from where the text can be made available to the user.

Typically, a single server is used to handle transcription loads for small to medium sized transcription services. For larger installations, where the volume of transcriptions requests is high and the turnaround time is essential, the user workflow application in the client machine can be modified to manage multiple transcription servers. Each server machine is treated as a separate operational entity and user profiles can be associated with a single server or applied to multiple servers.

If you have any questions after reviewing the following information, please don't hesitate to contact us.



Wizzard Software
has been building and assisting developers in building speech applications for more than ten years and we can help you with your project in a variety of ways.