Multimodal Web User Agent

Codename

Scope

Synopsis

A multimodal user agent for the web. Scope aims to combine a multimodal web browser and media player into a single application which runs in full screen mode by default. It will have a very minimalistic appearance and be able to render formats such as SVG, VoiceXML, X3D and multimedia as well as the traditional XHTML/CSS/JavaScript. The format used to represent a web resource will be negotiated between the user agent and web server using content negotiation. The user agent is intended to run in full screen mode on an information appliance or replace a traditional desktop environment and window manager, but can be run as a traditional desktop application if desired.

Rationale

The web is often assumed to be a collection of web pages, but really it is a collection of resources identified by Universal Resource Identifiers (URIs), a web page is only one representation of an abstract resource. These resources are not limited to being represented by text and images you "browse" on your "desktop". You should be able to walk around the web, listen to it, watch it, have a conversation with it, interact with it and change it. You should be able to carry it in your pocket and hang it on your wall. "A computer terminal is not some clunky old television with a typewriter in front of it. It is an interface where the mind and body can connect with the universe and move bits of it about." -- Douglas Adams Scope aims to be a general purpose user agent which can render resources in many formats such as a vector image, voice synthesis, 3D environment and perhaps some kind of tactile interface in the future. The format used for representation can be negotiated with a web server using content negotiation.

Features

Rendering Engines

  • Content & Structure
    • XHTML - Structured text (Gecko)
    • SVG - Vector image
    • VoiceXML - Voice synthesis
    • X3D - 3D environment
    • Binary Enclosures
      • PNG & JPEG - Image
      • MP3, OGG & FLAC - Sound
      • MPEG - Video
  • Layout & Style
    • CSS - layout and style for the above
  • Logic
    • JavaScript - logic for the above

Widgets

  • Back button
  • Forward button
  • Home button
  • Graphical address bar
  • Tabbed browsing
  • Command box/suggestion menu/progress bar combined
  • Clock
  • Power button
  • Error and notification message stripes

Content Negotiation

Scope will implement under-used parts of the HTTP specification for content negotiation to negotiate a resource representation format based on user preferences, abilities, environment or usage scenario. This feature will require server side logic on the web server where a resource may be transformed from a base format to the user's preferred format using XSLT or simply be stored in multiple formats. An implementation of XSLT inside the client itself for client-side transformation may also prove useful.

Natural language command line

Before graphical user interfaces computers were interacted with using a "command line", a series of text based commands of a strict structure where individual commands had to be memorised. Scope will include an experimental "natural command line" which could be represented as a text box or used in combination with speech recognition to enable a user to give commands to software in natural language. Example commands would be "email jack", "turn off the lounge light" or "play some classical music". The commands are input into the user agent which may require additional processing such as speech recognition. The command could then be passed as a string of text over HTTP with a URI such as: http://example.uk.home/?q=command goes here It is then up to a server side application to interpret this command and execute an action as a result.

Use Cases

Although the initial development will be for common desktop operating systems like Windows, Mac and GNU/Linux the primary intended platform is a new breed of "information appliances". This will include portable devices, television-like devices, touchscreens and a lot of other obscure hardware. Use cases include:

  • Voice interface for handsfree operation or for visually impaired users
  • The use of a 3D headset and hand sensors to navigate a 3D environment
  • Full screen video playback
  • Multi-modal interaction with a combination of web pages and voice
  • A screen on your fridge or TV
  • A media player on a small portable device

Implementation

See Webtop, previously Webscope

Reply

Please solve the math problem above and type in the result. e.g. for 1+1, type 2
The content of this field is kept private and will not be shown publicly.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd>
  • Lines and paragraphs break automatically.
More information about formatting options