Department of Engineering

IT Services

Web Mashups (DRAFT)

The term mashup has become popular, and is sometimes used nowadays to describe some traditional integrated solutions, but really mashups are integrated programs that are quick to write and easy to use - good at what they do but rather limited. They're usually WWW applications that combine data (often dynamic web-feeds) from one or more sources. They can be hacked together by hand, or an editor/creator of sorts can be used. Such editors are being created by some of the best-known computing companies, often designed to operate with APIs of the same company's products.

Mashup Editors

  • Yahoo Pipes - "a powerful composition tool to aggregate, manipulate, and mashup content from around the web" [1]. It "aims to let users design data-processing pipelines that filter, transform, enrich and combine data feeds and are again exposed as RSS feeds" ([1], p.46). The drag-and-drop editor works within a browser. Server-side: hosted on a Yahoo server.
  • Google AppEngine - Uses Python or Java. Free hosting until storage and bandwidth exceeds limits (5 million page views a month might still be free).
  • Popfly (Microsoft). Component-based. The "blocks" can process datafeeds, be Javascript functions, or can be a visual component. The drag-and-drop editor works within a browser, but needs a Microsoft plug-in. Hosted on a Microsoft server, excuted client-side, requires a plug-in
  • Intel Mashmaker - needs a browser plug-in

[1] attempts a summary of these approached, classifying them in terms of "Orchestration between components" (flow-based, event-based, or layout-based), "Data-passing style" (dataflow or global variables), etc. [2] introduces some of the principles and technologies.

Standardisation is being attempted - e.g. the OpenAjax Alliance have produced OpenAjax Hub 2.0 which includes "SMash" (Secure mash-up technology), "OpenAjax Metadata" and "OpenSearch".


  • Ajax - comprises several technologies (XHTML, CSS, DOM, Asynchronous data exchange, typically of XML data, client-side scripting, primarily JavaScript) to create a smooth, cohesive Web experience for the user by exchanging small amounts of data with the content servers rather than reload and re-render the entire page after some user action. Ajax toolkits and libraries include Sajax or Zimbra and are usually implemented in JavaScript.
  • Atom/RSS - Both are syndication protocols letting a site offer data (often just text) to other sites. Atom is a newer, and seeks to maintain better metadata than RSS
  • API - Application Protocol Interface - a set of routines to interface with an application
  • Client-side scripting - code (e.g. javascript) supplied on a web page and given to the browser to run. The alternative is server-side scripting (e.g. PHP), where the code is run on the web server before the resulting HTML is sent to the browser. The advantage of client-side scripting is that you don't need any special facilities on the web server. The disadvantage is that users can deliberately disable their browser's javascript, and browsers may be buggy or old. If the server-side scripts produce Javascript (mashups often do) you have a hybrid system.
  • EMML - Enterprise Mash-up Mark-up Language
  • JSON - JavaScript Object Notation. Another data-exchange format - used in Ajax instead of XML.
  • Lash-ups - Like mash-ups, but more of a hack
  • OMA - Open Mashup Alliance
  • RDF - Resource Description Framework. A W3C family of specifications establishing syntactic structures (a grammar) that describe data.
  • REST - Representational State Transfer (a protocol for communicating with remote services). Uses just HTTP and XML, and is simpler than SOAP. Programs that use it are said to be RESTful.
  • Screen scraping - using software tools to parse and analyze content that was originally written for human consumption (if there's no API).
  • SOAP - Services-Oriented Access Protocol (a protocol for communicating with remote services). SOAP APIs for Web services are described by WSDL documents

See Also

Much of this page's material derives from [1] and [2]