Abstract
For years, people have dreamt of the concept of a universal information database - information not only accessible to people around the world, but information that would link easily to other pieces of information so that only the most important information would be quickly found by a user. It was in the 1960's when this idea was explored, leading to visions of a 'docuverse' that people could swim through, revolutionising all aspects of human-information interaction, particularly in the educational field. Only now has the technology caught up with these dreams, making it possible to implement them on a global scale.
The World-Wide Web (WWW, W3 or Web) is a distributed hypermedia document system developed by CERN in Geneva, Switzerland. The original definition describes the WWW as a 'wide area hypermedia information retrieval initiative aiming to give universal access to a large universe of documents' - the first true global hypermedia network. The number of research institutes, universities and commercial companies around the world joining the World-Wide Web is increasing rapidly, causing an electronic cyberspace that spans the globe. It is becoming widely used and useful in a wide range of applications including commerce, entertainment, education and research.
This paper provides an insight into the workings and some emerging applications of the World-Wide Web.
Introduction
Imagine thousands of computers world-wide wishing to view the same group of documents. Without any type of communication system, this information would have to be located on each computer, requiring tremendous disk space and numerous copies of each document. Now imagine a system of common links connecting all of these computers together, the Internet1. These computers can now 'talk' to each other, and can be given access to the same documents without needing to store a copy on their own computer. But how do they get these documents? There are many ways computers can communicate (protocols), but each computer would need to have access to every protocol to obtain every available document. Instead, they could have access to only one information system that could support almost every protocol available, the World-Wide Web.
The official definition of the World-Wide Web describes it as a 'wide-area hypermedia information retrieval initiative aiming to give universal access to a large universe of documents'. What the World-Wide Web project has done, is provide users on computer networks with a consistent and simplified means of accessing a variety of resources. Using a popular software interface to the Web called Mosaic, the Web project has changed the way people view and create information - it has created the first true global hypermedia network.
The operation of the Web relies on hypertext (or hypermedia2) as its means of interacting with users. Hypertext is basically the same as regular text - it can be stored, read, searched, or edited - with an important exception: hypertext contains connections within the text to other documents. For instance, suppose you were able to somehow select (with a mouse or with your finger) the word 'hypertext' in the sentence before this one. In a hypertext system, you would then have one or more documents related to hypertext appear before you - a history of hypertext, for example, or the author's definition of hypertext. These new texts would themselves have links and connections to other documents - continually selecting text would take you on a free-associative tour of information. In this way, hypertext links, called hyperlinks, can create a complex virtual web of connections. Hypertext is:
The World-Wide Web
The World-Wide Web (WWW) is a 'distributed heterogeneous collaborative multimedia information system'. It can be described as a seamless world in which all information, from any source, can be accessed in a simple and consistent manner.
The concept of the WWW was first developed by CERN, the European Particle Physics Laboratory in Geneva, Switzerland, March 1989. The intention was to create a wide-area hypermedia information retrieval system, giving universal access to large volumes of information (documents) on distributed projects. Originally aimed at the High Energy Physics community, it has spread to other areas and attracted much interest in user support, resource discovery and collaborative work. Currently, it is the most advanced information system deployed on the Internet and is equipped to embrace many future advances in technology including: new networks, protocols and data formats.
To summarise, a WWW 'client' program (or browser) runs on your computer. It displays an object, normally a 'document' with text and possibly images, obtained from another computer, the 'server'. The user can either request a search, typing in plain text (or complex commands) to send to the server, or can follow a link from a highlighted phrase to another document. In either case, the client sends a request to the server, often a completely different machine in some other part of the world, and within (typically) a few seconds, the related information, hypertext again, plain text or multimedia, are presented. This is done repeatedly and by a sequence of selections and searches one can find anything that is 'out there'.
Some important things to point out are:
· Access to information is ubiquitous as electronic links, hidden to the user, globally interconnect electronic information.
· A node of information is provided or made available by a WWW server program. In simple cases, the server program can generate a hypertext view representing the directory structure of an existing file store.
· WWW server programs are freely available on the Internet for almost all platforms and run as daemon processes.
· There are many WWW client programs (for almost all GUI-based platforms) giving the user easy access to hypertext databases (nodes) as well as routes to FTP3, Gopher4, Usenet (discussion) Groups, WAIS5 and MBONE6. Figure 1 shows Mosaic7 the most popular interface to the WWW.
·
As hypertext information is transmitted on the network in logical (mark-up)
form, each client can interpret this in a way natural for the given platform,
making optimal use of fonts, colours and other human interface resources
available on that platform.


What does WWW define?
The WWW is a client-server based architecture, excellent for retrieving information from remote sites and also allows the user interact with such sites. The resources most commonly accessed through the WWW are documents written using the HyperText Mark-up Language (HTML). With HTML, parts of a document can be treated as hyperlinks (or references) to other WWW resources. These hyperlinks, often textual, sometimes graphical, are indicated by highlights. By clicking a mouse on such link, the browser automatically retrieves the information pertaining to that link.
The WWW can be characterised in the following way:
· The idea of a world in which all information items have a reference by which they can be retrieved, i.e. an addressing system (URL) makes this possible; despite many different protocols.
· A network protocol (HTTP) used by native WWW servers giving performance and features not otherwise present.

Figure 2: The World-Wide Web client-server architecture. For information to be universally available, WWW relies on a common addressing syntax, a set of common protocols and negotiation of data formats.
HyperText Mark-up Language
The standard language the WWW uses for creating and recognising hypermedia documents is the HyperText Mark-up Language (HTML). It adheres to the Standard Generalised Mark-up Language (SGML), a document formatting language used widely in some computing circles.
HTML is widely praised for its ease
of use. WWW documents are typically written in HTML and are usually named
with the suffix '.html'. HTML documents are nothing more than standard
7-bit ASCII files with formatting codes that contain information about
layout (text styles, document titles, paragraphs, lists, etc.) and hyperlinks.
Many free software converters are available for translating documents in
foreign formats (e.g., LaTeX, RTF) to HTML. Figure 3 shows a small example
of HTML.
<HTML>
<HEADER> <TITLE> This is the title of my document </TITLE> </HEADER>
<H1> This is a level one heading </H1>
Here is some normal text. We can have <B> bold </B> or <I> italic </I> text. We can put paragraphs at the end of our text <P>
We can include a GIF picture here:
<IMG SRC="photo.gif">
We can make links to another web page by doing <A HREF="page.html"> this </A>
</BODY>
</HTML>
Figure 3: The language used to write a page of Web data is called HTML, the HyperText Mark-up Language.
The initial HTML standard introduces basic hypermedia document creation and layout. It includes simple structure elements, such as several levels of headings, bulleted lists, menus and compact lists, all of which are useful when presenting choices and in on-line documents. Under development is a much more enriched version of HTML, called HTML+. It will support interactive forms for the entry of data by users, defined 'hotspots' in images, more versatile layout and formatting options and styles, formatted tables and mathematical formulae, among many other improvements. Currently many browsers support a subset of the HTML+ features along with the core HTML set.
Uniform Resource Locator
HTML uses what are called Uniform Resource Locators (URLs) to represent hypermedia links and links to network services within documents; i.e., the address of a web page. It is possible to represent nearly any file or service on the Internet with a URL. The URL syntax allows objects (menus, documents, images, etc.) to be addressed not only using HTTP, but also using the other common networked information protocols in use today (FTP, NNTP, Gopher and WAIS). For example, the URL of the main page for the WWW project is:
This makes it easy to address an object anywhere on the Internet. This has the advantage of the WWW system being scalable and for the information space to be independent of the network and server topology.
HyperText Transfer Protocol
The language that WWW clients and servers use to communicate with each other is called the HyperText Transfer Protocol (HTTP). All WWW clients and servers must be able to speak HTTP to send and receive hypermedia documents. For this reason, WWW servers are often called HTTP servers. The phrase 'World Wide Web' is often used to refer to the collective network of servers speaking HTTP as well as the global body of information available using the protocol
Application
Before the WWW, finding information on computer databases scattered around the world required knowing arcane addresses and commands like 'Telnet 192.100.81.100.' The WWW lets computer users simply click a mouse on words or images on their computer screens to summon text, sound and images from many of the hundreds of databases on the Internet.
The WWW offers a very easy-to-use interface to the traditionally hard-to-master resources on the Internet. It is probably this ease of use, as well as the popularity of many graphical interfaces to the WWW, that caused the explosion of WWW traffic in 1993. The potential of using networked hypertext and multimedia has prompted many users to create and explore countless innovative applications on the Internet. Some of these applications are described below.
Education and Training
Some applications of the World-Wide Web are showing how information useful for teaching and learning about business, telecommunications, data communications, etc. may be effectively shared over the Internet. Materials such as the following might be collaboratively provided, shared, and used by authors, instructors, and students world-wide:
For example, an introductory course 'Object Oriented Programming Using C++' ('http://uu-gna.mit.edu:8001/uu-gna/text/cc/index.html') is provided by GNA. It is a self-paced course sponsored through the Macvicar School of Education and Technology, one of many GNA schools delivering education on-line. The course is built around a hypertext book, mailing lists, a virtual classroom (interactive) and student projects.
The Hypertext Book
The C++ tutorial comes with a wealth of sample programs. Here the advantage of the WWW access to course material is obvious: programs can be accessed directly from within the text page and formatted as fancy as one wants, e.g. to mark language keywords, statements and user-defined variables. Alternatively, the student can cut and paste the ASCII source text, or look up language specifications in a glossary of C++ terms to be made searchable later on. Within the course, an alternative approach using automated hypertextification of a full C++ class library is tested.
The presentation of C++ source code on the WWW is an interesting topic by itself: students of the course have already contributed conversion programs. Teaching and learning with the World-Wide Web offers two advantages: diversity and homogeneity.
Librarians to Cybrarians
In early 1993, C.I.C.G. and I.M.A.G decided to work together on a project called 'Bibliothhque Virtuelle' (Virtual Library) - library to the desktop. The aim was to provide access, not only to catalogues of bibliographic references, but also to the actual grey literature (thesis, technical reports and other publications) produced by researchers in I.M.A.G., Grenoble and even elsewhere.
With the development of the use of the network and network retrieving tools within libraries, the librarians' job will certainly be the most subject to changes in the coming years. Up till now the library has been a place where documentation was centralised, and the librarians organised it. Now, with the whole Internet virtually accessible on the user's desktop, documentation will be decentralised, but the problem of providing a coherent system for finding pertinent information is all the more crucial. Librarians will now be cataloguing and manipulating 'electronic objects' as easily as they catalogued and manipulated paper documents yesterday. They must take possession of information servers, and create 'organiser points' in the 'creative anarchy' of the Internet today.
Today a number of library gateways are already available on the WWW. The Web is a new medium transforming the way libraries provide access to information. The hyperlinked structure will create a smooth interface, allowing users to find what they need easily.
Other applications
The WWW can be used as a front end to other applications, where it acts as a messenger returning the values generated by underlying applications. Such an application may be for example an object oriented database that stores information and allows sophisticated queries of the information. User queries could be submitted to the database using the query language via the WWW, and the return values of the query displayed back to the user.
Some of the innovative, emerging types of applications on the Web are:
Users of the WWW include computer scientists, librarians, software developers, magazine publishers, record companies and catalogue distributors, all of who see the WWW as the first step for the emerging super highway. For specific end-users, access to current and accurate information of broad variety is essential to the quality of results and systems produced. This leads to a need for support systems that provide rapid access to such information. The WWW platform can be used as a means to providing this support.
Recent Developments
In July 1994 CERN and MIT announced the WWW Organisation (W3O), an initiative to further development and standardisation of the Web. Dr. Martin Bangeman, the Commissioner of the European Union in charge of Industrial Policy, Information Technologies and Telecommunications, commented: 'The European Union intends to support this co-operative activity as an important step toward the Global Information Society'.
Mosaic Communications Corporation ('http://www.mcom.com/') announced in October 1994 that it is offering its newly introduced 'Netscape'9 network navigator (browser) free to users via the Internet. The new Internet navigator (see Figure 3), developed by the six-month-old Silicon Valley company led by Silicon Graphics founder Jim Clark and NCSA Mosaic creator Marc Andreessen, is available immediately for free downloading by individual, academic and research users.
Mosaic Communications' browser achieves performance improvements through new capabilities such as:

Figure 3: A view of the Netscape Mosaic WWW browser for X. An interactive UK guide to World-Wide Web servers is shown to the user.
The Future
Some future directions of the WWW include: commercial publishing; collaborative work (multiple authors, conferences, etc.); Object-Oriented databases store non-document objects manipulated by users; access to schools and homes (education on-line, distance learning, etc.); librarians get involved; fast changing information (financial, news, etc.); reference 'books' (all disciplines); mail, net-news and WWW merge, etc. - the list is endless.
So much for the web as it is. Everything we have seen so far is information distributed by server managers to clients everywhere. A next step is the move to universal authorship, in which everyone involved in an area can contribute to the electronic representation of the group knowledge.
Conclusions
The World-Wide Web project began at the CERN high-energy physics laboratory in 1989. However, the rapid growth of information providers and information users on WWW did not begin until the first release of the initial version of the Web client called Mosaic, by NCSA in November 1992. While the WWW is still a relatively new Internet information retrieval system, a conservative estimate would number its users at more than two million.
In addition to its expanding user base, the WWW has many features that make it attractive for networked information dissemination. Its multimedia capabilities support the retrieval and display of text, graphics, animation, and the playback of sound. These capabilities will likely seem crude in the future, but today they put the Web head and shoulders above the other networked information systems in widespread use. The WWW is also a multiplatform system, with user software available for use with MS Windows, Macintosh, X and VT100 terminals (the latter is limited to text display). This makes web-based information available to a very broad range of Internet users.
In 1994, commercial software for the WWW has started to appear. The majority of the software for use on the WWW is freely distributed via anonymous FTP for non-commercial applications. The software does not require vast system resources, and in many cases binary executable files can be downloaded and run with essentially no modification. This equates to low start-up cost for organisations who want to explore the use of the system.
The usability of the WWW is due in part to its ability to handle multiple application protocols. The Web clients provide user access to HTTP (the defining protocol of WWW), anonymous FTP, Gopher, and WAIS data servers, to name a few. The HTTP (HyperText Transfer Protocol) gives the system its global hypermedia functionality. The hyperlinks between files and documents from servers around the world make the system work as if it were one huge information web.
The WWW system is evolving continuously as new capabilities are added and previous limitations are overcome. There are user-end requirements and network considerations that effectively limit its use by some individuals and organisations. In order for the information files to reach the end user, that user must have local access to a computer which has a unique IP address. Today, this represents a minority of homes and businesses. In addition, even those who have the necessary network access may be at the end of one or more slower network segments (such as 56 kbaud). For those users, it can take a considerable time to download large files such as WWW animation, sound and large graphics files. Relatively few people today have Internet access from their homes (although this is changing fairly rapidly in the United States).
Today's limitations on network access can be expected to fade away in the future. The information transport companies are working vigorously to expand high-speed network access beyond the universities, government laboratories and large corporations and into homes and schools. As more and more organisations begin to provide public information services via the Web (as well as other Internet systems) the demand for ready access to the Internet will grow exponentially. Today's barriers to information access will begin to crumble as public libraries, schools, and government agencies continue to join the WWW as information providers and users.
The WWW initiative occupies the meeting point of many fields of technology. Users put pressure and effort into bringing about the adoption of WWW in new areas. Apart from being a place of communication and learning and a new market place, the WWW is showing ground for new developments in information technology.
Appendix:
Getting started
Freely distributable WWW servers are available for most operating systems, including many flavours of Unix, PC/Windows and Mac, System 7. However, TCP/IP is required on the host system. WWW servers use the transfer protocol HTTP and listen on TCP port 80 for incoming connections from WWW clients.
As far as security10 is concerned, sites can use their own firewall to block out incoming traffic on all ports except port 80. Some WWW server software (CERN's HTTPD for example) allows you to use access control lists to determine what part of your WWW is visible to specific users at specific sites.
CERN's WWW server is available by anonymous FTP from 'info.cern.ch'. NCSA's 'Mosaic' browser for WWW is available for X, Mac or PC/Windows by anonymous FTP from 'ftp.ncsa.uiuc.edu', currently without charge for academic users.
Writing Web Documents
To write WWW documents, you'll need to learn how to use HTML, the HyperText Mark-up Language that defines how a WWW document looks. There are a number of places where you can learn about HTML.
A Beginner's Guide to HTML,
'http://www.ncsa.uiuc.edu/demoweb/html-primer.html'
HTML Documents (tutorials for writing HTML), 'http://fire.clarkson.edu/doc/html/htut.html'
Extensions to HTML (HTML2.0, WO3), 'http://home.mcom.com/home/services_docs/html-extensions.html'
Creating High-Impact Documents, 'http://home.mcom.com/home/services_docs/impact_docs/creating-high-impact-docs.html'
Once you've written your HTML, you will want to learn about more advanced topics like forms, GIF and JPEG images, etc. every WWW author should know.
World Wide Web (a pointer to more information)
'http://akebono.stanford.edu/yahoo/Computers/World_Wide_Web/'
WWW Development (Published by GNN, the Global Network Navigator), 'http://www.charm.net/~web/Vlib.html'
HTML Editors (Mac, Windows, X, etc.), 'http://akebono.stanford.edu/yahoo/Computers/World_Wide_Web/HTML_Editors/'
Surf the Net
Here are a couple of good starting points:
One site is the NASA WWW home page; 'http://hypatia.gsfc.nasa.gov/NASA_homepage.html'.
Another is Cardiff's Movie Database Browser; 'http://www.msstate.edu/Movie/'.
Also, the AT&T Bell Laboratories Research WWW home page; 'http://www.research.att.com/'.
Sun Microsystem's Web Server; 'http://www.sun.com/'.
Broadcom Éireann's WWW home page; 'http://www.broadcom.ie/'.
Ericsson Netherland's WWW Server; 'http://www.ericsson.se:80/'.
The ATM Forum; 'http://www.atmforum.com/'.
International Telecom Union (ITU); 'gopher://info.itu.ch/1'.
The European Commission; 'http://www.earn.net/EC/'.
European Electronic Information Market; 'http://www.echo.lu/'.
IEEE gopher (International); 'gopher://info.computer.org/1'.
Pathfinder from Time Warner; 'http://www.timeinc.com/pathfinder/Welcome.html'.
The NetMarket Company, the interactive market on the Internet; 'http://www.netmarket.com/'.
The Internet Plaza (on-line shopping experience); 'http://plaza.xor.com/'.
Planet Earth WWW home page; 'http://white.nosc.mil/info.html'.
The WWW Virtual Library; 'http://info.cern.ch/hypertext/DataSources/bySubject/Overview2.html'.