ec7fa8a17afb4ed09668ca3cba134dcd How the World Wide Web works

World wide web in everyday use: A lifeline 

The genius and undisciplined son: Netscape

The network as an operating system: a new paradigm

Towards the Web TV


The operation of the World Wide Web does not differ much from that of other Internet applications. Also in this case the system is based on an interaction between a client and a server. The communication protocol that the two modules use to interact is called HyperText Transfer Protocol ( HTTP ). The only - but important - specific difference is the presence of a special format in which documents placed on the Web must be stored, called HyperText Markup Language ( HTML ).


Web clients are the interface tools between the user and the system; the main functions they perform are:

  • receive user commands

  • request documents from servers

  • interpret the format and present it to the user.

In telematic jargon these programs are also called browsers , from the English to browse , to scroll, since they allow you to scroll through documents. When the user activates a connection - acting on a link or explicitly specifying the address of a document - the client sends a request ('request') to a specific server with the indication of the file to be received.

The Web server, or more precisely HTTP server, on the other hand takes care of the management, retrieval and delivery of the individual documents requested by the clients. Of course it is capable of serving multiple requests at the same time. But a server can perform other functions as well. A typical task of HTTP servers is the interaction with other programs, interaction that allows to produce documents in a dynamic way. Let's try to understand better what it is.

A Web document is obviously a file, which once prepared and put online remains available to users 'as is', until the system manager decides to modify or remove it. However, there are cases in which it would be necessary to be able to change the content of a document dynamically, at fixed deadlines or as the result of a certain operation: for example every time an access occurs, or by automatically updating the data contained in a table after a calculation program has recalculated the corresponding functions; or when the results of a search on a database must be sent, inserted in an appropriate context. The web server is able to carry out these operations through the so-calledCommon Gateway Interface (CGI), that is a series of standard commands thanks to which it can communicate with other applications and programs (for example, do an automatic search on a database) and instantly produce Web documents suitable for the operation performed (for example, containing the search results). Of course this happens in a completely transparent way to the end user.

Another typical function performed by the server is the management of economic transactions, such as the registration of a purchase made with a credit card. From a technical point of view, this operation does not differ much from the normal consultation or updating of a database. But obviously the reliability and safety problems in this case are much more relevant: after all, being told that Manzoni wrote the Decameronit would be considered by far less serious than being charged a million dollars for a book, or discovering that our credit card number has fallen into the hands of a skilled cyber scammer. For this reason, HTTP servers have been developed that are specialized in the management of secure economic transactions through complex data encryption technologies (more on this later).

HyperText Markup Language


HyperText Markup Language (HTML) is the format in which hypermedia Web documents are stored. It is a markup language , specially oriented to the description of text documents. HTML is based on the syntax of the Standard Generalized Markup Language ( SGML ), a metalanguage for defining markup systems, which we will discuss more fully later.

But what does 'markup language' mean? The idea of ​​'markup' in an electronic document is linked to the symbology that writers and proofreaders use in traditional printing to indicate to the composer and typographer how to graphically treat the parts of text that perform particular functions: for example, underlining for indicate italics. Similarly, markup languages ​​are made up of a set of instructions, called tags ( markers), which serve to describe the structure, composition and layout of the document. Markers are sequences of normal ASCII characters, and are introduced, according to a certain syntax, within the document, next to the portion of text to which they refer.

An HTML document is therefore a file in text format that includes, together with the actual textual content, the markers that describe its structure. For example, it is possible to indicate the different levels of the titles of a document, the style of the characters (italics, bold ...), the paragraphs, the presence of lists (numbered or not). If we want to create a hypermedia document, we will also have specific markers for defining hypertext links and for inserting images. Of course, the images are not an integral part of the HTML file, which as such is a simple text file. Graphic files are sent as autonomous objects by the server, and inserted in a Web page only during the display operation performed by the browser. The standard digital image formats on the Web are GIF and JPEG. These are graphic coding systems capable of significantly compressing the size of the file, and therefore particularly suitable for use on a network.

Through HTML commands it is also possible to specify some interactive structures such as input forms through which the user can send commands and information to the server and activate special procedures (database searches, e-mails and even payments by credit card!); or draw tables.

An Internet user who only wanted to search and not produce information on the net could do without worrying about how HTML works. Attention, however: one of the fundamental characteristics of the Internet is the extreme ease with which it is possible to become active protagonists of the information exchange. If you want to take this decisive leap, a little familiarity with HTML is required. There is no need to be afraid: HTML is not a programming language, and its basic instructions are very simple: learning the rudiments of HTML is no more complicated than learning to use and interpret the main acronyms and abbreviations used by proofreaders. For these reasons in the

HTML was born and developed together with the World Wide Web. In the first version, the language did not foresee the possibility of representing complex textual and editorial phenomena. As a result, its specifications have undergone several revisions and extensions, which have given rise to three official versions, as well as a series of extensions introduced by various commercial Web browser manufacturers.

Official reviews are currently managed by an organization that brings together representatives of more than forty research organizations and companies interested in the development of information systems on the Internet, the W3 Consortium , founded by Tim Berners Lee. These successive refinements, welcoming the requests coming from an increasingly vast and varied community of users, have progressively introduced elements dedicated to the formal control of the text.

However, while the official commissions worked slowly on the revision of the standard, the explosion of the Internet phenomenon, and the widespread demand for tools capable of making Web pages spectacular (rather than 'documentally' well-structured), led the browser manufacturing industries , and in particular Netscape and Microsoft, to introduce a number of individual extensions to the language. The hope (in the case of Netscape, crowned by some success so far) was also to conquer a position of de facto monopoly in the market, since the extensions introduced by a given industry were, at least in the first instance, recognized and interpreted correctly only from its browser.

The most recent official version of the language released by the W3C, called HTML 3.2, has brought back some order, welcoming many of the most interesting innovations. Nonetheless, the evolution of document encoding systems on the Web is one of the most stimulating issues in the debate on the future of the network: we will deal with it in the paragraph dedicated to the new frontiers of the Internet.

Uniform Resource Locator


A particular aspect of the functioning of the World Wide Web is the document addressing technique, that is the way in which it is possible to refer to a particular document among all those published on the network.

The solution that has been adopted to address this important need is called Uniform Resource Locator ( URL ). The 'URL' of a document essentially corresponds to its network address; every information resource (computer or file) present on the Internet is tracked and reached by our client programs through its URL. Before the introduction of this technique, there was no way to formally indicate where a certain information resource was on the Internet.

A URL has a very simple syntax, which in its normal form consists of three parts:

servertype: // hostname / filename

The first part indicates with a keyword the type of server you are pointing to (it can be a gopher server, an http server, an FTP server, and so on); the second indicates the symbolic name of the host on which the addressed file is located; the numerical address may be given instead of the name; the third indicates the name and location ('path') of the single document or file to which it refers. Between the first and second part the characters ': //' must be inserted. An example of a URL is the following:

http://www.trendingworldwide.it/index.htm

The keyword 'http' indicates that we refer to a Web server , which is located on the computer called 'www.liberliber.it', from which we want to send us the file in HTML format whose name is' index.htm '. By changing the abbreviations, it is also possible to refer to other types of Internet network services:

ftp ' for FTP servers

gopher ' for gopher servers

telnet ' for telnet servers

wais ' for WAIS servers.

It should be noted that this syntax can be used both in the hypertext instructions of HTML files, and with the commands that individual clients, each in its own way, make available to reach a particular server or document. It is therefore good that even the normal user of the Internet network learns to use it correctly.

Some programs for using the World Wide Web


The main tool for browsing the pages of the World Wide Web is, as we have mentioned several times, a 'browser', that is a program capable of requesting the page we want to reach from the remote server that hosts it, receiving it and displaying it correctly (text, images, hyperlinks, backgrounds ... all paginated by following the instructions provided, in the form of HTML markers, by whoever created that particular page). The first web browsers (like Mosaic) were born in university research labs. The explosion of the Internet phenomenon, largely linked to the World Wide Web, has led to the multiplication of initiatives to develop new programs, or improve existing ones, and in particular it has shown its commercial potential. This has attracted the attention of many software companies, and has led many of the university pioneers to found new ones (the most sensational case is that of the often cited Netscape Corporation). One of the strategic battles for the future of information technology and telematics is currently being fought in this sector.

Consequently, the programs to access the World Wide Web available today are quite numerous, some free, others sold with particular commercial formulas. As with the other network services seen so far, there are browsers for all the most popular platforms and operating systems.

The use of these programs, in principle, is quite easy: a simple click of the mouse is enough to connect with a computer that is on the other side of the world. Furthermore, as we have already seen, a good Web client can access FTP and gopher servers in a completely transparent way, show newsgroup messages, manage e-mail, and as we will see the most recent versions can also automatically receive information 'channels' through the information push mechanism. In short, a Web client can integrate the main functions made available by the Internet. We remind you that you can use a graphic browser only if you have a direct connection to the network, or a connection with the PPP or SLIP protocols. Once connected to the network, just launch the client on your computer and start browsing the millions of web servers scattered across the Internet.

In the following pages we will review some of the most popular browsers, showing their main features. Our choice was guided by the technological level and the diffusion of the programs detected at the time of writing the manual. But remember that any attempt at systematization in this area is in vain. Any advice on which client to choose, any detailed illustration of one of them, risks a very rapid obsolescence. The only advice we feel we can give without fear is this: the best way to learn how to use all the tools of the Internet world is to use them, driven by a good dose of curiosity. Or, in the words of Galileo, "trying and trying again".

Character interface programs


Obviously, to use an information system such as the World Wide Web and to fully exploit its hypertext and multimedia features, it is necessary to use a client with a graphical interface. But the experience of surfing the World Wide Web, albeit in an extremely limited way, can also be experienced by those who do not have direct connections or SLIP / PPP. In fact, there are browsers based on a character interface that can also be used through a simple terminal connection to an Internet host. Any communication program with VT100 or VT102, two very common terminal emulations, and a modem, even if not particularly fast, is sufficient to connect with the host.

Of course, a version of the client must be installed on the host you are connecting to. To find out if your network access provider has one, ask the system administrator directly, or customer support in the case of a commercial provider.

If a local client is not available, you can use some hosts that allow free access, through a simple telnet connection, to a Web client. You can for example connect via telnet to the address telnet.w3.org . There you will find the character WWW client developed in the CERN laboratories in Geneva. But the best character client for navigating the World Wide Web is probably Lynx. The program was written by three University of Kansas programmers, Michael Grobe, Lou Montulli, and Charles Rezac, and there are versions of it for many platforms, including one for DOS. Let's look at it a little more closely. In the following figure you can see a screenshot of Lynx in Unix environment, by far the most used version.


The operation of the World Wide Web does not differ much from that of other Internet applications. Also in this case the system is based on an interaction between a client and a server. The communication protocol that the two modules use to interact is called HyperText Transfer Protocol ( HTTP ). The only - but important - specific difference is the presence of a special format in which documents placed on the Web must be stored, called HyperText Markup Language ( HTML ).  Web clients are the interface tools between the user and the system; the main functions they perform are:  receive user commands  request documents from servers  interpret the format and present it to the user.  In telematic jargon these programs are also called browsers , from the English to browse , to scroll, since they allow you to scroll through documents. When the user activates a connection - acting on a link or explicitly specifying the address of a document - the client sends a request ('request') to a specific server with the indication of the file to be received.  The Web server, or more precisely HTTP server, on the other hand takes care of the management, retrieval and delivery of the individual documents requested by the clients. Of course it is capable of serving multiple requests at the same time. But a server can perform other functions as well. A typical task of HTTP servers is the interaction with other programs, interaction that allows to produce documents in a dynamic way. Let's try to understand better what it is.  A Web document is obviously a file, which once prepared and put online remains available to users 'as is', until the system manager decides to modify or remove it. However, there are cases in which it would be necessary to be able to change the content of a document dynamically, at fixed deadlines or as the result of a certain operation: for example every time an access occurs, or by automatically updating the data contained in a table after a calculation program has recalculated the corresponding functions; or when the results of a search on a database must be sent, inserted in an appropriate context. The web server is able to carry out these operations through the so-calledCommon Gateway Interface (CGI), that is a series of standard commands thanks to which it can communicate with other applications and programs (for example, do an automatic search on a database) and instantly produce Web documents suitable for the operation performed (for example, containing the search results). Of course this happens in a completely transparent way to the end user.  Another typical function performed by the server is the management of economic transactions, such as the registration of a purchase made with a credit card. From a technical point of view, this operation does not differ much from the normal consultation or updating of a database. But obviously the reliability and safety problems in this case are much more relevant: after all, being told that Manzoni wrote the Decameronit would be considered by far less serious than being charged a million dollars for a book, or discovering that our credit card number has fallen into the hands of a skilled cyber scammer. For this reason, HTTP servers have been developed that are specialized in the management of secure economic transactions through complex data encryption technologies (more on this later).  HyperText Markup Language  HyperText Markup Language (HTML) is the format in which hypermedia Web documents are stored. It is a markup language , specially oriented to the description of text documents. HTML is based on the syntax of the Standard Generalized Markup Language ( SGML ), a metalanguage for defining markup systems, which we will discuss more fully later.  But what does 'markup language' mean? The idea of ​​'markup' in an electronic document is linked to the symbology that writers and proofreaders use in traditional printing to indicate to the composer and typographer how to graphically treat the parts of text that perform particular functions: for example, underlining for indicate italics. Similarly, markup languages ​​are made up of a set of instructions, called tags ( markers), which serve to describe the structure, composition and layout of the document. Markers are sequences of normal ASCII characters, and are introduced, according to a certain syntax, within the document, next to the portion of text to which they refer.  An HTML document is therefore a file in text format that includes, together with the actual textual content, the markers that describe its structure. For example, it is possible to indicate the different levels of the titles of a document, the style of the characters (italics, bold ...), the paragraphs, the presence of lists (numbered or not). If we want to create a hypermedia document, we will also have specific markers for defining hypertext links and for inserting images. Of course, the images are not an integral part of the HTML file, which as such is a simple text file. Graphic files are sent as autonomous objects by the server, and inserted in a Web page only during the display operation performed by the browser. The standard digital image formats on the Web are GIF and JPEG. These are graphic coding systems capable of significantly compressing the size of the file, and therefore particularly suitable for use on a network.  Through HTML commands it is also possible to specify some interactive structures such as input forms through which the user can send commands and information to the server and activate special procedures (database searches, e-mails and even payments by credit card!); or draw tables.  An Internet user who only wanted to search and not produce information on the net could do without worrying about how HTML works. Attention, however: one of the fundamental characteristics of the Internet is the extreme ease with which it is possible to become active protagonists of the information exchange. If you want to take this decisive leap, a little familiarity with HTML is required. There is no need to be afraid: HTML is not a programming language, and its basic instructions are very simple: learning the rudiments of HTML is no more complicated than learning to use and interpret the main acronyms and abbreviations used by proofreaders. For these reasons in the  HTML was born and developed together with the World Wide Web. In the first version, the language did not foresee the possibility of representing complex textual and editorial phenomena. As a result, its specifications have undergone several revisions and extensions, which have given rise to three official versions, as well as a series of extensions introduced by various commercial Web browser manufacturers.  Official reviews are currently managed by an organization that brings together representatives of more than forty research organizations and companies interested in the development of information systems on the Internet, the W3 Consortium , founded by Tim Berners Lee. These successive refinements, welcoming the requests coming from an increasingly vast and varied community of users, have progressively introduced elements dedicated to the formal control of the text.  However, while the official commissions worked slowly on the revision of the standard, the explosion of the Internet phenomenon, and the widespread demand for tools capable of making Web pages spectacular (rather than 'documentally' well-structured), led the browser manufacturing industries , and in particular Netscape and Microsoft, to introduce a number of individual extensions to the language. The hope (in the case of Netscape, crowned by some success so far) was also to conquer a position of de facto monopoly in the market, since the extensions introduced by a given industry were, at least in the first instance, recognized and interpreted correctly only from its browser.  The most recent official version of the language released by the W3C, called HTML 3.2, has brought back some order, welcoming many of the most interesting innovations. Nonetheless, the evolution of document encoding systems on the Web is one of the most stimulating issues in the debate on the future of the network: we will deal with it in the paragraph dedicated to the new frontiers of the Internet.  Uniform Resource Locator  A particular aspect of the functioning of the World Wide Web is the document addressing technique, that is the way in which it is possible to refer to a particular document among all those published on the network.  The solution that has been adopted to address this important need is called Uniform Resource Locator ( URL ). The 'URL' of a document essentially corresponds to its network address; every information resource (computer or file) present on the Internet is tracked and reached by our client programs through its URL. Before the introduction of this technique, there was no way to formally indicate where a certain information resource was on the Internet.  A URL has a very simple syntax, which in its normal form consists of three parts:  servertype: // hostname / filename  The first part indicates with a keyword the type of server you are pointing to (it can be a gopher server, an http server, an FTP server, and so on); the second indicates the symbolic name of the host on which the addressed file is located; the numerical address may be given instead of the name; the third indicates the name and location ('path') of the single document or file to which it refers. Between the first and second part the characters ': //' must be inserted. An example of a URL is the following:  http://www.liberliber.it/index.htm  The keyword 'http' indicates that we refer to a Web server , which is located on the computer called 'www.liberliber.it', from which we want to send us the file in HTML format whose name is' index.htm '. By changing the abbreviations, it is also possible to refer to other types of Internet network services:  ' ftp ' for FTP servers  ' gopher ' for gopher servers  ' telnet ' for telnet servers  ' wais ' for WAIS servers.  It should be noted that this syntax can be used both in the hypertext instructions of HTML files, and with the commands that individual clients, each in its own way, make available to reach a particular server or document. It is therefore good that even the normal user of the Internet network learns to use it correctly.  Some programs for using the World Wide Web  The main tool for browsing the pages of the World Wide Web is, as we have mentioned several times, a 'browser', that is a program capable of requesting the page we want to reach from the remote server that hosts it, receiving it and displaying it correctly (text, images, hyperlinks, backgrounds ... all paginated by following the instructions provided, in the form of HTML markers, by whoever created that particular page). The first web browsers (like Mosaic) were born in university research labs. The explosion of the Internet phenomenon, largely linked to the World Wide Web, has led to the multiplication of initiatives to develop new programs, or improve existing ones, and in particular it has shown its commercial potential. This has attracted the attention of many software companies, and has led many of the university pioneers to found new ones (the most sensational case is that of the often cited Netscape Corporation). One of the strategic battles for the future of information technology and telematics is currently being fought in this sector.  Consequently, the programs to access the World Wide Web available today are quite numerous, some free, others sold with particular commercial formulas. As with the other network services seen so far, there are browsers for all the most popular platforms and operating systems.  The use of these programs, in principle, is quite easy: a simple click of the mouse is enough to connect with a computer that is on the other side of the world. Furthermore, as we have already seen, a good Web client can access FTP and gopher servers in a completely transparent way, show newsgroup messages, manage e-mail, and as we will see the most recent versions can also automatically receive information 'channels' through the information push mechanism. In short, a Web client can integrate the main functions made available by the Internet. We remind you that you can use a graphic browser only if you have a direct connection to the network, or a connection with the PPP or SLIP protocols. Once connected to the network, just launch the client on your computer and start browsing the millions of web servers scattered across the Internet.  In the following pages we will review some of the most popular browsers, showing their main features. Our choice was guided by the technological level and the diffusion of the programs detected at the time of writing the manual. But remember that any attempt at systematization in this area is in vain. Any advice on which client to choose, any detailed illustration of one of them, risks a very rapid obsolescence. The only advice we feel we can give without fear is this: the best way to learn how to use all the tools of the Internet world is to use them, driven by a good dose of curiosity. Or, in the words of Galileo, "trying and trying again".  Character interface programs  Obviously, to use an information system such as the World Wide Web and to fully exploit its hypertext and multimedia features, it is necessary to use a client with a graphical interface. But the experience of surfing the World Wide Web, albeit in an extremely limited way, can also be experienced by those who do not have direct connections or SLIP / PPP. In fact, there are browsers based on a character interface that can also be used through a simple terminal connection to an Internet host. Any communication program with VT100 or VT102, two very common terminal emulations, and a modem, even if not particularly fast, is sufficient to connect with the host.  Of course, a version of the client must be installed on the host you are connecting to. To find out if your network access provider has one, ask the system administrator directly, or customer support in the case of a commercial provider.  If a local client is not available, you can use some hosts that allow free access, through a simple telnet connection, to a Web client. You can for example connect via telnet to the address telnet.w3.org . There you will find the character WWW client developed in the CERN laboratories in Geneva. But the best character client for navigating the World Wide Web is probably Lynx. The program was written by three University of Kansas programmers, Michael Grobe, Lou Montulli, and Charles Rezac, and there are versions of it for many platforms, including one for DOS. Let's look at it a little more closely. In the following figure you can see a screenshot of Lynx in Unix environment, by far the most used version.





Post a Comment

أحدث أقدم