|
|
|||
![]() |
Department of Engineering |
| University of Cambridge > Engineering Department > computing help > WWW |
HTML is the "mark-up language" used for writing WWW pages. It has evolved from a simple version 1 through versions 2 (1995) and 3.2 (Jan 1997) to version 4 (late 1997) then version 4.01. "xhtml 1.0" is like HTML 4.01 except that it's XML-based (which for our purposes means it's more rigorous).
This talk (which assumes some knowledge of HTML) deals with the reasons for adopting HTML4/xhtml and suggests ways to begin using it
H1 tag, etc.<CENTER>) are being phased out anyway.Most significantly, you can reliably use style-sheets. These make it easier to maintain stylistic uniformity in a document and can also be used to give the same 'look' to a whole site. But if you've never seen much point in using Word's styles, you may remain unconvinced. So first, something about styles in general
Suppose all subsection headings in a Word document needed to be centred, bold 14pt. How do you do this? You could go through the document, finding all the subsection headings and changing them to be centred, bold 14pt. Alternatively, you could create a subsection heading style and set all the subsection headings to that style. The advantages are that
When an author produces a text they "mark it up" (indicate how the parts of the text should be treated). In the past this was done using a "mark-up language" (a set of embedded formatting characters added to the raw text). Programs like Word claim to be 'WYSIWYG' (What You See is What You Get) - you don't see all the embedded formatting characters in the Word file, you just see their effects. If you're producing a paper leaflet and you want a word to be emphasised you might drag the mouse over it and change it to blue text. What you'd see on the screen is much like the final printed version.
An alternative way of working is WYSIWYM (What You See is What You Mean). If the leaflet ends up on the web and is found by a blind person using voice synthesis software, the word's unlikely to be emphasised. In a WYSIWYM system authors would indicate that the word should be emphasised but would leave the details of how it's emphasised to the program that finally presents the file to the user - on-screen it might be blue, on a mobile device it might be flashing, with audio it might be loud.
The first way of working is called visual (or physical) markup.
The second way is called semantic (or logical) markup. HTML offers a
choice of visual markup options (e.g. <b>) and semantic
ones (like <strong>). In Word you can work both ways too, but
the bias is towards visual mark-up whereas with HTML the bias is
increasingly towards semantic mark-up.
If your software produces HTML for you, you're likely to be producing HTML 4 files anyway, but if you've been hand-editing files you may still be writing old-style HTML. Until recently, browsers didn't behave reliably with pages that used style-sheets. Even now many people use browsers that have the odd problem, but we're now reaching the stage where style-sheets are the best and easiest option in most situations.
If you want to use automated page checkers, they'll work much better if your pages are conformant, and you'll earn the right to use the logo that's at the foot of this page.
Some of the rules concerning the basic commands (the "tags") have changed or have been tightened up. Here are some commonly encountered problems
<a href=... is needed, rather than <A HREF=... </LI> at the end of a listed item. </li> is now
obligatory. Even <br> needs a matching close-tag, though
<br></br> can be abbreviated to <br /> <I><B>Bold and Italic Text!</I></B>.
Now the nesting rules are more strictly enforced - you need to
close the <b> tag before the <i> tag.
& to separate parametersBefore you can use styles effectively you need to know something about how an HTML page is put together. With HTML4 in particular it's important to be aware of the structure of a document. Like a program, a document is likely to have large units inside which there are smaller and smaller units. The top level should have the following structure
The head section contains information about the document (the title, etc) and the body contains the material to be displayed. This material
consists of blocks (paragraphs, for example) and lists. These blocks shouldn't overlap but they
can often contain other blocks (paragraphs can't contain other paragraphs, but the
<body> block can contain many paragraphs).
Every element of a document has "attributes" (properties). You need to read
the documentation to see what the properties of
any particular element are (to see the properties of the <p>
element for example, see the W3C specification).
Some of these properties (e.g. color) affect the
appearance in an obvious way. Some affect positioning or behaviour. Some
are more general. All of these properties are under the authors' control,
though the default values are usually ok. Each property has a name and
a value. Here are some examples -
color.border-bottom-width, border-right-width, etc. Text can be made to wrap about a box by setting
the box's float property to right (as with this document's table of contents) or left.
onClick, onMouseOver - that provide a way of making
something happen when the mouse pointer moves over the element.
class=style id=name, etc. If the page is going to be
'dynamic' (e.g. if you want a particular object to change color at the click
of a button) you need some
way to refer to the object, which is where names come in handy.In this document we'll look at appearance and positioning properties rather than behavioural ones.
Each element of a page has a list of associated properties. To change something you need to specify the element, and then list the properties and their new values. The format is
so for example to change the default background and right margin of
the h3 element one could use
(note that HTML isn't fussy about line-breaks). There are 3 ways to use such lines
head of a document, having
head setting up styles
<style> section.
You'll see that fonts for <p> are defined, but <li> is left as the default, which is why the font in this document keeps switching
span, or for a block (a 'division') use div. For example,
If there's a clash between these different methods (if for example a piece of text is set to be different colours) then the most local setting takes priority.
You can start simply with a one-line style-sheet then add features when you're ready.
If you look at the source of this page you'll see how the headers, paragraph indentation, etc were set up. Even without using stylesheets, HTML4 lets you access several new features.
| This source code... | ... produces this | ||
|---|---|---|---|
|
<ol type="i">
<li> test</li>
<li> test</li>
</ol>
Some non-itemed text ...
<ol type="i" start="3">
<li> test</li>
<li> test</li>
</ol>
|
| ||
|
<ol style="list-style-type: lower-roman;" >
<li>roman numerals</li>
<li>roman numerals</li>
</ol>
<ol style="list-style-type: lower-alpha;" >
<li>letters</li>
<li>letters</li>
</ol>
|
|
||
<ul style="list-style-type: circle;" >
<li>circular bulletmark</li>
</ul>
<ul style="list-style-image: url(tpl.gif);">
<li>graphical bulletmark</li>
</ul>
|
| ||
<table style="border-right-width: thick;
border-bottom-width: thick;
border-left-width: thin;
border-top-width: thin;
border-style:ridge;" >
<tr><td>flashy</td><td>borders</td></tr>
</table>
|
| ||
<div align="center"> ... </div>
or <div align="text-center"> ... </div>
.rednarrow {color: red ; margin-left:35%; margin-right:35%; }
testing testing testing testing testing testing testing testing testing testing testing testing testing testing testing testing testing testing
Headers or various other objects could also be put into this class.
p.bluewide {color: blue ; margin-left:5%; margin-right:5%; }
testing testing testing testing testing testing testing testing testing testing testing testing testing testing testing testing testing testing
The p. at the start of the definition means that only paragraphs can be put into this class. Some properties are quite specialised - the following 2 headers use first-line and first-letter properties to affect certain charactersOld browsers won't understand the style-sheet commands. By using the following method you can ensure that old browsers will ignore the style-sheet commands (treating them as comments) and new browsers will cope ok.
| | computing help | WWW | DHTML | |