A markup language is a set of tags and/or a set of rules for creating tags that can be embedded in digital text to provide additional information about the text in order to facilitate automated processing of it, including editing and formatting for display or printing.
Markup languages are fundamental to displaying documents in web browsers, and they are also employed by every word processing program and by nearly every other program that displays text. However, such languages and their tags are typically hidden from the user.
By far the most familiar markup language to most people is, of course, HTML (hypertext markup language), which is used to allow documents to be displayed in web browsers. A newer and much more flexible (but also more difficult to learn and use) approach is to use languages based on XML (extensible markup language), which is a standard for creating languages that describe the content of documents rather than how they should be displayed.
Both HTML and XML are descendants of SGML (standardized general markup language), which was developed by the International Organization for Standards (ISO) in 1986 to facilitate the sharing of machine-readable documents in large projects in government agencies, in the aerospace industry and in in the legal field. SGML has also been used extensively in the printing and publishing industries. However, its complexity has prevented its widespread use for small-scale and general-purpose applications.
A tag is a special string (i.e., sequence of characters). In HTML, XML and related languages, every tag begins with a leftward pointing angular bracket, contains one or more alphanumeric characters, and ends with a rightward pointing angular bracket. These brackets indicate to the browser or other program that renders (i.e., converts to its final form to be viewed by users) that they, along with the enclosed characters, are instructions for the computer rather than ordinary text and that they are not to be visible in the rendered document.
Among the aspects of the display of a document that tags are used to indicate are its layout (including headings, paragraphs and margins), characteristics of the characters in the text (such as typeface, size, style, and whether they are subscripts or superscripts), the positioning of images, and the locations of (and other information about) hyperlinks (i.e., links to other documents or other locations on the same document).
Most tags are designed to be used in pairs (consisting of a start tag and an end tag) and to enclose text within the pair. An example is the pair of HTML tags that is used to indicate bold text:
Some tags are designed to be used individually because they do not enclose any content. An example is
The tag pair
A major advantage of using markup languages that can describe content is that it becomes practical to automatically manipulate the content. For example, a tag pair such as
Markup is also used to indicate special characters to display or print, including those that are not available on a standard keyboard. An example is the copyright symbol, whose markup is
The use of markup languages is becoming increasingly common, and numerous XML markup languages have been developed for specific types of applications. Although most describe text, the great versatility of XML also allows a greater range of applications. For example, SVG (scalable vector graphics) allows complex two-dimensional images to be described completely by text. This makes it very easy to manipulate them, including increasing size without loss of quality (in sharp contrast to conventional bitmap images).
XHTML (extensible HTML) is a reformulation of HTML in order to make it an XML language. It was developed as the successor to HTML and as a transitional step towards making XML languages the standard for web pages in order to simplify browser design and improve the ability to find and manipulate data. However, it has not caught on as fast as had been hoped, and now there is talk of developing new versions of HTML.
The term markup is derived from the traditional publishing practice of marking up manuscripts by writing instructions for typefaces, fonts, sizes, styles, etc. for each section in the margins for the typesetters to use when manually setting the lead type. The specialists who did the markup were known as markup men.
Created December 13, 2006.