Science  People  Locations  Timeline
Index: A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Home > Markup language


 Contents
1 Anatidae
A specialized markup language based on SGML is used to write the electronic version of the Oxford English Dictionary. This enables sophisticated queries to be performed, as well as easy translation into HTML.

A markup language is a kind of text encoding that represents text as well as details about the structure and appearance of the text. A modern one with widespread use is HTML, see below.

Markup languages are used, for example, by the publishing industry to communicate printed works among authors, editors, and printers.

1 History

The term "markup" is derived from the traditional publishing practice of "marking up" a manuscript, that is, adding printer's instructions in the margins of a paper manuscript. For centuries, this task was done by specialists known as "markup men" who marked up text to indicate what typeface, font, style, and size should be applied to each part, and then handed off the manuscript to someone else for the tedious task of typesetting by hand. Even as late as 1980, such "markup men" were apparently still in use, for a very lucky one named Chiarella had his conviction for insider trading reversed by the U.S. Supreme Court that year.

The idea of "markup languages" was apparently first presented by William W. Tunnicliffe at a conference in 1967. However, IBM researcher Charles Goldfarb is more commonly seen today as the "father" of markup languages (because his research transformed the idea into an actual working product).

Some early examples of markup languages available outside the publishing industry can be found in typesetting tools on Unix systems such as troff and nroff. In these systems, formatting commands were inserted into the document text so that typesetting software could format the text according to the editor's specifications. It was a trial and error iterative process to get a document printed correctly. Availability of WYSIWYG ("what you see is what you get") publishing software supplanted much use of these languages among casual users, though serious publishing work still uses markup to specify the non-visual structure of texts.

Another major publishing standard was TeX, created and continuously refined by Donald Knuth in the 1970sMillennia: 1st millennium 2nd millennium 3rd millennium Centuries: 19th century 20th century 21st century Decades: 1920s 1930s 1940s 1950s 1960s 1970s 1980s 1990s 2000s 2010s 2020s Years: 1970 1971 1972 1973 1974 1975 1976 1977 1978 1979 Events and trends and 80sMillennia: 1st millennium 2nd millennium 3rd millennium Centuries: 19th century 20th century 21st century Decades: 1930s 1940s 1950s 1960s 1970s 1980s 1990s 2000s 2010s 2020s 2030s Years: 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 Events and trends. TeX concentrated on detailed layout of text and font descriptions in order to typeset mathematical books in professional quality. This required Knuth to spend considerable time investigating the art of typesettingTypesetting involves the presentation of textual material in an aesthetic form on paper or some other media. Before the development of such late 20th century innovations as dot matrix and ink jet printers, printed material was produced in print shops.. However, TeX requires considerable skill from the user, so that it is mainly used in academiaPlato is credited with the inception of academia: the body of knowledge, its development and transmission across generations. Academia is a general term for the whole of higher education and research. The word comes from the Greek referring to the greater.

The first language to make a clear and clean distinction was Scribe , developed by Brian Reid and described in his doctoral thesis in 1980. Scribe was revolutionary in a number of ways, not least it introduced the idea of styles seperated from the marked up document. Scribe influenced the development of GML (later SGML) and is a direct ancestor to HTML and LaTeXSee also latex a form of rubber or plastic. LATX is a document preparation system for the TeX typesetting program. It offers programmable desktop publishing features and extensive facilities for automating most aspects of typesetting and DTP, especially n. LaTeX is a de-facto standard in many scientific disciplines.

The idea that markup should be focused on the structural aspects of a document and leave the visual presentation of that structure to the interpreter led to the creation of SGML (Standard Generalized Markup Language), which specified a syntax for including the markup in documents, as well as another system (a so-called "metalanguage") for separately describing what the markup meant. This allowed authors to create and use any markup they wished, selecting tags that made the most sense to them. Examples of such markup languages based on the SGML system are TEI and DocBook. SGML was promulgated as an International Standard in 1986.

However SGML was generally found to be cumbersome, a side effect of attempting to do too much and be too flexible. For example, SGML made end tags optional in certain contexts, because it was thought that markup would be done by overworked support staff who would appreciate saving a few keystrokes here and there.

By 1991, it appeared that SGML would be limited to niche uses while WYSIWYG tools (storing documents in proprietary binary formats) would take over the vast majority of document processing.

The situation changed dramatically when Sir Tim Berners-Lee used some of the SGML syntax, without the meta-language, to create HTML ( Hypertext Markup Language). In HTML the markup consists of a set of "known" tags that handle common formatting tasks. However the language was originally created to markup simple scientific papers and therefore had to be greatly expanded in order to offer the rich content the web has today, and for this reason the additions often follow no logical design, although recent efforts have attempted to address this. HTML is likely to be the most used document format in the world today.

Another, newer, markup language that is currently growing in importance is XML (Extensible Markup Language). Unlike HTML which uses a set of "known" tags, XML allows you to create any tag you wish (thus it's extensible) and then describe those tags in a meta-language known as the "DTD" (Document Type Definition). However, DTD's were difficult to write because their syntax was different from XML, so they have recently been supplemented by XML schemas, which are a meta-language defined in terms of XML itself.

XML is similar to the concept of SGML, and in fact, in general terms, XML is a subset of SGML and a superset of HTML. The main purpose of XML (as opposed to using SGML) is to keep the system simpler by focusing on a particular problem — documents on the internet. By doing so they hope to avoid the feature-creep that complicated SGML. The newest incarnation of HTML is XHTML or EXtensible Hypertext Markup Language, a more rigorous and robust version that is in fact XML, and requires documents to be "well-formed" as does XML, but which uses mostly the familiar HTML tags. The main difference between HTML and XHTML from the standpoint of coding the language is that all tags must be closed, including so-called 'empty' tags such as
which, not being a 'container tag', must be 'closed' in every instance like:
.



Read more »

Non User