6.3 KiB
| title | chunk | source | category | tags | date_saved | instance |
|---|---|---|---|---|---|---|
| Lightweight markup language | 1/2 | https://en.wikipedia.org/wiki/Lightweight_markup_language | reference | science, encyclopedia | 2026-05-05T08:03:31.907383+00:00 | kb-cron |
A lightweight markup language (LML), also termed a simple or humane markup language, is a markup language with simple, unobtrusive syntax. It is designed to be easy to write using any generic text editor and easy to read in its raw form. It is used in applications where it may be necessary to read the raw document and the final rendered output. For instance, a person downloading a software library might prefer to read the documentation in a text editor rather than a web browser. Another use for such languages is to provide for data entry in web-based publishing, as in blogs and wikis, where the input interface is a simple text box. The server software then converts the input into a common document markup language like HTML.
== History == Lightweight markup languages were originally used on text-only displays which could not display characters in italics or bold, so informal methods to convey this information had to be developed. This formatting choice was naturally carried forth to plain-text email communications. Console browsers may also resort to similar display conventions. In 1986 international standard SGML provided facilities to define and parse lightweight markup languages using grammars and tag implication. The 1998 W3C XML is a profile of SGML that omits these facilities. However, no SGML document type definition (DTD) for any of the languages listed below is known.
== Types == Lightweight markup languages can be categorized by their tag types. Like HTML (bold), some languages use named elements that share a common format for start and end tags (e.g., BBCode [b]bold[/b]), whereas proper lightweight markup languages are restricted to ASCII-only punctuation marks and other non-letter symbols for tags, but some also mix both styles (e.g., Textile bq.) or allow embedded HTML (e.g., Markdown), possibly extended with custom elements (e.g., MediaWiki '''source'''). Most languages distinguish between markup for lines or blocks and for shorter spans of texts, but some only support inline markup. Some markup languages are tailored for a specific purpose, such as documenting computer code (e.g., POD, reST, RD) or being converted to a certain output format (usually HTML or LaTeX) and nothing else, others are more general in application. This includes whether they are oriented on textual presentation or on data serialization. Presentation oriented languages include AsciiDoc, atx, BBCode, Creole, Crossmark, Djot, Epytext, Haml, JsonML, MakeDoc, Markdown, Org-mode, POD (Perl), reST (Python), RD (Ruby), Setext, SiSU, SPIP, Xupl, Texy!, Textile, txt2tags, UDO and Wikitext. Data serialization oriented languages include Curl (homoiconic, but also reads JSON; every object serializes), JSON, and YAML.
== Comparison of language features ==
Markdown's own syntax does not support class attributes or id attributes; however, since Markdown supports the inclusion of native HTML code, these features can be implemented using direct HTML. (Some extensions may support these features.) txt2tags' own syntax does not support class attributes or id attributes; however, since txt2tags supports inclusion of native HTML code in tagged areas, these features can be implemented using direct HTML when saving to an HTML target. DokuWiki does not support HTML import natively, but HTML to DokuWiki converters and importers exist and are mentioned in the official documentation. DokuWiki does not support class or id attributes, but can be set up to support HTML code, which does support both features. HTML code support was built-in before release 2023-04-04. In later versions, HTML code support can be achieved through plug-ins, though it is discouraged.
== Comparison of implementation features ==
== Comparison of lightweight markup language syntax ==
=== Inline span syntax === Although usually documented as yielding italic and bold text, most lightweight markup processors output semantic HTML elements em and strong instead. Monospaced text may either result in semantic code or presentational tt elements. Few languages make a distinction, e.g., Textile, or allow the user to configure the output easily, e.g., Texy. LMLs sometimes differ for multi-word markup where some require the markup characters to replace the inter-word spaces (infix). Some languages require one character as prefix and suffix, others need two or even three, or support both with slightly different meaning, e.g., different levels of emphasis.
Gemtext does not have any inline formatting, monospaced text (called preformatted text in the context of Gemtext) must have the opening and closing ``` on their own lines.
==== Emphasis syntax ==== In HTML, text is emphasized with the and element types, whereas and traditionally mark up text to be italicized or bold-faced, respectively. Microsoft Word and Outlook, and accordingly other word processors and mail clients that strive for a similar user experience, support the basic convention of using asterisks for boldface and underscores for italic style. While Word removes the characters, Outlook retains them.
==== Editorial syntax ====
In HTML, removed or deleted and inserted text is marked up with the and element types, respectively. However, legacy element types or and are still also available for stricken and underlined spans of text.
AsciiDoc, ATX, Creole, MediaWiki, PmWiki, reST, Slack, Textile and WhatsApp do not support dedicated markup for underlining text. Textile does, however, support insertion via the +inserted+ syntax.
ATX, Creole, MediaWiki, PmWiki, reST and Setext do not support dedicated markup for striking through text.
DokuWiki supports HTML-like stricken syntax, even with embedded HTML disabled.
AsciiDoc supports stricken text through a built-in text span prefix: [.line-through]#stricken#.
==== Programming syntax ====
Quoted computer code is traditionally presented in typewriter-like fonts where each character occupies the same fixed width. HTML offers the semantic and the deprecated, presentational element types for this task.
Mediawiki and Gemtext do not provide lightweight markup for inline code spans.