CML Reference Guide

Chapter 4.16:  Text Filters

[TOP] [UP] [PREV] [NEXT]

$t2hbr(stuff)
Turns plain text stuff (which may contain newlines) into HTML.  It turns each newline into a <BR>.  It also turns each of the special characters <, ", and > into their HTML special codes (unless escaped by a "\").  Example:
  " $t2hbr( $shell(cat mytext) )
displays the text of an ordinary file mytext as HTML.

$cleanhtml(prohibit text)
"Clean HTML" filter.  Filters HTML fragment in text, according to the rules in the string named prohibit.  Provides a way to filter out certain HTML tags that you may not wish to be displayed in a response, such as applets, javascript, or even annoying tags such as <BLINK>.

Here are the sample contents of a prohibit string:

    applet,prohibit,all   script,allow,all   blink,prohibit,tag

This means that everything between <APPLET> and </APPLET> is ignored; that the <SCRIPT> tag is allowed, and the <BLINK> tag (but only the tag, not the text that follows it) is ignored.  Normally if something is allowed it does not need to be in the list, but advanced uses of this feature can support lists of tags that can be individually allowed or prohibited at run time.

$cleanhtml() includes all of the safety features of $safehtml(), such as automatic tag closing and mismatched quote correction.

$safehtml(prop stuff)
"Safe HTML" filter.  Obsolete form of $cleanhtml().  Filters HTML fragment in text of stuff, making it "safe" to include in an existing HTML page.  Specifically, it removes the tags <HTML>, </HTML>, <HEAD>, </HEAD>, <BODY>, and </BODY>.  It "closes" any open tags (such as <B>) that don't have a matching closing tag (such as </B>).  It looks for mismatched quotes inside a tag, and adds an extra quote if necessary.  (For example, <A HREF="junk> becomes <A HREF="junk">.)

Prop is a number that controls certain properties of $safehtml().  It is the sum of a set of bitmasks (powers of 2); each bit controls a particular property.  The properties are:

    1 allow <FORM>s.  Otherwise <FORM> tags are removed, like <BODY>.

$rhtml(stuff)
Obsolete form of $safehtml(), without the Prop argument.  $rhtml(stuff) is equivalent to $safehtml(0 stuff).

$t2html(stuff)
Attempts an "intelligent" filtering of plain text stuff into HTML.  Blank lines become <P>'s.  Parses and translates URL's into anchored links with the same names.  (see $t2url().)

$t2url(stuff)
Translates URLs in stuff into anchored links (that pop up a new window) with the same names.  Both this function and $t2html() translate URLs that begin with any of the schemes http:/, gopher:/, telnet:/, ftp:/, or mailto:.

$wrap2html()
A more intelligent (than $t2html) filtering of plain text into HTML.  Acts as much as possible like a typical word-processor.  Each single "hard" RETURN in the original text translates into a <BR>; multiple RETURNs become sequences of "&nbsp;<P>".  Groups of N spaces become N-1 "&nbsp;"s plus a regular space.  A tab is treated as a group of 5 spaces.  Parses and translates URL's into anchored links.

Special note: All 3 functions also recognize and translate special "caucus" URLs of the form "http:/caucus...", into a reference to a particular Caucus CML page on the current host (and with the current swebs subserver).  For example, "http:/caucus" becomes a reference to the Caucus Center page, i.e. center.cml, and "http:/caucus/conf_name" becomes a reference to confhome.cml for conference conf_name.  This is one of the very few instances in which the CML interpreter assumes knowledge of the names and arguments of the actual CML files.  (Normally this would be a bad idea, but in this case the feature is so powerful and useful as to allow the exception.)

$t2amp(stuff)
Translates all "&"s in stuff into "&amp;".  Useful to "pre-escape" HTML code that is going to be "unescaped" when displayed by a browser.  (This pre-escaping is essential when using Caucus to edit a response containing HTML code.  Without it, any escaped HTML special sequences like "&gt;" would lose their meaning after one edit.)

$t2esc(stuff)
Translates all instances of "&", "<", and ">" in stuff into their HTML code equivalents (&amp; &lt; and &gt;).  Useful to "pre-escape" HTML code that is going to be "unescaped" when displayed by a browser. 

$xmlesc(stuff)
Same as $t2esc, but also escapes ASCII characters with values > (octal) 176 into their entity equivalents.  E.g. octal 177 = hex 8F, which becomes "&#x8f;".  As the name suggests, this encodes character data so that it may be used in XML.

$escquote(text)
Translates all double-quotes in text to the HTML special sequence "&quot;".  This is primarily useful for placing text (that contains double-quotes) inside a double-quote-delimited field inside an HTML <INPUT> tag.

$escsingle(text)
Translates all single quotes in text to the sequence "\'".  (Backslash single-quote).  Also translates all newlines to the sequence "\n" and all returns to the sequence "\r".  This is primarily useful for placing text (that contains single quotes) inside a single-quote delimited string -- a common need, especially inside javascript inside HTML inside CML.

$t2mail(address)
Attempts to translate address into a "mailto:" URL.  (For example, if address is "joe@xyz.com", $t2mail() generates "<a href="mailto:joe@xyz.com">joe@xyz.com</A>".)  If address does not appear to be an e-mail address, it is passed through unchanged.

$wraptext(width text)
Word-wraps text to width (single-width-character) columns by inserting newlines in the appropriate places.

$mac_define(name text)        {protected}
Defines a CML macro name that expands to text.  See the CML macros chapter for more information.  If name is already defined, the original definition is erased and replaced by the new one. 

$mac_expand(text)
Expands any macro invocations in text.  Evaluates to text, with the macro invocations replaced by the expansion of the macros.  See the CML macros chapter for more information. 

$eval(text)
Evaluates to nothing.  The functional equivalent of the eval directive.  Particularly useful in macro definitions when some other function needs to be evaluated, but the result "thrown away".

$unhtml(taglist text)
Evaluates to text, but with all HTML tags removed.  Tags in taglist are replaced with newlines, all other tags are replaced with blanks.  Taglist is a comma-separated list of tag names, e.g. "p,br,tr,ul,ol".

$url_decode(text)
Evaluates to the decoded form of "url encoded" text.

$h2url(text)
Looks for URLs in HTML text, and where they are not part of existing <A HREF...> tags, translates them into tags that link to those URLs.  Evaluates to the thus-translated value of text.

(This function is used, for example, to post-process the output from the Caucus "richtext" editor, to automatically "blue" URLs when needed.)

$addtarget(text)
Looks for <A HREF...> tags in HTML text, and if the referenced URL is something viewable (as defined by the New_Win_For parameter in the Caucus swebd.conf configuration file) and does not have an explict TARGET reference, adds a TARGET that opens a new window.  Evaluates to the thus-translated value of text.

(This is another function that was added to help post-process the output from the Caucus "richtext" editor.)

$encode64(filename)
Evaluates to the base64-encoded contents of filename (full pathname).  Particularly useful for generating attachments to emails.