RTF (Rich Text Foramt) is a very ancient and horrible format for "word processing" documents. It is from very ancient times and a quasi standard of Microsoft. It can prove useful when trying to generate documents that people will open with Word. It is much harder to take a modern version of Word and make comprehensive reversible RTF, but creating simple things with RTF that can be seen on any version of Word should be possible.

In case the point is missed here, when you are required to create some unholy output that some wretch needs to see with Microsoft Word because they haven’t the wherewithal to handle it any other way, using RTF you might just be able to synthesize something using only wholesome tools that could work. For example, a Python program, an XSLT transformation, or just by hand in Vim.

If you need more of the ghastly details, here is the full RTF specification.

Viewing

Check out apt-get install catdoc for a simple conversion of RTF.

Conventions

Uses twips (1/20th of a point aka 1/1440th of an inch). 1 pt = 20twips. 1 inch = 1440 twips. Plenty of precision there.

\fs28

Font size is specified in half points (14 point here).

Minimal RTF Code

Here is a very small sample document.

{\rtf1\ansi\deff0 {\fonttbl {\f0 Times New Roman;}} \f0\fs24 Sample\line Minimal RTF Document}

Here’s some stuff to include for polite operation of a general document. Put this in before the first real content:

\deflang1033 \plain \fs26 \widowctrl \hyphauto \ftnbj
{\header \pard\qr\plain\f0\fs16 \chpgn \par}

This resets everything and sets a default language (us-en). It also sets a default header. The \chpgn inserts the current page number.

Structure

Note that spacing is lax where it is lax and important where it is important (usually when literal in your text). New lines don’t usually cause trouble.

Braces are nested properly and the entire document must be enclosed in curly braces.

Major RTF component classes:

commands

Does stuff. Matches /\\[a-z]+(-?[0-9]+)? ?/

escapes

Specify the hard to specify stuff. See below.

groups

Use braces to save state ({) and then do something and then revert state (}). Think gsave and grestore in PostScript.

plaintext

Literal text. The actual content. What a concept!

Commands

Commands end in a space or a newline. Otherwise newlines don’t do anything in the text at all. Many spaces after a command will be inserted literally except for the first one which ends the command. That first space could have been a newline with the same effect. You can use newlines pretty safely for formatting the source.

\rtf1

RTF version 1 document follows. Required first thing. Always use 1.

\ansi

Windows Code Page 1252 character set.

\deff0

Default font is font #0 in the font table. See next item.

{\fonttbl {\f0 Times New Roman;}}

This is the font table defining font #0 as TNR.

\fs24

Set text to 12 point.

\i

Italic.

\b

Bold.

\ul

Underline.

\super

Superscript. I’m super, thanks for asking!

\sub

Subscript.

\scaps

Small capitals. Sort of an oxymoron.

\strike

Strike through. Maybe good for generating fake <hr> sorts of things. Or try {\pard \brdrb \brdrs \brdrw10 \brsp20 \par} {\pard\par}.

\plain

Turns off any formatting stuff and resets stuff in general.

\line

Start a new line within the paragraph.

\qc

"Quadding" (alignment) center.

\ql

Left justify.

\qr

Right justify.

\qj

Full justify.

\pagebb

Page break here.

\keep

Don’t split this paragraph across pages.

\keepn

Keep this paragraph with the next one.

\widctlpar

Turns on a reluctance to break near the first or last sentence (or two?). Widow/orphan control it’s called.

\nowidctlpar

Turns off widctlpar if it’s on.

\hyphpar

Turns on automatic hyphenation for the paragraph. With parameter 0 (\hyphpar0) it is canceled for this paragraph.

\hyphauto

Turns on automatic hyphenation for the whole doc if used near the beginning.

\sl480\slmult1

Single line spacing. 1.5 spacing is 360. Single spacing (240) is the default.

\pvpg\phpg\posxX\posyY\abswW

Exact paragraph positioning X,Y from top left in twips and W wide.

Those Microsoft people love their backslashes.

Escapes

RTF only uses ASCII values 32-126. To get something else use this kind of escape:

\'a9

Where a9 is a hex number (representing the copyright symbol).

Escapes of interest:

\~

Non-breaking space, like  

\-

Optional hyphenation point.

\_

Real hyphen that should not be broken

\*

Something else.

There are no optional spaces after escape sequences.

Unicode works and is done like this:

\uc1\u2620*

Looks like the star ends it? There are some other funky rules for very high unicode values.

Fonts

In theory a document should have a font table.

Something like this will do:

{\fonttbl {\f0\froman Times;} {\f1\fswiss Helvetica;} {\f2\fmodern Courier;} }

And then you can do {\f1 A Swiss typeface.}.

Paragraphs

A basic paragraph syntax looks like this:

{\rtf1\ansi\deff0 {\fonttbl {\f0 Times New Roman;}}\f0\fs24
{\pard Some lengthy explanation of how paragraphs in RTF are
structured is probably occurring right now somewhere.\par}
}

Note that there needs to be a literal space after are in the source because the newline isn’t converted to it. It is simply dropped from the source.

The "d" in pard means "default" and it is implied that the paragraph defaults are set at the start of a paragraph. YMMV.

Here is a 20pt Centered title line using the paragraph delimters:

{\pard \qc \fs40 Chris X Edwards\par}

I don’t know if it makes a difference which kind of formatting command comes first in cases like this.

Padding

To get vertical space around paragraphs, use:

\sbN

Add N twips of space before the paragraph.

\saN

Add N twips of space after the paragraph.

Best to add space after for best general effect, but this can be done at the beginning of the paragraph definition:

{\pard\sa720 \fs24 Some text is followed by 1/2".\par}

Indenting

\liN

Indent lines N twips of space from the left margin.

\fiN

Indent lines N twips of space from the paragraph margin (set with \liN). You can use a negative number to have the first line be not as indented as the rest.

\riN

Indent lines N twips of space from the right margin.

Margins

\margtN

Where N is the distance in twips from the top.

\margbN

Where N is the distance in twips from the bottom.

\marglN

Where N is the distance in twips from the left.

\margrN

Where N is the distance in twips from the right.

US Letter:

\paperw15840 \paperh12240

Landscape US Letter: \paperw15840 \paperh12240 \margl1440 \margr1440 \margt1800 \margb1800 \landscape

Also: \twoonone God help us.

Columns

\colsN

Where N is the number of columns you want.

\colsxN

Where N is the spacing between columns in twips.

\linebetcol

Draw a line between columns.

Tables

Tables can be done. Not fun but can be done. Here’s an example:

\trowd \trgaph180
\cellx1440\cellx2880
\pard\intbl Content off upper left box.\cell
\pard\intbl Content off upper right box.\cell
\row

\trowd \trgaph180
\cellx1440\cellx2880
\pard\intbl Content off lower left box.\cell
\pard\intbl Content off lower right box.\cell
\row

Info Section

You can have an optional info section of metadata. This looks like:

{\info
{\title THE DOCUMENT TITLE}
{\author CHRIS X EDWARDS}
{\company PRIVACY LEAK DETECTIVES, LLC}
{\creatim\yr2012\mo8\dy20\hr12\min06}
{\doccomm Any random comment, URL, version number, etc.}
}

Speaking of comments, it seems that RTF does not really support them (since they never expected people like me to compose human readable documents directly in RTF from an editor). But there is some weird syntax designed to allow backwards compatibility. The idea is newfangled commands can get parsed and attended to by newfangled parsers. But old parsers were instructed to be ready for newfangled commands and to skip over them if they didn’t understand them. So basically if you just make up a new command with this syntax, you can have what amounts to a comment. Here’s how:

{\*\xednote{Chris X Edwards}}
{\*\xednote{As long as "xednote" isn't a real function in a parser,
everything in this block should be ignored.}}