# TeX: Typesetting

TeX is a typesetting system: it was especially designed to handle complex mathematics, as well as most ordinary text typesetting.

TeX is a batch language, like C or Pascal, and not an interactive "word processor": you compile a TeX input file into a corresponding device-independent (DVI) file (and then translate the DVI file to the commands for a particular output device). This approach has both considerable disadvantages and considerable advantages. For a complete description of the TeX language, see The TeXbook (see section References). Many other books on TeX, introductory and otherwise, are available.

## `tex` invocation

TeX (usually invoked as `tex`) formats the given text and commands, and outputs a corresponding device-independent representation of the typeset document. This section merely describes the options available in the Web2c implementation. For a complete description of the TeX typesetting language, see The TeXbook (see section References).

TeX, Metafont, and MetaPost process the command line (described here) and determine their memory dump (fmt) file in the same way (see section Memory dumps). Synopses:

```tex [option]... [texname[.tex]] [tex-commands]
tex [option]... \first-line
tex [option]... &fmt args
```

TeX searches the usual places for the main input file texname (see section `Supported file formats' in Kpathsea), extending texname with `.tex' if necessary. To see all the relevant paths, set the environment variable `KPATHSEA_DEBUG` to `-1' before running the program.

After texname is read, TeX processes any remaining tex-commands on the command line as regular TeX input. Also, if the first non-option argument begins with a TeX escape character (usually `\`), TeX processes all non-option command-line arguments as a line of regular TeX input.

If no arguments or options are specified, TeX prompts for an input file name with `**'.

TeX writes the main DVI output to the file `basetexname.dvi', where basetexname is the basename of texname, or `texput' if no input file was specified. A DVI file is a device-independent binary representation of your TeX document. The idea is that after running TeX, you translate the DVI file using a separate program to the commands for a particular output device, such as a PostScript printer (see section `Introduction' in Dvips) or an X Window System display (see xdvi(1)).

TeX also reads TFM files for any fonts you load in your document with the `\font` primitive. By default, it runs an external program named `mktextfm' to create any nonexistent TFM files. You can disable this at configure-time or runtime (see section `mktex configuration' in Kpathsea). This is enabled mostly for the sake of the EC fonts, which can be generated at any size.

TeX can write output files, via the `\openout` primitive; this opens a security hole vulnerable to Trojan horse attack: an unwitting user could run a TeX program that overwrites, say, `~/.rhosts'. (MetaPost has a `write` primitive with similar implications). To alleviate this, there is a configuration variable `openout_any`, which selects one of three levels of security. When it is set to `a' (for "any"), no restrictions are imposed. When it is set to `r' (for "restricted"), filenames beginning with `.' are disallowed (except `.tex' because LaTeX needs it). When it is set to `p' (for "paranoid") additional restrictions are imposed: an absolute filename must refer to a file in (a subdirectory) of `TEXMFOUTPUT`, and any attempt to go up a directory level is forbidden (that is, paths may not contain a `..' component). The paranoid setting is the default. (For backwards compatibility, `y' and `1' are synonyms of `a', while `n' and `0' are synonyms for `r'.)

In any case, all `\openout` filenames are recorded in the log file, except those opened on the first line of input, which is processed when the log file has not yet been opened. (If you as a TeX administrator wish to implement more stringent rules on `\openout`, modifying the function `openoutnameok` in `web2c/lib/texmfmp.c' is intended to suffice.)

The program accepts the following options, as well as the standard `-help' and `-version' (see section Common options):

`-kpathsea-debug=number'
`-ini'
`-fmt=fmtname'
`-progname=string'
These options are common to TeX, Metafont, and MetaPost. See section Common options.
`-ipc'
`-ipc-start'
With either option, TeX writes its DVI output to a socket as well as to the usual `.dvi' file. With `-ipc-start', TeX also opens a server program at the other end to read the output. See section IPC and TeX. These options are available only if the `--enable-ipc' option was specified to `configure` during installation of Web2c.
`-mktex=filetype'
`-no-mktex=filetype'
Turn on or off the `mktex' script associated with filetype. The only values that make sense for filetype are `tex' and `tfm',
`-mltex'
If `INITEX` (see section Initial and virgin), enable MLTeX extensions such as `\charsubdef`. Implicitly set if the program name is `mltex`. See section MLTeX: Multi-lingual TeX.
`-output-comment=string'
Use string as the DVI file comment. Ordinarily, this comment records the date and time of the TeX run, but if you are doing regression testing, you may not want the DVI file to have this spurious difference. This is also taken from the environment variable and config file value `output_comment'.
`-shell-escape'
Enable the `\write18{shell-command}' feature. This is also enabled if the environment variable or config file value `shell_escape' is set to `t'. (For backwards compatibility, `y' and `1' are accepted as synonyms of `t'). It is disabled by default to avoid security problems. When enabled, the shell-command string (which first undergoes the usual TeX expansions, just as in `\special') is passed to the command shell (via the C library function `system'). The output of shell-command is not diverted anywhere, so it will not appear in the log file. The system call either happens at `\output' time or right away, according to the absence or presence of the `\immediate' prefix, as usual for `\write`. (If you as a TeX administrator wish to implement more stringent rules on what can be executed, you will need to modify `tex.ch'.)

## `initex` invocation

`initex` is the "initial" form of TeX, which does lengthy initializations avoided by the "virgin" (`vir`) form, so as to be capable of dumping `.fmt' files (see section Memory dumps). For a detailed comparison of virgin and initial forms, see section Initial and virgin.

For a list of options and other information, see section `tex` invocation.

Unlike Metafont and MetaPost, many format files are commonly used with TeX. The standard one implementing the features described in the TeXbook is `plain.fmt', also known as `tex.fmt' (again, see section Memory dumps). It is created by default during installation, but you can also do so by hand if necessary (e.g., if an update to `plain.tex' is issued):

```initex '\input plain \dump'
```

(The quotes prevent interpretation of the backslashes from the shell.) Then install the resulting `plain.fmt' in `\$(fmtdir)' (`/usr/local/share/texmf/web2c' by default), and link `tex.fmt' to it.

The necessary invocation for generating a format file differs for each format, so instructions that come with the format should explain. The top-level `web2c' Makefile has targets for making most common formats: plain latex amstex texinfo eplain. See section Formats, for more details on TeX formats.

## `virtex` invocation

`virtex` is the "virgin" form of TeX, which avoids the lengthy initializations done by the "initial" (`ini`) form, and is thus what is generally used for production work. For a detailed comparison of virgin and initial forms, see section Initial and virgin.

For a list of options and other information, see section `tex` invocation.

## Formats

TeX formats are large collections of macros, possibly dumped into a `.fmt' file (see section Memory dumps) by `initex` (see section `initex` invocation). A number of formats are in reasonably widespread use, and the Web2c Makefile has targets to make the versions current at the time of release. You can change which formats are automatically built by setting the `fmts` Make variable; by default, only the `plain' and `latex' formats are made.

You can get the latest versions of most of these formats from the CTAN archives in subdirectories of `CTAN:/macros' (for CTAN info, see section `unixtex.ftp' in Kpathsea). The archive `ftp://ftp.tug.org/tex/lib.tar.gz` (also available from CTAN) contains most of these formats (although perhaps not the absolute latest version), among other things.

latex
The most widely used format. The current release is named `LaTeX 2e'; new versions are released approximately every six months, with patches issued as needed. The old release was called `LaTeX 2.09', and is no longer maintained or supported. LaTeX attempts to provide generic markup instructions, such as "emphasize", instead of specific typesetting instructions, such as "use the 10pt Computer Modern italic font".
amstex
The official typesetting system of the American Mathematical Society, used to produce nearly all of its publications, e.g., Mathematical Reviews. Like LaTeX, it encourages generic markup commands. The AMS also provides a LaTeX package for authors who prefer LaTeX (see the `amslatex' item below).
texinfo
The documentation system developed and maintained by the Free Software Foundation for their software manuals. It can be automatically converted into plain text, a machine-readable on-line format called `info', HTML, etc.
eplain
The "expanded plain" format provides various common features (e.g., symbolic cross-referencing, tables of contents, indexing, citations using BibTeX), for those authors who prefer to handle their own high-level formatting.
lamstex
Augments AMSTeX with LaTeX-like features.
amslatex
An LaTeX package (see `latex' item above), that augments LaTeX with AMSTeX-like features.
slitex
An obsolete LaTeX 2.09 format for making slides. It is replaced by the `slides' document class.

## Languages and hyphenation

### MLTeX: Multi-lingual TeX

Multi-lingual TeX (`mltex`) is an extension of TeX originally written by Michael Ferguson and now updated and maintained by Bernd Raichle. It allows the use of non-existing glyphs in a font by declaring glyph substitutions. These are restricted to substitutions of an accented character glyph, which need not be defined in the current font, by its appropriate `\accent` construction using a base and accent character glyph, which do have to exist in the current font. This substitution is automatically done behind the scenes, if necessary, and thus MLTeX additionally supports hyphenation of words containing an accented character glyph for fonts missing this glyph (e.g., Computer Modern). Standard TeX suppresses hyphenation in this case.

MLTeX works at `.fmt'-creation time: the basic idea is to specify the `-mltex' option to TeX when you `\dump` a format. Then, when you subsequently invoke TeX and read that `.fmt` file, the MLTeX features described below will be enabled.

Generally, you use special macro files to create an MLTeX `.fmt` file. See:

```CTAN:/systems/generic/mltex
`ftp://ftp.univ-rennes1.fr/pub/GUTenberg/french/`
```

The sections below describe the two new primitives that MLTeX defines. Aside from these, MLTeX is completely compatible with standard TeX.

#### `\charsubdef`: Character substitutions

The most important primitive MLTeX adds is `\charsubdef`, used in a way reminiscent of `\chardef`:

```\charsubdef composite [=] accent base
```

Each of composite, accent, and base are font glyph numbers, expressed in the usual TeX syntax: `\e symbolically, '145 for octal, "65 for hex, 101 for decimal.

MLTeX's `\charsubdef` declares how to construct an accented character glyph (not necessarily existing in the current font) using two character glyphs (that do exist). Thus it defines whether a character glyph code, either typed as a single character or using the `\char` primitive, will be mapped to a font glyph or to an `\accent` glyph construction.

For example, if you assume glyph code 138 (decimal) for an e-circumflex and you are using the Computer Modern fonts, which have the circumflex accent in position 18 and lowercase `e' in the usual ASCII position 101 decimal, you would use `\charsubdef` as follows:

```\charsubdef 138 = 18 101
```

For the plain TeX format to make use of this substitution, you have to redefine the circumflex accent macro `\^` in such a way that if its argument is character `e' the expansion `\char138 ` is used instead of `\accent18 e`. Similar `\charsubdef` declaration and macro redefinitions have to be done for all other accented characters.

To disable a previous `\charsubdef c`, redefine c as a pair of zeros. For example:

```\charsubdef '321 = 0 0  % disable N tilde
```

(Octal '321 is the ISO Latin-1 value for the Spanish N tilde.)

`\charsubdef` commands should only be given once. Although in principle you can use `\charsubdef` at any time, the result is unspecified. If `\charsubdef` declarations are changed, usually either incorrect character dimensions will be used or MLTeX will output missing character warnings. (The substitution of a `\charsubdef` is used by TeX when appending the character node to the current horizontal list, to compute the width of a horizontal box when the box gets packed, and when building the `\accent` construction at `\shipout`-time. In summary, the substitution is accessed often, so changing it is not desirable, nor generally useful.)

#### `\tracingcharsubdef`: Substitution diagnostics

To help diagnose problems with `\charsubdef', MLTeX provides a new primitive parameter, `\tracingcharsubdef`. If positive, every use of `\charsubdef` will be reported. This can help track down when a character is redefined.

In addition, if the TeX parameter `\tracinglostchars` is 100 or more, the character substitutions actually performed at `\shipout`-time will be recorded.

### Patgen: Creating hyphenation patterns

Patgen creates hyphenation patterns from dictionary files for use with TeX. Synopsis:

```patgen dictionary patterns output translate
```

Each argument is a filename. No path searching is done. The output is written to the file output.

In addition, Patgen prompts interactively for other values.

For more information, see Word hy-phen-a-tion by com-puter by Frank Liang (see section References), and also the `patgen.web' source file.

The only options are `-help' and `-version' (see section Common options).

## IPC and TeX

(Sorry, but I'm not going to write this unless someone actually uses this feature. Let me know.)

This functionality is available only if the `--enable-ipc' option was specified to `configure` during installation of Web2c (see section Installation).

If you define `IPC_DEBUG` before compilation (e.g., with `make XCFLAGS=-DIPC_DEBUG'), TeX will print messages to standard error about its socket operations. This may be helpful if you are, well, debugging.

## TeX extensions

The base TeX program has been extended in many ways. Here's a partial list. Please send information on extensions not listed here to the address in section `Reporting bugs' in Kpathsea.

e-TeX
Adds many new primitives, including right-to-left typesetting. Available from `http://www.vms.rhbnc.ac.uk/e-TeX/` and `CTAN:/systems/e-tex'.
Omega
Adds Unicode support, right-to-left typesetting, and more. Available from `http://www.ens.fr/omega` and `CTAN:/systems/omega'.
PDFTeX
A variant of TeX that produces PDF instead of DVI files. It also includes primitives for hypertext. Available from `CTAN:/systems/pdftex'.
`TeX--XeT'
Adds primitives and DVI opcodes for right-to-left typesetting (as used in Arabic, for example). An old version for TeX 3.1415 is available from `CTAN:/systems/knuth/tex--xet'. A newer version is included in e-TeX.
File-handling TeX
Adds primitives for creating multiple DVI files in a single run; and appending to output files as well as overwriting. Web2c implementation available in the distribution file `web2c/contrib/file-handling-tex'.