Table of Contents * Previous Chapter * Next Chapter

Miftran Users Guide: 3 The RC File

3 The RC File

The rc file controls the translation of your MIF file. The syntax of the rc file is identical to the syntax of a MIF file (the same parser is used internally), but the command keywords are different.

Commands in a MIF file (and thus in an rc file) are surrounded by angle brackets (<>), and strings are surrounded by back- and forward-quotes (`'). Words which contain only alphabetic characters do not need to be quoted. Within the angle brackets, the command consists of a command keyword followed by the data (arguments) for that command, which are either integers, floats, or strings. For example, the infilename command in the rc file looks like this:

<infilename `foo.mif'>

Commands in a real MIF file are similar, except that they may contain nested arguments. Commands in an rc fie do not contain nested arguments.

3.1 Comments

You can include comments in your rc file by starting a line with the # character.

A comment, starting with the # character, can also be placed on the same line as a command. The remainder of the line is treated as a comment.

In a MIF file, lines starting with & or = (only at the beginning of a line) are treated as comment lines. These lines are used to indicate an Inset in MIF (such as an embedded postscript file), which miftran does not handle.

3.2 Commands

This section describes the commands that you can put into your rc file.

3.2.1 altoutfilename

Specifies a filename pattern for an alternate set of output files. These files are typically used for generating auxiliary data such as indexes. For example, you might specify

<altoutfile `d%d.txt'>

Then in a substitution string you could use

`%1.1A'

to switch to the output file d.txt.

3.2.2 eprint

Prints the specified message to stderr. For example

<eprint `Using RC file foo.rc'>

3.2.3 include

Includes the contents of the specified file as if it were part of the current file. Include statements can be nested. For example,

<include `std.rc'>

will include the standard rc file.

Note that if you mistakenly use the C preprocessor syntax (#include), the line will be treated as a comment, which makes the mistake much less obvious.

3.2.4 includedir

Specifies a directory in which to search for files specified with the <include> command. Multiple directories may be specified. The -I command line option can also be used to specify these directories.

3.2.5 infilename

Specifies the name of the input MIF file. If this command is not specified, the input filename can be specified on the command line.

<infilename `miftran.mif'>

3.2.6 init

Process the specified control string immediately. The tag and data strings will be blank. This is useful for doing initialization steps such as pre-loading registers or setting up output files.

3.2.7 outfilename

Specifies a filename pattern for the output files. If there is a single output file, the given name is that filename. If there are multiple output files, the pattern should include %d in it somewhere. For example,

<outfilename `chap%d.html'>

Then you would switch to a new output file with a translation string such as

`%+1.1F'

3.2.8 print

Prints the specified message to stdout. For example

<print `Using RC file foo.rc'>

3.2.9 stringsub

Specifies a string translation. This command allows you to translate special characters or strings in the MIF file into other special strings in the output file. The first argument is the string to look for in the input, the second argument is what to replace it with in the output. For example, the < character must be escaped in HTML files. This is accomplished with the command

<stringsub `<' `&lt;'>

String substitution is part of the substitution stage of the translation process (see Section 1.1, `The Translation Process' on page 2). It occurs after type substitution and before formatting. String substitution is applied to the data string from the MI files before they are used in the formatting step.

3.2.10 tagalias

Specifies that the first tag name is to be treated the same as if it were the second tag name. This makes it much easier to set up your rc file for your document, since it means that you can specify in a single line how miftran should handle each of your paragraph tags (assuming miftran has a standard set of commands for dealing with that paragraph type). For example, the line

<typesub chapter MiftranChapter>

tells miftran that your chapter paragraphs are tagged `chapter', and that it should handle those using its standard MiftranChapter definition.

3.2.11 typestringsub

Specifies a string substitution, to be executed only when in a paragraph of the specified type. This command allows you to translate special character or strings only when they appear in a specified paragraph type. For example, you could use the following command to turn tabs into spaces only in the `chapter' paragraph type:

<typestringsub chapter `\\x11 ' ` '>

3.2.12 typesub

Specifies a type substitution. This command allows you to specify what gets output for specific paragraph and font tags. This is the primary way in which you can customize your rc file for your document.

The results of the translation step of the entire translation process is a series of commands indicating a state transition, a tag string (such as a paragraph tag), and a data string. You can see this data stream by using the -tran command line option, which tells miftran to stop processing immediately after the translation step and output those results. A sample of this output looks like this:

startpgf Chapter
string Chapter Overview
endpgf Chapter
switchpgf Chapter.Body

In each line, the first item is the state transition name, the second is the tag, and the third is the data string. In the typesub command, you specify the same three pieces of data in the same order. The difference is that, in the typesub command, the third piece of data is typically a format string that determines what gets output. For example, you might include the command

<typesub startpgf Chapter `<h1>%s</h1>'>

which would result in the output string

<h1>Overview</h1>

in your output file, given the above example as input.

You can use a `*' in place of the tag to specify that that typesub should match on all tag types (sorry, no partial patterns are allowed).

The switchpgf state uses a dual paragraph tag of the form fromtag.totag, such as is used in the above example. In this case, you can specify a `*' in place of one or the other tags to indicate that that typesub should match on all paragraph tags for that part of the transition. For example, you could specify

<typesub switchpgf `*.Bullet' `<ul>'>

which would output <ul> whenever a paragraph of type Bullet followed a paragraph of any other type.

For a list of all of the state names that can be specified in a typesub command, see Section 3.2, `State Names' on page 8. For more information about the formatting string (third arg to typesub), see Section 3.3, `Format Strings' on page 10.

3.3 State Names

The first argument to the typesub command is a state name as output by the tran stage of the translation process. The set of state names is defined in C code. An advanced user can expand the set of state names by editing the C source (see Chapter 4, `Advanced Customization' on page 13). The currently defined state names are described in this section.

In the following descriptions, MIF commands in the input file are referred to with angle brackets around them, such as <Char> and <PgfTag>.

These commands are specified in a transition table in miftran. The table is hierarchical. In the description of the state transitions given below, only the last (leaf) command is mentioned, although the complete command hierarchy must appear in the input file. For example, the <XRefSrcText> command mentioned below is actually nested. To be more precise, it should be noted as <Para>/<ParaLine>/<XRef>/<XRefSrcText>. In practice, it is not necessary to know this, since it is how the MIF files are generated anyway. For the advanced user who is examining the C source code, the structure is apparent from the hierarchical definition tables.

Unless otherwise specified, the tag argument is the current paragraph tag, as set by the most recent <PgfTag> command in the input.

3.3.1 aframe

Generated in response to the <AFrame> command in a text flow. The data string is the aframe id number.

3.3.2 aframefile

Generated in response to the <ImportObjFile> command during AFrame definition. The data string is the filename given in the command.

3.3.3 aframeid

Generated in response to the <ID> command during AFrame definition. The data string is the aframe id number.

3.3.4 char

Generated in response to the <Char> command in the MIF input. The data string is the single argument to the <Char> command. Note that some of the more common <Char> commands are translated differently, such as <Char Tab> and <Char HardReturn>.

3.3.5 emdash

Generated in response to a <Char Emdash> command. The data string is blank.

3.3.6 endash

Generated in response to a <Char Endash> command. The data string is blank.

3.3.7 endfile

Generated at the end of the input file. The tag is blank, and the data string is the name of the input file.

3.3.8 endfont

Generated when switching out of a font. The tag is the font tag. The data string is blank.

3.3.9 endpgf

Generated when finishing a paragraph. The tag is for the paragraph being finished. The data string is blank.

3.3.10 hardhyphen

Generated in response to a <Char HardHyphen> command. The data string is blank.

3.3.11 hardreturn

Generated in response to a <Char HardReturn> command. The data string is blank.

3.3.12 hardspace

Generated in response to a <Char HardSpace> command. The data string is blank.

3.3.13 hypertext

Generated in response to a <hypertext> command. A typical hypertext command in a MIF file looks like this:

<hypertext `gotolink target_label'>

The tag is the hypertext command word (such as `gotolink' or `openlink'); the data string is the remainder of the hypertext command.

3.3.14 markertext

Generated in response to a <MText> command. The tag is a string version of the marker type from the previous <MType> command. The data string is the text from the <MText> command.

3.3.15 pgfnumstring

Generated in response to a <PgfNumString> command. The data string is the string given in that command.

3.3.16 startfile

Generated at the start of reading the input file. The tag is blank, and the data string is the name of the input file.

3.3.17 startfont

Generated when switching into a font. The tag is the font tag. The data string is blank.

3.3.18 startfontangle

Generated in response to a <FAngle> command. The tag is the font angle name. The data string is blank.

3.3.19 startfontweight

Generated in response to a <FWeight> command. The tag is the font weight name. The data string is blank.

3.3.20 startpgf

Generated when starting a paragraph. The tag is for the paragraph being started. The data string is blank.

3.3.21 startpsfont

Generated in response to a <FPostScriptName> command. The tag is the postscript font name. The data string is blank.

3.3.22 string

Generated in response to a <String> command. The data string is the string given in that command.

3.3.23 switchpgf

Generated when switching from one paragraph type to another. The tag is a dual paragraph tag of the form fromtag.totag, for example

Chapter.Body

The data string is blank.

You can use an asterisk as a wildcard to specify that this substitution is valid for any paragraph type. For example,

Chapter.*

specifies a transition from a paragraph of type Chapter to any other paragraph type.

3.3.24 tab

Generated in response to a <Char Tab> command. The data string is blank.

3.3.25 textflowtag

Generated in response to a <TFTag> command. The tag is the specified text flow name. The data string is blank.

3.3.26 vardef

Generated in response to a <VariableDef> command. The data string is the variable definition.

3.3.27 varname

Generated in response to a <VariableName> command. The data string is the variable name.

3.3.28 varref

Generated in response to the <VariableName> command referencing a variable in a text flow. The data string is the variable name.

3.3.29 xrefend

Generated in response to a <XrefEnd> command. The tag and data string are both blank.

3.3.30 xreftext

Generated in response to a <XRefSrcText> command. The tag is blank. The data string is the string given in the <XRefSrcText> command.

3.4 Format Strings

The third argument to the typesub command (see Subsection 3.1.7, `typesub' on page 7) is a formatting string. This string controls what is actually written to the output file, and also controls other aspects, such as when to switch output files.

The string (within quotes `') can contain any printing characters, plus backslash-escaped characters (such as \n for a newline) plus % formatting commands. Any printing characters are sent to the current output file, backslash-escaped characters are interpreted and sent on to the output file, and % formatting is processed, which may or may not result in additional data being sent to the output file.

The different parts of the format string are discussed in the following sections.

3.5 Backslash Sequences

A string can be continued on the next line of input by placing a backslash as the last character on the line. The backslash and following newline are not part of the resulting string.

The valid characters which can appear after a backslash (\) are:

The results of putting any other character following a backslash is officially undefined, although the current implementation will output that character.

3.6 Percent Sequences

The % formatting control characters are divided into two classes: those with side-effects, and those without. The formatting controls without side effects only output data to the current output file, whereas the ones with side-effects can do all sorts of things. Control characters which have side effects are all upper case, such as %A and %F. Control characters without side effects are all lower case, such as %s and %f.

The actual % format is similar to printf, but with an extension to allow a literal string to be specified. It contains the following parts:

  1. The initial % character.
  2. An optional + or - character.
  3. An optional integer. This number is referred to as N1.
  4. An optional dot and integer. This number is referred to as N2.
  5. An optional double-quote character, followed by a string, followed by a closing double quote character. A double quote character can be included in the string by preceding it with a backslash. A backslash can be included in the string by using a double-backslash. This string is referred to as S.
  6. The format control character.

There are two banks of registers that are accessed by some of the commands, one set for integer data, and one set for string data. There are 99 registers, numbered from 1 to 99. References to illegal register numbers are quietly ignored. You can change the number of registers by recompiling miftran. See Chapter 4, `Advanced Customization' on page 13.

3.7 Format Control Characters

The format control characters are described in the following paragraphs.

3.7.1 %%

The % format control characters causes a single % character to be written to the output.

3.7.2 %A

The A format control character directs output to one of the alternate output files. N1 is the value to plug into a %d in the output filename, as specified by the altoutfilename command in the rc file. If a + or - is specified, then N1 is added to the current alternate file number, and the result is used as the new current alternate file number. N2 is the mode: 0 means switch to an already open file; 1 means open with mode w; 2 means open with mode a. By definition, the string %A is equivalent to %+0A.

3.7.3 %E

The E format control character stores the value of the environment variable specified by S into string register N1. For example, the command

<init `%1"DOC_TITLE"E'>

gets the value of the environment variable DOC_TITLE and places it into string register 1 during initialization.

If the + character is specified, then the value of the environment variable is passed to atoi (which converts the string into an integer), and the return value of atoi is stored in integer register N1. If the string is blank (such as if the environment variable is unset), then 0 is stored into integer register N1. If the string is not a valid integer, the results are undefined.

3.7.4 %F

The F format control character directs output to one of the primary output files. N1 is the value to plug into a %d in the output filename, as specified by the outfilename command in the rc file. If a + or - is specified, then N1 is added to the current primary file number, and the result is used as the new current primary file number. N2 is the mode: 0 means switch to an already open file; 1 means open with mode w; 2 means open with mode a. By definition, the string %F is equivalent to %+0F.

3.7.5 %H

The H format control character causes miftran to do a callback to the typesub for `hypertext endanchor' on the next font change. This allows simple rc definitions to handle the hyperlink commands in the MIF file. The standard include file std.rc has definitions to handle these commands. Look for the word `hypertext' in that file to see the definitions.

3.7.6 %L

The L format control character loads literal data into an internal register. N1 specifies the register number. If the + character is specified, then N2 is loaded into integer register N1. If no + character is specified, then S is loaded into string register N1.

3.7.7 %O

The O format control character performs an operation on two registers. If the + character is specified then the operation is performed on the integer registers, else the operation is performed on the string registers. N1 and N2 specify the two registers to use and S specifies the name of the operation to perform. The results are placed into register N1. The list of operations names is given in Section 3.7, `Register Operations For Integers' on page 12.

3.7.8 %R

The R format control character loads data into an internal register. N1 specifies the register number. If the + character is specified, then integer data is extracted from the data string and stored into the specified register. N2 is used to determine which integer to extract, and is 0 based. For example, if N2 is 1 and the data string is `a12b25c30', then the number 25 is stored into the specified integer register.

If no + character is specified, the current data string is stored into the specified register.

The data string used as the source for this command is the current string data for the translation being formatted. If N1 is not in the legal range, it is silently ignored. Note that %R is equivalent to %0R and will thus be ignored.

3.7.9 %S

The S format control character directs the output stream to string register N1. All output goes into that register until redirected by another %S or %A command. This is used, for example, to implement part headings in the standard include file part.rc.

3.7.10 %X

The X format control character executes an extended command. The name of the command is given by S. The list of extended commands is given in Section 3.9, `Extended Commands' on page 13.

3.7.11 %f

The f format control character prints the primary output filename to the output. N1 is the value to plug into a %d in the output filename, and is calculated in the same way as for the %F format.

3.7.12 %n

The n format control character outputs a newline if the last character of the associated data is a space. This hack provides a way to output a newline at a safe place in the output file. If the data does not end in a space, it may be in the middle of a word, so it is not safe to output a newline.

3.7.13 %r

The r format control character prints out the contents of the register specified by N1. If this register has never been set, or if N1 is not a valid register number, then nothing is output. Note that %r is equivalent to %0r, and does nothing.

3.7.14 %s

The s format control character prints out the entire data string.

3.8 Register Operations For Integers

This section describes the register operations that can be specified for integer registers in a %O control sequence.

3.8.1 add

The integer add (or +) command adds the value in integer register N2 to the value in integer register N1 and places the result in integer register N1.

3.8.2 subtract

The integer subtract (or -) command subtracts the value in integer register N2 from the value in integer register N1 and places the result in integer register N1.

3.8.3 copy

The integer copy command sets integer register N1 to the value in integer register N2.

3.9 Register Operations For Strings

This section describes the register operations that can be specified for string registers in a %O sequence.

3.9.1 cat

The string cat command places the concatenation of string register N1 followed by string register N2 into string register N1.

3.9.2 copy

The string copy command makes a copy of string register N2 and places it into string register N1.

3.9.3 get

The string get command gets the value of the symbol named by string register N2 and places it into string register N1.

3.9.4 set

The string set command sets the symbol named by string register N2 to the value in string register N1.

3.10 Extended Commands

This section describes the extended commands available by using a %X sequence.

3.10.1 skip

The skip command skips through the MIF input file until the end of the current level is reached. This is useful for skipping disabled conditional text.

Table of Contents * Previous Chapter * Next Chapter