mfirstuc.sty v2.06: uppercasing first letter
Nicola L.C. Talbot |
Dickimaw Books |
http://www.dickimaw-books.com/ |
2017-11-14
The mfirstuc package was originally part of the glossaries bundle for use with commands like \Gls, but as the commands provided by mfirstuc may be used without glossaries, the two have been split into separately maintained packages.
Here are some examples of semantic commands:
(or use the csquotes package). With this, the following works:
This produces:
“Word”
fails (no case-change and double open quote becomes two single open quotes):
‘‘word”
Now the following is possible:
This produces
Word
Define these semantic commands robustly if you intend using any of the commands that fully expand their argument (\emakefirstuc, \ecapitalisewords and \ecapitalisefmtwords).
A simple word can be capitalised just using the standard LATEX upper casing command. For example,
but for commands like \Gls the word may be embedded within the argument of another command, such as a font changing command. This makes things more complicated for a general purpose solution, so the mfirstuc package provides:
This makes the first object of ⟨stuff ⟩ upper case unless ⟨stuff ⟩ starts with a control sequence followed by a non-empty group, in which case the first object in the group is converted to upper case. No expansion is performed on the argument.
Examples:
produces ABC (first object is {\em abc} so this is equivalent to \MakeUppercase{\em abc}), and
produces abc (\em doesn’t have an argument therefore first object is \em and so is equivalent to {\MakeUppercase{\em}abc}).
Note that non-Latin or accented characters appearing at the start of the text should be placed in a group (even if you are using the inputenc package). The reason for this restriction is detailed in §4 UTF-8.
New to version 2.04: There is now limited support for UTF-8 characters with the inputenc package, provided that you load datatool-base (at least v2.24) before mfirstuc (datatool-base is loaded automatically with newer versions of glossaries). If available mfirstuc will now use datatool-base’s \dtl@getfirst@UTFviii command which is still experimental. See the datatool manual for further details.
(Package ordering is important.)
Note also that
produces: ABC. This is because the first object in the argument of \makefirstuc is \abc, so it does \MakeUppercase{\abc}. Whereas:
produces: Abc. There is a short cut command which will do this:
This is equivalent to \expandafter\makefirstuc\expandafter{⟨stuff ⟩}. So
produces: Abc.
As from version 1.10, there is now a command that fully expands the entire argument before applying \makefirstuc:
Examples:
produces: No expansion: XYZA. First object one-level expansion: XYZa. Fully expanded: Xyza.
If you use mfirstuc without the glossaries package, the standard \MakeUppercase command is used. If used with glossaries, \MakeTextUppercase (defined by the textcase package) is used instead. If you are using mfirstuc without the glossaries package and want to use \MakeTextUppercase instead, you can redefine
For example:
Remember to also load textcase (glossaries loads this automatically).
New to mfirstuc v1.06:
This command applies \makefirstuc to each word in ⟨text⟩ where the space character is used as the word separator. Note that it has to be a plain space character, not another form of space, such as ~ or \space. Note that no expansion is performed on ⟨text⟩. See §3.2 Excluding Words From Case-Changing for excluding words (such as “of”) from the case-changing.
The actual capitalisation of each word is done using (new to version 2.03):
This just does \makefirstuc{⟨word⟩} by default, but its behaviour is determined by the conditional:
If you want to title case each part of a compound word containing hyphens, you can enable this using
or switch it back off again using:
Compare
which produces:
Server-side Includes
which produces:
Server-Side Includes
Formatting for the entire phrase must go outside \capitalisewords (unlike \makefirstuc). Compare:
which produces:
A sample phrase
which produces:
A Sample Phrase
As from version 2.03, there is now a command for phrases that may include a formatting command:
where ⟨phrase⟩ may be just words (as with \capitalisewords) or may be entirely enclosed in a formatting command in the form
The starred form only permits a text-block command at the start of the phrase.
Examples:
produces:
A Small Book Of Rhyme
produces:
A Small Book Of Rhyme
produces:
A Small Book Of Rhyme
produces:
A Small Book of Rhyme
produces:
A Small Book of Rhyme
produces:
A Small Book of Rhyme
If there is a text-block command within the argument of the starred form, it’s assumed to be at the start of the argument. Unexpected results can occur if there are other commands. For example
produces:
A Small Book Of rhyme
produces:
A Very small Book of Rhyme
Grouping causes interference:
produces:
A Small book Of Rhyme
produces:
a Small book Of Rhyme
Avoid complicated commands in the unstarred version. For example, the following breaks:
However it works okay with the starred form and the simpler \capitalisewords:
Produces:
A okBo Of Rhyme
A okBo Of Rhyme
This is a short cut for \expandafter\capitalisewords\expandafter{⟨text⟩}.
As from version 1.10, there is now a command that fully expands the entire argument before applying \capitalisewords:
There are also similar shortcut commands for the version that allows text-block commands:
The unstarred version is a short cut for \expandafter\capitalisefmtwords\expandafter {⟨text⟩}. Similarly the starred version of \xcapitalisefmtwords uses the starred version of \capitalisefmtwords.
For full expansion:
Take care with this as it may expand non-robust semantic commands to replacement text that breaks the functioning of \capitalisefmtwords. Use robust semantic commands where possible. Again this has a starred version that uses the starred form of \capitalisefmtwords.
Examples:
produces:
No expansion: ONE TWO THREE FOUR FIVE.
First object one-level expansion: ONE TWO THREE four Five.
Fully expanded: One Two Three Four Five.
(Remember that the spaces need to be explicit. In the second case above, using \xcapitalisewords, the space before “four” has been hidden within \space so it’s not recognised as a word boundary, but in the third case, \space has been expanded to an actual space character.)
Examples:
produces: A Book Of Rhyme.
produces: A Book of Rhyme.
produces: A BOOK OF RHYME. (No expansion is performed on \mytitle.) Compare with next example:
produces: A Book of Rhyme.
However
produces: A Book Of Rhyme. (\space has been expanded to an actual space character.)
produces:
A okBo Of Rhyme
A OKbo Of Rhyme
If you want to provide an alternative for the PDF bookmark, you can use hyperref’s \texorpdfstring command. For example:
Alternatively, you can use hyperref’s mechanism for disabling commands within the bookmarks. For example:
See the hyperref manual for further details.
As from v1.09, you can specify words which shouldn’t be capitalised unless they occur at the start of ⟨text⟩ using:
This only has a local effect. The global version is:
For example:
produces:
The Wind In The Willows
The Wind in the Willows
You can also simply place an empty group in front of a word if you don’t want that specific instance to be capitalised. For example:
produces:
The wind In The Willows
produces
This Is Section Excluding Words From Case-Changing.
The package mfirstuc-english loads mfirstuc and uses \MFUnocap to add common English articles and conjunctions, such as “a”, “an”, “and”, “but”. You may want to add other words to this list, such as prepositions but, as there’s some dispute over whether prepositions should be capitalised, I don’t intend to add them to this package.
If you want to write a similar package for another language, all you need to do is create a file with the extension .sty that starts with
The next line should identify the package. For example, if you have called the file mfirstuc-french.sty then you need:
It’s a good idea to also add a version in the final optional argument, for example:
Next load mfirstuc:
Now add all your \MFUnocap commands. For example:
At the end of the file add:
Put the file somewhere on TEX’s path, and now you can use this package in your document. You might also consider uploading it to CTAN in case other users find it useful.
The \makefirstuc command works by utilizing the fact that, in most cases, TEX doesn’t require a regular argument to be enclosed in braces if it only consists of a single token. (This is why you can do, say, \frac12 instead of \frac{1}{2} or x^2 instead of x^{2}, although some users frown on this practice.)
A simplistic version of the \makefirstuc command is:
Here
is equivalent to
and since \MakeUppercase requires an argument, it grabs the first token (the character “a” in this case) and uses that as the argument so that the result is: Abc.
The glossaries package needs to take into account the fact that the text may be contained in the argument of a formatting command, such as \acronymfont, so \makefirstuc has to be more complicated than the trivial \FirstUC shown above, but at its basic level, \makefirstuc uses this same method and is the reason why, in most cases, you don’t need to enclose the first character in braces. So if
Try the following document:
This will result in the error:
This is why \makefirstuc{ãbc} won’t work. It will only work if the character ã is placed inside a group.
The reason for this error message is due to TEX having been written before Unicode was invented. Although ã may look like a single character in your text editor, from TEX’s point of view it’s two tokens. So
Note that XeTeX (and therefore XeLaTeX) is a modern implementation of TEX designed to work with Unicode and therefore doesn’t suffer from this drawback. Now let’s look at the XeLaTeX equivalent of the above example:
This works correctly when compiled with XeLaTeX. This means that \makefirstuc{ãbc} will work provided you use XeLaTeX and the fontspec package.
Version 2.24 of datatool-base added the command \dtl@getfirst@UTFviii which attempts to grab both octets. If this command has been defined, mfirstuc will use it when it tries to split the first character from the rest of the word. See the datatool documented code for further details.
D
datatool-base package 6, 19
E
\ecapitalisefmtwords 12
\ecapitalisewords 12
\emakefirstuc 7
F
fontspec package 19
G
glossaries package 6, 7, 17
\glsmakefirstuc 7
\gMFUnocap 15
H
hyperref package 14
I
\ifMFUhyphen 8
inputenc package 6
M
\makefirstuc 5
mfirstuc package 6, 19
mfirstuc-english package 15
\MFUcapword 8
\MFUclear 15
\MFUhyphenfalse 8
\MFUhyphentrue 8
\MFUnocap 15
T
textcase package 7
X
\xcapitalisefmtwords 12
\xcapitalisewords 12
\xmakefirstuc 7