html
head
title [Why (and how) I am using XML and MathML]
author [Richard Kaye, School of Mathematics, University of Birmingham]
contributor ~http://web.mat.bham.ac.uk/R.W.Kaye/
date [2006-01-17]
keywords [MathML XML mathematics publishing gloss]
body {
p [This article is a follow-up to that by Peter Rowlett
(MSOR Connections Nov 2005, vol 5 no 4, 25–6). I give
some personal reasons to add to Rowlett's comments on
why mathematicians should be using XML and MathML much more,
and outline my own attempts to make XML/MathML publishing
accessible to the mathematics community.]
section {
title[Why]
p [Mathematicians have been served well by TeX and LaTeX for their
mathematical typesetting. Too well, perhaps. At least, if an
dedicated TeXnician of the last ten years has a chance to \\relax
and look about himself he will see that the rest of the world
has moved on in several incompatible ways to the cosy world of
TeX.]
p [We probably were drawn to TeX in the first place because for the first time
it provided all the mathematical characters we needed in a series of 7-bit
fonts. But nowadays there seems to be a consensus on how characters should be
encoded in computer documents. The unicode standard is out there and
increasingly being used to provide standardised encodings of characters beyond
the basic ASCII character set, and sadly unicode is incompatible with TeX's
7-bit kludge. Search engines, like Google, do remarkably well in indexing PDF
and PS files, but they would do even better on an en-unicoded (or is that
uni-encoded?) HTML web page, especially if it involves special mathematical
characters or accented characters.]
p [And then there is future-proofing. XML is just a data format for whatever
data you happen to have, but comes with many standard tools for its display
and transformation into other formats, such as CSS and XSLT. CSS enables web
browsers to display XML, and XSLT is a sort of [q[macro processor]] for
transformation. The other good news is that XML has a system of
[q[namespaces]] that ensures that mark-up (roughly equivalent to TeX's
[q[macros]]) from different sources with the same name do not clash with each
other, a problem I for one have encountered with TeX as I loaded up yet another
macro-package only to find it breaks an existing one because of a
name-conflict. So even if MathML turns out [i[not]] to be the flavour of, say, the
2030s, then there will be ways to convert our documents into the new format
and preserve the intended meaning. And it is [q[intended meaning]] that
prevents any good way of converting (La)TeX to XHTML: (La)TeX is simply not
rich enough to present the meaning of subexpressions in a mathematical formula
to allow translation to MathML, let alone to a computer algebra system or
other such program.]
p [I could go on. For example, I believe strongly that, in the academic community, all
of us have a duty to make our documents as widely accessible as possible to
all in the world, irrespective of language or disability. Paper-based PDF
documents are not the way to go here, though they may still be the final
medium chosen for printing the documents out for the majority of us.
(I recently had an email from a prospective student who is blind. He wanted
to know how many lecturers here were using MathML, as he had software that could
read such documents out loud. Failing that, he was just able to [q[read]]
TeX source files, but PDF documents were quite impossible.)]
p [Academics must stick to standards where they exist, to enable global searches,
automatic translations, or other automatic transformations to aural or other
formats wherever possible. The nature of our subject is not to pre-judge the future.
And if, by sensible use of technology, our documents reach a wider readership, then
that is good for us too.]
p [I first heard about XML and MathML over 10 years ago. One puzzle is, given
that it is so much superior to TeX, why are so few people in the mathematics
community using it? One reason is that only recently have [q[mainstream]]
browsers such as Firefox ([uri[http://www.mozilla.com/firefox/]])
been able to display MathML. And why has that taken
so long? Well, in part it has to do with Microsoft's battles with Netscape
over the browser market but mostly it is because we, the mathematical
community have not seen the need for it. So to help promote MathML, as well
as for the other reasons given above, I have started to write my lecture notes for the
new module I will be teaching in the new year in XHTML+MathML, and direct my
students to those places on the internet they can find a suitable browser and
the fonts required, rather than providing PDF [q[translations]] for them.]
p [You can see the results of this work at [uri[http://web.mat.bham.ac.uk/R.W.Kaye/seqser/]]
for yourself. This address gives the home page for my module on Introductory
Real Analysis (Sequences and Series). The home page itself hasn't got any
mathematics itself, but towards the bottom it links to exercise sheets and lecture
notes towards. As I write, there are five such pages linked in. By the time you read
this there should be many more.]
p [If you do decide you want to look at these pages and you haven't looked at MathML
before with a web browser, you will need to ensure you have the correct software on your
computer. In my view, the best browser for mathematics is Mozilla Firefox
([uri[http://www.mozilla.com/firefox/]]). The main
reason is that it can display web pages with mathematics directly without having to make
last-minute behind-the-scenes translations. Unfortunately, mathematics will not display
properly unless additional fonts are installed, and the web page at
[uri[http://www.mozilla.org/projects/mathml/fonts/]] details what is
needed here. (A future version of firefox with all the required fonts included is
promised some time in 2006.) On my MS-Windows machine, font
installation went smoothly. It was a little more
awkward on my Linux machines. For people using MS Internet Explorer, mathematics
support has improved considerably since I last looked at it a year and a half ago
with MathPlayer 2.0 ([uri[http://www.dessci.com/en/products/mathplayer/]]). My main
gripes with this set up are: (1) the way MathPlayer has to make a transformation
of the source document before it is displayed (so [q[view > page source]] [i[doesn't]]
show the source but in fact shows an intermediate); and (2)
the fact that Internet Explorer does not seem to be fully XHTML-compliant yet,
in particular cannot display the XML-standard combination [q[']].]
}
section {
title[How]
p [The main disadvantage with XML and MathML in particular is how verbose it
is. It was never designed for direct entry from a text editor, in the way
LaTeX is commonly typed, or HTML can be. It seems that the W3C (an independent
consortium who publish the web standards, including XML and MathML) never
expected XML or MathML to be typed in directly. Instead, they rather expect
XML authors to use specialised XML editors with drop-down menus using the
mouse presenting a [q[palette]] of options available in that context, rather
similar to the equation editor in word. There are such editors available, and
one or two free ones that work across several platforms, including the
equation editor in Open Office (which can export MathML) and
W3C's own Amaya ([uri[http://www.w3.org/Amaya/]])
browser/editor (which can author XHTML with embedded MathML directly).]
p [I for one find mouse-based editing
tedious in the extreme as it can be very slow to use and rather limiting
in that only the combinations available in the palettes can be used.
In principle, it would be possible to input a special vocabulary of
TeX and allow TeX itself to convert this text to
XML, and this approach has certainly been advocated.
I decided to experiment with a more flexible approach and wrote
a Java program called [i[gloss]] to convert a text file into XML: the
conversion process itself is controlled by an XML file called a [q[modular
vocabulary]] ([q[MV]]) and by writing different MVs it is in principle
possible to convert other types text files to XML. The main
application at present is authoring XHTML and MathML by writing plain
text in a text editor and converting to XML with [i[gloss]].
[i[Gloss]] is still in early days—any information that is
available can be accessed via my home page at [uri[http://web.mat.bham.ac.uk/R.W.Kaye/]].
The subject of text-input with [i[gloss]] is too large and still too experimental
for this article, but my experience is that it really can provide a format in which
mathematical text can be typed in a text editor as quickly as LaTeX can be, and
the source file is at least as legible as LaTeX is. The processing stage is slightly
slower than LaTeX, but still only a matter of a couple of seconds or so for
a typical document. This document has been typed using emacs and converted
using [i[gloss]], and will be made available on the web via my
[a @href[http://web.mat.bham.ac.uk/R.W.Kaye/][Home Page]].]
; para {
p [As I have mentioned, the notes I have been writing have been on Sequences
and Series for First Years. So I have had to be able to write statements like]
math
series {mrow n = 1} infin {mfrac 1 {mrow n sup 2}}
= {mfrac 1 1} + {mfrac 1 4} + {mfrac 1 9} + {mfrac 1 16} + hellip
lt {mfrac 1 1} + {mfrac 1 2} + {mfrac 1 6} + {mfrac 1 12} + hellip
= 1 + series { mrow n = 2 } infin { mfrac 1 { mrow n (n - 1) } }
.
p [In MathML this is stored as]
pre !CDATA[
∑
n=1
∞
1
n2
=
11
+
14
+
19
+
116
+
…
<
11
+
12
+
16
+
112
+
…
=
1
+
∑
n=2
∞
1
n
(
n-1
)
.
]
p [whereas in the [i[gloss]] system I have been using this is entered as]
pre !CDATA[math
series \{mrow n = 1\} infin \{mfrac 1 \{mrow n sup 2\}\}
= \{mfrac 1 1\} + \{mfrac 1 4\} + \{mfrac 1 9\} + \{mfrac 1 16\} + hellip
< \{mfrac 1 1\} + \{mfrac 1 2\} + \{mfrac 1 6\} + \{mfrac 1 12\} + hellip
= 1 + series \{mrow n = 2\} infin \{mfrac 1 \{mrow n (n - 1)\}\}
]
p [You can probably make reasonable guesses as to how the first is a translation
of the second. Suffice it to say that the typing can be done in a reasonable
amount of time.]
; };para
p [One of the joys of being able to write student's lecture notes as web pages
is the extra facility that hyperlinks provide. For example, in the
very first lecture I discussed how difficult it is to tell from
numerical experiments whether
[math series { mrow n = 1 } infin { mfrac 1 { mrow n sup 2 } }] and
[math series { mrow n = 1 } infin { mfrac 1 n }] converge. In a web page
I could just link the computer program and its output as hyperlinks
for only those students who are curious enough to see them, thus having
just the essential material on the main page. Similarly, rather than just
quoting [q[Theorem 4.23]] and expecting the students to find it in their
notes I can use a hyperlink.]
p [So far, all my MathML experiments have been with presentation-MathML, which focuses
on the presentational aspects of mathematics. There is a parallel form of
MathML, Content-MathML that focuses on meanings rather than presentation.
In time I do want to look at such semantic aspects. However, I am rather taken
with OpenMath ([uri[http://www.openmath.org/]]), a rather elegant and much more
flexible XML mathematics format that concentrates on semantic aspects only and
which can be used in conjunction with presentation-MathML, and expect to be looking
into this rather more in the future. However, whatever the future of my own experiments
with [i[gloss]], MathML and OpenMath, the documents I will be writing in the near future
will conform to existing standards and can be viewed, transformed, saved and edited
in many readily-available editors.]
} ; section
hr
section {
title[Afterword: publication details]
p [The above article appeared in
MSOR Connections Vol 6 No 1 Feb 2006, 20–22. The only changes I
have made are to update some of the web links from my experimental
server at mat140.bham.ac.uk to the main School of Mathematics server at
web.mat.bham.ac.uk. The article was written using an early version of
my [q[gloss]] system and the source code, as well as translations into
other formats, and any other information on the ariticle, should be
available in [a @href[.][this directory]].]
p [
[b [Richard Kaye]][br]
[i [School of Mathematics]][br]
[i [University of Birmingham] ][br]
[tt [[a @href[http://web.mat.bham.ac.uk/R.W.Kaye/][http://web.mat.bham.ac.uk/R.W.Kaye/]]]]]
};section
} ; body