Welcome to Sethdot Technology Linux Business News Security Games
 faq
 code
 awards
 journals
 older stuff
 preferences
 submit story
 supporters
 topics
 about

WYSIWYG Editors versus Markup Editors
RobBrown Posted by RobBrown on Thursday December 20, @08:09PM
from the what-you-read-is-what-you-need-to-know dept.
First, I've used word processors (WYSIWYG and otherwise) from Wordstar on. I used one of the first "true" WYSIWYG PC editors capable of doing math and equations (T3). I've used most of the versions of Word (starting with what I think was version 1) in passing at least, Word Perfect, and around a half dozen others in various incarnations (PC-Write -- anybody remember that? StarOffice, Abiword, Applixware and a couple of PC transients whose name I can no longer recall).

I've taught secretaries to use at least a couple of these tools, taught my kids, used them for personal work, to write physics papers, to write books and stories and poetry and papers, both in physics and other fields. I'm just reciting this litany to communicate that I really am not mindlessly biased against WYSIWYG editors out of ignorance or a lack of expertise. I've been using them as long as at least some of the likely readers of this document have been alive. If my life relies on it, I can still hack out a decently formatted document using Word or Word Perfect or etc. No surprise there -- "anybody can", right?

Second, I've also used text (mostly or only) typing editors for far longer, starting with a typewriter and a variety of erasers -- liquid paper was a godsend, I remember, once upon a time long before many of you were born. [Parenthetically: I suspect that very few of those reading these words have any concept of how clumsy (type)"written" communication was back in the pre-1982 days. Typing and editing this response would once have taken me several days of work (especially if I count the number of times I back up, change a word, and zip right along and review the whole damn article and add a line, edit a line, delete a line, add a smiley:-). Heck -- it's taken me four or five hours even with an editor!] Just having a simple, fast, full screen, line-wrapping text editor is a true blessing. The first editors I used where line oriented: the QED editor in TSO/MVS, for example, which was so bad I wrote a whole front end in BASICA on my first PC to transform it into a virtual fullscreen editor. The PC-DOS edlin (IIRC the name) was a joke -- sort of a crippled unix ed).

Shortly thereafter things began to improve and "real" text-mostly or text-only editors appeared. PC-Write was actually not too bad a text editor and I used it mostly for this purpose. In fact, I vaguely remember using it and one other tool to write most of the actual C code I wrote for/on the PC. Naturally, I've also used a variety of text editors on Unixoid systems over the last fifteen years or so -- vi (yuk!), emacs, jove (which is my personal favorite, which proves that I'm a bit iconoclastic even in the realm of Unix editors). I also started using PC-TeX in roughly 1985 or 1986, switched over to LaTeX once I learned that in the realm of typesetting LaTeX is to TeX as a compiler is to an assembler (a crude analogy, but adequate) shortly after writing one or two papers the "hard way".

Third, I would guess that I write more (on average) than any three Sethdot readers put together. My current writing includes a wide range of material from formal letters (of recommendation, for example), email traffic (LOTS of email traffic), poetry, presentations, stories, physics papers, quizzes and exams, and even a book or article or two. I haven't done a keystroke count (I probably should just for grins) but I seriously guestimate an average output of somewhere between five and ten single space typed pages a day.

Hopefully this will establish a reasonable set of credentials -- I write a lot by anyone's standards and am quite familiar with a wide variety of editing and text processing tools.

My own opinion on the WYSIWYG vs Markup discussion is as follows:

1) 95% of anybody's needs are satisfied by any decent text editor. The miracle of computers with respect to writing isn't, really, that they gave to each human writer the electronic equivalent of a typesetting shop that would have cost (twenty years ago) several million dollars and given full time employment to a trained staff of ten or twenty people. It is that one can type one's compositions (on the probably flawed but what can you do QWERTY keyboard:-) without taking as long to fix a simple typographical error or make a simple editorial revision as it would take you to type five or ten more lines of text (and leave a nasty smudge, a hole in your paper, a patch of crusty dried paper, or worse, behind in your work).

I'm tempted to give the youngsters on the list a historical overview of the "old days" when a sipmle tranpsosition of hcaracters on a typed line cost one around a MINUTE of work. If you were lucky. In words, it cost one 40 to 60 words of productive output and destroyed the flow of creative output. Now any text editing tool allows simple transposition of characters to be corrected with a few keystrokes -- the "cost" is no more than four or five words of streaming output. Editorial revision is similarly blindingly simple and many, many times faster.

Note that one doesn't need EITHER a word processor OR markup to do a gangbusters job of communicating with text. I can write anything from poetry:

  Word processing apps:
    Miracle and burden both
      As tools often are

to formatted formal letters with nothing but text. I certainly use text almost exclusively in email for the simple reason that folks that send formatted documents, whether they are in html, sgml, xml, doc, or any other of the various open or proprietary ways of encapsulating text with formatting instructions will both annoy and fail to communicate with a large segment of folks that receive them. Text is the lowest common denominator of the written word, sure, but it is a damn high lowest common denominator! We rarely need anything more, no matter how much we think we do.

2) I will propose, for the sake of argument, that 5% of an author's effort is spent on the actual format and/or layout of the document in question. In reality the actual time consumed by this is, or should be, much smaller than 5% of the document creation time. The 5% figure thus is weighted, somewhat, with its relative importance and augmented a bit to include time spent thinking about appearance and polishing the final product. It does not include general editing time -- most editing is structural and associated with real semantic content, not whether the paragraphs are lined up just right and printed in the "perfect" font.

The argument that seems to have developed on the list and elsewhere in the universe is whether WYSIWYG "word processors" (integrated formatting, typesetting editors) are "better" or "worse" than simple text editors with some sort of ASCII markup.

My thesis, developed in some detail below, is twofold. First, There are no perfect document composition tools or methods. Both WYSIWYG WP's and text editing/markup methods (as commonly and commercially implemented) are seriously flawed. Furthermore, they have flaws that are virtually complementary. Fortunately, there is some small hope that those flaws will be remedied in the not infinitely distant future.

Of the two, text editing with markup (whether for general purpose document creation or for typesetting formatted documents) is probably superior for the purpose of creating portable, archivable, editable, properly formatted documents that can be acted upon with a wide range of tools in many environments. Straight text editors (with markup) are also often far, far faster to use to create and format complex documents as straight typing is considerably faster than using a mouse -- I can create {\em emphasized} text or <b>bold</b> text in markup in the time that a user is just reaching for the mouse in a word processor.

The hand-integration of a text editor with markup and a post-processor is, however, considerably more difficult to set up and learn to use. Quite a bit of that learning barrier is associated with learning how to properly format a document in the first place, which one might well argue is an important first step in producing a decent looking document with correct style and consistent appearance in any environment. The rest is associated with the relative clumsiness and multiple steps of the edit/compile/display/debug cycle (which resembles that of writing a computer program more than that of "word processing" a document) compared to the "instant gratification" of making a change and seeing the result directly on a word processor.

Let me go into some detail on this.

First, Joe is quite correct when he says that learning or using a markup or typesetting language is a huge barrier for most users. He is equally correct when he says that most of those users will never learn to use a markup or typesetting language when they can open up a word processor and crank out documents formatted exactly the way that they want them to be.

Unfortunately, very few of those same, essentially lazy users has any idea of how to correctly format a formal document, let alone typeset it. Infinite power, zero knowledge, negative discipline. This isn't really their fault. Once upon a time (usually around 8th grade) those users were probably taught how to write the most common formal, formatted documents. There was "The business letter". There was "The Paper". There was "The Story". There might have been "The Report". Only a few actually learned how to do any of these things from these classes, but this is normal enough. They had other things they cared about -- at the time.

As far as I recall, at least the schools I attended were utterly devoid of classes covering advanced document composition, typesetting and font selection. No classes, no rules, not even coverage of the concepts. Of course this was perfectly reasonable at the time. Most documents were handwritten, or if you were lucky and knew how, typed by hand on a machine where the muscle-driven keys actuated little mechanical levers that caused a letter cast to strike a piece of paper through a strip of inked ribbon.

The cost of typesetting a simple, short document was thousands of dollars or more and a job for trained professionals with advanced tools who did this for a living. When writing real manuscripts on a typewriter (yes, I DID take a class in typing, anticipating that the keyboard and I had a real future together, and used to own a Smith-Corona portable that would be worth literally hundreds of dollars as an antique today:-), one used caps for emphasis, ordinary text the rest of the time, or (if one was writing a book or the like) some sort of primitive handwritten or typed markup to indicate where italics and the like were needed.

One might well argue that it is {\em still} perfectly reasonable for most folks, including those that use word processors and produce all sorts of formal documents, to not have to learn about advanced document composition, typesetting, font selection, and the esthetics and pratical aspects of document layout. Reasonable or not, most folks simply won't bother anyway. They don't have the time and sadly, the tools they use don't encourage structured writing, let alone formatted structured writing.

Compare the division of structure and function (necessary for a clean and consistent document format). Clearly and obviously

\title{This is a title}

and

<title>This is also a title</title>

but who knows

<block font=times-roman-bold; size=18pt; centered>If this is a title?</block>

Yet I'll bet that 90% of the titles that ever make it into a Word or Word Perfect document are of the latter sort. Click on font, size, centered text, type, and click back to normal text and type. Get annoyed when you try to insert text in the wrong place and the font comes out "wrong". Get REALLY annoyed when you realize that you need 18 point for section titles (since you are using 14 point for subsections and 12 for text) and that all the chapter titles and the book title have to be redone. Get REALLY annoyed when trying to do a keyword search on "title" -- only there is no such entity. Get REALLY annoyed when the editor of the magazine or the journal or the book changes your font sizes or whatever, as this (correctly) suggests that your choices were not competently made.

For better or worse, WYSIWYG editors have jumped straight from supporting simple formatted documents with templates to putting a full typesetting toolset at the disposal of each and every user, where one can create documents that look just like kidnapper ransom notes pasted together by some madman with access to a whole library to cut up, not just the Sunday paper. This leaves most users of word processors just as bewildered and clueless as they would have been if they had wandered into a typesetting shop twenty five years ago and been given carte blanche to compose and typeset a letter to their mother or a list of things to do today. Where do we start? Ooo, look at these cool fonts!

In my own experience (both as a user and as a teacher of others of many different document manipulation tools) VERY few users actually use the formal document templates provided in a WYSIWYG WP, since there is a very definite learning curve associated with using them and an equally severe documentation deficit. Who reads the Word manual(s)? Nobody -- why bother getting a WYSIWYG, mouse driven product if you're going to have to actually learn to use them right? I'm certain that when document templates are used correctly they indeed encapsulate a structured markup. But they aren't easy to use, and used incorrectly, they often break badly and are very frustrating. Most PC users have nearly zero tolerance for figuring out complex things, curiously enough.

So they stick to picking fonts randomly to serve various purposes -- this chapter title they'll set in 24 point, this section header they'll set in 18 point, the text they'll set in 10, or 12, or 14 point text depending on whether they are trying to maximize the size of the document produced with respect to the number of actual words (as my kids all too often want to do for class papers, where twelve point type leaves a two hundred word paper a paltry two or three inches long in printed text and where they show a definite propensity toward the "exotic" fonts in their finished products:-). Perhaps they'll use boldface for emphasis or maybe they'll use italics. A lot of folks turn off the perfectly decent default Times-Roman font and select a sans-serif font, or mix fonts in wierd ways throughout their documents. A number (and you know who you are, if you are on this list!) are ultimately evil and use the #$!@ script font! Death, death, kill, kill.

This problem isn't confined to the WYSIWYG editors, although they actively encourage it by the very ease of use and wide palette of formatting choices that make them popular. TeX has exactly the same problem. Raw TeX is a low level typesetting language. Font selections and so forth must be made for every line of the document. This is so difficult (and so prone to abuse) that early TeX documents were often "beautiful" by typewriter standards (proportional type, perfect margins, lovely fonts in a wide range of sizes) but a nightmare in terms of those little formatting/typesetting considerations.

TeX styles were the first effort to provide encapsulated structure to prevent a bit of this abuse and save work. However, even with a fair amount of encapsulation, there was too easy access to those low-level commands (for TeX is basically a very low level language) and hence a direct temptation to generate one-of-a-kind documents that look-like-I-think-it-should instead of the way that a compositional style guide written by experts says that it should. We are all free rebels in a free rebel culture, and we have a God-given right to make up document formats as we go along. Don't we?

LaTeX was developed to encapsulate and hide most of TeX from the writer so that they could gain two key benefits. First, TeX commands are moderately complicated and took an inordinate amount of tweaking to get a particular look. Second, most users had no clear idea of what kind of look they should be trying to get and were making it up as they went along. LaTeX prevented both.

LaTeX functions much more like a markup language than a typesetting language, so much so that latex <-> *ml translation is generally a simple linear transform between things like {\em emphasized} and <emphasize>emphasized</emphasized>. Furthermore, important document entities are clearly identified and consistently transformed into a professionally written, internally consistent layout. Users do not generally need to tweak things like the fonts used to present titles versus chapter titles versus section headers, versus footnotes versus ordinary text -- this is all laid out ahead of time and looks better by default than all but the most carefully crafted WYSIWYG documents.

In addition to providing excellent support for the creation of correctly formatted "formal" documents, LaTeX has the additional advantage of providing unparalleled support for the creation of technical documents -- ones containing equations or math of virtually any sort. Things like font decisions, how high to raise or lower super or subscripts, how to size parentheses in nested expressions, how to align repeated equations -- all handled transparently, automatically, and perfectly. LaTeX is literally without peer in this -- WYSIWYG math editors are a joke in comparison and are by no means easy to learn to use or fast to use in operation. LaTeX equations can be typed without a mouse in seconds at an ordinary keypboard.

Once typed, as already noted in previous list discussion, one can email an latex equation like:

\begin{equation}
  \vec{F} = \frac{G M m \hat{r}}{r^2}
\end{equation}

(Newton's law of gravitation) and anybody can both receive and read it. Without any tools at all.

Tex/Latex are by far the best thing around for this purpose. Math ML transforms this single line (the markup reduces to $$...$$ around the actual equation) into about a page and a half of markup that no human can read or generate. Word (or other WYSIWYG editors that will let you work with equations at all) produce some totally illegible binary that can only be regenerated and viewed with or edited by the precisely correct version of the program that created it. Although not everybody needs to work with documents that need math or algebra, (La)TeX's power with documents of that do is one significant reason for its widespread adoption.

For all of these reasons LaTeX has been nearly universally adopted in the math, physics and engineering communities for advanced technical document composition. If you work with equations a lot, you have to be masochistic to do so with Word and friends. TeX-based systems are also fairly widely used in real publishing venues, where Word is not. For example, check out:

http://guzdial.cc.gatech.e+du:8080/personal.138

Now, as Joe also already pointed out, Word and other decent WYSIWYG editors also have tools intended to provide structure to writers so that they DON'T have to figure out what font to use for titles and so forth. They have templates, crossreferencing tools, table builders, embeddable rulers, they manage itemized lists automagically (sometimes horribly, horribly, incorrectly, but that's another story:-). A user who takes the time can get a very LaTeX-like professionally formatted output by using all of this. A skilled Word user can therefore produce decent looking documents. [Note: I'm not really picking on Word, much as I'd like to -- the same is true for any of the major WP's and indeed you've seen/used one you've seen/used them all.]

Note the key phrases here: "takes the time" and "skilled". The same learning barrier that makes Word the easy way to create documents for most users keeps them from ever learning to do it right. When they do, they may find that it actually takes as long or much longer than it takes to do similar serious formatting tasks in e.g. latex. Most folks haven't GOT the time and don't have the faintest idea of how to acquire advanced formatting skills, and this sort of thing isn't generally particularly well documented either externally or internally. Isn't that what that the toolbar with fonts and everything is for? What are those menu items for if not to be used? Users who actually try to use some of those advanced features on their own quickly learn that with some of them (if you ever have the temerity to touch the ruler line, for example) you'll probably break things so badly that you'll often have to throw the document out and start over. This sort of meltdown failure doesn't exactly encourage one to persevere and surmount the hump.

The problem is that typesetting and formatting a formal document is complex and a lot of that complexity is irreducible and, in the case of a WYSIWYG editor, hidden and inaccessible except through menu driven, complex, tools. It is difficult with a markup language. It is difficult with a typesetting language. It is difficult in a GUI word processor. I honestly think that it is by far the {\em most} difficult in a GUI WYSIWYG editor -- most folks give up long before mastering "advanced Word". OTOH I know lots of folks who have mastered LaTeX or a markup language in that they can confidently learn and use a new feature once they have the basic set of format markup commands down. This is because markups are often actually conceptually simpler and more elemental in their description of the complexity, with simple english markup tags instead of threaded paths through menus and option panels with unexpected side effects.

Finally, there is the problem of storing and transmitting the document to others, which is actually a major reason to have this whole discussion. As always, this requires a bit of historical examination, especially since this was one of the major reasons that I ultimately abandoned WYSIWYG.

In the beginning PC word processors were very simple things. Wordstar, for example, had only a handful of things one could do with fonts -- italic, boldface and regular. It had a superscript and subscript mode that pretty much sucked, mostly because of the 9 pin dot matrix printers one used to print out the final documents, although even the really expensive 24 pin dmp's left a fair amount to be desired. It wasn't a WYSIWYG -- it presented a marked up version of the text on the display because most displays of the day were text-only anyway (or a whopping 640x200 pixels of slow 8-bit graphics).

The early WP's basically saved their documents as a compressed binary markup. By compressed binary I mean that they used straight ascii for the text portions but turned on the high bit for "special" characters that literally marked up the text. A special character would turn on boldface and another (or the same one a second time) would turn it off.

This was done for three reasons. The zeroth is that all formatted non-binary-image documents are always saved as markup -- they have to be. That is what markup is and what it does -- encapsulates text with instructions for its use or presentation.

The first reason was that storage was expensive -- my original PC had 360 KB floppy disks and 64 KB of memory (which I upgraded as fast as I could afford it -- over several years -- to a whopping 640K total, spending more on the memory expansion card alone than I spent on a whole computer this year, in 1983-1985 dollars). Spending 3.5 bytes or more on a straight ascii markup e.g. <b>bold</b> was unthinkable. READABLE straight ascii markup like <bold>bold</bold> was absolutely out of the question. It just plain cost too much valuable space on relatively expensive and comparatively tiny (in terms of storage space, not physical size) floppies.

The second was that they could make their output formats proprietary and "secret", as from the earliest days the "set the standard" armwrestling corporate games were being played between vendors seeking to become the next Lotus, with a much flatter mat than today.

Make no mistake about it -- those proprietary formats both then and today are a) basically pure markup -- that's what markup IS and what it DOES; b) are still binary even now, nearly twenty years later only to make life difficult for reverse engineers and guard those proprietary edges.

Storage, both non-volatile and dynamic, has long since become absurdly inexpensive. That same computer that cost as much as a multifunction card for an original PC came with 256 MB of memory -- 4000x as much memory as the PC. Its permanent storage is 20 GB (not including the CD-ROM and floppy), compared to 0.7 MB -- close to 30,000 as much. The text documents one creates with a word processor, however, are still, at most, a few hundred pages long. The unmarked up text for a fullsize novel is order of a megabyte. If one spent one character on markup for every character spent on text nobody would care, especially since compression programs can easily squeeze out that information-theoretically wasted space and then some (in the text itself) for transparent, compressed, archival storage.

This, then, is the primary remaining piece of evil in most proprietary word processors. If you wish to save a document with all those carefully made WYSIWYG formatting decisions preserved (with the full span of choices and parametric values encapsulated by the markup tools provided in the GUI) then you pretty much have to save in the WP's "native" format, because all WP companies (not being stupid) ensure that their products lose format information when "exporting" into foreign formats. Even with the same company's WP, next generation, that markup often breaks horribly when being "upgraded" because a tiny change in the spacing of proportional font text can cause the whole document to come out wrong. Indeed, as one might expect, the probability that it horribly breaks is proportional to the amount of microscopic justification you do by hand on a WYSIWYG basis as those decisions often make assumuptions about the anchors of a block of text that can change. Documents with a strong style and real markup (generated by WYSIWYG or not) are more robust.

Let's see, how can we count the depth of this particular Evil? As a binary format, it cannot be emailed in an ordinary message. It cannot be read by an ordinary plaintext browser. It cannot be printed in a legible form in source form. It cannot be edited by an ordinary plaintext editor. As a proprietary/hidden binary format with a constantly (often deliberately) churning encapsulated markup, it cannot be easily displayed or printed with even a dedicated tool!

One especially evil feature that has burned me badly in the past is that a document saved in just about any proprietary non-ascii markup format will be impossible to recover, (in the sense of being able to extract a useful version of your document) if you archive it for, say, ten years. Don't sneer at this. I've lost a whole set of the prettiest damn lecture notes you ever saw in exactly this way. I've also come damn near losing three short stories -- salvaged only by basically typing them all over again from hard copy. Just because your WP is really cool and popular THIS year, when you archive your writing and leave it for a decade or so and then decide that you need that Wordstar or Word version 1 or PC-Write or Word Perfect document, will you still be able to recover even a CRUDE ascii-only version of the text? Quite possibly not, not even if you are still using Word or Word Perfect. In the case of my lecture notes, well, T3 is something most readers on this list probably never heard of.

This sort of thing is inevitable on the commercial side of things. The company that developed the proprietary format has a vested interest in ensuring that you damn well cannot either export to any other format and preserve the beauty of your hard won layout AND that nobody else can import from that format and correctly display or edit the result. They have an additional interest in both adding features and value (good) and causing incompatible feature creep to force you to buy upgrades and give them more money (bad).

They largely succeed. Big surprise. So don't blame me if you archive documents for twenty years and then look at them longingly because they contain twenty thousand words of prose and a few hundred nicely laid out equations and figures and they might as well be gone forever for all of your ability to extract them and put them to use with modern tools.

 

With ALL of the above background, we are finally at the point where I am prepared to state what, in my opinion, is wrong -- and right -- about both WYSIWYG and raw text+markup approaches, and where they may yet one day come together and give us a sane word processing toolset.

WYSIWYG

Good:

Easy to start learning. Mouse driven, intuitive, toolbars with icons let one quickly explore cool things you can do to text. You get to write and see a reasonable facsimile of what the finished product will look like when "published" to a high resolution printer. Sometimes have powerful tools and integration features that let you relatively easily do complex things, like build tables or even include "dynamic" tables that are coupled to distinct entities, like spreadsheets.

Bad:

VERY difficult to finish learning, as there is little incentive to learn to do complex tasks when one can "fake it" with things from the toolbar. Mouse driven editing can be learned in a day and costs one productivity forever, as every time ones fingers leave the keyboard costs time (generally many times as much time as it would take to navigate with control keystrokes). Generally (in the hands of novices) produces documents with only a crude font/size sort of markup rather than a proper functional markup, which means that the result is neither portable nor (almost certainly) properly or consistently formatted. Encourages the ill-conceived blend of formatting decisions and the actual writing, when only rare individuals have any skill or talent or knowledge of how to do both. Finally, as a general rule the documents produced are saved in proprietary, closed, binary formats that are neither portable nor useable outside the context of the WP tool itself (or other tools in the same vendors document suite, at the same relative revision). Note that this final feature can bite you five, ten, even twenty years after you use a product, and is one of the strongest arguments in favor of an ascii-text markup for final documents. You may not get all of the markup the way you got it originally, but the all-valuable prose content is always recoverable.

Text/Markup

Good:

Even easier to start learning (as I include unmarked text -- like this -- in this category). Very fast. Very portable. Can be edited, browsed, searched, filed, compressed, printed on any platform. Can generally be read and sensibly interpreted even when marked up. Can manage equations. Strongly encourages structured writing, as markup is generally functional rather than microscopic (which I think is a good thing, but don't care to argue with those who really like going around trying to get things to line up the way you want them to in Word or whatever). Generally clearly documented and surprisingly, rather simple. Good markup or typesetting implementations very powerful -- they yield truly professional typesetting level control while sparing one the hassle of doing most of the hard work.

Finally, a marked up ascii document is timeless and can be usefully and extractably archived essentially "forever" -- we'll NEVER have WORSE filtering languages than we do right now, and it is a simple matter to write a script even now to strip out nearly any sort of ascii markup and render a document down to its composite ascii text -- the raw prose. Some of the ascii markup systems have proven amazingly powerful and robust -- I still have physics papers I wrote with latex more than a decade ago, and they still "work" with at most a trivial port to 2e form. This is simply not true for most of the alternatives available at that time (although my mother-in-law still uses Wordstar -- and her old wordstar files -- because I started her off with it back in 1984 or so -- she just lives inside a dos compatibility window and a time-warp...:-)

Bad:

Generally they separate the writing/formating/previewing/changing steps and use different tools for each step, loosely conjoined. Lots of things to learn before one can get from a straight-ascii document like this to the simplest formal formatted document (although, as noted above, once you can do one you can learn and do lots of them and it all is suddenly rather easy). Tables tend to be clumsy to build and get right. Embedded documents (e.g. figures, images, spreadsheets) often relatively poorly implemented (although this isn't really a fault of the markup concept). Finally, when they get things wrong they (like WP's, actually) are remarkably difficult to "fix" -- if you don't like where latex or a web browser leaves your figure on a page, it is not at all easy to force it to correctly move it where you think it should be.

 

From the above, I can safely conclude that BOTH WYSIWYG AND text/markup have good features, but both of them really suck in most implementations that I've seen so far. Would I like to be able to enter latex style equations in one window and see them magically appear in another WYSIWYG? I would, although frankly that's damn near what happens with a properly designed integration of make, emacs or jove, and xdvi. Such an integration isn't easy to build, however. Would I like to see the opposite happen -- read in Word documents with equations and see them rendered in LaTeX (or any other decent markup)? Damn straight I would.

Would I like to be able to go into the preview window and drop in a general figure image and have the application determine how to encapsulate it and place it in my ascii-only sources -- I would. Would I like to be able to grab that figure and move it up three lines and have the text rebreak and rewrap around it? I would. Would I like to be able to link in spreadsheet objects? Sure. Would I like a nifty toolbar to turn e.g. fonts, sizes, and so forth on and off? Not really -- boldface and emphasis is fine, but overall I'd rather have tools for collecting e.g. "title" and "author" and other named entities and sticking them in as properly formatted boilerplate. Even boldface is a heck of a lot faster to control as {\bf boldface} than by clicking the damn mouse.

Can all this be done, now? Sure. It wouldn't even be that difficult, and lyx is making a stab at doing it, although they've created YAML (.lyx) for the purpose. lyx is fairly stringently designed to "strongly" encourage the use of document templates, and is "sort of" WYSIWYG in the editor, which an integrated interface to a proper xdvi editor. [Parenthetically, I almost NEVER print out paper drafts with latex. xdvi yields a "perfect" representation of the screen on a high res monitor. There is just no point in printing it.] OTOH, XML or LaTeX/TeX either one (or both) could just as easily serve as a foundation for the document markup, and everything else is just providing the appropriate GUI tools and glue.

One beauty of markup languages is that they are almost always one-to-one maps on a large part of their terrain and -- where they are extensible -- be relatively easily adapted to be one-to-one everywhere. XML is by design the ultimate extensible markup system. One could very easily create, for example, "LML" (the LaTeX markup language, which would be precisely latex within the XML conventions) and/or LyxML, a translation of Lyx's Latex encapsulation into the XML conventions. Or probably WordXML or WPXML or whatever you like.

One day most of these folks will be dragged, kicking and screaming into using an XML-type universal markup syntax (it isn't a markup language itself -- it is a specification for creating a syntatically logical and systematically parseable markup language). It is, really, silly and expensive to maintain disparate parsing systems that all do the same thing, and proprietary formats will actually LOSE market share because of their lack of portability and the limitations on integrating Cool Tools. With the advent of glade (with its embedded GUI XML for saving a GUI itself) one could actually see truly amazing things like documents with an embedded application GUI (let alone simple things like a cross-formatted spreadsheet or graph).

 

To conclude, it is true that I intensely dislike Word and Word-like WYSIWYG WP's, but I have a lot of fairly carefully thought out reasons and many, many years of experience backing that dislike. I also clearly recognize their virtues -- a ten year old can use one, while a ten year old cannot use raw latex. Alas, few ten year olds can afford Word, and quite a few older users never get significantly further than the level of usage of that ten year old: pick a nice big font so a "one page paper" is as short as possible, and wouldn't this look cool in old english?

However, I also dislike some things about both latex and the generalized document markup languages. They CAN'T be learned or used by a ten year old, for one, and alas, there are plenty of folks who never get significantly more advanced in their document creation skills.

I do very much and even religiously support open source, open standard efforts whereever they exist. For this reason alone, something like lyx would get the nod over Word or Word Perfect. The proprietary non-ascii formats are also EXTREMELY dangerous formats to use for documents expected to have a long shelf life, and some of what I write has a shelf life in decades (and I don't know what part that might be FOR decades) which also argues strongly against proprietary tools.

However, I'll freely admit that the document editing and manipulation system of my imagination and dreams doesn't yet exist. In the meantime, on the whole, I do think that markup approaches in general are superior to WYSIWYG editors in general, for those that can manage them (that is, are willing to invest the time to learn to use either tool properly). Certainly latex (with tex embedded for full low level control) is immensely powerful while still retaining strong structuring. It isn't easy to master, though, and is certainly expert-friendly. Neither were all those tools in the old typesetting shops, I'm sure.

Maybe one day soon.

WYSIWYG Editors versus Markup Editors
Threshold:
The Fine Print: The following comments are the stupid ideas of whoever posted them. We do not agree with them in any way.
haiku (Score:-5, Funny)
by mike on Monday December 20, @10:10AM

I just finished. This was really inspiring.... See?

Rob Brown does go on
insightful comments to spare
I am so tired
yes and no (Score:-4, Insightful)
by physicsboy on Monday December 20, @10:15AM

Markup languages are great as long as you've got a preset layout available for you, as with something like REVTeX. As soon as you start making documents with unique design or layout features, you're back into the pit of pain which is creating layout in a markup language.

Brownbot (Score:-3, Funny)
by CmdrTaco on Monday December 20, @10:17AM

Dear Sethdotters,

The Brownbot is loose! No longer is it content with spamming Sethdot with long stories of dubious worth, now it's taking up flamebait topics.

I'm afraid now that we will need two squadrons of "snub" fighters - one to place a proton torpedo in the "logic hole" vulnerability of the original Brownbot, and a second to fly into the enormous, empty meta-brain of the second one and blow up the power station's cooling tower.

But it won't be that tough. I used to bulls-eye womp-rats in my T-38 back home. They're not much bigger than two meters...

---Free as in 'shoplifting'?

You might look at this! (Score:-2, Informative)
by BrownNose on Monday December 20, @10:30AM
I recently found conglomerate [conglomerate.org]. Maybe this is more what you were thinking about?

- look, ma! clever .sig!

And then there's XML (Score:-1, Interesting)
by yadayada on Monday December 20, @10:53AM

I'm not sure that proprietary formats will lose market share. As long as businesses are around, they will have a greater incentive to have their own format rather than a common one which everyone can use. I figure a "universal" markup language would either be so large and unwieldy as to be unusable or be limited to the point that people would continually be looking for ways around the language.

Actually, XML falls into a third category: so nebulous as to have the possibility of being both proprietary and non-proprietary. XML doesn't mean anything without definitions, and there's nothing forcing companies to make their XML definitions available to all.

what we really need... (Score:-1, Insightful)
by me on Monday December 20, @10:62AM

What I want is more separation of layout and content, or at least more documents in ASCII. Most of the documents at my company are in Word and Wordperfect format, neither of which is terrible portable or searchable. Lots of documentation is rotting on random harddrives, or being lost of machines are surplused.

Currently our "documentation management" system appears to consist of a few web developers who cut and paste ASCII text from documents and HTML format it.

I offered to start an Open Source project that would be an XML based document database (possibly using the Star Office DTD as a base) figuring that more than a few organizations could use a tool like this. I received the "but if you leave we won't have a vendor to turn to! Where can we buy a support contract?" and "We don't do internal development, we outsource programming tasks" and "We can't afford to retrain everyone." (Ignoring the fact that they are retraining everyone from Wordperfect to Word as we speak.)

I gave up.

WOW! (Score:-1, Funny)
by hah on Monday December 20, @11:01AM
Damn, Rob, why don't you write more in your articles? I mean, c'mon!
Summary (Score:-3, Informative)
by snarf on Monday December 20, @11:05AM

For those who don't have a lot of time, here's the Reader's Digest version of the article:

I use editors. I know how they work.

I don't like editors.

WYSIWYG? All implementation of WYSIWYG editors suck.

Markup editors? Same deal.

But markup's the way to go because Word and WordPerfect with EAT YOUR BRANE.

Hopefully the editor faeries will soon appear and grant my utopian dream of an über-editor.