Poetry, punctuation, markup & screen readers

I’ve seen poetry markup discussed a couple of times lately, but it never seems to end in a definite conclusion. Most recently, it has been discussed on the Web Standards Group mailing list: Marking up poems.

For some, it is obvious that using a pre element around a poem is correct as it maintains spacing and shape. For others, the semantic richness of using more typical HTML elements, such as paragraphs, is the obvious choice.

Anyway, I’ve done some brief tests of poetry with screen readers to see how these two semantic constructs are handled.

Some options

Poems are typically broken up into blocks called stanzas, the paragraphs of poetry. So, in terms of typographic layout on the Web, paragraphs are probably what we’re dealing with here; specifically, paragraphs with single-line boundaries, the sort of layout we typically see browsers render by default.

I started out writing this thinking that paragraph and break elements were the answer, but layout and spacing are particularly important to poetry. Some poems have integral layout and deliberate spacing—lines with different amounts of indent, distinct overall shape—which should be maintained if possible. While we may be able to achieve the required results using CSS, would we be unnecessarily abstracting part of the content in doing so? In essence, a poem is pre-formatted text, so I can see why pre is a natural choice; even the W3C use a poem as an example of using pre.

Another option is to use a blockquote. While you may be quoting someone else’s work, what markup do you then use for your own poetry? There’s nothing wrong with using blockquote in this way, but it depends on the context of your writing.

Practicalities

First off, we don’t want to complicate things too much. There’s no grounds for developing poem-specific in HTML 5 and suchlike, and using overly-complicated HTML constructs to avoid the use of br elements is just plain silly in my opinion.

Paragraphs and line breaks make sense from a semantic point of view, but if we want to preserve spacing, do we really want to be mucking about with CSS to achieve specific effects? Does whitespace—beyond paragraph barriers, so indents and overall shape—really matter that much? Yes, I think it does.

By using pre elements for poems, we are able to preserve the author’s work more easily. However, with no paragraph elements available to us, we lose a bit of the semantic richness that HTML allows: Only a selection of inline elements are permitted inside a pre element, so you can’t validly put paragraph elements in there.

Still torn?

Let’s take off our semantic hats for a moment, depart from the debate a little and look at how screen readers handle things.

Picking a short section of a poem featuring lines with and without punctuation at their ends, I used the two markup methods discussed and ran recent versions of JAWS and Window-Eyes over them. I recorded the speech output, so you can hear the results for yourself if you really want to.

In summary, the speech output from the different markup methods sound pretty much the same. Judging by these brief tests, screen readers don’t seem to pay much attention to carriage returns, line feeds or HTML br elements when they speak. Lines of poems run together, but punctuation causes screen readers to pause or announce the mark, which is how it should behave in my opinion. I don’t know enough about the actual inner workings of screen readers to provide an authoritative answer for this, but how the stanzas of a poem are achieved through markup doesn’t seem to make any difference to how it is spoken by modern screen readers.

Being a standards-based kind of guy, when I started testing I did not expect screen readers to be able to navigate the contents of a pre in the same way it can a bunch of paragraphs. I had imagined long poems with little markup would be a bit of a hassle for screen reader users. I should know better.

Screen reader users can skip from one paragraph (or stanza) to the next (Control+Down Arrow in JAWS). I wasn’t sure this would be possible for content inside a pre. A quick test reminded me, as is the case with browsers, screen readers have had to deal with a lot of crap markup. Quite possibly with a helping hand from the browser, screen readers know what looks like a paragraph. So, within a pre element, a screen reader user can still navigate paragraphs, skip to the next line (Down Arrow in JAWS), etc.

What does it all mean, Jon?

It basically means that it doesn’t really matter which markup method you use for poetry when it comes to screen readers. There is something I’d like to address here though: Some people seem to think they need to put in special punctuation for screen readers; commas at the end of each line, for example.

Punctuation marks, being the signposts of the written word, guide us through bodies of text. When it comes to poetry, line breaks add extra guidance, but how they should be interpreted may be ambiguous. Should they incur a pause? I’m sure that people will read the same poem differently, perhaps putting a pause in where the author didn’t mean one to exist, thus altering the rhythm.

The thing is, the rhythm of a poem is at the artistic discretion of its author. You cannot go sticking commas into someone’s poem because you think there should be a pause at the end of a line, or at the end of every line. The line breaks in a poem aid the understanding of the poem; they often highlight rhyming in a poem and indicate rhythm. If we’re specifically concerned with how screen readers speak the lines of a poem, I don’t think we can realistically do any more than to let the author’s punctuation lead the way; and neither should we do any more than that.

So, with that out of the way, we’re just left with making a decision as to which method to use. This blog post is really just a long-winded way of saying that, fundamentally, I doubt it matters a great deal. If you want to preserve spacing and shape, it’s probably best to use pre, but otherwise, I don’t think there’s anything wrong with using p and br elements (and perhaps blockquote as well).