Greek /h/

		5. Greek /h/
		Home > Greek > Unicode
Language: ENG ELL EPO JBO TLH LAT		Home > Greek > Unicode

The history of the written representation of Greek /h/ (when it had it, and when it no longer did) is replete with complications. Anyone using the standard script is unaware of these complications; they have complications enough to deal with in the placement of the breathing diacritics, U+0313 Non-Spacing Comma Above and U+0314 Non-Spacing Reversed Comma Above (Smooth and Rough Breathing) in titlecase and on diphthongs. But the breathings weren't always commas, and weren't always diacritics. And typographical treatment of earlier stages of Greek occasionally preserves the former status of /h/.

1. The 'istory of 'eta

To understand the diversity of typographical treatments of /h/, we need to trace the history of the development of /h/ in Greek script: the typographical forms merely sample earlier stages of this development.

We start with a bifurcation. The startling innovation of Greek was to turn Phoenecian consonants into vowels:

Phoenecian Letter	Pronunciation	Hebrew Glyph	Greek Letter	Pronunciation	Greek Glyph
ʾālep̱	ʔ	א	alpha	a	Α
hēʾ	h	ה	ei (later epsilon)	e	Ε
yôḏ	j	י	iota	i	Ι
ʿayin	ʕ	ע	ou (later omicron)	o	Ο
wāw	w	ו	u (later upsilon)	u	Υ
ḥêṯ	x	ח	(h)eta	??	Η

As Semiticists out there know, the innovation wasn't all that startling, as the Phoenecians already had matres lectionis—such consonants used as vowels in certain contexts. The Greek innovation was to make this systematic. Greek also split wāw into the vowel u and the consonant wau—later named digamma.

Phoenecian had a letter left over, ḥêṯ. The majority of Greek dialects needed a letter for /h/; since hēʾ was already taken for /e/, and Greek was not to get /x/ for centuries yet, they decided to use ḥêṯ for /h/, and accordingly called it heta. This is the tradition of Greek alphabet which was transmitted to Italy, and thence gave rise to Latin H.

The dialects of Ionia (south-eastern Asia Minor) and Crete, however, were psilotic. If you've been following your Greek Unicode nomenclature and are not a classicist, you can guess what that means. Since you may not: everything in those dialects was psili (smooth breathing)—meaning, there were no rough breathings, meaning, there were no aitches: those dialects 'ad dropped their aitches early. In those dialects, it made no sense for ḥêṯ to stand for /h/, since there was no /h/ there for it to stand for. Instead, those dialects decided to allocate it to the long vowel they had lying around in the language, /ɛː/. And since those dialects had no aitches, they weren't going to call the letter heta, but 'eta.

Since those dialects were now differentiating length orthographically, they carried the distinction further, by also distinguishing /o/ from /ɔː/; most of them—but not all!—used omega for the latter. Later on Greek was to get new long vowels, /eː/ and /oː/; but with the peculiar exception of Corinth, it never made up distinct letters for them, using digraphs instead.

The year after the defeat of Athens in the Peloponnesian War, 403/402 BC, the archon Euclides officially changed the alphabet used in Athens to the Ionic alphabet of Miletus (although there are indications that the Ionic alphabet had already been in use in Athens for a couple of decades beforehand). Though Athens lost militarily, it prevailed culturally, and within the century its new alphabet had displaced all the epichoric alphabets of Greece.

But while Greek thereby gained orthographic differentiation between long and short vowels, it lost the ability to write /h/—something it still required for another few centuries: while Greek used to write /héllɛːn/ "Hellene" as ΗΕΛΛΕΝ, now it wrote is as ΕΛΛΗΝ.

Or at least something close to ΗΕΛΛΕΝ: there is huge diversity in the epichoric alphabets. The family of scripts conflated in Unicode as Old Italic took on the Western Greek alphabetic values, as well as their lack of uniform letter shapes; the same word would be spelled in Old Italic (Astral Planes alert) something like 𐌇𐌄𐌋𐌋𐌄𐌍. (Yes, that's a variant heta.)

At this time, the Southern Italian colonies of Heracelia and Tarentum adopted Η with the meaning /ɛː/, but held on to their own epichoric heta, shaped as ⊢, to express /h/.

This means that Heracleia and Tarentum had both a heta and an eta. They weren't the only places with this idea: Delphi, in the same predicament, used boxed heta as distinct from normal eta (Buck 1955:240), and Cnidus invented a new box glyph for eta (Jeffery 1990:351). Rhodes however used the same glyph, a boxed or normal eta, for both /h/ and /ɛː/ (Jeffery 1990:345), which is just plain ornery. There were even instances where heta/eta was used to notate heta and eta simultaneously—Η = /hɛː/.

This idea caught on when h's started being dropped throughout Greek, and the codifiers of literary Greek needed a diacritic to indicate where /h/ used to be pronounced. The Heracleian symbol started being used to indiated /h/ as a diacritic, and its reverse, ⊣, to indicate the absence of /h/. This gives rise to the notion that Η was split in two to derive the new signs, and this may indeed have been what the Alexandrine grammarians had in mind; but there is no indication that the Heracleians ever split Η in two themselves, or that ⊣ had ever been used as a letter, as distinct from a diacritic.

So by the time Greek starts showing up in papyri, the new diacritics were in use: first in their tack form, x҅ x҆ [assuming the reference glyphs for U+0485 and U+0486], then ^⌞ and ^⌟, and by the twelfth century the comma forms we know today. Because of the nature of Classical Greek phonology, the breathing ended up restricted to the beginnings of words, though very occasionally heta did appear elsewhere in a word.

2. Who Writes Heta

Greek has a standard way of writing /h/: the comma-form breathing mark. So in all normal usage, including all normal publishing of Ancient Greek text, the comma-form breathing mark is what you would use. The only time you would not use the breathing mark is if you want to emphasise that the source document was written in a different version of the Greek script.

The most trivial difference is to leave the breathing mark as a diacritic, but to use the older, tack form of the diacritic rather than the mediaeval comma. I have only ever seen this done once, and unfortunately for you I haven't taken down the details; it was a German edition from the early 1900s of a mathematical work, and the text was done entirely in uncased uncials, with diacritics of the time. In effect, the text was given in a cleaned up version of 100 AD Greek script, rather than the 12th century version in universal use. This is a one-off, and you will not come across it anywhere but in histories of Greek script, and editions of Ancient grammars discussing the diacritics as they were shaped back then:

Προϲῳδίαι εἰϲὶ δέκα· ὀξεῖα ΄, βαρεῖα `, περιϲπωμένη ῀, μακρά –, βραχεῖα ˘, δαϲεῖα ҅, ψιλή ҆, ἀπόϲτροφοϲ ’, ὑφέν ‿ , ὑποδιαϲτολή ˒ . [τούτων εἰϲὶν ϲημεῖα τάδε· ὀξεῖα οἷον Ζεύϲ, βαρεῖα οἷον Πὰν, περιϲπωμένη οἷον πῦρ, μακρὰ οἷον Ἥρα ̄, βραχεῖα οἷον γά ̆ρ, δαϲεῖα οἷον ῥῆμα, ψιλὴ οἷον ἄρτοϲ, ἀπόϲτροφοϲ οἷον ὣϲ ἔφατ’, ὑφὲν ὡϲ παϲι‿μέλουϲα, ὑποδιαϲτολὴ «Δία δ’ οὐκ ἔχεν˒ἥδυμοϲ ὕπνοϲ».] (Anonymi Grammatici, Supplementa artis Dionysianae vetusta 1.1.106)

There are ten diacritics: the acute ΄, the grave `, the circumflex ῀, the macron –, the breve ˘, the rough ҅, the smooth ҆, the apostrophe ’, the hyphen ‿ , the hypodiastole ˒ . Their signs are as follows: acute as in Ζεύϲ, grave as in Πὰν, circumflex as in πῦρ, macron as in Ἥρα ̄, breve as in γά ̆ρ, rough as in ῥῆμα, smooth as in ἄρτοϲ, apostrophe as in ὣϲ ἔφατ’, hyphen as in παϲι-μέλουϲα, hypodiastole as in Δία δ’ οὐκ ἔχεν/ἥδυμοϲ ὕπνοϲ.

Sneakily, the editor gives the breathings in isolation by their contemporary form on the papyrus, but cites their illustrations in words with the normalised comma form. The hyphen and hypodiastole were optional word non-break and word break diacritics; space as a word delimiter had not yet been invented.

However as we saw, when the text cited is an ancient inscription, in which heta is still a distinct letter, the text will be cited with heta as a distinct letter, and the inscription will not be normalised into the modern script. This is important for epigraphers, as they need to apply diacritics or brackets to the letter to indicate textual certainty—which would be impractical if /h/ itself were a diacritic: underdot to represent partial damage, square brackets to indicate editorial addition, braces to indicate scribal deletion, and so forth.

Now, Western Greek ΗΕΛΛΕΝ and normal Greek script Ἕλλην are the selfsame word. If you are entering instances of inscriptions containing ΗΕΛΛΕΝ into a dictionary, Ἕλλην is the keyword they would appear under. So the use of heta is restricted to the verbatim citation of the form on the stone; any discussion about the language of the stone in general, or comparison to literary Greek, will impose the normal diacritic. While Greek linguists use normal breathings when discussing Ancient dialect words in the abstract, however, they usually respect the epigraphers' convention when it comes to citing the inscriptions themselves, and use heta like them.

So if Unicode is dealing with the normalised script, you would treat [heta]ΕΛΛΕΝ as a funny way of spelling Ἕλλην, and normalise it. Whether you normalise it through font (calling the heta letter a funny way of writing rough breathing), or with an ad hoc character mapping is presumably up to you. Note that if you were to call heta a funny version of U+0312 Combining Turned Comma Above, you would not escape the fact that your heta is still a combining character as far as Unicode is concerned; what it will do with the acute next to is something best not left to the imagination. Besides, Unicode philosophy is to differentiate between diacritics only according to what they look like, not their function. So the conflation would not be welcome.

3. Latin Heta

So let's accept that we don't treat the Η as a diacritic, but as an alternative spelling, which you have custom software to map from. The issue is still what glyph (and codepoint) it should be. You'd think it would obviously be U+0397 Greek Capital Letter Eta. But this leads to a nasty ambiguity, with the conventional eta, which we would rather avoid.

We are left with three choices. The first choice—which noone ever takes—is to use the archaic variant of the (h)eta glyph, the boxed heta—featured in Old Italic as U+10307 Old Italic Letter He, 𐌇. This is in fact what was done in Delphic and Rhodian inscriptions, but it is not done in modern editions. Presumably it would be hard to integrate a boxed heta into Modern Greek typography, and editors wouldn't want the hassle of inventing a brand new symbol anyway.

The second choice is what epigraphers usually do. What epigraphers want is a character that may well look just like Η as a capital, but in lower case—which will be used just about always in transcription—looks distinct from η, and cannot be confused with it—and is also readily available in the local printer's tray.

I have, of course, just described the offspring of heta, U+0068 Latin Small Letter H:

[Σῖμον, ὀρκhε̄στὰν ἄριστον,] ναὶ <μὰ >τὸν Δελπhίνιον,
ἐ͂ ̄Κρίμο̄ν τε͂̄δε ὀ͂̄πhε, παῖδα Βαθυκλέος, ἀδελπhεόν. (Iambica Adespota 29Aa, from Thera, 8th century BC: Inscriptiones Graecae xii.3.537)

(What the inscription actually looked like:)
ΝΑΙΤΟΝΔΕΛΠΗΙΝΙΟΝΕΚΡΙΜΟ
ΝΤΕΔΕΟΙΠΗΕΠΑΙΔΑΒΑΘΥΚΛΕΟΣΑΔΕΛΠΗΕΟ (Jeffery 1990:Plate 61; second line is backwards—boustrophedon)

ναι τον Δελπ⊢ινιον ε̣ Κριμο̄ν
τε̄δε ο̄ιπ⊢ε παιδα, Βαθυκλεος αδελπ⊢εο[ν] (Jeffery 1990:413)

(Standard Greek script:)
[Σῖμον, ὀρχηστὰν ἄριστον,] ναὶ <μὰ> τὸν Δελφίνιον,
ἦ Κρίμων τῆδε ᾦφε παῖδα Βαθυκλέος, ἀδελφεόν.

Invoking the Delphic Apollo, truly have I, Crimon, here [ritually] copulated with a boy, [Simus, an excellent dancer,] Son of Bathycles, brother of... (See analysis at the Leslie-Lohman Gay Art Foundation and in Thomas Hubbard's Homosexuality in Greece and Rome (§2.22.537a); the initial phrase is speculative by the editor, since in inscription 540.III Crimon says he "delighted Simias with his lascivious dance".)

And what a bonanza of archaism we have here: /kh/, /ph/ being written as ΚΗ, ΠΗ instead of Χ, Φ; the long vowels Η, Ω written as Ε, Ο—with the editor employing macrons to differentiate them from their short vowels; and the normal diacritics of Greek superimposed on the unaccented original, resulting in triple deep diacritics.

Epigraphers have been quite happily using this convention for decades, and it turns up far and wide—even in grammars of New Testament Greek; in discussing the development of the verb ἱστάναι "stand", Blass & Debrunner (p. 49) cite an inscription from Argos as hα στάλα ἔσστα "was set up". It is without question the main way of doing heta, and it involves a loan from a script whose characters are readily available.

Since the lowercase letter <h> is absent from Greek, no ambiguity results, and for the most part editors are content to leave the letter as is, although one will occasionally see the <h> altered. Peek uses a script h for heta (Peek, W. 1955. Griechische Vers-Inschriften. Band I: Grab-Epigramme. Berlin: Akademie-Verlag); while some editions of Heraclean inscriptions in the Supplementum Epigraphicum Graecum (e.g. vol. 47, 1997, p. 401) use an <h> shrunk to x-height, presumably so it is less obtrusive in Greek text.

The capital version of Latin Heta is H, however, and this leads to pernicious ambiguity with capital Eta Η. Usually in epigraphy this is not an issue, since a document will rarely have both heta and eta; but this did happen on occasion, as we saw, and Buck at least quite cheerfully uses H as the capital of h in inscriptions which also have eta:

θοῖναι δὲ ταίδ[ε νόμιμ]οι· Ἀπέλλαι καὶ Β[ουκά]τια, Hηραῖα, Δαιδαφ[όρια], Ποιτρόπια, Βυσίου [μην]ὸς τὰν hεβδέμαν καὶ [τ]ὰν hενάταν ... (Delphi; Buck 1955:242)

It is lawful to feast at the following days: the Apellae and the Boucatia, the Heraea [Ἡραῖα], the Daedaphoria, the Poetropia, the seventh and ninth of the month Bysios...

Ἀνέγραψαν τοὶ ὀρισταὶ τοὶ hαιρεθέντες ἐπὶ τὼς χώρως τὼς hιαρὼς τὼς τῶ Διονύσω, Φιλώνυμος Ζωπυρίσκω, Ἀπολλώνιος Hηρακλήτω, Δάζιμος Πύρρω, Φιλώτας Hιστιείω, Hηρακλείδας Ζωπύρω, καθὰ [ὤρ]ιξαν και ἐτέρμαξαν καὶ συνεμέτρησαν καὶ ἐμέριξαν τῶν Hηρακλείων διακνόντων ἐν κατακλήτωι ἀλίαι. (Heraclea; Buck 1955:273)

Inscribed by the surveyors chosen for the places consecrated to Dionysus: Philonymus son of Zopyriscus, Apollonius son of Heraclitus, Dazimus son of Pyrrhus, Philotas son of Histieias, Heracleidas son of Zopyrus, according to what they have determined, measured and apportioned, with the Heraclaeans assenting in a deliberative assembly.

It's lucky for Buck that he had no instances of initial eta without a rough breathing, so there are no capital etas in his texts with heta. Even if there were, there would be no ambiguity in titlecase: capital eta would have a smooth breathing, capital heta does not. So Hηραῖα would contrast with Ἠχώ. This kind of context-sensitivity is unpleasant as a general solution, though, and would fail in all-caps. So while in lowercase Latin heta is unproblematic, in uppercase it joins the other instances of Latin-Greek capital ambiguity where one is likelier to see the uppercase glyph avoided entirely.

One would have thought that epigraphers would titlecase the initial vowel, and precede it with the lowercase h, Irish lenition style: hἝλλε̄ν. John Mansfield of Cornell has indicated to me that this does happen on occasion, e.g. hΑβρο-.

4. Tack heta

There is a third alternative, though, which is to take the Heraclean form of heta—the tack formed by splitting Η in half. The Delphic solution—use the archaic, box heta—is felt too disruptive to Greek text; the Heraclean isn't necessarily. So if your inscription has a heta distinct from eta, the tack is the best solution on offer. The Heraclean and Delphi inscriptions are too late for Jeffery's survey, but she does cite an insciption from Cnidus using both eta and heta:

⊢ο Μικος ⊢ο Μ[αγν]ητος τἀθαναιαι̣ μ' α[νεθηκε] (Jeffery 1990:415)
Micus son of Magnes dedicated me to Athena.

The tack heta, however, is not at all in widespread use, particularly as it has the disadvantage over Latin heta of not being readily available to classical typographers. Curiously Johnston, in prefacing the supplement to Jeffery (1990:423), says s/he will use Latin heta rather than Jeffery's tack heta—but does so only once. In fact, the only place I have seen the tack used instead of the Latin <h> with any regularity is in publications of inscriptions... from Heracleia and Tarentum. Where the tack used to be used, some epigraphers still use it; everywhere else, especially when the inscriptions are treated as text and not as illustrations of the history of the alphabet, the Latin form has won the day.

Unlike the Latin heta, there is no problem of ambiguity with capital tack heta; but one might wonder if there is a capital version of the tack heta at all—particularly given that it utimately engendered a diacritic, which is caseless. At least in Jeffery's case, a distinction is made: the lowercase tack is the same height as an omicron, the uppercase the same height as a capital omicron.

The same glyph was used briefly in Boeotia to represent a raised /e/ before a vowel; as noted elsewhere, though, it is less confusing to conflate this with Corinthian EI than to multiply instances of the tack—especially since, as a vowel, the tack can bear accents (Buck 1955:22), unlike heta.

5. Unicode heta

There are two related questions that arise out of all this for Unicode: how to represent the different forms of /h/ in Greek, and whether any of these representations should be conflated.

If we do not conflate any of the representations of /h/, but keep each glyph in a distinct codepoint, there are already Unicode codepoints that can be used for all but one:

/h/ rendering	Codepoint	Capital Version
Comma diacritic	`U+0314 Combining Reversed Comma Above`	—
Tack diacritic	`U+0485 Combining Cyrillic Dasia Pneumata`	—
Boxed heta	`U+10307 Old Italic Letter He`	—
Latin heta	`U+0068 Latin Small Letter H`	`U+0058 Latin Capital Letter H`
Tack heta	???	???

The tack heta does not have a codepoint for its glyph as such in Unicode. I have used U+22A2 Right Tack ⊢ here, though arguably the narrower U+22A6 Assertion ⊦ is closer to the symbol used typographically. But these are mathematical symbols, which are not intended to be mixed in with normal text. The character properties of those codepoints mean they will not be recognised as alphabetic characters, and their typical typography—with much thinner lines than alphabetic letters—would make their mixing look awkward. In addition, the casing distinction between tack hetas is not one the mathematical symbols are equipped to handle. So if adopted into Unicode as distinct codepoints, they would require new codepoints to be allocated, rather than reusing existing ones.

The tack heta is unserifed in its typographical history, but that follows from the fact that epigraphy is done in the Greek version of sans-serif typefaces anyway.

This case is quite different from that of mixing Latin and Greek scripts, which Latin heta would require, and which I argue to be acceptable. The behaviour of Latin heta is fully that of Latin H: it is typographically equivalent, it has the same casing behaviour, the same semantics, and conflating the two is both practical and the best way to forestall confusion between identical characters cloned across scripts.

The question then becomes whether any of the five renderings can be conflated. Tempting though it is, the conflation of the diacritics and the letters is not possible within Unicode: the distinction between diacritic and letter is a normative property, and Unicode codepoints may not blur that distinction. So while Ἕλλην is indeed the normalised version of Hέλλε̄ν, and some process needs to be able to reduce the latter to the former, that process cannot lie within Unicode.

5.1. Diacritics

The two versions of the diacritic, the comma and the tack, have been kept distinct because of Unicode's policy of allocating diacritic codepoints by what they look like, not by semantics. (This policy has been explicitly violated by U+0342 Combining Greek Perispomeni, which conflated the typographically dissimilar renderings of the diacritic, U+0303 Combining Tilde and U+0311 Combining Inverted Breve.) U+0314 is a comma, and is not necessarily restricted to Greek; its counterpart U+0313 is already allocated to the Americanist phonetic notation of ejectives and glottalisation. The Cyrillic diacritic has a reference glyph of a tack, and is intended to look different from the Greek comma, although font designers are free to put forward other distinct designs, and Everson Mono Unicode has a swash horizontal version of the comma. (At least, as swash as you can get in a monospace font.)

It would seem obvious to associate the comma diacritics with Greek script, and restrict the tack diacritics to Cyrillic script. This would be convenient for avoiding the proliferation of distinct diacritics to generalise over. That is not the intent for these characters, even though the association of the comma diacritics with Greek in particular is cemented through the formidable array of Greek Extended precomposed characters. One should be free to affix any diacritic to any letter—though Unicode won't take responsibility for the visual presentation of some of the more unfathomable combinations, such as Devanagari vowel diacritics on Runic letters. And the diacritic codepoint chosen should provide full information as to what the diacritic actually looked like in the source. So when Old Cyrillic shifted from the tack to the comma, the reasoning goes, the transcription should follow suit by shifting codepoints; the transcription will no longer have a single codepoint for both, but that's the fault of the transcription refusing to impose uniformity on the diacritics, not of the Unicode diacritic inventory. As long as your diacritic glyphs are distinct, the codepoints will be distinct as well, and you will not be able to do a single search encompassing both using Unicode on its own. Whatever process reduces them to the one codepoint lies outside Unicode.

The only way to enforce the tack and the comma being treated the same within the framework of Unicode, and not disrupting what is already in place for e.g. Amerindian transcriptions, is to make their Greek versions glyph variants of a single overriding diacritic, semantically defined as is currently the case with Perispomeni. Any such attempt which moves anything away from U+0313 and U+0314 will break compatibility with the existing implementations and texts of polytonic Greek (for the sake of quite marginal usage), and is thus a non-starter. Any subsuming of the Cyrillic breathings to the Greek would possibly mean the end of the tack as distinct from the comma; at any rate there is presumably already usage in place for the Cyrillic breathings, which cannot be annulled. So it does not seem that there will be any solution; implementations can only be made aware that there is a semantic though not a graphemic equivalence between the two pairs of diacritics. Fortunately the tack diacritic is so rare in Greek that a failure to do this does not cause any tangible problems.

5.2. Heta

U+0370 Greek Capital Letter Heta [Ͱ], U+0371 Greek Small Letter Heta [ͱ]

Of the three forms, the fate of boxed heta, and whether it stays with Old Italic or gets brought in from the cold back into Greek, is moot, since there is no tradition in Greek typography of using boxed heta to differentiate heta from eta. (There was such a tradition in the stones of Delphi; but the business of Unicode is to encode editions, not stones.)

If we keep the Latin and the tack heta separate, we might unify the latter with U+1FFE Greek Dasia. As a spacing codepoint, Greek Dasia is not doing anything integral to Unicode: its original major function (as the left-modifying titlecase diacritic) is a misfeature given that all diacritics in Unicode postmodify, and its compatibility decomposition as U+0020 Space, U+0314 Combining Reversed Comma Above shows that, if you want to talk about the glyph in isolation, you need only combine it with space. So the presence of U+1FFE in Unicode contravenes the spirit of the standard, and the codepoint isn't being put to good use. If (as John Cowan has suggested to me) we treat the tack heta as a glyph variant of U+1FFE, sidestepping its compatibility decomposition, we have no need to find a new codepoint for it.

However, this will not work. The normative General Category of U+1FFE is Sk (Symbol, modifier), which is not quite letter status; so tack heta wouldn't be treated as a letter in searches. This might still be patched up, if an overwhelming case were to be made that the codepoint is only to be used for heta qua letter. But this has not been the case historically, and even though the codepoint is a misfeature, it is unlikely to become the case in the future. More damaging is the fact that at least sometimes tack heta has case. Diacritics do not have case (the one attempt to do so in Unicode involved adscripts, and I believe shouldn't have); so we would have to unify only uppercase tack heta with dasia (as a titlecase letter), and leave lowercase heta as a separate codepoint. By this stage, we might as well just have two heta codepoints, lowercase and uppercase, and be done with it.

There is one more conflation we could attempt, which is to merge the two hetas, the Latin and the tack, as the single codepoint Greek Letter Heta. The two look nothing like each other, but they represent the same value, have the same behaviour as characters—casing, letters, epigraphical, and are mutually exclusive, belonging to different epigraphical traditions of representing the same group of Greek squiggles. Were the Latin hetaon its own, I believe it is only proper for it to be conflated with its Latin counterpart; but conflating it with the tack makes much more sense than conflating it with Latin H, since the two hetas really are the same character. If epigraphers still insist on using Latin H after such a conflation, there's no harm done: not all equivalences to be drawn between Greek glyphs need be resolved within the confines of Unicode codepoints—programmers still need some employment.

In feedback I have received from epigraphers, there appears to be support for such a distinct heta codepoint even if people go on using the Latin <h>. Several digital epigraphy projects encode heta in non-Unicode schemes (typically variants of Beta code) as a Greek codepoint, distinct from Latin <h>. Even if the projects reflect publications using the <h> glyph, they would rather hang on to a distinct codepoint. That said, the head of the most prominent corpus of Greek inscriptions, the Inscriptiones Graecae, has objected to a tack heta proposal to me, arguing that the tack is virtually unused (which is true), and that it risks confusion with the numeric sign U+10142 Attic Acrophonic Symbol One Drachma.

With the eventual advent of smart fonts, epigraphers would be able to pick between the Latin and tack glyph as stylistic variants, much as smart fonts will also solve the problem of Italic Serbian. (To minimise confusion with non-hellenists, though, the reference glyph should probably be the tack, even though the Latin glyph is what is used most commonly.)

Based on this document, I submitted a proposal to the UTC for heta as a distinct codepoint: Proposal to add Greek Letter Lowercase Heta and Greek Letter Capital Heta, with versions L2/04-388, L2/05-002. The letter was accepted in Unicode 5.1.0, April 2008.

Nick Nicholas, opoudjis [AT] optusnet . com . au
Created: 2003-08-10; Last revision: 2008-05-14
URL: http://www.opoudjis.net/unicode/unicode_aitch.html

5. Greek /h/