There are three ways to deal with the umlauts in alphabetic sorting.
Treat them like their base characters, as if the umlaut was not present (DIN 5007-1, section 6.1.1.4.1). This is the preferred method for dictionaries, where umlauted words ("Füße", feet) should appear near their origin words ("Fuß", foot).
In words which are the same except for one having an umlaut and one its base character (e.g., "Müll" vs. "Mull"), the word with the base character gets precedence. Decompose them (invisibly) to vowel plus e (DIN 5007-2, section 6.1.1.4.2). This is often preferred for personal and geographical names, wherein the characters are used unsystematically, as in German telephone directories ("Müller, A.; Mueller, B.; Müller, C."). They are treated like extra letters either placed after their base letters (Austrian phone books have ä between az and b etc.) or at the end of the alphabet (as in Swedish or in extended ASCII). Microsoft Windows in German versions offers the choice between the first two variants in its internationalisation settings. Eszett is sorted as though it were ss. Occasionally it is treated as s, but this is generally considered incorrect. It is not used at all in Switzerland. Accents in French loan words are always ignored in collation. In rare contexts (e. g. in older indices) sch (equal to English sh) and likewise st and ch are treated as single letters, but the vocalic digraphs ai, ei (historically ay, ey), au, äu, eu and the historic ui and oi never are.
In words which are the same except for one having an umlaut and one its base character (e.g., "Müll" vs. "Mull"), the word with the base character gets precedence. Decompose them (invisibly) to vowel plus e (DIN 5007-2, section 6.1.1.4.2). This is often preferred for personal and geographical names, wherein the characters are used unsystematically, as in German telephone directories ("Müller, A.; Mueller, B.; Müller, C."). They are treated like extra letters either placed after their base letters (Austrian phone books have ä between az and b etc.) or at the end of the alphabet (as in Swedish or in extended ASCII). Microsoft Windows in German versions offers the choice between the first two variants in its internationalisation settings. Eszett is sorted as though it were ss. Occasionally it is treated as s, but this is generally considered incorrect. It is not used at all in Switzerland. Accents in French loan words are always ignored in collation. In rare contexts (e. g. in older indices) sch (equal to English sh) and likewise st and ch are treated as single letters, but the vocalic digraphs ai, ei (historically ay, ey), au, äu, eu and the historic ui and oi never are.
No comments:
Post a Comment