Ada Reference Manual (Ada 2022)Legal Information
Contents   Index   References   Search   Previous   Next 

2.1 Character Set

1/5
The character repertoire for the text of an Ada program consists of the entire coding space described by the ISO/IEC 10646:2020 Universal Coded Character Set. This coding space is organized in planes, each plane comprising 65536 characters.

Syntax

Paragraphs 2 and 3 were deleted. 
3.1/5
A character is defined by this Reference Manual for each cell in the coding space described by ISO/IEC 10646:2020, regardless of whether or not ISO/IEC 10646:2020 allocates a character to that cell. 

Static Semantics

4/5
The coded representation for characters is implementation defined (it can be a representation that is not defined within ISO/IEC 10646:2020). A character whose relative code point in its plane is 16#FFFE# or 16#FFFF# is not allowed anywhere in the text of a program. The only characters allowed outside of comments are those in categories other_format, format_effector, and graphic_character.
4.1/5
 The semantics of an Ada program whose text is not in Normalization Form C (as defined by Clause 22 of ISO/IEC 10646:2020) is implementation defined. 
5/5
The description of the language definition in this document uses the character properties General Category, Simple Uppercase Mapping, Uppercase Mapping, and Special Case Condition of the documents referenced by Clause 2 of ISO/IEC 10646:2020. The actual set of graphic symbols used by an implementation for the visual representation of the text of an Ada program is not specified.
6/3
Characters are categorized as follows: 
7/2
This paragraph was deleted.
8/2
letter_uppercase

Any character whose General Category is defined to be “Letter, Uppercase”.
9/2
letter_lowercase

Any character whose General Category is defined to be “Letter, Lowercase”. 
9.1/2
 letter_titlecase

Any character whose General Category is defined to be “Letter, Titlecase”.
9.2/2
 letter_modifier

Any character whose General Category is defined to be “Letter, Modifier”.
9.3/2
 letter_other
Any character whose General Category is defined to be “Letter, Other”.
9.4/2
 mark_non_spacing

Any character whose General Category is defined to be “Mark, Non-Spacing”.
9.5/2
 mark_spacing_combining

Any character whose General Category is defined to be “Mark, Spacing Combining”.
10/2
number_decimal

Any character whose General Category is defined to be “Number, Decimal”.
10.1/2
  number_letter

Any character whose General Category is defined to be “Number, Letter”.
10.2/2
  punctuation_connector

Any character whose General Category is defined to be “Punctuation, Connector”.
10.3/2
  other_format
Any character whose General Category is defined to be “Other, Format”.
11/2
separator_space

Any character whose General Category is defined to be “Separator, Space”.
12/2
separator_line
Any character whose General Category is defined to be “Separator, Line”. 
12.1/2
  separator_paragraph

Any character whose General Category is defined to be “Separator, Paragraph”.
13/3
format_effector

The characters whose code points are 16#09# (CHARACTER TABULATION), 16#0A# (LINE FEED), 16#0B# (LINE TABULATION), 16#0C# (FORM FEED), 16#0D# (CARRIAGE RETURN), 16#85# (NEXT LINE), and the characters in categories separator_line and separator_paragraph.
13.1/2
  other_control

Any character whose General Category is defined to be “Other, Control”, and which is not defined to be a format_effector.
13.2/2
  other_private_use

Any character whose General Category is defined to be “Other, Private Use”.
13.3/2
  other_surrogate

Any character whose General Category is defined to be “Other, Surrogate”.
14/3
graphic_character

Any character that is not in the categories other_control, other_private_use, other_surrogate, format_effector, and whose relative code point in its plane is neither 16#FFFE# nor 16#FFFF#. 
15/5
The following names are used when referring to certain characters (the first name is that given in ISO/IEC 10646:2020):
  graphic symbolname  graphic symbolname
    
         "quotation mark         :colon
         #number sign         ;semicolon
         &ampersand         <less-than sign
         'apostrophe, tick         =equals sign
         (left parenthesis         >greater-than sign
         )right parenthesis         _low line, underline
         *asterisk, multiply         |vertical line
         +plus sign         /solidus, divide
         ,comma         !exclamation point
         –hyphen-minus, minus         %percent sign
         .full stop, dot, point         [left square bracket
         @commercial at, at sign         ] right square bracket

Implementation Requirements

16/3
An Ada implementation shall accept Ada source code in UTF-8 encoding, with or without a BOM (see A.4.11), where every character is represented by its code point. The character pair CARRIAGE RETURN/LINE FEED (code points 16#0D# 16#0A#) signifies a single end of line (see 2.2); every other occurrence of a format_effector other than the character whose code point position is 16#09# (CHARACTER TABULATION) also signifies a single end of line.

Implementation Permissions

17/3
The categories defined above, as well as case mapping and folding, may be based on an implementation-defined version of ISO/IEC 10646 (2003 edition or later). 
18/2
NOTE   The characters in categories other_control, other_private_use, and other_surrogate are only allowed in comments.

Contents   Index   References   Search   Previous   Next 
Ada-Europe Ada 2005 and 2012 Editions sponsored in part by Ada-Europe