词法结构Lexical structure

ProgramsPrograms

C # 程序 _ 由一个或多个 _源文件*_ 组成,该文件称为 "作为 _ 编译单元" (编译单元) 。A C# program _ consists of one or more _source files_, known formally as _ compilation units (Compilation units). 源文件是 Unicode 字符的有序序列。A source file is an ordered sequence of Unicode characters. 源文件与文件系统中的文件通常具有一对一的对应关系,但不需要此函件。Source files typically have a one-to-one correspondence with files in a file system, but this correspondence is not required. 为获得最大的可移植性,建议使用 UTF-8 编码对文件系统中的文件进行编码。For maximal portability, it is recommended that files in a file system be encoded with the UTF-8 encoding.

从概念上讲,程序是使用三个步骤编译的:Conceptually speaking, a program is compiled using three steps:

  1. 转换,将特定字符已知和编码方案中的文件转换为 Unicode 字符序列。Transformation, which converts a file from a particular character repertoire and encoding scheme into a sequence of Unicode characters.
  2. 词法分析,将 Unicode 输入字符流转换为标记流。Lexical analysis, which translates a stream of Unicode input characters into a stream of tokens.
  3. 语法分析,将令牌流转换为可执行代码。Syntactic analysis, which translates the stream of tokens into executable code.

语法Grammars

此规范提供使用两个语法的 c # 编程语言的语法。This specification presents the syntax of the C# programming language using two grammars. *词法语法 _ (词法语法) 定义如何将 Unicode 字符组合起来以形成行终止符、空格、注释、标记和预处理指令。The *lexical grammar _ (Lexical grammar) defines how Unicode characters are combined to form line terminators, white space, comments, tokens, and pre-processing directives. _ 句法文法* (句法文法) 定义如何组合词汇语法生成的标记以形成 c # 程序。The _ syntactic grammar* (Syntactic grammar) defines how the tokens resulting from the lexical grammar are combined to form C# programs.

语法表示法Grammar notation

词法语法和语法语法使用 ANTLR 语法工具的表示法以 Backus-Naur 形式出现。The lexical and syntactic grammars are presented in Backus-Naur form using the notation of the ANTLR grammar tool.

词法语法Lexical grammar

C # 的词法语法显示在 词法分析标记预处理指令中。The lexical grammar of C# is presented in Lexical analysis, Tokens, and Pre-processing directives. 词法语法的终端符号是 Unicode 字符集的字符,并且词法语法指定如何将字符组合到 (标记) 、空白 (空格) 、注释 (注释) 和预处理指令 (预处理指令) 中。The terminal symbols of the lexical grammar are the characters of the Unicode character set, and the lexical grammar specifies how characters are combined to form tokens (Tokens), white space (White space), comments (Comments), and pre-processing directives (Pre-processing directives).

C # 程序中的每个源文件都必须符合词法文法 (词法分析) 的 输入 生产。Every source file in a C# program must conform to the input production of the lexical grammar (Lexical analysis).

语法语法Syntactic grammar

C # 的句法语法在本章后面的章节和附录中提供。The syntactic grammar of C# is presented in the chapters and appendices that follow this chapter. 语法语法的终端符号是由词法语法定义的标记,句法语法指定如何组合标记以形成 c # 程序。The terminal symbols of the syntactic grammar are the tokens defined by the lexical grammar, and the syntactic grammar specifies how tokens are combined to form C# programs.

C # 程序中的每个源文件都必须符合句法文法 (编译单元) compilation_unit 生产。Every source file in a C# program must conform to the compilation_unit production of the syntactic grammar (Compilation units).

词法分析Lexical analysis

输入 生产定义 c # 源文件的词法结构。The input production defines the lexical structure of a C# source file. C # 程序中的每个源文件都必须符合此词法文法生产。Each source file in a C# program must conform to this lexical grammar production.

input
    : input_section?
    ;

input_section
    : input_section_part+
    ;

input_section_part
    : input_element* new_line
    | pp_directive
    ;

input_element
    : whitespace
    | comment
    | token
    ;

五个基本元素构成 c # 源文件的词法结构:行终止符 (行结束符) 、 (空格) 、注释 (注释) 、标记 (标记) 和预处理指令 (预处理指令) 。Five basic elements make up the lexical structure of a C# source file: Line terminators (Line terminators), white space (White space), comments (Comments), tokens (Tokens), and pre-processing directives (Pre-processing directives). 在这些基本元素中,只有标记在 c # 程序的句法语法中是有意义的 (句法文法) 。Of these basic elements, only tokens are significant in the syntactic grammar of a C# program (Syntactic grammar).

C # 源文件的词法处理包括将文件缩减为一系列标记,后者成为句法分析的输入。The lexical processing of a C# source file consists of reducing the file into a sequence of tokens which becomes the input to the syntactic analysis. 行终止符、空白和注释可用于分隔标记,预处理指令可能会导致跳过源文件的各个部分,否则,这些词法元素不会影响 c # 程序的语法结构。Line terminators, white space, and comments can serve to separate tokens, and pre-processing directives can cause sections of the source file to be skipped, but otherwise these lexical elements have no impact on the syntactic structure of a C# program.

对于内插字符串, (内 插字符串文本) 单个标记最初由词法分析生成,但被分解为多个输入元素,这些输入元素会反复进入词法分析,直到所有内插字符串文本都已得到解决。In the case of interpolated string literals (Interpolated string literals) a single token is initially produced by lexical analysis, but is broken up into several input elements which are repeatedly subjected to lexical analysis until all interpolated string literals have been resolved. 然后,生成的令牌作为句法分析的输入。The resulting tokens then serve as input to the syntactic analysis.

当多个词法语法生产与源文件中的一系列字符匹配时,词法处理始终形成可能的最长词汇元素。When several lexical grammar productions match a sequence of characters in a source file, the lexical processing always forms the longest possible lexical element. 例如,字符序列 // 作为单行注释的开头处理,因为该词法元素比单个 / 标记长。For example, the character sequence // is processed as the beginning of a single-line comment because that lexical element is longer than a single / token.

行终止符Line terminators

行结束符将 c # 源文件中的字符分为多行。Line terminators divide the characters of a C# source file into lines.

new_line
    : '<Carriage return character (U+000D)>'
    | '<Line feed character (U+000A)>'
    | '<Carriage return character (U+000D) followed by line feed character (U+000A)>'
    | '<Next line character (U+0085)>'
    | '<Line separator character (U+2028)>'
    | '<Paragraph separator character (U+2029)>'
    ;

为了与添加文件结尾标记的源代码编辑工具兼容,若要将源文件视为一系列正确终止的行,请按顺序将以下转换应用于 c # 程序中的每个源文件:For compatibility with source code editing tools that add end-of-file markers, and to enable a source file to be viewed as a sequence of properly terminated lines, the following transformations are applied, in order, to every source file in a C# program:

  • 如果源文件的最后一个字符是 () 的 Control Z 字符 U+001A ,则将删除该字符。If the last character of the source file is a Control-Z character (U+001A), this character is deleted.
  • 如果源文件不为 U+000D 空,并且源文件的最后一个字符不是 () 的回车符 U+000D 、换行 (U+000A) 、行分隔符 (U+2028) 或段落分隔符 () ,则会将回车符 () 添加到源文件的末尾 U+2029A carriage-return character (U+000D) is added to the end of the source file if that source file is non-empty and if the last character of the source file is not a carriage return (U+000D), a line feed (U+000A), a line separator (U+2028), or a paragraph separator (U+2029).

注释Comments

支持两种形式的注释:单行注释和分隔注释。Two forms of comments are supported: single-line comments and delimited comments.\ 单行注释 以字符开头 // ,并延伸到源行的末尾。\ Single-line comments _ start with the characters // and extend to the end of the source line. *分隔注释* 以字符开头 /_ 并以字符结尾 */_Delimited comments_ start with the characters /_ and end with the characters */. 分隔注释可能跨多行。Delimited comments may span multiple lines.

comment
    : single_line_comment
    | delimited_comment
    ;

single_line_comment
    : '//' input_character*
    ;

input_character
    : '<Any Unicode character except a new_line_character>'
    ;

new_line_character
    : '<Carriage return character (U+000D)>'
    | '<Line feed character (U+000A)>'
    | '<Next line character (U+0085)>'
    | '<Line separator character (U+2028)>'
    | '<Paragraph separator character (U+2029)>'
    ;

delimited_comment
    : '/*' delimited_comment_section* asterisk+ '/'
    ;

delimited_comment_section
    : '/'
    | asterisk* not_slash_or_asterisk
    ;

asterisk
    : '*'
    ;

not_slash_or_asterisk
    : '<Any Unicode character except / or *>'
    ;

注释不嵌套。Comments do not nest. 字符序列 /* */ 在注释中没有特殊含义,并且 // 字符序列在 // /* 分隔注释中没有特殊含义。The character sequences /* and */ have no special meaning within a // comment, and the character sequences // and /* have no special meaning within a delimited comment.

在字符和字符串文本中不处理注释。Comments are not processed within character and string literals.

示例The example

/* Hello, world program
   This program writes "hello, world" to the console
*/
class Hello
{
    static void Main() {
        System.Console.WriteLine("hello, world");
    }
}

包含分隔注释。includes a delimited comment.

示例The example

// Hello, world program
// This program writes "hello, world" to the console
//
class Hello // any name will do for this class
{
    static void Main() { // this method must be named "Main"
        System.Console.WriteLine("hello, world");
    }
}

显示若干单行注释。shows several single-line comments.

空格White space

空格定义为带有 Unicode 类 Zs (的任何字符,其中包括空格字符) 以及水平制表符、垂直制表符和换页符。White space is defined as any character with Unicode class Zs (which includes the space character) as well as the horizontal tab character, the vertical tab character, and the form feed character.

whitespace
    : '<Any character with Unicode class Zs>'
    | '<Horizontal tab character (U+0009)>'
    | '<Vertical tab character (U+000B)>'
    | '<Form feed character (U+000C)>'
    ;

令牌Tokens

有多种类型的令牌:标识符、关键字、文本、运算符和标点符号。There are several kinds of tokens: identifiers, keywords, literals, operators, and punctuators. 空白和注释不是标记,不过它们充当标记的分隔符。White space and comments are not tokens, though they act as separators for tokens.

token
    : identifier
    | keyword
    | integer_literal
    | real_literal
    | character_literal
    | string_literal
    | interpolated_string_literal
    | operator_or_punctuator
    ;

Unicode 字符转义序列Unicode character escape sequences

Unicode 字符转义序列表示一个 Unicode 字符。A Unicode character escape sequence represents a Unicode character. Unicode 字符转义序列在标识符中处理 (标识符) 、字符文本 (字符文本) ,并 (字符串文本) 字符串 文本。Unicode character escape sequences are processed in identifiers (Identifiers), character literals (Character literals), and regular string literals (String literals). 在其他任何位置都不会处理 Unicode 字符转义 (例如,要形成运算符、标点符号或关键字) 。A Unicode character escape is not processed in any other location (for example, to form an operator, punctuator, or keyword).

unicode_escape_sequence
    : '\\u' hex_digit hex_digit hex_digit hex_digit
    | '\\U' hex_digit hex_digit hex_digit hex_digit hex_digit hex_digit hex_digit hex_digit
    ;

Unicode 转义序列表示由 " \u " 或 "" 字符后面的十六进制数构成的单个 unicode 字符 \UA Unicode escape sequence represents the single Unicode character formed by the hexadecimal number following the "\u" or "\U" characters. 由于 c # 使用字符和字符串值中的 Unicode 码位的16位编码,字符文本中不允许使用 U + 10000 到 U + 10FFFF 范围内的 Unicode 字符,而是使用字符串文本中的 Unicode 代理项对来表示。Since C# uses a 16-bit encoding of Unicode code points in characters and string values, a Unicode character in the range U+10000 to U+10FFFF is not permitted in a character literal and is represented using a Unicode surrogate pair in a string literal. 不支持0x10FFFF 以上的码位的 Unicode 字符。Unicode characters with code points above 0x10FFFF are not supported.

不会执行多个转换。Multiple translations are not performed. 例如,字符串文本 " \u005Cu005C " 等效于 " \u005C ",而不是 " \ "。For instance, the string literal "\u005Cu005C" is equivalent to "\u005C" rather than "\". Unicode 值 \u005C 为字符 " \ "。The Unicode value \u005C is the character "\".

示例The example

class Class1
{
    static void Test(bool \u0066) {
        char c = '\u0066';
        if (\u0066)
            System.Console.WriteLine(c.ToString());
    }        
}

显示了的几个用法 \u0066 ,它是字母 "" 的转义序列 fshows several uses of \u0066, which is the escape sequence for the letter "f". 该程序等效于The program is equivalent to

class Class1
{
    static void Test(bool f) {
        char c = 'f';
        if (f)
            System.Console.WriteLine(c.ToString());
    }        
}

标识符Identifiers

本节中给出的标识符规则与 Unicode 标准附录31所建议的规则完全一致,只不过允许使用下划线作为 (在 C 编程语言) 中为传统字符,在标识符中允许使用 Unicode 转义序列,并允许将 " @ " 字符作为前缀,以使关键字可用作标识符。The rules for identifiers given in this section correspond exactly to those recommended by the Unicode Standard Annex 31, except that underscore is allowed as an initial character (as is traditional in the C programming language), Unicode escape sequences are permitted in identifiers, and the "@" character is allowed as a prefix to enable keywords to be used as identifiers.

identifier
    : available_identifier
    | '@' identifier_or_keyword
    ;

available_identifier
    : '<An identifier_or_keyword that is not a keyword>'
    ;

identifier_or_keyword
    : identifier_start_character identifier_part_character*
    ;

identifier_start_character
    : letter_character
    | '_'
    ;

identifier_part_character
    : letter_character
    | decimal_digit_character
    | connecting_character
    | combining_character
    | formatting_character
    ;

letter_character
    : '<A Unicode character of classes Lu, Ll, Lt, Lm, Lo, or Nl>'
    | '<A unicode_escape_sequence representing a character of classes Lu, Ll, Lt, Lm, Lo, or Nl>'
    ;

combining_character
    : '<A Unicode character of classes Mn or Mc>'
    | '<A unicode_escape_sequence representing a character of classes Mn or Mc>'
    ;

decimal_digit_character
    : '<A Unicode character of the class Nd>'
    | '<A unicode_escape_sequence representing a character of the class Nd>'
    ;

connecting_character
    : '<A Unicode character of the class Pc>'
    | '<A unicode_escape_sequence representing a character of the class Pc>'
    ;

formatting_character
    : '<A Unicode character of the class Cf>'
    | '<A unicode_escape_sequence representing a character of the class Cf>'
    ;

有关上面提到的 Unicode 字符类的信息,请参阅 Unicode 标准版本3.0,第4.5 节。For information on the Unicode character classes mentioned above, see The Unicode Standard, Version 3.0, section 4.5.

有效标识符的示例包括 " identifier1 "、" _identifier2 " 和 " @if "。Examples of valid identifiers include "identifier1", "_identifier2", and "@if".

符合标准的程序中的标识符必须是 Unicode 范式 C 定义的规范格式,如 Unicode 标准附录15所定义。An identifier in a conforming program must be in the canonical format defined by Unicode Normalization Form C, as defined by Unicode Standard Annex 15. 如果遇到非范式规范的标识符,则该行为是实现定义的;但是,不需要诊断。The behavior when encountering an identifier not in Normalization Form C is implementation-defined; however, a diagnostic is not required.

前缀 " @ " 允许将关键字用作标识符,这在与其他编程语言交互时非常有用。The prefix "@" enables the use of keywords as identifiers, which is useful when interfacing with other programming languages. 该字符 @ 实际上不是标识符的一部分,因此标识符可能以其他语言显示为普通标识符,不含前缀。The character @ is not actually part of the identifier, so the identifier might be seen in other languages as a normal identifier, without the prefix. 带有前缀的标识符 @ 称为 逐字标识符An identifier with an @ prefix is called a verbatim identifier. @允许对不是关键字的标识符使用前缀,但强烈建议不要使用它作为样式。Use of the @ prefix for identifiers that are not keywords is permitted, but strongly discouraged as a matter of style.

示例:The example:

class @class
{
    public static void @static(bool @bool) {
        if (@bool)
            System.Console.WriteLine("true");
        else
            System.Console.WriteLine("false");
    }    
}

class Class1
{
    static void M() {
        cl\u0061ss.st\u0061tic(true);
    }
}

定义名为 "" 的类,该类具有名为 "" class 的静态方法 static ,该方法采用名为 "" 的参数 booldefines a class named "class" with a static method named "static" that takes a parameter named "bool". 请注意,由于关键字中不允许使用 Unicode 转义,因此标记 " cl\u0061ss " 是标识符,与 "" 具有相同的标识符 @classNote that since Unicode escapes are not permitted in keywords, the token "cl\u0061ss" is an identifier, and is the same identifier as "@class".

如果两个标识符在应用以下转换后相同,则将其视为相同:Two identifiers are considered the same if they are identical after the following transformations are applied, in order:

  • 删除前缀 " @ " (如果使用)。The prefix "@", if used, is removed.
  • 每个 unicode_escape_sequence 都转换为其对应的 unicode 字符。Each unicode_escape_sequence is transformed into its corresponding Unicode character.
  • 删除任何 formatting_characterAny formatting_character s are removed.

包含两个连续下划线字符的标识符 (U+005F 保留供实现使用) 。Identifiers containing two consecutive underscore characters (U+005F) are reserved for use by the implementation. 例如,实现可能提供以两个下划线开头的扩展关键字。For example, an implementation might provide extended keywords that begin with two underscores.

关键字Keywords

关键字 是类似于标识符的字符序列(保留),不能用作标识符,除非以 @ 字符开头。A keyword is an identifier-like sequence of characters that is reserved, and cannot be used as an identifier except when prefaced by the @ character.

keyword
    : 'abstract' | 'as'       | 'base'       | 'bool'      | 'break'
    | 'byte'     | 'case'     | 'catch'      | 'char'      | 'checked'
    | 'class'    | 'const'    | 'continue'   | 'decimal'   | 'default'
    | 'delegate' | 'do'       | 'double'     | 'else'      | 'enum'
    | 'event'    | 'explicit' | 'extern'     | 'false'     | 'finally'
    | 'fixed'    | 'float'    | 'for'        | 'foreach'   | 'goto'
    | 'if'       | 'implicit' | 'in'         | 'int'       | 'interface'
    | 'internal' | 'is'       | 'lock'       | 'long'      | 'namespace'
    | 'new'      | 'null'     | 'object'     | 'operator'  | 'out'
    | 'override' | 'params'   | 'private'    | 'protected' | 'public'
    | 'readonly' | 'ref'      | 'return'     | 'sbyte'     | 'sealed'
    | 'short'    | 'sizeof'   | 'stackalloc' | 'static'    | 'string'
    | 'struct'   | 'switch'   | 'this'       | 'throw'     | 'true'
    | 'try'      | 'typeof'   | 'uint'       | 'ulong'     | 'unchecked'
    | 'unsafe'   | 'ushort'   | 'using'      | 'virtual'   | 'void'
    | 'volatile' | 'while'
    ;

在语法中的某些位置,特定标识符具有特殊意义,但不是关键字。In some places in the grammar, specific identifiers have special meaning, but are not keywords. 此类标识符有时称为 "上下文关键字"。Such identifiers are sometimes referred to as "contextual keywords". 例如,在属性声明中," get " 和 " set " 标识符具有特殊意义 (访问器) 。For example, within a property declaration, the "get" and "set" identifiers have special meaning (Accessors). 此位置不允许使用或以外的标识符 get set ,因此,此使用不会与使用这些字词作为标识符冲突。An identifier other than get or set is never permitted in these locations, so this use does not conflict with a use of these words as identifiers. 在其他情况下,例如,如果 var 在隐式类型的局部变量声明中将标识符 "" () 局部变量声明 ,则上下文关键字可能会与声明的名称冲突。In other cases, such as with the identifier "var" in implicitly typed local variable declarations (Local variable declarations), a contextual keyword can conflict with declared names. 在这种情况下,声明的名称优先于将标识符用作上下文关键字。In such cases, the declared name takes precedence over the use of the identifier as a contextual keyword.

文字Literals

文本是值的源代码表示形式。A literal is a source code representation of a value.

literal
    : boolean_literal
    | integer_literal
    | real_literal
    | character_literal
    | string_literal
    | null_literal
    ;

布尔值文字Boolean literals

有两个布尔文本值: truefalseThere are two boolean literal values: true and false.

boolean_literal
    : 'true'
    | 'false'
    ;

Boolean_literal 的类型为 boolThe type of a boolean_literal is bool.

整数文本Integer literals

整数文本用于写入类型为、、和的值 int uint long ulongInteger literals are used to write values of types int, uint, long, and ulong. 整数文本具有两种可能的形式: decimal 和十六进制。Integer literals have two possible forms: decimal and hexadecimal.

integer_literal
    : decimal_integer_literal
    | hexadecimal_integer_literal
    ;

decimal_integer_literal
    : decimal_digit+ integer_type_suffix?
    ;

decimal_digit
    : '0' | '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9'
    ;

integer_type_suffix
    : 'U' | 'u' | 'L' | 'l' | 'UL' | 'Ul' | 'uL' | 'ul' | 'LU' | 'Lu' | 'lU' | 'lu'
    ;

hexadecimal_integer_literal
    : '0x' hex_digit+ integer_type_suffix?
    | '0X' hex_digit+ integer_type_suffix?
    ;

hex_digit
    : '0' | '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9'
    | 'A' | 'B' | 'C' | 'D' | 'E' | 'F' | 'a' | 'b' | 'c' | 'd' | 'e' | 'f';

确定整数文本的类型,如下所示:The type of an integer literal is determined as follows:

  • 如果文字没有后缀,则其值可以表示为以下类型的第一个类型: intuintlongulongIf the literal has no suffix, it has the first of these types in which its value can be represented: int, uint, long, ulong.
  • 如果文本的后缀为 Uu ,则它具有以下类型中可以表示其值的第一个类型: uintulongIf the literal is suffixed by U or u, it has the first of these types in which its value can be represented: uint, ulong.
  • 如果文本的后缀为 Ll ,则它具有以下类型中可以表示其值的第一个类型: longulongIf the literal is suffixed by L or l, it has the first of these types in which its value can be represented: long, ulong.
  • 如果文本的后缀为 ULUl 、、、、、 uL ul LU Lu lU 或,则 lu 它属于类型 ulongIf the literal is suffixed by UL, Ul, uL, ul, LU, Lu, lU, or lu, it is of type ulong.

如果整数文本所表示的值超出了该类型的范围 ulong ,则会发生编译时错误。If the value represented by an integer literal is outside the range of the ulong type, a compile-time error occurs.

作为样式,建议 L 在写入类型的文本时使用 "" 而不是 "" l long ,因为这样可以很容易地将字母 " l " 与数字 " 1 " 混淆。As a matter of style, it is suggested that "L" be used instead of "l" when writing literals of type long, since it is easy to confuse the letter "l" with the digit "1".

若要允许尽可能小的 intlong 值写入为十进制整数文本,请满足以下两个规则:To permit the smallest possible int and long values to be written as decimal integer literals, the following two rules exist:

  • 如果 decimal_integer_literal 值为 2147483648 (2 ^ 31) ,并且没有 integer_type_suffix 出现在一元减号运算符 (后面紧跟一元减号 运算符) 的 标记,则结果为类型的一个常量,其 int 值为-2147483648 ( ^ 31) 。When a decimal_integer_literal with the value 2147483648 (2^31) and no integer_type_suffix appears as the token immediately following a unary minus operator token (Unary minus operator), the result is a constant of type int with the value -2147483648 (-2^31). 在所有其他情况下,这类 decimal_integer_literal 的类型为 uintIn all other situations, such a decimal_integer_literal is of type uint.
  • 当具有值9223372036854775808的 decimal_integer_literal (2 ^ 63) ,并且没有 integer_type_suffixinteger_type_suffix L ,或在 l 一元减号运算符 (后出现的标记立即出现) 一元减运算符 ( 时,结果为类型的常量,其 long 值为-9223372036854775808) ^ 63。When a decimal_integer_literal with the value 9223372036854775808 (2^63) and no integer_type_suffix or the integer_type_suffix L or l appears as the token immediately following a unary minus operator token (Unary minus operator), the result is a constant of type long with the value -9223372036854775808 (-2^63). 在所有其他情况下,这类 decimal_integer_literal 的类型为 ulongIn all other situations, such a decimal_integer_literal is of type ulong.

真实文本Real literals

真实文本用于写入类型为 float 、和的值 double decimalReal literals are used to write values of types float, double, and decimal.

real_literal
    : decimal_digit+ '.' decimal_digit+ exponent_part? real_type_suffix?
    | '.' decimal_digit+ exponent_part? real_type_suffix?
    | decimal_digit+ exponent_part real_type_suffix?
    | decimal_digit+ real_type_suffix
    ;

exponent_part
    : 'e' sign? decimal_digit+
    | 'E' sign? decimal_digit+
    ;

sign
    : '+'
    | '-'
    ;

real_type_suffix
    : 'F' | 'f' | 'D' | 'd' | 'M' | 'm'
    ;

如果未指定 real_type_suffix ,则真实文本的类型为 doubleIf no real_type_suffix is specified, the type of the real literal is double. 否则,实数类型后缀将确定真实文本的类型,如下所示:Otherwise, the real type suffix determines the type of the real literal, as follows:

  • F 或作为后缀的实文本 f 的类型为 floatA real literal suffixed by F or f is of type float. 例如,文本、、 1f 1.5f 1e10f123.456F 都是类型 floatFor example, the literals 1f, 1.5f, 1e10f, and 123.456F are all of type float.
  • D 或作为后缀的实文本 d 的类型为 doubleA real literal suffixed by D or d is of type double. 例如,文本、、 1d 1.5d 1e10d123.456D 都是类型 doubleFor example, the literals 1d, 1.5d, 1e10d, and 123.456D are all of type double.
  • M 或作为后缀的实文本 m 的类型为 decimalA real literal suffixed by M or m is of type decimal. 例如,文本、、 1m 1.5m 1e10m123.456M 都是类型 decimalFor example, the literals 1m, 1.5m, 1e10m, and 123.456M are all of type decimal. 此文本将 decimal 通过采用精确值转换为值,并在必要时使用银行家的舍入舍入为最接近的可表示的值) (decimal 类型This literal is converted to a decimal value by taking the exact value, and, if necessary, rounding to the nearest representable value using banker's rounding (The decimal type). 除非将值舍入或值为零,否则文本中的任何小数位数都将保留,这 (在后者中,符号和小数位数为 0) 。Any scale apparent in the literal is preserved unless the value is rounded or the value is zero (in which latter case the sign and scale will be 0). 因此, 2.900m 将分析文本,以形成带有符号 0 、系数 2900 和刻度的小数 3Hence, the literal 2.900m will be parsed to form the decimal with sign 0, coefficient 2900, and scale 3.

如果指定的文本不能用指定的类型表示,则会发生编译时错误。If the specified literal cannot be represented in the indicated type, a compile-time error occurs.

类型或的实字的值 float double 是通过使用 IEEE "舍入到最近的" 模式确定的。The value of a real literal of type float or double is determined by using the IEEE "round to nearest" mode.

请注意,在实际文本中,小数点后始终需要小数位数。Note that in a real literal, decimal digits are always required after the decimal point. 例如, 1.3F 是一个真实文本,但 1.F 不是。For example, 1.3F is a real literal but 1.F is not.

字符文本Character literals

字符文本表示单个字符,通常由引号中的字符组成,如中所示 'a'A character literal represents a single character, and usually consists of a character in quotes, as in 'a'.

注意: ANTLR 语法表示法会使以下混乱!Note: The ANTLR grammar notation makes the following confusing! 在 ANTLR 中,当您编写时, \' 它代表一个引号 'In ANTLR, when you write \' it stands for a single quote '. 当你编写时, \\ 它代表单个反斜杠 \And when you write \\ it stands for a single backslash \. 因此,字符文本的第一个规则意味着以单个单引号开始,然后是一个字符,然后是一个引号。Therefore the first rule for a character literal means it starts with a single quote, then a character, then a single quote. 还有11个可能的简单转义序列 \' :、、、 \" \\ \0\a\b \f \n \r \t \v 、、、、、。And the eleven possible simple escape sequences are \', \", \\, \0, \a, \b, \f, \n, \r, \t, \v.

character_literal
    : '\'' character '\''
    ;

character
    : single_character
    | simple_escape_sequence
    | hexadecimal_escape_sequence
    | unicode_escape_sequence
    ;

single_character
    : '<Any character except \' (U+0027), \\ (U+005C), and new_line_character>'
    ;

simple_escape_sequence
    : '\\\'' | '\\"' | '\\\\' | '\\0' | '\\a' | '\\b' | '\\f' | '\\n' | '\\r' | '\\t' | '\\v'
    ;

hexadecimal_escape_sequence
    : '\\x' hex_digit hex_digit? hex_digit? hex_digit?;

紧跟在字符中 () 的反斜杠字符后面的字符 \ 必须是下列字符之一: '"\ 、、 0 a b f n r t u U x v 、、、、、、、、、。A character that follows a backslash character (\) in a character must be one of the following characters: ', ", \, 0, a, b, f, n, r, t, u, U, x, v. 否则,将发生编译时错误。Otherwise, a compile-time error occurs.

十六进制转义序列表示单个 Unicode 字符,其值由 "" 后面的十六进制数构成 \xA hexadecimal escape sequence represents a single Unicode character, with the value formed by the hexadecimal number following "\x".

如果字符文本表示的值大于 U+FFFF ,则会发生编译时错误。If the value represented by a character literal is greater than U+FFFF, a compile-time error occurs.

字符文本中) (unicode 字符 转义序列的 unicode 字符转义序列必须介于 U+0000 到之间 U+FFFFA Unicode character escape sequence (Unicode character escape sequences) in a character literal must be in the range U+0000 to U+FFFF.

简单的转义序列表示 Unicode 字符编码,如下表中所述。A simple escape sequence represents a Unicode character encoding, as described in the table below.

转义序列Escape sequence 字符名称Character name Unicode 编码Unicode encoding
\' 单引号Single quote 0x0027
\" 双引号Double quote 0x0022
\\ 反斜杠Backslash 0x005C
\0 NullNull 0x0000
\a 警报Alert 0x0007
\b 退格键Backspace 0x0008
\f 换页Form feed 0x000C
\n 换行New line 0x000A
\r 回车Carriage return 0x000D
\t 水平制表符Horizontal tab 0x0009
\v 垂直制表符Vertical tab 0x000B

Character_literal 的类型为 charThe type of a character_literal is char.

字符串文本String literals

C # 支持两种形式的字符串文本: *常规字符串文本 _ 和 _ 原义字符串文本 *。C# supports two forms of string literals: regular string literals _ and _verbatim string literals**.

正则字符串文字由零个或多个字符括在双引号中,如中所示, "hello" 并且可能包括简单转义序列 (如 \t 用于制表符) 的和十六进制和 Unicode 转义序列。A regular string literal consists of zero or more characters enclosed in double quotes, as in "hello", and may include both simple escape sequences (such as \t for the tab character), and hexadecimal and Unicode escape sequences.

原义字符串包含一个 @ 字符,后跟一个双引号字符、零个或多个字符和一个右双引号字符。A verbatim string literal consists of an @ character followed by a double-quote character, zero or more characters, and a closing double-quote character. 一个简单的示例是 @"hello"A simple example is @"hello". 在原义字符串文本中,分隔符之间的字符按原义解释,唯一的例外是 quote_escape_sequenceIn a verbatim string literal, the characters between the delimiters are interpreted verbatim, the only exception being a quote_escape_sequence. 具体而言,简单转义序列和十六进制和 Unicode 转义序列不会在原义字符串文本中处理。In particular, simple escape sequences, and hexadecimal and Unicode escape sequences are not processed in verbatim string literals. 原义字符串文本可以跨多个行。A verbatim string literal may span multiple lines.

string_literal
    : regular_string_literal
    | verbatim_string_literal
    ;

regular_string_literal
    : '"' regular_string_literal_character* '"'
    ;

regular_string_literal_character
    : single_regular_string_literal_character
    | simple_escape_sequence
    | hexadecimal_escape_sequence
    | unicode_escape_sequence
    ;

single_regular_string_literal_character
    : '<Any character except " (U+0022), \\ (U+005C), and new_line_character>'
    ;

verbatim_string_literal
    : '@"' verbatim_string_literal_character* '"'
    ;

verbatim_string_literal_character
    : single_verbatim_string_literal_character
    | quote_escape_sequence
    ;

single_verbatim_string_literal_character
    : '<any character except ">'
    ;

quote_escape_sequence
    : '""'
    ;

位于反斜杠字符后跟 regular_string_literal_character (\) 的字符必须是下列字符之一: '"\0a b f n r t u U x v 、、、、、、、、、。A character that follows a backslash character (\) in a regular_string_literal_character must be one of the following characters: ', ", \, 0, a, b, f, n, r, t, u, U, x, v. 否则,将发生编译时错误。Otherwise, a compile-time error occurs.

示例The example

string a = "hello, world";                   // hello, world
string b = @"hello, world";                  // hello, world

string c = "hello \t world";                 // hello      world
string d = @"hello \t world";                // hello \t world

string e = "Joe said \"Hello\" to me";       // Joe said "Hello" to me
string f = @"Joe said ""Hello"" to me";      // Joe said "Hello" to me

string g = "\\\\server\\share\\file.txt";    // \\server\share\file.txt
string h = @"\\server\share\file.txt";       // \\server\share\file.txt

string i = "one\r\ntwo\r\nthree";
string j = @"one
two
three";

显示各种字符串文本。shows a variety of string literals. 最后一个字符串文字 j 是跨多行的逐字字符串。The last string literal, j, is a verbatim string literal that spans multiple lines. 引号之间的字符(包括空白字符,如换行符)会逐字保留。The characters between the quotation marks, including white space such as new line characters, are preserved verbatim.

由于十六进制转义序列可以具有可变数量的十六进制数字,因此字符串文本 "\x123" 包含一个具有十六进制值123的单个字符。Since a hexadecimal escape sequence can have a variable number of hex digits, the string literal "\x123" contains a single character with hex value 123. 若要创建一个字符串,该字符串包含的字符的十六进制值为12,后跟字符3,则可以写 "\x00123" "\x12" + "3"To create a string containing the character with hex value 12 followed by the character 3, one could write "\x00123" or "\x12" + "3" instead.

String_literal 的类型为 stringThe type of a string_literal is string.

每个字符串文本不一定会生成新的字符串实例。Each string literal does not necessarily result in a new string instance. 如果两个或更多个字符串文本根据字符串相等运算符相等,则 (字符串相等运算符) 出现在同一程序中,则这些字符串将引用相同的字符串实例。When two or more string literals that are equivalent according to the string equality operator (String equality operators) appear in the same program, these string literals refer to the same string instance. 例如,生成的输出For instance, the output produced by

class Test
{
    static void Main() {
        object a = "hello";
        object b = "hello";
        System.Console.WriteLine(a == b);
    }
}

True 因为两个文本引用相同的字符串实例。is True because the two literals refer to the same string instance.

内插字符串文本Interpolated string literals

内插字符串与字符串文本类似,但包含用 and 分隔的孔,其中的 { } 表达式可以出现。Interpolated string literals are similar to string literals, but contain holes delimited by { and }, wherein expressions can occur. 在运行时,将对表达式进行计算,目的是将其文本窗体替换为发生该洞的位置的字符串。At runtime, the expressions are evaluated with the purpose of having their textual forms substituted into the string at the place where the hole occurs. 字符串内插的语法和语义在 (内 插字符串) 部分中进行了介绍。The syntax and semantics of string interpolation are described in section (Interpolated strings).

与字符串文本一样,内插字符串文本可以是正则为或是原义字符串。Like string literals, interpolated string literals can be either regular or verbatim. 内插正则字符串文本由 $" 和分隔 " ,并由和分隔逐字字符串。 $@" "Interpolated regular string literals are delimited by $" and ", and interpolated verbatim string literals are delimited by $@" and ".

与其他文本一样,内插字符串的词法分析最初会根据下面的语法产生单个令牌。Like other literals, lexical analysis of an interpolated string literal initially results in a single token, as per the grammar below. 但是,在句法分析之前,内插字符串的单个标记将被分解为包含该洞的字符串部分的多个标记,而洞中发生的输入元素会在词法上重新并非。However, before syntactic analysis, the single token of an interpolated string literal is broken into several tokens for the parts of the string enclosing the holes, and the input elements occurring in the holes are lexically analysed again. 这反过来会生成更多的内插字符串文字,但如果词法上正确,最终将导致一系列标记,以便进行语法分析。This may in turn produce more interpolated string literals to be processed, but, if lexically correct, will eventually lead to a sequence of tokens for syntactic analysis to process.

interpolated_string_literal
    : '$' interpolated_regular_string_literal
    | '$' interpolated_verbatim_string_literal
    ;

interpolated_regular_string_literal
    : interpolated_regular_string_whole
    | interpolated_regular_string_start  interpolated_regular_string_literal_body interpolated_regular_string_end
    ;

interpolated_regular_string_literal_body
    : regular_balanced_text
    | interpolated_regular_string_literal_body interpolated_regular_string_mid regular_balanced_text
    ;

interpolated_regular_string_whole
    : '"' interpolated_regular_string_character* '"'
    ;

interpolated_regular_string_start
    : '"' interpolated_regular_string_character* '{'
    ;

interpolated_regular_string_mid
    : interpolation_format? '}' interpolated_regular_string_characters_after_brace? '{'
    ;

interpolated_regular_string_end
    : interpolation_format? '}' interpolated_regular_string_characters_after_brace? '"'
    ;

interpolated_regular_string_characters_after_brace
    : interpolated_regular_string_character_no_brace
    | interpolated_regular_string_characters_after_brace interpolated_regular_string_character
    ;

interpolated_regular_string_character
    : single_interpolated_regular_string_character
    | simple_escape_sequence
    | hexadecimal_escape_sequence
    | unicode_escape_sequence
    | open_brace_escape_sequence
    | close_brace_escape_sequence
    ;

interpolated_regular_string_character_no_brace
    : '<Any interpolated_regular_string_character except close_brace_escape_sequence and any hexadecimal_escape_sequence or unicode_escape_sequence designating } (U+007D)>'
    ;

single_interpolated_regular_string_character
    : '<Any character except \" (U+0022), \\ (U+005C), { (U+007B), } (U+007D), and new_line_character>'
    ;

open_brace_escape_sequence
    : '{{'
    ;

close_brace_escape_sequence
    : '}}'
    ;
    
regular_balanced_text
    : regular_balanced_text_part+
    ;

regular_balanced_text_part
    : single_regular_balanced_text_character
    | delimited_comment
    | '@' identifier_or_keyword
    | string_literal
    | interpolated_string_literal
    | '(' regular_balanced_text ')'
    | '[' regular_balanced_text ']'
    | '{' regular_balanced_text '}'
    ;
    
single_regular_balanced_text_character
    : '<Any character except / (U+002F), @ (U+0040), \" (U+0022), $ (U+0024), ( (U+0028), ) (U+0029), [ (U+005B), ] (U+005D), { (U+007B), } (U+007D) and new_line_character>'
    | '</ (U+002F), if not directly followed by / (U+002F) or * (U+002A)>'
    ;
    
interpolation_format
    : ':' interpolation_format_character+
    ;
    
interpolation_format_character
    : '<Any character except \" (U+0022), : (U+003A), { (U+007B) and } (U+007D)>'
    ;
    
interpolated_verbatim_string_literal
    : interpolated_verbatim_string_whole
    | interpolated_verbatim_string_start interpolated_verbatim_string_literal_body interpolated_verbatim_string_end
    ;

interpolated_verbatim_string_literal_body
    : verbatim_balanced_text
    | interpolated_verbatim_string_literal_body interpolated_verbatim_string_mid verbatim_balanced_text
    ;
    
interpolated_verbatim_string_whole
    : '@"' interpolated_verbatim_string_character* '"'
    ;
    
interpolated_verbatim_string_start
    : '@"' interpolated_verbatim_string_character* '{'
    ;
    
interpolated_verbatim_string_mid
    : interpolation_format? '}' interpolated_verbatim_string_characters_after_brace? '{'
    ;
    
interpolated_verbatim_string_end
    : interpolation_format? '}' interpolated_verbatim_string_characters_after_brace? '"'
    ;
    
interpolated_verbatim_string_characters_after_brace
    : interpolated_verbatim_string_character_no_brace
    | interpolated_verbatim_string_characters_after_brace interpolated_verbatim_string_character
    ;
    
interpolated_verbatim_string_character
    : single_interpolated_verbatim_string_character
    | quote_escape_sequence
    | open_brace_escape_sequence
    | close_brace_escape_sequence
    ;
    
interpolated_verbatim_string_character_no_brace
    : '<Any interpolated_verbatim_string_character except close_brace_escape_sequence>'
    ;
    
single_interpolated_verbatim_string_character
    : '<Any character except \" (U+0022), { (U+007B) and } (U+007D)>'
    ;
    
verbatim_balanced_text
    : verbatim_balanced_text_part+
    ;

verbatim_balanced_text_part
    : single_verbatim_balanced_text_character
    | comment
    | '@' identifier_or_keyword
    | string_literal
    | interpolated_string_literal
    | '(' verbatim_balanced_text ')'
    | '[' verbatim_balanced_text ']'
    | '{' verbatim_balanced_text '}'
    ;
    
single_verbatim_balanced_text_character
    : '<Any character except / (U+002F), @ (U+0040), \" (U+0022), $ (U+0024), ( (U+0028), ) (U+0029), [ (U+005B), ] (U+005D), { (U+007B) and } (U+007D)>'
    | '</ (U+002F), if not directly followed by / (U+002F) or * (U+002A)>'
    ;

Interpolated_string_literal 令牌重新解释为多个令牌和其他输入元素,如下所示: interpolated_string_literal 中出现的顺序:An interpolated_string_literal token is reinterpreted as multiple tokens and other input elements as follows, in order of occurrence in the interpolated_string_literal:

  • 以下各项分别作为单独的标记重新解释:前导 $ 符号、 interpolated_regular_string_wholeinterpolated_regular_string_startinterpolated_regular_string_midinterpolated_regular_string_endinterpolated_verbatim_string_wholeinterpolated_verbatim_string_startinterpolated_verbatim_string_midinterpolated_verbatim_string_endOccurrences of the following are reinterpreted as separate individual tokens: the leading $ sign, interpolated_regular_string_whole, interpolated_regular_string_start, interpolated_regular_string_mid, interpolated_regular_string_end, interpolated_verbatim_string_whole, interpolated_verbatim_string_start, interpolated_verbatim_string_mid and interpolated_verbatim_string_end.
  • Regular_balanced_textverbatim_balanced_text 之间的发生方式作为 input_section (词法分析) 进行重新处理,并重新解释作为输入元素的结果序列。Occurrences of regular_balanced_text and verbatim_balanced_text between these are reprocessed as an input_section (Lexical analysis) and are reinterpreted as the resulting sequence of input elements. 这些转换可能会将内插字符串文本标记包含为重新解释。These may in turn include interpolated string literal tokens to be reinterpreted.

语法分析会将令牌重新组合到) interpolated_string_expression (内 插字符串Syntactic analysis will recombine the tokens into an interpolated_string_expression (Interpolated strings).

示例 TODOExamples TODO

Null 文本The null literal

null_literal
    : 'null'
    ;

Null_literal 可以隐式转换为引用类型或可以为 null 的类型。The null_literal can be implicitly converted to a reference type or nullable type.

运算符和标点符号Operators and punctuators

有多种运算符和标点符号。There are several kinds of operators and punctuators. 表达式中使用运算符来描述涉及一个或多个操作数的操作。Operators are used in expressions to describe operations involving one or more operands. 例如,表达式 a + b 使用 + 运算符添加两个操作数 abFor example, the expression a + b uses the + operator to add the two operands a and b. 标点符号用于分组和分隔。Punctuators are for grouping and separating.

operator_or_punctuator
    : '{'  | '}'  | '['  | ']'  | '('   | ')'  | '.'  | ','  | ':'  | ';'
    | '+'  | '-'  | '*'  | '/'  | '%'   | '&'  | '|'  | '^'  | '!'  | '~'
    | '='  | '<'  | '>'  | '?'  | '??'  | '::' | '++' | '--' | '&&' | '||'
    | '->' | '==' | '!=' | '<=' | '>='  | '+=' | '-=' | '*=' | '/=' | '%='
    | '&=' | '|=' | '^=' | '<<' | '<<=' | '=>'
    ;

right_shift
    : '>>'
    ;

right_shift_assignment
    : '>>='
    ;

Right_shiftright_shift_assignment 生产中的竖线用于指示,与句法语法中的其他生产不同的是,不允许在标记之间使用任何类型 (甚至不允许使用空格) 任何字符。The vertical bar in the right_shift and right_shift_assignment productions are used to indicate that, unlike other productions in the syntactic grammar, no characters of any kind (not even whitespace) are allowed between the tokens. 这些生产是专门处理的,目的是为了能够) type_parameter_list s (类型参数 的正确处理。These productions are treated specially in order to enable the correct handling of type_parameter_list s (Type parameters).

预处理指令Pre-processing directives

预处理指令提供按条件跳过源文件部分的功能,报告错误和警告条件,以及描述源代码的不同区域。The pre-processing directives provide the ability to conditionally skip sections of source files, to report error and warning conditions, and to delineate distinct regions of source code. 术语 "预处理指令" 仅用于与 C 和 c + + 编程语言的一致性。The term "pre-processing directives" is used only for consistency with the C and C++ programming languages. 在 c # 中,没有单独的预处理步骤;预处理指令作为词法分析阶段的一部分进行处理。In C#, there is no separate pre-processing step; pre-processing directives are processed as part of the lexical analysis phase.

pp_directive
    : pp_declaration
    | pp_conditional
    | pp_line
    | pp_diagnostic
    | pp_region
    | pp_pragma
    ;

以下预处理指令可用:The following pre-processing directives are available:

  • #define#undef ,分别用于定义和取消定义条件编译符号 (声明指令) 。#define and #undef, which are used to define and undefine, respectively, conditional compilation symbols (Declaration directives).
  • #if#elif#else#endif ,用于有条件地跳过源代码 (条件编译指令) 的部分。#if, #elif, #else, and #endif, which are used to conditionally skip sections of source code (Conditional compilation directives).
  • #line,用于控制发出的用于错误和警告的行号) (line 指令#line, which is used to control line numbers emitted for errors and warnings (Line directives).
  • #error``#warning用于发出错误和警告的和,分别 (诊断指令) 。#error and #warning, which are used to issue errors and warnings, respectively (Diagnostic directives).
  • #region#endregion ,用于将源代码中的部分显式标记 (区域指令) 。#region and #endregion, which are used to explicitly mark sections of source code (Region directives).
  • #pragma,用于指定 (杂注指令) 的编译器的可选上下文信息。#pragma, which is used to specify optional contextual information to the compiler (Pragma directives).

预处理指令始终占用一行单独的源代码,并始终以 # 字符和预处理指令名称开头。A pre-processing directive always occupies a separate line of source code and always begins with a # character and a pre-processing directive name. 空格可能出现在字符和 # # 指令名称之间。White space may occur before the # character and between the # character and the directive name.

包含、、、、、、或指令的源行 #define #undef #if #elif #else #endif #line #endregion 可以以单行注释结束。A source line containing a #define, #undef, #if, #elif, #else, #endif, #line, or #endregion directive may end with a single-line comment. /* */在包含预处理指令的源行上不允许 (注释样式的分隔注释) 。Delimited comments (the /* */ style of comments) are not permitted on source lines containing pre-processing directives.

预处理指令不是标记,不是 c # 语法语法的一部分。Pre-processing directives are not tokens and are not part of the syntactic grammar of C#. 但是,预处理指令可用于包含或排除标记序列,并以这种方式影响 c # 程序的含义。However, pre-processing directives can be used to include or exclude sequences of tokens and can in that way affect the meaning of a C# program. 例如,编译后,程序:For example, when compiled, the program:

#define A
#undef B

class C
{
#if A
    void F() {}
#else
    void G() {}
#endif

#if B
    void H() {}
#else
    void I() {}
#endif
}

生成与程序完全相同的标记序列:results in the exact same sequence of tokens as the program:

class C
{
    void F() {}
    void I() {}
}

因此,在语义上,这两个程序在语法上非常不同,它们是相同的。Thus, whereas lexically, the two programs are quite different, syntactically, they are identical.

“条件编译符”号Conditional compilation symbols

、、和指令提供的条件编译功能 #if #elif 通过预处理表达式来控制, #else #endif (预处理表达式) 和条件编译符号。The conditional compilation functionality provided by the #if, #elif, #else, and #endif directives is controlled through pre-processing expressions (Pre-processing expressions) and conditional compilation symbols.

conditional_symbol
    : '<Any identifier_or_keyword except true or false>'
    ;

条件编译符号有两种可能的状态: *定义 的 _ 或 _ 未定义 *。A conditional compilation symbol has two possible states: defined _ or _undefined**. 在源文件的词法处理开始时,不定义条件编译符号,除非已通过外部机制(例如命令行编译器选项)显式定义了该符号 (如) 。At the beginning of the lexical processing of a source file, a conditional compilation symbol is undefined unless it has been explicitly defined by an external mechanism (such as a command-line compiler option). #define处理指令时,该指令中名为的条件编译符号将在该源文件中进行定义。When a #define directive is processed, the conditional compilation symbol named in that directive becomes defined in that source file. #undef 处理同一符号的指令之前,或在到达源文件末尾之前,该符号保持为已定义。The symbol remains defined until an #undef directive for that same symbol is processed, or until the end of the source file is reached. 这意味着, #define #undef 一个源文件中的和指令对同一程序中的其他源文件不起作用。An implication of this is that #define and #undef directives in one source file have no effect on other source files in the same program.

在预处理表达式中引用时,定义的条件编译符号具有布尔值 true ,未定义的条件编译符号具有布尔值 falseWhen referenced in a pre-processing expression, a defined conditional compilation symbol has the boolean value true, and an undefined conditional compilation symbol has the boolean value false. 在预处理表达式中引用条件编译符号之前,不需要显式声明它们。There is no requirement that conditional compilation symbols be explicitly declared before they are referenced in pre-processing expressions. 相反,未声明的符号只是未定义的,因此具有值 falseInstead, undeclared symbols are simply undefined and thus have the value false.

条件编译符号的命名空间是不同的,并且独立于 c # 程序中的所有其他命名实体。The name space for conditional compilation symbols is distinct and separate from all other named entities in a C# program. 条件编译符号只能在 #define#undef 指令以及预处理表达式中引用。Conditional compilation symbols can only be referenced in #define and #undef directives and in pre-processing expressions.

预处理表达式Pre-processing expressions

预处理表达式可以出现在 #if 和指令中 #elifPre-processing expressions can occur in #if and #elif directives. ! == != && || 预处理表达式中允许使用运算符、、和,括号可用于分组。The operators !, ==, !=, && and || are permitted in pre-processing expressions, and parentheses may be used for grouping.

pp_expression
    : whitespace? pp_or_expression whitespace?
    ;

pp_or_expression
    : pp_and_expression
    | pp_or_expression whitespace? '||' whitespace? pp_and_expression
    ;

pp_and_expression
    : pp_equality_expression
    | pp_and_expression whitespace? '&&' whitespace? pp_equality_expression
    ;

pp_equality_expression
    : pp_unary_expression
    | pp_equality_expression whitespace? '==' whitespace? pp_unary_expression
    | pp_equality_expression whitespace? '!=' whitespace? pp_unary_expression
    ;

pp_unary_expression
    : pp_primary_expression
    | '!' whitespace? pp_unary_expression
    ;

pp_primary_expression
    : 'true'
    | 'false'
    | conditional_symbol
    | '(' whitespace? pp_expression whitespace? ')'
    ;

在预处理表达式中引用时,定义的条件编译符号具有布尔值 true ,未定义的条件编译符号具有布尔值 falseWhen referenced in a pre-processing expression, a defined conditional compilation symbol has the boolean value true, and an undefined conditional compilation symbol has the boolean value false.

预处理表达式的计算始终产生布尔值。Evaluation of a pre-processing expression always yields a boolean value. 预处理表达式的计算规则与 () 常数 表达式的常量表达式的计算规则相同,只不过只能引用的用户定义的实体是条件编译符号。The rules of evaluation for a pre-processing expression are the same as those for a constant expression (Constant expressions), except that the only user-defined entities that can be referenced are conditional compilation symbols.

声明指令Declaration directives

声明指令用于定义或取消定义条件编译符号。The declaration directives are used to define or undefine conditional compilation symbols.

pp_declaration
    : whitespace? '#' whitespace? 'define' whitespace conditional_symbol pp_new_line
    | whitespace? '#' whitespace? 'undef' whitespace conditional_symbol pp_new_line
    ;

pp_new_line
    : whitespace? single_line_comment? new_line
    ;

指令的处理 #define 会使给定的条件编译符号成为定义,并从跟在指令后面的源行开始。The processing of a #define directive causes the given conditional compilation symbol to become defined, starting with the source line that follows the directive. 同样,处理 #undef 指令会使给定的条件编译符号变成未定义的,从该指令后面的源行开始。Likewise, the processing of an #undef directive causes the given conditional compilation symbol to become undefined, starting with the source line that follows the directive.

#define #undef 源文件中的任何和指令必须出现在源文件中 ( 标记) 之前; 否则,将发生编译时错误。Any #define and #undef directives in a source file must occur before the first token (Tokens) in the source file; otherwise a compile-time error occurs. 在直观的术语中, #define#undef 指令必须位于源文件中的任何 "真实代码" 之前。In intuitive terms, #define and #undef directives must precede any "real code" in the source file.

示例:The example:

#define Enterprise

#if Professional || Enterprise
    #define Advanced
#endif

namespace Megacorp.Data
{
    #if Advanced
    class PivotTable {...}
    #endif
}

有效,因为 #define 指令位于第一个标记前面 (namespace 关键字) 在源文件中。is valid because the #define directives precede the first token (the namespace keyword) in the source file.

下面的示例会导致编译时错误,因为它会 #define 跟随真实代码:The following example results in a compile-time error because a #define follows real code:

#define A
namespace N
{
    #define B
    #if B
    class Class1 {}
    #endif
}

#define可以定义已定义的条件编译符号,而不会 #undef 对该符号进行任何干预。A #define may define a conditional compilation symbol that is already defined, without there being any intervening #undef for that symbol. 下面的示例定义条件编译符号 A ,然后再次定义它。The example below defines a conditional compilation symbol A and then defines it again.

#define A
#define A

#undef可能 "取消定义" 未定义的条件编译符号。A #undef may "undefine" a conditional compilation symbol that is not defined. 下面的示例定义条件编译符号 A ,然后将其取消定义两次; 虽然第二个 #undef 不起作用,但仍有效。The example below defines a conditional compilation symbol A and then undefines it twice; although the second #undef has no effect, it is still valid.

#define A
#undef A
#undef A

条件编译指令Conditional compilation directives

条件编译指令用于有条件地包含或排除源文件的某些部分。The conditional compilation directives are used to conditionally include or exclude portions of a source file.

pp_conditional
    : pp_if_section pp_elif_section* pp_else_section? pp_endif
    ;

pp_if_section
    : whitespace? '#' whitespace? 'if' whitespace pp_expression pp_new_line conditional_section?
    ;

pp_elif_section
    : whitespace? '#' whitespace? 'elif' whitespace pp_expression pp_new_line conditional_section?
    ;

pp_else_section:
    | whitespace? '#' whitespace? 'else' pp_new_line conditional_section?
    ;

pp_endif
    : whitespace? '#' whitespace? 'endif' pp_new_line
    ;

conditional_section
    : input_section
    | skipped_section
    ;

skipped_section
    : skipped_section_part+
    ;

skipped_section_part
    : skipped_characters? new_line
    | pp_directive
    ;

skipped_characters
    : whitespace? not_number_sign input_character*
    ;

not_number_sign
    : '<Any input_character except #>'
    ;

如语法所示,必须按顺序(按顺序)、 #if 指令、零个或多个 #elif 指令、零个或一个 #else 指令以及指令来写入条件编译指令 #endifAs indicated by the syntax, conditional compilation directives must be written as sets consisting of, in order, an #if directive, zero or more #elif directives, zero or one #else directive, and an #endif directive. 在指令与源代码的条件部分之间。Between the directives are conditional sections of source code. 每个部分都由前面的指令控制。Each section is controlled by the immediately preceding directive. 条件部分本身可能包含嵌套的条件编译指令,前提是这些指令构成了完整的集。A conditional section may itself contain nested conditional compilation directives provided these directives form complete sets.

Pp_conditional 最多为常规词法处理选择一个包含的 conditional_sectionA pp_conditional selects at most one of the contained conditional_section s for normal lexical processing:

  • 和指令的 pp_expression#if #elif 顺序进行计算,直到有一个结果 trueThe pp_expression s of the #if and #elif directives are evaluated in order until one yields true. 如果表达式产生了 true ,则选择相应指令的 conditional_sectionIf an expression yields true, the conditional_section of the corresponding directive is selected.
  • 如果所有 pp_expressionfalse 为 yield,并且 #else 存在指令,则选择指令的 conditional_section #elseIf all pp_expression s yield false, and if an #else directive is present, the conditional_section of the #else directive is selected.
  • 否则,不会选择任何 conditional_sectionOtherwise, no conditional_section is selected.

选定的 conditional_section(如果有)将作为正常 input_section 进行处理:节中包含的源代码必须符合词法语法;标记是从节中的源代码生成的;部分中的和预处理指令具有指定的效果。The selected conditional_section, if any, is processed as a normal input_section: the source code contained in the section must adhere to the lexical grammar; tokens are generated from the source code in the section; and pre-processing directives in the section have the prescribed effects.

剩余的 conditional_section(如果有)将作为 skipped_section s 进行处理:除预处理指令以外,部分中的源代码无需遵守词法语法;不会从该部分中的源代码生成任何标记;部分中的和预处理指令必须在词法上是正确的,但不会进行处理。The remaining conditional_section s, if any, are processed as skipped_section s: except for pre-processing directives, the source code in the section need not adhere to the lexical grammar; no tokens are generated from the source code in the section; and pre-processing directives in the section must be lexically correct but are not otherwise processed. 在作为 skipped_section 进行处理的 conditional_section 中,嵌套 ... 和 ... 构造) 中包含的任何嵌套 conditional_section (#if #endif #region #endregion 也作为 skipped_section 进行处理。Within a conditional_section that is being processed as a skipped_section, any nested conditional_section s (contained in nested #if...#endif and #region...#endregion constructs) are also processed as skipped_section s.

下面的示例说明了条件编译指令如何嵌套:The following example illustrates how conditional compilation directives can nest:

#define Debug       // Debugging on
#undef Trace        // Tracing off

class PurchaseTransaction
{
    void Commit() {
        #if Debug
            CheckConsistency();
            #if Trace
                WriteToLog(this.ToString());
            #endif
        #endif
        CommitHelper();
    }
}

除预处理指令外,跳过的源代码不受词法分析的限制。Except for pre-processing directives, skipped source code is not subject to lexical analysis. 例如,尽管部分中出现未终止的注释,以下内容仍有效 #elseFor example, the following is valid despite the unterminated comment in the #else section:

#define Debug        // Debugging on

class PurchaseTransaction
{
    void Commit() {
        #if Debug
            CheckConsistency();
        #else
            /* Do something else
        #endif
    }
}

但请注意,即使在源代码中跳过的部分,预处理指令也需要在词法上正确。Note, however, that pre-processing directives are required to be lexically correct even in skipped sections of source code.

当预处理指令出现在多行输入元素中时,不会对其进行处理。Pre-processing directives are not processed when they appear inside multi-line input elements. 例如,程序:For example, the program:

class Hello
{
    static void Main() {
        System.Console.WriteLine(@"hello, 
#if Debug
        world
#else
        Nebraska
#endif
        ");
    }
}

输出结果为:results in the output:

hello,
#if Debug
        world
#else
        Nebraska
#endif

在特殊情况下,处理的预处理指令集可能取决于 pp_expression 的计算。In peculiar cases, the set of pre-processing directives that is processed might depend on the evaluation of the pp_expression. 示例:The example:

#if X
    /*
#else
    /* */ class Q { }
#endif

class Q { } 不管是否 X 定义了,始终 () 生成相同的令牌流。always produces the same token stream (class Q { }), regardless of whether or not X is defined. 如果 X 定义了,则仅处理的指令为 #if#endif ,因为有多行注释。If X is defined, the only processed directives are #if and #endif, due to the multi-line comment. 如果 X 未定义,则 (,则 #if #else #endif) 是指令集的一部分。If X is undefined, then three directives (#if, #else, #endif) are part of the directive set.

诊断指令Diagnostic directives

诊断指令用于显式生成错误和警告消息,其报告方式与其他编译时错误和警告的方式相同。The diagnostic directives are used to explicitly generate error and warning messages that are reported in the same way as other compile-time errors and warnings.

pp_diagnostic
    : whitespace? '#' whitespace? 'error' pp_message
    | whitespace? '#' whitespace? 'warning' pp_message
    ;

pp_message
    : new_line
    | whitespace input_character* new_line
    ;

示例:The example:

#warning Code review needed before check-in

#if Debug && Retail
    #error A build can't be both debug and retail
#endif

class Test {...}

始终会生成警告 ( "签入前需要代码评审" ) ,并生成编译时错误 ( "生成不能同时为调试和零售" ) 如果条件符号 Debug 和均 Retail 已定义。always produces a warning ("Code review needed before check-in"), and produces a compile-time error ("A build can't be both debug and retail") if the conditional symbols Debug and Retail are both defined. 请注意, pp_message 可以包含任意文本;具体而言,它不需要包含格式正确的标记,如单词中的单引号所示 can'tNote that a pp_message can contain arbitrary text; specifically, it need not contain well-formed tokens, as shown by the single quote in the word can't.

区域指令Region directives

区域指令用于显式标记源代码区域。The region directives are used to explicitly mark regions of source code.

pp_region
    : pp_start_region conditional_section? pp_end_region
    ;

pp_start_region
    : whitespace? '#' whitespace? 'region' pp_message
    ;

pp_end_region
    : whitespace? '#' whitespace? 'endregion' pp_message
    ;

无语义含义附加到区域;区域旨在供程序员或自动工具用来标记源代码的一部分。No semantic meaning is attached to a region; regions are intended for use by the programmer or by automated tools to mark a section of source code. 或指令中指定的 #region 消息 #endregion 同样没有语义含义; 它仅用于标识区域。The message specified in a #region or #endregion directive likewise has no semantic meaning; it merely serves to identify the region. 匹配 #region 的和 #endregion 指令可能具有不同的 pp_messageMatching #region and #endregion directives may have different pp_message s.

区域的词法处理:The lexical processing of a region:

#region
...
#endregion

完全对应于格式为的条件编译指令的词法处理:corresponds exactly to the lexical processing of a conditional compilation directive of the form:

#if true
...
#endif

行指令Line directives

行指令可用于更改编译器在输出(如警告和错误)中报告的行号和源文件名,以及由调用方信息属性) (调用 方信息属性使用的源文件名。Line directives may be used to alter the line numbers and source file names that are reported by the compiler in output such as warnings and errors, and that are used by caller info attributes (Caller info attributes).

行指令最常用于从其他某些文本输入生成 c # 源代码的元编程工具。Line directives are most commonly used in meta-programming tools that generate C# source code from some other text input.

pp_line
    : whitespace? '#' whitespace? 'line' whitespace line_indicator pp_new_line
    ;

line_indicator
    : decimal_digit+ whitespace file_name
    | decimal_digit+
    | 'default'
    | 'hidden'
    ;

file_name
    : '"' file_name_character+ '"'
    ;

file_name_character
    : '<Any input_character except ">'
    ;

当不 #line 存在任何指令时,编译器会在其输出中报告真实的行号和源文件名。When no #line directives are present, the compiler reports true line numbers and source file names in its output. 当处理 #line 包含非 line_indicator 的指令时,编译器会将 default 指令后面的行视为具有给定的行号 (和文件名(如果指定) )。When processing a #line directive that includes a line_indicator that is not default, the compiler treats the line after the directive as having the given line number (and file name, if specified).

#line default指令反转所有前面 #line 指令的作用。A #line default directive reverses the effect of all preceding #line directives. 编译器会报告后续行的真实行信息,就像未 #line 处理过指令一样。The compiler reports true line information for subsequent lines, precisely as if no #line directives had been processed.

#line hidden指令对错误消息中报告的文件和行号没有影响,但会影响源级别调试。A #line hidden directive has no effect on the file and line numbers reported in error messages, but does affect source level debugging. 调试时, #line hidden 指令和后续指令之间的所有行 #line (未 #line hidden) 的行号信息。When debugging, all lines between a #line hidden directive and the subsequent #line directive (that is not #line hidden) have no line number information. 单步执行调试器中的代码时,将完全跳过这些行。When stepping through code in the debugger, these lines will be skipped entirely.

请注意,在不处理转义字符的情况下, file_name 与正则字符串文字不同;" \ " 字符只是在 file_name 中指定普通反斜杠字符。Note that a file_name differs from a regular string literal in that escape characters are not processed; the "\" character simply designates an ordinary backslash character within a file_name.

Pragma 指令Pragma directives

#pragma预处理指令用于指定编译器的可选上下文信息。The #pragma preprocessing directive is used to specify optional contextual information to the compiler. 指令中提供的信息 #pragma 永远不会更改程序语义。The information supplied in a #pragma directive will never change program semantics.

pp_pragma
    : whitespace? '#' whitespace? 'pragma' whitespace pragma_body pp_new_line
    ;

pragma_body
    : pragma_warning_body
    ;

C # 提供了 #pragma 控制编译器警告的指令。C# provides #pragma directives to control compiler warnings. 将来版本的语言可能包含其他 #pragma 指令。Future versions of the language may include additional #pragma directives. 为了确保与其他 c # 编译器的互操作性,Microsoft c # 编译器不会发出未知指令的编译错误 #pragma ; 因此,此类指令将生成警告。To ensure interoperability with other C# compilers, the Microsoft C# compiler does not issue compilation errors for unknown #pragma directives; such directives do however generate warnings.

Pragma warningPragma warning

#pragma warning指令用于在编译后续程序文本期间禁用或还原所有或一组特定的警告消息。The #pragma warning directive is used to disable or restore all or a particular set of warning messages during compilation of the subsequent program text.

pragma_warning_body
    : 'warning' whitespace warning_action
    | 'warning' whitespace warning_action whitespace warning_list
    ;

warning_action
    : 'disable'
    | 'restore'
    ;

warning_list
    : decimal_digit+ (whitespace? ',' whitespace? decimal_digit+)*
    ;

#pragma warning省略警告列表的指令将影响所有警告。A #pragma warning directive that omits the warning list affects all warnings. #pragma warning包含警告列表的指令只影响列表中指定的那些警告。A #pragma warning directive that includes a warning list affects only those warnings that are specified in the list.

#pragma warning disable指令禁用所有或给定的一组警告。A #pragma warning disable directive disables all or the given set of warnings.

#pragma warning restore指令将所有或给定的警告集还原到编译单元开头处生效的状态。A #pragma warning restore directive restores all or the given set of warnings to the state that was in effect at the beginning of the compilation unit. 请注意,如果从外部禁用特定的警告,则 #pragma warning restore (无论是) 所有还是特定警告,都不会重新启用该警告。Note that if a particular warning was disabled externally, a #pragma warning restore (whether for all or the specific warning) will not re-enable that warning.

下面的示例演示如何使用 #pragma warning Microsoft c # 编译器中的警告编号来暂时禁用引用过时成员时所报告的警告。The following example shows use of #pragma warning to temporarily disable the warning reported when obsoleted members are referenced, using the warning number from the Microsoft C# compiler.

using System;

class Program
{
    [Obsolete]
    static void Foo() {}

    static void Main() {
#pragma warning disable 612
    Foo();
#pragma warning restore 612
    }
}