字符串函数和操作符
===================


本节描述了用于检查和操作字符串数值的函数和操作符。 在这个环境中的字符串包括character,character varying, text类型的值。除非另外说明，所有下面列出的函数都可以处理这些类型， 不过要小心的是，在使用character类型的时候，需要注意自动填充的潜在影响。 有些函数还可以处理位串类型。

SQL定义了一些字符串函数，用特定的关键字而不是逗号来分隔参数。 详情请见下表。这些函数也可以使用正常的函数调用语法实施(见表其他字符串函数)。

**表.SQL 字符串函数和操作符**


.. list-table::
    :widths: auto
    :header-rows: 1

    * - 函数
      - 返回类型
      - 描述
      - 例子
      - 结果
    * - string || string
      - text
      - 字符串连接
      - 'Oushu' || 'DB'
      - OushuDB
    * - bit_length(string)
      - int 
      - 字符串的位
      - bit_length('jose')
      - 32
    * - char_length(string) or character_length(string)
      - int
      - 字符串的字符个数
      - char_length('jose')
      - 4
    * - lower(string)
      - text
      - 把字符串转化为小写
      - lower('TOM')
      - tom
    * - octet_length(string)
      - int
      - 字符串中的字节数
      - octet_length('jose')
      - 4
    * - overlay(string placing string from int [for int])
      - text
      - 替换子字符串
      - overlay('Txxxxas' placing 'hom' from 2 for 4)
      - Thomas
    * - position(substring in string)
      - int
      - 指定子字符串的位置
      - position('om' in 'Thomas')
      - 3
    * - substring(string [from int][for int])
      - text
      - 截取子字符串
      - substring('Thomas' from 2 for 3)
      - hom
    * - substring(string from pattern)
      - text
      - 截取匹配POSIX正则表达式的子字符串。
      - substring('Thomas' from '...$')
      - mas
    * - substring(string from pattern for escape)
      - text
      - 截取匹配SQL正则表达式的子字符串。
      - substring('Thomas' from '%#"o_a#"_' for '#')
      - oma
    * - trim([leading | trailing | both][characters] from string)
      - text
      - 从字符串string的开头/结尾/两边删除只包含 characters中字符 (缺省是空白)的最长的字符串
      - trim(both 'x' from 'xTomxx')
      - Tom
    * - upper(string)
      - text
      - 把字符串转化为大写
      - upper('tom')
      - TOM

还有额外的操作函数可以用，在下表中列出。它们有些在内部用于实现上表列出的SQL标准字符串函数。

**表.其他字符串函数**


.. list-table::
    :widths: auto
    :header-rows: 1

    * - 函数
      - 返回类型
      - 描述
      - 例子
      - 结果
    * - ascii(string)
      - int
      - 参数中第一个字符的ASCII编码值。
      - ascii('x')
      - 120
    * - btrim(string text [, characters text])
      - text
      - 从string开头和结尾删除只包含 characters中字符(缺省是空白)的最长字符串。
      - btrim('xyxtrimyyx', 'xy')
      - trim
    * - chr(int)
      - text
      - 给定的ASCII码对应的字符
      - chr(65)
      - A
    * - convert(string text, [src_encoding name,] dest_encoding name)
      - text
      - 将字符串转化为dest_encoding编码格式。最初的编码格式由src_encoding声明。如果src_encoding被省略，数据库编码会被假定。
      - convert( 'text_in_utf8', 'UTF8', 'LATIN1')
      - 以ISO 8859-1编码表示的 text_in_utf8
    * - encode(data bytea, type text)
      - text
      - 把二进制数据编码为文本表示。支持的格式有：base64, hex, escape。escape 转换零字节和高位设置字节为八进制序列(\nnn) 和双反斜杠。
      - encode( E'123\\\\000\\\\001', 'base64')
      - MTIzAAE=
    * - initcap(string)
      - text
      - 把每个单词的第一个字母转为大写，其它的保留小写。 单词是一系列字母数字组成的字符，用非字母数字分隔。
      - initcap('hi THOMAS')
      - Hi Thomas
    * - length(string)
      - int
      - string中字符的数目
      - length('jose')
      - 4
    * - lpad(string text, length int [, fill text])
      - text
      - 通过填充字符fill(缺省时为空白)， 把string填充为length长度。 如果string已经比length长则将其尾部截断。
      - lpad('hi', 5, 'xy')
      - xyxhi
    * - ltrim(string text [, characters text])
      - text
      - 从字符串string的开头删除只包含characters 中字符(缺省是一个空白)的最长的字符串。
      - ltrim('zzzytrim', 'xyz')
      - trim
    * - md5(string)
      - text
      - 计算string的MD5散列，以十六进制返回结果。
      - md5('abc')
      - 900150983cd24fb0d6963f7d28e17f72
    * - pg_client_encoding()
      - name
      - 当前客户端编码名称
      - pg_client_encoding()
      - UTF8
    * - quote_ident(string)
      - text
      - 返回适用于SQL语句的标识符形式(使用适当的引号进行界定)。 只有在必要的时候才会添加引号(字符串包含非标识符字符或者会转换大小写的字符)。 嵌入的引号被恰当地写了双份。
      - quote_ident('Foo bar')
      - "Foo bar"
    * - quote_literal(string)
      - text
      - 返回适用于在SQL语句里当作文本使用的形式(使用适当的引号进行界定)。 嵌入的引号和反斜杠被恰当地写了双份。
      - quote_literal( E'O\\'Reilly')	
      - 'O''Reilly'
    * - regexp_replace(string text, pattern text, replacement text [,flags text])
      - text
      - 用POSIX正则表达式作为分隔符，分隔string。
      - regexp_replace('Thomas', '.[mN]a.', 'M')
      - ThM
    * - repeat(string text, number int)
      - text
      - 将string重复number次
      - repeat('Pg', 4)
      - PgPgPgPg
    * - replace(string text, from text, to text)
      - text
      - 把字符串string里出现地所有子字符串from 替换成子字符串to
      - replace( 'abcdefabcdef', 'cd', 'XX')
      - abXXefabXXef
    * - rpad(string text, length int [, fill text])
      - text
      - 使用填充字符fill(缺省时为空白)， 把string填充到length长度。 如果string已经比length长则将其从尾部截断。
      - rpad('hi', 5, 'xy')
      - hixyx
    * - rtrim(string text [, characters text])
      - text
      - 从字符串string的结尾删除只包含 characters中字符(缺省是个空白)的最长的字符串。
      - rtrim('trimxxxx', 'x')
      - trim
    * - split_part(string text, delimiter text, field int)
      - text
      - 根据delimiter分隔string 返回生成的第 field 个子字符串(1为基)。
      - split_part('abc~@~def~@~ghi', '~@~', 2)
      - def
    * - strpos(string, substring)
      - int
      - 指定的子字符串的位置。和position(substring in string)一样，不过参数顺序相反。
      - strpos('high', 'ig')
      - 2
    * - substr(string, from [, count])
      - text
      - 抽取子字符串。和substring(string from from for count))一样
      - substr('alphabet', 3, 2)
      - ph
    * - to_hex(number int or bigint)
      - text
      - 把number转换成十六进制表现形式
      - to_hex(2147483647)
      - 7fffffff
    * - translate(string text, from text, to text)
      - text
      - 把在string中包含的任何匹配from 中字符的字符转化为对应的在to中的字符。
      - translate('12345', '14', 'ax')
      - a23x5


.. list-table::
    :widths: auto
    :header-rows: 1

    * - 转换名
      - 源编码
      - 目的编码
    * - ascii_to_mic
      - SQL_ASCII
      - MULE_INTERNAL
    * - ascii_to_utf8
      - SQL_ASCII
      - UTF8
    * - big5_to_euc_tw
      - BIG5
      - EUC_TW
    * - big5_to_mic
      - BIG5
      - MULE_INTERNAL
    * - big5_to_utf8
      - BIG5
      - UTF8
    * - euc_cn_to_mic
      - EUC_CN
      - MULE_INTERNAL
    * - euc_cn_to_utf8
      - EUC_CN
      - UTF8
    * - euc_jp_to_mic
      - EUC_JP
      - MULE_INTERNAL
    * - euc_jp_to_sjis
      - EUC_JP
      - SJIS
    * - euc_jp_to_utf8
      - EUC_JP
      - UTF8
    * - euc_kr_to_mic
      - EUC_KR
      - MULE_INTERNAL
    * - euc_kr_to_utf8
      - EUC_KR
      - UTF8
    * - euc_tw_to_big5
      - EUC_TW
      - BIG5
    * - euc_tw_to_mic
      - EUC_TW
      - MULE_INTERNAL
    * - euc_tw_to_utf8
      - EUC_TW
      - UTF8
    * - gb18030_to_utf8
      - GB18030
      - UTF8
    * - gbk_to_utf8
      - GBK
      - UTF8
    * - iso_8859_10_to_utf8
      - LATIN6
      - UTF8
    * - iso_8859_13_to_utf8
      - LATIN7
      - UTF8
    * - iso_8859_14_to_utf8
      - LATIN8
      - UTF8
    * - iso_8859_15_to_utf8
      - LATIN9
      - UTF8
    * - iso_8859_16_to_utf8
      - LATIN10
      - UTF8
    * - iso_8859_1_to_mic
      - LATIN1
      - MULE_INTERNAL
    * - iso_8859_1_to_utf8
      - LATIN1
      - UTF8
    * - iso_8859_2_to_mic
      - LATIN2
      - MULE_INTERNAL
    * - iso_8859_2_to_utf8
      - LATIN2
      - UTF8
    * - iso_8859_2_to_windows_1250
      - LATIN2
      - WIN1250
    * - iso_8859_3_to_mic
      - LATIN3
      - MULE_INTERNAL
    * - iso_8859_3_to_utf8
      - LATIN3
      - UTF8
    * - iso_8859_4_to_mic
      - LATIN4
      - MULE_INTERNAL
    * - iso_8859_4_to_utf8
      - LATIN4
      - UTF8
    * - iso_8859_5_to_koi8_r
      - ISO_8859_5
      - KOI8
    * - iso_8859_5_to_mic
      - ISO_8859_5
      - MULE_INTERNAL
    * - iso_8859_5_to_utf8
      - ISO_8859_5
      - UTF8
    * - iso_8859_5_to_windows_1251
      - ISO_8859_5
      - WIN1251
    * - iso_8859_5_to_windows_866
      - ISO_8859_5
      - WIN866
    * - iso_8859_6_to_utf8
      - ISO_8859_6
      - UTF8
    * - iso_8859_7_to_utf8
      - ISO_8859_7
      - UTF8
    * - iso_8859_8_to_utf8
      - ISO_8859_8
      - UTF8
    * - iso_8859_9_to_utf8
      - LATIN5
      - UTF8
    * - johab_to_utf8
      - JOHAB
      - UTF8
    * - koi8_r_to_iso_8859_5
      - KOI8
      - ISO_8859_5
    * - koi8_r_to_mic
      - KOI8
      - MULE_INTERNAL
    * - koi8_r_to_utf8
      - KOI8
      - UTF8
    * - koi8_r_to_windows_1251
      - KOI8
      - WIN1251
    * - koi8_r_to_windows_866
      - KOI8
      - WIN866
    * - mic_to_ascii
      - MULE_INTERNAL
      - SQL_ASCII
    * - mic_to_big5
      - MULE_INTERNAL
      - BIG5
    * - mic_to_euc_cn
      - MULE_INTERNAL
      - EUC_CN
    * - mic_to_euc_jp
      - MULE_INTERNAL
      - EUC_JP
    * - mic_to_euc_kr
      - MULE_INTERNAL
      - EUC_KR
    * - mic_to_euc_tw
      - MULE_INTERNAL
      - EUC_TW
    * - mic_to_iso_8859_1
      - MULE_INTERNAL
      - LATIN1
    * - mic_to_iso_8859_2
      - MULE_INTERNAL
      - LATIN2
    * - mic_to_iso_8859_3
      - MULE_INTERNAL
      - LATIN3
    * - mic_to_iso_8859_4
      - MULE_INTERNAL
      - LATIN4
    * - mic_to_iso_8859_5
      - MULE_INTERNAL
      - ISO_8859_5
    * - mic_to_koi8_r
      - MULE_INTERNAL
      - KOI8
    * - mic_to_sjis
      - MULE_INTERNAL
      - SJIS
    * - mic_to_windows_1250
      - MULE_INTERNAL
      - WIN1250
    * - mic_to_windows_1251
      - MULE_INTERNAL
      - WIN1251
    * - mic_to_windows_866
      - MULE_INTERNAL
      - WIN866
    * - sjis_to_euc_jp
      - SJIS
      - EUC_JP
    * - sjis_to_mic
      - SJIS
      - MULE_INTERNAL
    * - sjis_to_utf8
      - SJIS
      - UTF8
    * - tcvn_to_utf8
      - WIN1258
      - UTF8
    * - uhc_to_utf8
      - UHC
      - UTF8
    * - utf8_to_ascii
      - UTF8
      - SQL_ASCII
    * - utf8_to_big5
      - UTF8
      - BIG5
    * - utf8_to_euc_cn
      - UTF8
      - EUC_CN
    * - utf8_to_euc_jp
      - UTF8
      - EUC_JP
    * - utf8_to_euc_kr
      - UTF8
      - EUC_KR
    * - utf8_to_euc_tw
      - UTF8
      - EUC_TW
    * - utf8_to_gb18030
      - UTF8
      - GB18030
    * - utf8_to_gbk
      - UTF8
      - GBK
    * - utf8_to_iso_8859_1
      - UTF8
      - LATIN1
    * - utf8_to_iso_8859_10
      - UTF8
      - LATIN6
    * - utf8_to_iso_8859_13
      - UTF8
      - LATIN7
    * - utf8_to_iso_8859_14
      - UTF8
      - LATIN8
    * - utf8_to_iso_8859_15
      - UTF8
      - LATIN9
    * - utf8_to_iso_8859_16
      - UTF8
      - LATIN10
    * - utf8_to_iso_8859_2
      - UTF8
      - LATIN2
    * - utf8_to_iso_8859_3
      - UTF8
      - LATIN3
    * - utf8_to_iso_8859_4
      - UTF8
      - LATIN4
    * - utf8_to_iso_8859_5
      - UTF8
      - ISO_8859_5
    * - utf8_to_iso_8859_6
      - UTF8
      - ISO_8859_6
    * - utf8_to_iso_8859_7
      - UTF8
      - ISO_8859_7
    * - utf8_to_iso_8859_8
      - UTF8
      - ISO_8859_8
    * - utf8_to_iso_8859_9
      - UTF8
      - LATIN5
    * - utf8_to_johab
      - UTF8
      - JOHAB
    * - utf8_to_koi8_r
      - UTF8
      - KOI8
    * - utf8_to_sjis
      - UTF8
      - SJIS
    * - utf8_to_tcvn
      - UTF8
      - WIN1258
    * - utf8_to_uhc
      - UTF8
      - UHC
    * - utf8_to_windows_1250
      - UTF8
      - WIN1250
    * - utf8_to_windows_1251
      - UTF8
      - WIN1251
    * - utf8_to_windows_1252
      - UTF8
      - WIN1252
    * - utf8_to_windows_1253
      - UTF8
      - WIN1253
    * - utf8_to_windows_1254
      - UTF8
      - WIN1254
    * - utf8_to_windows_1255
      - UTF8
      - WIN1255
    * - utf8_to_windows_1256	
      - UTF8	
      - WIN1256
    * - utf8_to_windows_1257	
      - UTF8	
      - WIN1257
    * - utf8_to_windows_866	
      - UTF8	
      - WIN866
    * - utf8_to_windows_874	
      - UTF8	
      - WIN874
    * - windows_1250_to_iso_8859_2	
      - WIN1250	
      - LATIN2
    * - windows_1250_to_mic	
      - WIN1250	
      - MULE_INTERNAL
    * - windows_1250_to_utf8	
      - WIN1250	
      - UTF8
    * - windows_1251_to_iso_8859_5	
      - WIN1251	
      - ISO_8859_5
    * - windows_1251_to_koi8_r	
      - WIN1251	
      - KOI8
    * - windows_1251_to_mic	
      - WIN1251	
      - MULE_INTERNAL
    * - windows_1251_to_utf8	
      - WIN1251	
      - UTF8
    * - windows_1251_to_windows_866	
      - WIN1251	
      - WIN866
    * - windows_1252_to_utf8	
      - WIN1252	
      - UTF8
    * - windows_1256_to_utf8	
      - WIN1256	
      - UTF8
    * - windows_866_to_iso_8859_5	
      - WIN866	
      - ISO_8859_5
    * - windows_866_to_koi8_r	
      - WIN866	
      - KOI8
    * - windows_866_to_mic	
      - WIN866	
      - MULE_INTERNAL
    * - windows_866_to_utf8	
      - WIN866	
      - UTF8
    * - windows_866_to_windows_1251	
      - WIN866	
      - WIN
    * - windows_874_to_utf8	
      - WIN874	
      - UTF8*