12.8 String Functions and Operators

String Functions and Operators

Name Description

'ASCII()' Return numeric value of left-most character

'BIN()' Return a string containing binary representation of a number

'BIT_LENGTH()' Return length of argument in bits

'CHAR()' Return the character for each integer passed

'CHAR_LENGTH()' Return number of characters in argument

'CHARACTER_LENGTH()' Synonym for CHAR_LENGTH()

'CONCAT()' Return concatenated string

'CONCAT_WS()' Return concatenate with separator

'ELT()' Return string at index number

'EXPORT_SET()' Return a string such that for every bit set in the value bits, you get an on string and for every unset bit, you get an off string

'FIELD()' Index (position) of first argument in subsequent arguments

'FIND_IN_SET()' Index (position) of first argument within second argument

'FORMAT()' Return a number formatted to specified number of decimal places

'FROM_BASE64()' Decode base64 encoded string and return result

'HEX()' Hexadecimal representation of decimal or string value

'INSERT()' Insert substring at specified position up to specified number of characters

'INSTR()' Return the index of the first occurrence of substring

'LCASE()' Synonym for LOWER()

'LEFT()' Return the leftmost number of characters as specified

'LENGTH()' Return the length of a string in bytes

'LIKE' Simple pattern matching

'LOAD_FILE()' Load the named file

'LOCATE()' Return the position of the first occurrence of substring

'LOWER()' Return the argument in lowercase

'LPAD()' Return the string argument, left-padded with the specified string

'LTRIM()' Remove leading spaces

'MAKE_SET()' Return a set of comma-separated strings that have the corresponding bit in bits set

'MATCH()' Perform full-text search

'MID()' Return a substring starting from the specified position

'NOT LIKE' Negation of simple pattern matching

'NOT REGEXP' Negation of REGEXP

'OCT()' Return a string containing octal representation of a number

'OCTET_LENGTH()' Synonym for LENGTH()

'ORD()' Return character code for leftmost character of the argument

'POSITION()' Synonym for LOCATE()

'QUOTE()' Escape the argument for use in an SQL statement

'REGEXP' Whether string matches regular expression

'REPEAT()' Repeat a string the specified number of times

'REPLACE()' Replace occurrences of a specified string

'REVERSE()' Reverse the characters in a string

'RIGHT()' Return the specified rightmost number of characters

'RLIKE' Whether string matches regular expression

'RPAD()' Append string the specified number of times

'RTRIM()' Remove trailing spaces

'SOUNDEX()' Return a soundex string

'SOUNDS LIKE' Compare sounds

'SPACE()' Return a string of the specified number of spaces

'STRCMP()' Compare two strings

'SUBSTR()' Return the substring as specified

'SUBSTRING()' Return the substring as specified

'SUBSTRING_INDEX()' Return a substring from a string before the specified number of occurrences of the delimiter

'TO_BASE64()' Return the argument converted to a base-64 string

'TRIM()' Remove leading and trailing spaces

'UCASE()' Synonym for UPPER()

'UNHEX()' Return a string containing hex representation of a number

'UPPER()' Convert to uppercase

'WEIGHT_STRING()' Return the weight string for a string

String-valued functions return 'NULL' if the length of the result would be greater than the value of the 'max_allowed_packet' system variable. See *note server-configuration::.

For functions that operate on string positions, the first position is numbered 1.

For functions that take length arguments, noninteger arguments are rounded to the nearest integer.

 File: manual.info.tmp, Node: string-comparison-functions, Next: regexp, Prev: string-functions, Up: string-functions

12.8.1 String Comparison Functions and Operators

String Comparison Functions and Operators

Name Description

'LIKE' Simple pattern matching

'NOT LIKE' Negation of simple pattern matching

'STRCMP()' Compare two strings

If a string function is given a binary string as an argument, the resulting string is also a binary string. A number converted to a string is treated as a binary string. This affects only comparisons.

Normally, if any expression in a string comparison is case-sensitive, the comparison is performed in case-sensitive fashion.

If a string function is invoked from within the note 'mysql': mysql. client, binary strings display using hexadecimal notation, depending on the value of the '--binary-as-hex'. For more information about that option, see note mysql::.

 File: manual.info.tmp, Node: regexp, Next: string-functions-charset, Prev: string-comparison-functions, Up: string-functions

12.8.2 Regular Expressions

Regular Expression Functions and Operators

Name Description

'NOT REGEXP' Negation of REGEXP

'REGEXP' Whether string matches regular expression

'RLIKE' Whether string matches regular expression

A regular expression is a powerful way of specifying a pattern for a complex search. This section discusses the operators available for regular expression matching and illustrates, with examples, some of the special characters and constructs that can be used for regular expression operations. See also *note pattern-matching::.

MySQL uses Henry Spencer's implementation of regular expressions, which is aimed at conformance with POSIX 1003.2. MySQL uses the extended version to support regular expression pattern-matching operations in SQL statements. This section does not contain all the details that can be found in Henry Spencer's 'regex(7)' manual page. That manual page is included in MySQL source distributions, in the 'regex.7' file under the 'regex' directory.

Regular Expression Function and Operator Descriptions

Regular Expression Syntax

A regular expression describes a set of strings. The simplest regular expression is one that has no special characters in it. For example, the regular expression 'hello' matches 'hello' and nothing else.

Nontrivial regular expressions use certain special constructs so that they can match more than one string. For example, the regular expression 'hello|world' contains the '|' alternation operator and matches either the 'hello' or 'world'.

As a more complex example, the regular expression 'B[an]*s' matches any of the strings 'Bananas', 'Baaaaas', 'Bs', and any other string starting with a 'B', ending with an 's', and containing any number of 'a' or 'n' characters in between.

A regular expression for the 'REGEXP' operator may use any of the following special characters and constructs:

To use a literal instance of a special character in a regular expression, precede it by two backslash () characters. The MySQL parser interprets one of the backslashes, and the regular expression library interprets the other. For example, to match the string '1+2' that contains the special '+' character, only the last of the following regular expressions is the correct one:

 mysql> SELECT '1+2' REGEXP '1+2';                       -> 0
 mysql> SELECT '1+2' REGEXP '1\+2';                      -> 0
 mysql> SELECT '1+2' REGEXP '1\\+2';                     -> 1

 File: manual.info.tmp, Node: string-functions-charset, Prev: regexp, Up: string-functions

12.8.3 Character Set and Collation of Function Results

MySQL has many operators and functions that return a string. This section answers the question: What is the character set and collation of such a string?

For simple functions that take string input and return a string result as output, the output's character set and collation are the same as those of the principal input value. For example, 'UPPER(X)' returns a string with the same character string and collation as X. The same applies for 'INSTR()', 'LCASE()', 'LOWER()', 'LTRIM()', 'MID()', 'REPEAT()', 'REPLACE()', 'REVERSE()', 'RIGHT()', 'RPAD()', 'RTRIM()', 'SOUNDEX()', 'SUBSTRING()', 'TRIM()', 'UCASE()', and 'UPPER()'.

Note:

The 'REPLACE()' function, unlike all other functions, always ignores the collation of the string input and performs a case-sensitive comparison.

If a string input or function result is a binary string, the string has the 'binary' character set and collation. This can be checked by using the 'CHARSET()' and 'COLLATION()' functions, both of which return 'binary' for a binary string argument:

 mysql> SELECT CHARSET(BINARY 'a'), COLLATION(BINARY 'a');
 +---------------------+-----------------------+
 | CHARSET(BINARY 'a') | COLLATION(BINARY 'a') |
 +---------------------+-----------------------+
 | binary              | binary                |
 +---------------------+-----------------------+

For operations that combine multiple string inputs and return a single string output, the 'aggregation rules' of standard SQL apply for determining the collation of the result:

For example, with 'CASE ... WHEN a THEN b WHEN b THEN c COLLATE X END', the resulting collation is X. The same applies for *note 'UNION': union, '||', 'CONCAT()', 'ELT()', 'GREATEST()', 'IF()', and 'LEAST()'.

For operations that convert to character data, the character set and collation of the strings that result from the operations are defined by the 'character_set_connection' and 'collation_connection' system variables that determine the default connection character set and collation (see *note charset-connection::). This applies only to 'CAST()', 'CONV()', 'FORMAT()', 'HEX()', and 'SPACE()'.

As of MySQL 5.7.19, an exception to the preceding principle occurs for expressions for virtual generated columns. In such expressions, the table character set is used for 'CONV()' or 'HEX()' results, regardless of connection character set.

If there is any question about the character set or collation of the result returned by a string function, use the 'CHARSET()' or 'COLLATION()' function to find out:

 mysql> SELECT USER(), CHARSET(USER()), COLLATION(USER());
 +----------------+-----------------+-------------------+
 | USER()         | CHARSET(USER()) | COLLATION(USER()) |
 +----------------+-----------------+-------------------+
 | test@localhost | utf8            | utf8_general_ci   |
 +----------------+-----------------+-------------------+
 mysql> SELECT CHARSET(COMPRESS('abc')), COLLATION(COMPRESS('abc'));
 +--------------------------+----------------------------+
 | CHARSET(COMPRESS('abc')) | COLLATION(COMPRESS('abc')) |
 +--------------------------+----------------------------+
 | binary                   | binary                     |
 +--------------------------+----------------------------+

 File: manual.info.tmp, Node: fulltext-search, Next: cast-functions, Prev: string-functions, Up: functions