Go to the first, previous, next, last section, table of contents.


10 MySQL server functions

10.1 What languages are supported by MySQL?

mysqld can issue error messages in the following languages: Czech, Dutch, English (the default), Estonia, French, German, Hungarian, Italian, Norwegian, Norwegian-ny, Polish, Portuguese, Spanish and Swedish.

To start mysqld with a particular language, use either the --language=lang or -L lang options. For example:

shell> mysqld --language=swedish

or:

shell> mysqld --language=/usr/local/share/swedish

Note that all language names are specified in lowercase.

The language files are located (by default) in `mysql_base_dir/share/LANGUAGE/'.

To update the error message file, you should edit the `errmsg.txt' file and execute the following command to generate the `errmsg.sys' file:

shell> comp_err errmsg.txt errmsg.sys

If you upgrade to a newer version of MySQL, remember to repeat your changes with the new `errmsg.txt' file.

10.1.1 The character set used for data and sorting

By default, MySQL uses the ISO-8859-1 (Latin1) character set. This is the character set used in the USA and western Europe.

The character set determines what characters are allowed in names and how things are sorted by the ORDER BY and GROUP BY clauses of the SELECT statement.

You can change the character set with the --default-character-set option when you start the server. The character sets available depend on the --with-charset=charset option to configure, and the character set configuration files listed in `SHAREDIR/charsets/Index'. See section 4.7.1 Quick installation overview.

When a client connects to a MySQL server, the server sends the default character set in use to the client. The client will switch to use this character set for this connection.

One should use mysql_real_escape_string() when escaping strings for a SQL query. mysql_real_escape_string() is identical to the old mysql_escape_string() function, except that it takes the MYSQL connection handle as the first parameter.

If the client is compiled with different paths than where the server is installed and the user that configured MySQL didn't included all character sets in the MySQL binary one must specify for the client where it can find the additional character sets it will need if the server runs with a different character set than the client.

On can specify this by putting in a MySQL option file:

[client]
default-set-dir=/usr/local/mysql/share/mysql/charsets

where the path points to where the dynamic MySQL character sets are stored.

One can force the client to use specific character set by specifying:

[client]
default-character-set=character-set-name

but normally this is never needed.

To add another character set to MySQL, use the following procedure:

10.1.2 Adding a new character set

  1. Choose a name for the character set, denoted MYSET below.
  2. Decide if the set is simple or complex. If the character set does not need to use special string collating routines for sorting, and does not need mulit-byte character support, it is simple. If it needs either of those features, it is complex.
  3. If the character set is simple, then create the file `sql/share/charsets/MYSET.conf', and add MYSET the `sql/share/charsets/Index' file. Read the `sql/share/charsets/README' for more instructions.
  4. Add the character set name to the CHARSETS_AVAILABLE and COMPILED_CHARSETS lists in configure.in.
  5. Reconfigure, recompile and test.
  6. If the character set is complex, create the file `strings/ctype-MYSET.c' in the MySQL source distribution.
  7. Add MYSET to the end of the `sql/share/charsets/Index' file. Take note of its position in the file - this is its character set number, denoted MYNUMBER below.
  8. Look at one of the existing `ctype-*.c' files to see what needs to be defined. Note that the arrays in your file must have names like ctype_MYSET, to_lower_MYSET and so on. Near the top of the file, place a special comment like this:
    /*
     * This comment is parsed by configure to create ctype.c,
     * so don't change it unless you know what you are doing.
     *
     * .configure. number_MYSET=MYNUMBER
     * .configure. strxfrm_multiply_MYSET=N
     * .configure. mbmaxlen_MYSET=N
     */
    
    The configure program uses this comment to include the character set into the MySQL library automatically. The strxfrm_multiply and mbmaxlen lines will be explained in the following sections. Only include them if you the string collating functions or the multi-byte character set functions, respectively. to_lower[] and to_upper[] are simple arrays that hold the lowercase and uppercase characters corresponding to each member of the character set. For example:
    to_lower['A'] should contain 'a'
    to_upper['a'] should contain 'A'
    
    sort_order[] is a map indicating how characters should be ordered for comparison and sorting purposes. For many character sets, this is the same as to_upper[] (which means sorting will be case insensitive). MySQL will sort characters based on the value of sort_order[character]. For more complicated sorting rules, see the discussion of string collating below. ctype[] is an array of bit values, with one element for one character. (Note that to_lower[], to_upper[] and sort_order[] are indexed by character value, but ctype[] is indexed by character value + 1. This is an old legacy to be able to handle EOF.) You can find the following bitmask definitions in `m_ctype.h':
    #define _U      01      /* Upper case */
    #define _L      02      /* Lower case */
    #define _N      04      /* Numeral (digit) */
    #define _S      010     /* Spacing character */
    #define _P      020     /* Punctuation */
    #define _C      040     /* Control character */
    #define _B      0100    /* Blank */
    #define _X      0200    /* heXadecimal digit */
    
    The ctype[] entry for each character should be the union of the applicable bitmask values that describe the character. For example, 'A' is an uppercase character (_U) as well as a hexadecimal digit (_X), so ctype['A'+1] should contain the value:
    _U + _X = 01 + 0200 = 0201
    
  9. Add support for the string collating or multi-byte features needed, as described in the following sections.
  10. Add the character set name to the CHARSETS_AVAILABLE and COMPILED_CHARSETS lists in configure.in.
  11. Reconfigure, recompile and test.

10.1.3 String collating support

If the sorting rules for your language are too complex to be handled with the simple sort_order[] table, you need to use the string collating functions.

Right now the best documentation on this is the character sets that are already implemented. Look at the big5, czech, gbk, sjis and tis160 character sets for examples.

You must specify the strxfrm_multiply_MYSET=N value in the special comment at the top of the file. N should be set to the maximum ratio the strings may grow during my_strxfrm_MYSET (it must be a positive integer).

10.1.4 Multi-byte character support

If your character set includes multi-byte characters, you need to use the multi-byte character functions.

Right now the best documentation on this is the character sets that are already implemented. Look at the euc_kr, gb2312, gbk, sjis and ujis character sets for examples.

You must specify the mbmaxlen_MYSET=N value in the special comment at the top of the file. N should be set to the size in bytes of the largest character in the set.

10.2 How big MySQL tables can be

MySQL 3.22 has a 4G limit on table size. With the new MyISAM in MySQL 3.23 the maximum table size is pushed up to 8 million terabytes (2 ^ 63 bytes).

Note however that operating systems have their own file size limits. Here are some examples:

Linux-Intel 2G (or 4G with reiserfs)
Linux-Alpha 8T (?)
Solaris 2.5.1 2G (possible 4G with patch)
Solaris 2.6 4G
Solaris 2.7 Intel 4G
Solaris 2.7 ULTRA-SPARC 8T (?)

This means that the table size for MySQL is normally limited by the operating system.

By default, MySQL tables have a maximum size of about 4G. You can check the maximum table size for a table with the SHOW TABLE STATUS command or with the myisamchk -dv table_name. See section 7.23 SHOW syntax (Get information about tables, columns,...).

If you need bigger tables than 4G (and your operating system supports this), you should set the AVG_ROW_LENGTH and MAX_ROWS parameter when you create your table. See section 7.7 CREATE TABLE syntax. You can also set these later with ALTER TABLE. See section 7.8 ALTER TABLE syntax.

If you need to have bigger tables than 2G / 4G

If your big table is going to be read-only, you could use myisampack to merge and compress many tables to one. myisampack usually compresses a table by at least 50%, so you can have, in effect, much bigger tables. See section 14.7 The MySQL compressed read-only table generator..

Another solution can be the included MERGE library, which allows you to handle a collection of identical tables as one. (Identical in this case means that all tables are created with identical column information.) Currently MERGE can only be used to scan a collection of tables because it doesn't support indexes. We will add indexes to this in the near future.


Go to the first, previous, next, last section, table of contents.