Internet Assigned Numbers Authority Collation Registry for Internet Application Protocols Last Updated 2013-08-20 Registration Procedure(s) Expert Review Description [RFC4790] creates an abstraction framework so that application protocols can precisely identify a comparison function and the repertoire of comparison functions can be extended in the future. This document defines an IANA-maintained registry of collations for comparing, searching and sorting international strings. Available Formats [IMG] CSV [IMG] XML [IMG] HTML [IMG] Plain text Collation Description Reference The "i;ascii-numeric" collation is a simple collation intended for use with arbitrary sized unsigned decimal integer i;ascii-numeric numbers stored as octet strings. US-ASCII digits (0x30 to 0x39) represent digits of the numbers. Before converting [RFC4790] from string to integer, the input string is truncated at the first non-digit character. All input is valid; strings which do not start with a digit represent positive infinity. The "i;ascii-casemap" collation is a simple collation which operates on octet strings and treats US-ASCII letters i;ascii-casemap case-insensitively. It provides equality, substring and ordering operations. All input is valid. Note that letters [RFC4790] outside ASCII are not treated case- insensitively. The "i;octet" collation is a simple and fast collation intended for use on binary octet strings rather than on i;octet character data. Protocols that want to make this collation available have to do so by explicitly allowing it. If not [RFC4790] explicitly allowed, it MUST NOT be used. It never returns an "undefined" result. It provides equality, substring and ordering operations. The "i;unicode-casemap" collation is a simple collation which is case-insensitive in its treatment of characters. It provides equality, substring, and ordering operations. The validity test operation returns "valid" for any input. i;unicode-casemap [RFC5051] This collation allows strings in arbitrary (and mixed) character sets, as long as the character set for each string is identified and it is possible to convert the string to Unicode. Strings which have an unidentified character set and/or cannot be converted to Unicode are not rejected, but are treated as binary. Licensing Terms