Class idna_convert

Encode/decode Internationalized Domain Names.

The class allows to convert internationalized domain names (see RFC 3490 for details) as they can be used with various registries worldwide to be translated between their original (localized) form and their encoded form as it will be used in the DNS (Domain Name System).

The class provides two public methods, encode() and decode(), which do exactly what you would expect them to do. You are allowed to use complete domain names, simple strings and complete email addresses as well. That means, that you might use any of the following notations:

www.nörgler.com
xn--nrgler-wxa
xn--brse-5qa.xn--knrz-1ra.info

Unicode input might be given as either UTF-8 string, UCS-4 string or UCS-4 array. Unicode output is available in the same formats. You can select your preferred format via set_paramter().

ACE input and output is always expected to be ASCII.

Direct known subclasses

Net_IDNA_php4

Copyright: 2004-2007 phlyLabs Berlin, http://phlylabs.de
Author: Matthias Sommerfeld <mso@phlylabs.de>
Version: 0.5.1
Located at idn/idna_convert.class.php

Methods summary
`public`	# `idna_convert( $options = false )`
`public boolean`	# `set_parameter( mixed $option, string $value = false )` Sets a new option value. Available options and values: [encoding - Use either UTF-8, UCS4 as array or UCS4 as string as input ('utf8' for UTF-8, 'ucs4_string' and 'ucs4_array' respectively for UCS4); The output is always UTF-8] [overlong - Unicode does not allow unnecessarily long encodings of chars, to allow this, set this parameter to true, else to false; default is false.] [strict - true: strict mode, good for registration purposes - Causes errors on failures; false: loose mode, ideal for "wildlife" applications by silently ignoring errors and returning the original input instead Sets a new option value. Available options and values: [encoding - Use either UTF-8, UCS4 as array or UCS4 as string as input ('utf8' for UTF-8, 'ucs4_string' and 'ucs4_array' respectively for UCS4); The output is always UTF-8] [overlong - Unicode does not allow unnecessarily long encodings of chars, to allow this, set this parameter to true, else to false; default is false.] [strict - true: strict mode, good for registration purposes - Causes errors on failures; false: loose mode, ideal for "wildlife" applications by silently ignoring errors and returning the original input instead Parameters `$option` `mixed` Parameter to set (string: single parameter; array of Parameter => Value pairs) `$value` `string` Value to use (if parameter 1 is a string) Returns `boolean` true on success, false otherwise
`public string`	# `decode( string $input, $one_time_encoding = false )` Decode a given ACE domain name Decode a given ACE domain name Parameters `$input` `string` Domain name (ACE string) [@param string Desired output encoding, see `idna_convert::set_parameter()`] `$one_time_encoding` Returns `string` Decoded Domain name (UTF-8 or UCS-4)
`public string`	# `encode( string $decoded, $one_time_encoding = false )` Encode a given UTF-8 domain name Encode a given UTF-8 domain name Parameters `$decoded` `string` Domain name (UTF-8 or UCS-4) [@param string Desired input encoding, see `idna_convert::set_parameter()`] `$one_time_encoding` Returns `string` Encoded Domain name (ACE string)
`public string`	# `get_last_error( )` Use this method to get the last error ocurred Use this method to get the last error ocurred Returns `string` The last error, that occured
`public`	# `_decode( $encoded )` The actual decoding algorithm The actual decoding algorithm
`public`	# `_encode( $decoded )` The actual encoding algorithm The actual encoding algorithm
`public`	# `_adapt( $delta, $npoints, $is_first )` Adapt the bias according to the current code point and position Adapt the bias according to the current code point and position
`public`	# `_encode_digit( $d )` Encoding a certain digit Encoding a certain digit
`public`	# `_decode_digit( $cp )` Decode a certain digit Decode a certain digit
`public`	# `_error( $error = '' )` Internal error handling method Internal error handling method
`public string`	# `_nameprep( array $input )` Do Nameprep according to RFC3491 and RFC3454 Do Nameprep according to RFC3491 and RFC3454 Parameters `$input` `array` Unicode Characters Returns `string` Unicode Characters, Nameprep'd
`public array`	# `_hangul_decompose( integer $char )` Decomposes a Hangul syllable (see http://www.unicode.org/unicode/reports/tr15/#Hangul Decomposes a Hangul syllable (see http://www.unicode.org/unicode/reports/tr15/#Hangul Parameters `$char` `integer` 32bit UCS4 code point Returns `array` Either Hangul Syllable decomposed or original 32bit value as one value array
`public array`	# `_hangul_compose( array $input )` Ccomposes a Hangul syllable (see http://www.unicode.org/unicode/reports/tr15/#Hangul Ccomposes a Hangul syllable (see http://www.unicode.org/unicode/reports/tr15/#Hangul Parameters `$input` `array` Decomposed UCS4 sequence Returns `array` UCS4 sequence with syllables composed
`public integer`	# `_get_combining_class( integer $char )` Returns the combining class of a certain wide char Returns the combining class of a certain wide char Parameters `$char` `integer` Wide char to check (32bit integer) Returns `integer` Combining class if found, else 0
`public array`	# `_apply_cannonical_ordering( array $input )` Apllies the cannonical ordering of a decomposed UCS4 sequence Apllies the cannonical ordering of a decomposed UCS4 sequence Parameters `$input` `array` Decomposed UCS4 sequence Returns `array` Ordered USC4 sequence
`public array`	# `_combine( array $input )` Do composition of a sequence of starter and non-starter Do composition of a sequence of starter and non-starter Parameters `$input` `array` UCS4 Decomposed sequence Returns `array` Ordered USC4 sequence
`public`	# `_utf8_to_ucs4( $input )` This converts an UTF-8 encoded string to its UCS-4 representation By talking about UCS-4 "strings" we mean arrays of 32bit integers representing each of the "chars". This is due to PHP not being able to handle strings with bit depth different from 8. This apllies to the reverse method _ucs4_to_utf8(), too. The following UTF-8 encodings are supported: bytes bits representation 1 7 0xxxxxxx 2 11 110xxxxx 10xxxxxx 3 16 1110xxxx 10xxxxxx 10xxxxxx 4 21 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx 5 26 111110xx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx 6 31 1111110x 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx Each x represents a bit that can be used to store character data. The five and six byte sequences are part of Annex D of ISO/IEC 10646-1:2000 This converts an UTF-8 encoded string to its UCS-4 representation By talking about UCS-4 "strings" we mean arrays of 32bit integers representing each of the "chars". This is due to PHP not being able to handle strings with bit depth different from 8. This apllies to the reverse method _ucs4_to_utf8(), too. The following UTF-8 encodings are supported: bytes bits representation 1 7 0xxxxxxx 2 11 110xxxxx 10xxxxxx 3 16 1110xxxx 10xxxxxx 10xxxxxx 4 21 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx 5 26 111110xx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx 6 31 1111110x 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx Each x represents a bit that can be used to store character data. The five and six byte sequences are part of Annex D of ISO/IEC 10646-1:2000
`public`	# `_ucs4_to_utf8( $input )` Convert UCS-4 string into UTF-8 string See _utf8_to_ucs4() for details Convert UCS-4 string into UTF-8 string See _utf8_to_ucs4() for details
`public`	# `_ucs4_to_ucs4_string( $input )` Convert UCS-4 array into UCS-4 string Convert UCS-4 array into UCS-4 string
`public`	# `_ucs4_string_to_ucs4( $input )` Convert UCS-4 strin into UCS-4 garray Convert UCS-4 strin into UCS-4 garray

Properties summary
`public array`	`$NP`	`array()`	# Holds all relevant mapping tables, loaded from a seperate file on construct See RFC3454 for details Holds all relevant mapping tables, loaded from a seperate file on construct See RFC3454 for details
`public string`	`$_punycode_prefix`	`'xn--'`	#
`public integer`	`$_invalid_ucs`	`0x80000000`	#
`public integer`	`$_max_ucs`	`0x10FFFF`	#
`public integer`	`$_base`	`36`	#
`public integer`	`$_tmin`	`1`	#
`public integer`	`$_tmax`	`26`	#
`public integer`	`$_skew`	`38`	#
`public integer`	`$_damp`	`700`	#
`public integer`	`$_initial_bias`	`72`	#
`public integer`	`$_initial_n`	`0x80`	#
`public integer`	`$_sbase`	`0xAC00`	#
`public integer`	`$_lbase`	`0x1100`	#
`public integer`	`$_vbase`	`0x1161`	#
`public integer`	`$_tbase`	`0x11A7`	#
`public integer`	`$_lcount`	`19`	#
`public integer`	`$_vcount`	`21`	#
`public integer`	`$_tcount`	`28`	#
`public integer`	`$_ncount`	`588`	#
`public integer`	`$_scount`	`11172`	#
`public boolean`	`$_error`	`false`	#
`public string`	`$_api_encoding`	`'utf8'`	#
`public boolean`	`$_allow_overlong`	`false`	#
`public boolean`	`$_strict_mode`	`false`	#

Packages

Classes

Functions

Class idna_convert

Direct known subclasses

Parameters

Returns

Parameters

Returns

Parameters

Returns

Returns

Parameters

Returns

Parameters

Returns

Parameters

Returns

Parameters

Returns

Parameters

Returns

Parameters

Returns