Class idna_convert
Encode/decode Internationalized Domain Names.
The class allows to convert internationalized domain names (see RFC 3490 for details) as they can be used with various registries worldwide to be translated between their original (localized) form and their encoded form as it will be used in the DNS (Domain Name System).
The class provides two public methods, encode() and decode(), which do exactly what you would expect them to do. You are allowed to use complete domain names, simple strings and complete email addresses as well. That means, that you might use any of the following notations:
- www.nörgler.com
- xn--nrgler-wxa
- xn--brse-5qa.xn--knrz-1ra.info
Unicode input might be given as either UTF-8 string, UCS-4 string or UCS-4 array. Unicode output is available in the same formats. You can select your preferred format via set_paramter().
ACE input and output is always expected to be ASCII.
Direct known subclasses
Net_IDNA_php4Author: Matthias Sommerfeld <mso@phlylabs.de>
Version: 0.5.1
Located at idn/idna_convert.class.php
public
|
|
public
boolean
|
#
set_parameter( mixed $option, string $value = false )
Sets a new option value. Available options and values: [encoding - Use either
UTF-8, UCS4 as array or UCS4 as string as input ('utf8' for UTF-8, |
public
string
|
|
public
string
|
|
public
string
|
|
public
|
|
public
|
|
public
|
|
public
|
|
public
|
|
public
|
|
public
string
|
|
public
array
|
#
_hangul_decompose( integer $char )
Decomposes a Hangul syllable (see http://www.unicode.org/unicode/reports/tr15/#Hangul |
public
array
|
#
_hangul_compose( array $input )
Ccomposes a Hangul syllable (see http://www.unicode.org/unicode/reports/tr15/#Hangul |
public
integer
|
|
public
array
|
#
_apply_cannonical_ordering( array $input )
Apllies the cannonical ordering of a decomposed UCS4 sequence |
public
array
|
|
public
|
#
_utf8_to_ucs4( $input )
This converts an UTF-8 encoded string to its UCS-4 representation By talking about UCS-4 "strings" we mean arrays of 32bit integers representing each of the "chars". This is due to PHP not being able to handle strings with bit depth different from 8. This apllies to the reverse method _ucs4_to_utf8(), too. The following UTF-8 encodings are supported: bytes bits representation 1 7 0xxxxxxx 2 11 110xxxxx 10xxxxxx 3 16 1110xxxx 10xxxxxx 10xxxxxx 4 21 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx 5 26 111110xx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx 6 31 1111110x 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx Each x represents a bit that can be used to store character data. The five and six byte sequences are part of Annex D of ISO/IEC 10646-1:2000 |
public
|
|
public
|
|
public
|
public
array
|
$NP | array() |
#
Holds all relevant mapping tables, loaded from a seperate file on construct See RFC3454 for details |
public
string
|
$_punycode_prefix | 'xn--' |
|
public
integer
|
$_invalid_ucs | 0x80000000 |
|
public
integer
|
$_max_ucs | 0x10FFFF |
|
public
integer
|
$_base | 36 |
|
public
integer
|
$_tmin | 1 |
|
public
integer
|
$_tmax | 26 |
|
public
integer
|
$_skew | 38 |
|
public
integer
|
$_damp | 700 |
|
public
integer
|
$_initial_bias | 72 |
|
public
integer
|
$_initial_n | 0x80 |
|
public
integer
|
$_sbase | 0xAC00 |
|
public
integer
|
$_lbase | 0x1100 |
|
public
integer
|
$_vbase | 0x1161 |
|
public
integer
|
$_tbase | 0x11A7 |
|
public
integer
|
$_lcount | 19 |
|
public
integer
|
$_vcount | 21 |
|
public
integer
|
$_tcount | 28 |
|
public
integer
|
$_ncount | 588 |
|
public
integer
|
$_scount | 11172 |
|
public
boolean
|
$_error | false |
|
public
string
|
$_api_encoding | 'utf8' |
|
public
boolean
|
$_allow_overlong | false |
|
public
boolean
|
$_strict_mode | false |