Overview

Packages

  • None
  • SimplePie

Classes

  • idna_convert
  • Net_IDNA_php4

Functions

  • callable_htmlspecialchars
  • Overview
  • Package
  • Class
  • Tree
  • Deprecated
  • Todo

Class idna_convert

Encode/decode Internationalized Domain Names.

The class allows to convert internationalized domain names (see RFC 3490 for details) as they can be used with various registries worldwide to be translated between their original (localized) form and their encoded form as it will be used in the DNS (Domain Name System).

The class provides two public methods, encode() and decode(), which do exactly what you would expect them to do. You are allowed to use complete domain names, simple strings and complete email addresses as well. That means, that you might use any of the following notations:

  • www.nörgler.com
  • xn--nrgler-wxa
  • xn--brse-5qa.xn--knrz-1ra.info

Unicode input might be given as either UTF-8 string, UCS-4 string or UCS-4 array. Unicode output is available in the same formats. You can select your preferred format via set_paramter().

ACE input and output is always expected to be ASCII.

Direct known subclasses

Net_IDNA_php4
Copyright: 2004-2007 phlyLabs Berlin, http://phlylabs.de
Author: Matthias Sommerfeld <mso@phlylabs.de>
Version: 0.5.1
Located at idn/idna_convert.class.php
Methods summary
public
# idna_convert( $options = false )
public boolean
# set_parameter( mixed $option, string $value = false )

Sets a new option value. Available options and values: [encoding - Use either UTF-8, UCS4 as array or UCS4 as string as input ('utf8' for UTF-8,
'ucs4_string' and 'ucs4_array' respectively for UCS4); The output is always UTF-8] [overlong - Unicode does not allow unnecessarily long encodings of chars,
to allow this, set this parameter to true, else to false;
default is false.] [strict - true: strict mode, good for registration purposes - Causes errors
on failures; false: loose mode, ideal for "wildlife" applications
by silently ignoring errors and returning the original input instead

Sets a new option value. Available options and values: [encoding - Use either UTF-8, UCS4 as array or UCS4 as string as input ('utf8' for UTF-8, 'ucs4_string' and 'ucs4_array' respectively for UCS4); The output is always UTF-8] [overlong - Unicode does not allow unnecessarily long encodings of chars, to allow this, set this parameter to true, else to false; default is false.] [strict - true: strict mode, good for registration purposes - Causes errors on failures; false: loose mode, ideal for "wildlife" applications by silently ignoring errors and returning the original input instead

Parameters

$option
mixed
Parameter to set (string: single parameter; array of Parameter => Value pairs)
$value
string
Value to use (if parameter 1 is a string)

Returns

boolean
true on success, false otherwise
public string
# decode( string $input, $one_time_encoding = false )

Decode a given ACE domain name

Decode a given ACE domain name

Parameters

$input
string
Domain name (ACE string) [@param string Desired output encoding, see idna_convert::set_parameter()]
$one_time_encoding

Returns

string
Decoded Domain name (UTF-8 or UCS-4)
public string
# encode( string $decoded, $one_time_encoding = false )

Encode a given UTF-8 domain name

Encode a given UTF-8 domain name

Parameters

$decoded
string
Domain name (UTF-8 or UCS-4) [@param string Desired input encoding, see idna_convert::set_parameter()]
$one_time_encoding

Returns

string
Encoded Domain name (ACE string)
public string
# get_last_error( )

Use this method to get the last error ocurred

Use this method to get the last error ocurred

Returns

string
The last error, that occured
public
# _decode( $encoded )

The actual decoding algorithm

The actual decoding algorithm

public
# _encode( $decoded )

The actual encoding algorithm

The actual encoding algorithm

public
# _adapt( $delta, $npoints, $is_first )

Adapt the bias according to the current code point and position

Adapt the bias according to the current code point and position

public
# _encode_digit( $d )

Encoding a certain digit

Encoding a certain digit

public
# _decode_digit( $cp )

Decode a certain digit

Decode a certain digit

public
# _error( $error = '' )

Internal error handling method

Internal error handling method

public string
# _nameprep( array $input )

Do Nameprep according to RFC3491 and RFC3454

Do Nameprep according to RFC3491 and RFC3454

Parameters

$input
array
Unicode Characters

Returns

string
Unicode Characters, Nameprep'd
public array
# _hangul_decompose( integer $char )

Decomposes a Hangul syllable (see http://www.unicode.org/unicode/reports/tr15/#Hangul

Decomposes a Hangul syllable (see http://www.unicode.org/unicode/reports/tr15/#Hangul

Parameters

$char
integer
32bit UCS4 code point

Returns

array
Either Hangul Syllable decomposed or original 32bit value as one value array
public array
# _hangul_compose( array $input )

Ccomposes a Hangul syllable (see http://www.unicode.org/unicode/reports/tr15/#Hangul

Ccomposes a Hangul syllable (see http://www.unicode.org/unicode/reports/tr15/#Hangul

Parameters

$input
array
Decomposed UCS4 sequence

Returns

array
UCS4 sequence with syllables composed
public integer
# _get_combining_class( integer $char )

Returns the combining class of a certain wide char

Returns the combining class of a certain wide char

Parameters

$char
integer
Wide char to check (32bit integer)

Returns

integer
Combining class if found, else 0
public array
# _apply_cannonical_ordering( array $input )

Apllies the cannonical ordering of a decomposed UCS4 sequence

Apllies the cannonical ordering of a decomposed UCS4 sequence

Parameters

$input
array
Decomposed UCS4 sequence

Returns

array
Ordered USC4 sequence
public array
# _combine( array $input )

Do composition of a sequence of starter and non-starter

Do composition of a sequence of starter and non-starter

Parameters

$input
array
UCS4 Decomposed sequence

Returns

array
Ordered USC4 sequence
public
# _utf8_to_ucs4( $input )

This converts an UTF-8 encoded string to its UCS-4 representation By talking about UCS-4 "strings" we mean arrays of 32bit integers representing each of the "chars". This is due to PHP not being able to handle strings with bit depth different from 8. This apllies to the reverse method _ucs4_to_utf8(), too. The following UTF-8 encodings are supported: bytes bits representation 1 7 0xxxxxxx 2 11 110xxxxx 10xxxxxx 3 16 1110xxxx 10xxxxxx 10xxxxxx 4 21 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx 5 26 111110xx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx 6 31 1111110x 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx Each x represents a bit that can be used to store character data. The five and six byte sequences are part of Annex D of ISO/IEC 10646-1:2000

This converts an UTF-8 encoded string to its UCS-4 representation By talking about UCS-4 "strings" we mean arrays of 32bit integers representing each of the "chars". This is due to PHP not being able to handle strings with bit depth different from 8. This apllies to the reverse method _ucs4_to_utf8(), too. The following UTF-8 encodings are supported: bytes bits representation 1 7 0xxxxxxx 2 11 110xxxxx 10xxxxxx 3 16 1110xxxx 10xxxxxx 10xxxxxx 4 21 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx 5 26 111110xx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx 6 31 1111110x 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx Each x represents a bit that can be used to store character data. The five and six byte sequences are part of Annex D of ISO/IEC 10646-1:2000

public
# _ucs4_to_utf8( $input )

Convert UCS-4 string into UTF-8 string See _utf8_to_ucs4() for details

Convert UCS-4 string into UTF-8 string See _utf8_to_ucs4() for details

public
# _ucs4_to_ucs4_string( $input )

Convert UCS-4 array into UCS-4 string

Convert UCS-4 array into UCS-4 string

public
# _ucs4_string_to_ucs4( $input )

Convert UCS-4 strin into UCS-4 garray

Convert UCS-4 strin into UCS-4 garray

Properties summary
public array $NP array()
#

Holds all relevant mapping tables, loaded from a seperate file on construct See RFC3454 for details

Holds all relevant mapping tables, loaded from a seperate file on construct See RFC3454 for details

public string $_punycode_prefix 'xn--'
#
public integer $_invalid_ucs 0x80000000
#
public integer $_max_ucs 0x10FFFF
#
public integer $_base 36
#
public integer $_tmin 1
#
public integer $_tmax 26
#
public integer $_skew 38
#
public integer $_damp 700
#
public integer $_initial_bias 72
#
public integer $_initial_n 0x80
#
public integer $_sbase 0xAC00
#
public integer $_lbase 0x1100
#
public integer $_vbase 0x1161
#
public integer $_tbase 0x11A7
#
public integer $_lcount 19
#
public integer $_vcount 21
#
public integer $_tcount 28
#
public integer $_ncount 588
#
public integer $_scount 11172
#
public boolean $_error false
#
public string $_api_encoding 'utf8'
#
public boolean $_allow_overlong false
#
public boolean $_strict_mode false
#
SimplePie Documentation API documentation generated by ApiGen 2.4.0