This is a wrapper around PHP’s mb_convert_encoding
and iconv
functions. This library adds:
- fallback from
mb
toiconv
for encodings it does not support - conversion of warnings to proper exceptions.
The recommended way to install the Transcoder library is through Composer:
$ composer require fossar/transcoder
This command requires you to have Composer installed globally, as explained in the installation chapter of the Composer documentation.
Create the right transcoder for your platform and translate a string to ISO-8859-1 encoding:
use Ddeboer\Transcoder\Transcoder;
$transcoder = Transcoder::create();
$result = $transcoder->transcode('España', 'iso-8859-1');
You can also manually instantiate a transcoder of your liking:
use Ddeboer\Transcoder\MbTranscoder;
$transcoder = new MbTranscoder();
Or:
use Ddeboer\Transcoder\IconvTranscoder;
$transcoder = new IconvTranscoder();
The second argument accepts source encoding and can actually be omitted or passed null
.
$transcoder->transcode('España');
In that case, however, the behaviour is backend-specific:
IconvTranscoder
will use the encoding of the current locale of the process.MbTranscoder
will try to detect encoding from a list based on the value ofmbstring.language
setting. By default, this tries ASCII, followed by UTF-8. The number of supported languages is limited though and the encoding tables often overlap so the detection might be unreliable.
As you can see, this is mostly useless for western languages. You will get much more reliable results when you specify the source encoding explicitly.
Specify a default target encoding as the first argument to create()
:
use Ddeboer\Transcoder\Transcoder;
$isoTranscoder = Transcoder::create('iso-8859-1');
Alternatively, specify a target encoding as the third argument in a transcode()
call:
use Ddeboer\Transcoder\Transcoder;
$transcoder->transcode('España', 'iso-8859-1', 'UTF-8');
PHP’s mv_convert_encoding
and iconv
are inconvenient to use because they generate notices and warnings instead of proper exceptions. This library fixes that:
use Ddeboer\Transcoder\Exception\UndetectableEncodingException;
use Ddeboer\Transcoder\Exception\UnsupportedEncodingException;
use Ddeboer\Transcoder\Exception\IllegalCharacterException;
$input = 'España';
try {
$transcoder->transcode($input, 'utf-8', 'not-a-real-encoding');
} catch (UnsupportedEncodingException $e) {
// ‘not-a-real-encoding’ is an unsupported encoding
}
try {
$transcoder->transcode('Illegal quotes: ‘ ’', 'utf-8', 'iso-8859-1');
} catch (IllegalCharacterException $e) {
// Curly quotes ‘ ’ are illegal in ISO-8859-1
}
try {
$transcoder->transcode($input);
} catch (UndetectableEncodingException $e) {
// Failed to automatically detect $input’s encoding (mb) or not a valid string in current locale locale (iconv)
}
In general, mb_convert_encoding
is faster than iconv
. However, as iconv
supports more encodings than mb_convert_encoding
, it makes sense to combine the two.
So, the Transcoder returned from create()
: