Ticket #820 (closed defect: fixed)
multibyte characters becomes garbled when imported via yaml fixture file
| Reported by: | malte | Owned by: | lsmith |
|---|---|---|---|
| Priority: | major | Milestone: | 0.10.3 |
| Component: | Import/Export | Version: | 0.9.0 |
| Severity: | Keywords: | ||
| Cc: | Has Test: | ||
| Status: | Has Patch: |
Description
If you put a few MB characters like "åäö£" (swedish characters and pound-sign) in a row in a table, and then dump the data to a fixture file, all is well, the characters are put cleanly in the yaml fixture file, and they occupy 2 bytes each as expected. But if you import this data back into the db, the MB characters will become garbled. "åäö£" becomes "åäö£". The latter string occupies 16 characters, since it now is 8 MB characters instead.
I have tested and verified the broken-ness of this on Linux (debian lenny) and Mac OSX (leopard).
I think this is a major problem with Doctrine, since it breaks your data in a large way if you are working with languages and symbols other than US english. Not even british english is safe (pound sign is broken).
DB used is MySQL, collation used is MySQLs default; latin1_swedish_ci. That collation can positively handle the characters above, so the error is not with the db or the OS (tested on multiple platforms). PHP version is 5.2.5.