[egenix-users] ODBC encoding problems

Martijn Pieters mj at zopatista.com
Mon Oct 29 10:44:26 CET 2007


On 10/28/07, Martijn Pieters <mj at zopatista.com> wrote:
> Here are the first 2 rows as comma-separated values, which happen to
> be public company names.
> 2, SuperOffice Norge AS, Østnorsk, 2
> 3, Norsk Jazzforum, Østnorsk, 1453
>
> In case email encoding gets mucked up, that's 2 times O with /
> (unicode 00D8) in the 'department' column (don't ask why the
> department column holds a region in Norway).

I used the TDSDUMP environment variable to trace the problem. isql
sets the character set to UTF-8, while for mx.ODBC, the character set
appears to be set to ASCII!

Here is the relevant section for the mx.ODBC query 'select deparment
from CRM5.contact where contact_id=2':

09:39:20.790301 Received packet
0000 81 01 00 00 00 09 00 a7-27 00 06 04 d0 00 00 0a |........ '.......|
0010 64 00 65 00 70 00 61 00-72 00 74 00 6d 00 65 00 |d.e.p.a. r.t.m.e.|
0020 6e 00 74 00 d1 08 00 d8-73 74 6e 6f 72 73 6b fd |n.t..... stnorsk.|
0030 10 00 c1 00 01 00 00 00-                        |........|

09:39:20.790339 processing result tokens.  marker is  81(TDS7_RESULT)
09:39:20.790461 tds_free_all_results()
09:39:20.790490 processing TDS7 result. set current_results to tds->res_info
09:39:20.790512 adjust_character_column_size:
        Server charset: CP1252
        Server column_size: 39
        Client charset: US-ASCII
        Client column_size: 39

When querying from isql instead, the same section of the log reads:

09:35:07.428719 Received packet
0000 81 01 00 00 00 09 00 a7-27 00 06 04 d0 00 00 0a |........ '.......|
0010 64 00 65 00 70 00 61 00-72 00 74 00 6d 00 65 00 |d.e.p.a. r.t.m.e.|
0020 6e 00 74 00 d1 08 00 d8-73 74 6e 6f 72 73 6b ff |n.t..... stnorsk.|
0030 11 00 c1 00 01 00 00 00-79 00 00 00 00 fe 00 00 |........ y.......|
0040 e0 00 01 00 00 00      -                        |......|

09:35:07.428756 processing result tokens.  marker is  81(TDS7_RESULT)
09:35:07.428796 tds_free_all_results()
09:35:07.428820 processing TDS7 result. set current_results to tds->res_info
09:35:07.428840 adjust_character_column_size:
        Server charset: CP1252
        Server column_size: 39
        Client charset: UTF-8
        Client column_size: 156

The question remains then how freetds determines the client charset.
I'll experiment with setting client charset explicitly.

-- 
Martijn Pieters


More information about the egenix-users mailing list