[egenix-users] ODBC encoding problems

M.-A. Lemburg mal at egenix.com
Mon Oct 29 10:48:59 CET 2007


On 2007-10-29 09:44, Martijn Pieters wrote:
> On 10/28/07, Martijn Pieters <mj at zopatista.com> wrote:
>> Here are the first 2 rows as comma-separated values, which happen to
>> be public company names.
>> 2, SuperOffice Norge AS, Østnorsk, 2
>> 3, Norsk Jazzforum, Østnorsk, 1453
>>
>> In case email encoding gets mucked up, that's 2 times O with /
>> (unicode 00D8) in the 'department' column (don't ask why the
>> department column holds a region in Norway).
> 
> I used the TDSDUMP environment variable to trace the problem. isql
> sets the character set to UTF-8, while for mx.ODBC, the character set
> appears to be set to ASCII!

mxODBC doesn't set the character set, it only makes an assumption
of what the character set is in order to deal with Unicode
conversions.

This is what we get in the TDS log file when starting up
our test suite:

util.c:288:Starting log file for FreeTDS 0.64
        on 2007-10-29 10:30:40 with debug flags 0x4fff.
iconv.c:195:names for ISO-8859-1: ISO-8859-1
iconv.c:195:names for UTF-8: UTF-8
iconv.c:195:names for UCS-2LE: UCS-2LE
iconv.c:195:names for UCS-2BE: UCS-2BE
iconv.c:361:iconv to convert client-side data to the "ISO-8859-1" character set
iconv.c:514:tds_iconv_info_init: converting "ISO-8859-1"->"UCS-2LE"
iconv.c:514:tds_iconv_info_init: converting "ISO-8859-1"->"UCS-2LE"

and later on:

token.c:3395:adjust_character_column_size:
        Server charset: CP1252
        Server column_size: 100
        Client charset: ISO-8859-1
        Client column_size: 100

The Latin-1 encoding used is in line with our test results.

> Here is the relevant section for the mx.ODBC query 'select deparment
> from CRM5.contact where contact_id=2':
> 
> 09:39:20.790301 Received packet
> 0000 81 01 00 00 00 09 00 a7-27 00 06 04 d0 00 00 0a |........ '.......|
> 0010 64 00 65 00 70 00 61 00-72 00 74 00 6d 00 65 00 |d.e.p.a. r.t.m.e.|
> 0020 6e 00 74 00 d1 08 00 d8-73 74 6e 6f 72 73 6b fd |n.t..... stnorsk.|
> 0030 10 00 c1 00 01 00 00 00-                        |........|
> 
> 09:39:20.790339 processing result tokens.  marker is  81(TDS7_RESULT)
> 09:39:20.790461 tds_free_all_results()
> 09:39:20.790490 processing TDS7 result. set current_results to tds->res_info
> 09:39:20.790512 adjust_character_column_size:
>         Server charset: CP1252
>         Server column_size: 39
>         Client charset: US-ASCII
>         Client column_size: 39
> 
> When querying from isql instead, the same section of the log reads:
> 
> 09:35:07.428719 Received packet
> 0000 81 01 00 00 00 09 00 a7-27 00 06 04 d0 00 00 0a |........ '.......|
> 0010 64 00 65 00 70 00 61 00-72 00 74 00 6d 00 65 00 |d.e.p.a. r.t.m.e.|
> 0020 6e 00 74 00 d1 08 00 d8-73 74 6e 6f 72 73 6b ff |n.t..... stnorsk.|
> 0030 11 00 c1 00 01 00 00 00-79 00 00 00 00 fe 00 00 |........ y.......|
> 0040 e0 00 01 00 00 00      -                        |......|
> 
> 09:35:07.428756 processing result tokens.  marker is  81(TDS7_RESULT)
> 09:35:07.428796 tds_free_all_results()
> 09:35:07.428820 processing TDS7 result. set current_results to tds->res_info
> 09:35:07.428840 adjust_character_column_size:
>         Server charset: CP1252
>         Server column_size: 39
>         Client charset: UTF-8
>         Client column_size: 156
> 
> The question remains then how freetds determines the client charset.
> I'll experiment with setting client charset explicitly.

What's strange in your log output is that your freetds entry says:

[JazzForum]
        ; development host uses a ssh tunnel to connect
        host = localhost
        port = 1433
        tds version = 8.0

Yet the log suggests that TDS 7.0 is being used.

Perhaps isql is setting the character set explicitly, while using
the FreeTDS driver via mxODBC picks up a default encoding via
some locale environment variable ?

Try adding an explicit line

	client charset = ISO-8859-1

to your freetds.conf file.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Oct 29 2007)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

:::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611



More information about the egenix-users mailing list