Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UTF-8 column names and error messages are octet-streams, not strings [rt.cpan.org #120141] #214

Open
mbeijen opened this issue Nov 15, 2017 · 0 comments
Labels
utf8 Unicode and UTF-8 handling

Comments

@mbeijen
Copy link
Contributor

mbeijen commented Nov 15, 2017

Migrated from rt.cpan.org#120141 (status was 'open')

Requestors:

Attachments:

From [email protected] on 2017-02-08 04:07:52:

Hello,

Column names and error messages should be treated as strings, but
they are octet-streams in DBD-mysql-4.041.

The attached code creates a table with a column whose name
contains a non ASCII character.  After issueing a SELECT statement
and fetchrow_hashref, it tries to get a value using the column name
at (1), but the result is undef.  If you use the octet stream for
the column name as a key, you get the value, at (2).

Also, when you use Japanese error messages by adding line
	lc_messages=ja_JP
in [mysqld] section of my.ini, messages are not decoded in
DBD::mysql.  As a result, messages are unreadable in (3) and (4).
We could explicitly decode them as in (5) for message caught, but
this cannot be applied to (3).  Of course, it can be avoided by
not using automatic encoding for STDERR at (6), but then we need
to manually encode all other strings, a nightmare.

Finally, I noticed that when error messages are in Japanese, make
test of DBD-mysql fails.  It may be difficult to avoid (I do not
know), but a warning message (lc_messages should not be changed)
in make test would help.

DBD::mysql version: 4.041
Strawberry perl 64bit, v5.22.1
MariaDB
   $dbh->{mysql_clientinfo, mysql_clientversion, mysql_serverversion}  
returns:
   5.1.44, 50144, 50505, respectively.
Windows 7 Pro Service Pack 1

Regards,
Tanabe Yoshinori

From [email protected] on 2017-02-08 10:32:43:

On Tue Feb 07 23:07:52 2017, [email protected] wrote:
> Hello,
> 
> Column names and error messages should be treated as strings, but
> they are octet-streams in DBD-mysql-4.041.
> 
> The attached code creates a table with a column whose name
> contains a non ASCII character.  After issueing a SELECT statement
> and fetchrow_hashref, it tries to get a value using the column name
> at (1), but the result is undef.  If you use the octet stream for
> the column name as a key, you get the value, at (2).
> 
> Also, when you use Japanese error messages by adding line
> 	lc_messages=ja_JP
> in [mysqld] section of my.ini, messages are not decoded in
> DBD::mysql.  As a result, messages are unreadable in (3) and (4).
> We could explicitly decode them as in (5) for message caught, but
> this cannot be applied to (3).  Of course, it can be avoided by
> not using automatic encoding for STDERR at (6), but then we need
> to manually encode all other strings, a nightmare.
> 
> Finally, I noticed that when error messages are in Japanese, make
> test of DBD-mysql fails.  It may be difficult to avoid (I do not
> know), but a warning message (lc_messages should not be changed)
> in make test would help.
> 
> DBD::mysql version: 4.041
> Strawberry perl 64bit, v5.22.1
> MariaDB
>    $dbh->{mysql_clientinfo, mysql_clientversion, mysql_serverversion}  
> returns:
>    5.1.44, 50144, 50505, respectively.
> Windows 7 Pro Service Pack 1
> 
> Regards,
> Tanabe Yoshinori
> 

Hello, please try development version 4.041_1 of DBD-mysql. That one has fixed UTF-8 support for passing statements and parameters.

From [email protected] on 2017-02-08 11:20:34:

On 2017/02/08 19:32, Pali via RT wrote:
> <URL: https://rt.cpan.org/Ticket/Display.html?id=120141 >
>
> On Tue Feb 07 23:07:52 2017, [email protected] wrote:
>> Hello,
>>
>> Column names and error messages should be treated as strings, but
>> they are octet-streams in DBD-mysql-4.041.
>>
>> The attached code creates a table with a column whose name
>> contains a non ASCII character.  After issueing a SELECT statement
>> and fetchrow_hashref, it tries to get a value using the column name
>> at (1), but the result is undef.  If you use the octet stream for
>> the column name as a key, you get the value, at (2).
>>
>> Also, when you use Japanese error messages by adding line
>> 	lc_messages=ja_JP
>> in [mysqld] section of my.ini, messages are not decoded in
>> DBD::mysql.  As a result, messages are unreadable in (3) and (4).
>> We could explicitly decode them as in (5) for message caught, but
>> this cannot be applied to (3).  Of course, it can be avoided by
>> not using automatic encoding for STDERR at (6), but then we need
>> to manually encode all other strings, a nightmare.
>>
>> Finally, I noticed that when error messages are in Japanese, make
>> test of DBD-mysql fails.  It may be difficult to avoid (I do not
>> know), but a warning message (lc_messages should not be changed)
>> in make test would help.
>>
>> DBD::mysql version: 4.041
>> Strawberry perl 64bit, v5.22.1
>> MariaDB
>>    $dbh->{mysql_clientinfo, mysql_clientversion, mysql_serverversion}
>> returns:
>>    5.1.44, 50144, 50505, respectively.
>> Windows 7 Pro Service Pack 1
>>
>> Regards,
>> Tanabe Yoshinori
>>
>
> Hello, please try development version 4.041_1 of DBD-mysql. That one has fixed UTF-8 support for passing statements and parameters.
>

Hello,

I have just installed 4.041_01 ("print $DBD::mysql::VERSION" shows the 
number) and run the script again.  The results are the same as in my
first report.

Thank you.
Tanabe

From [email protected] on 2017-02-12 12:52:30:

On Str Feb 08 06:20:34 2017, [email protected] wrote:
> On 2017/02/08 19:32, Pali via RT wrote:
> > <URL: https://rt.cpan.org/Ticket/Display.html?id=120141 >
> >
> > On Tue Feb 07 23:07:52 2017, [email protected] wrote:
> >> Hello,
> >>
> >> Column names and error messages should be treated as strings, but
> >> they are octet-streams in DBD-mysql-4.041.
> >>
> >> The attached code creates a table with a column whose name
> >> contains a non ASCII character.  After issueing a SELECT statement
> >> and fetchrow_hashref, it tries to get a value using the column name
> >> at (1), but the result is undef.  If you use the octet stream for
> >> the column name as a key, you get the value, at (2).
> >>
> >> Also, when you use Japanese error messages by adding line
> >>      lc_messages=ja_JP
> >> in [mysqld] section of my.ini, messages are not decoded in
> >> DBD::mysql.  As a result, messages are unreadable in (3) and (4).
> >> We could explicitly decode them as in (5) for message caught, but
> >> this cannot be applied to (3).  Of course, it can be avoided by
> >> not using automatic encoding for STDERR at (6), but then we need
> >> to manually encode all other strings, a nightmare.
> >>
> >> Finally, I noticed that when error messages are in Japanese, make
> >> test of DBD-mysql fails.  It may be difficult to avoid (I do not
> >> know), but a warning message (lc_messages should not be changed)
> >> in make test would help.
> >>
> >> DBD::mysql version: 4.041
> >> Strawberry perl 64bit, v5.22.1
> >> MariaDB
> >>    $dbh->{mysql_clientinfo, mysql_clientversion,
> >> mysql_serverversion}
> >> returns:
> >>    5.1.44, 50144, 50505, respectively.
> >> Windows 7 Pro Service Pack 1
> >>
> >> Regards,
> >> Tanabe Yoshinori
> >>
> >
> > Hello, please try development version 4.041_1 of DBD-mysql. That one
> > has fixed UTF-8 support for passing statements and parameters.
> >
> 
> Hello,
> 
> I have just installed 4.041_01 ("print $DBD::mysql::VERSION" shows the
> number) and run the script again.  The results are the same as in my
> first report.
> 
> Thank you.
> Tanabe

Hi! Can you try compile DBD::mysql (either 4.041_01 or from git master) with these two attached patches? It should fix wide Unicode characters in column names and error messages. Note that DBI itself has broken Unicode messages prior to version 1.635 (see https://rt.cpan.org/Public/Bug/Display.html?id=102404).

From [email protected] on 2017-02-12 12:54:03:

On Ned Feb 12 07:52:30 2017, PALI wrote:
> On Str Feb 08 06:20:34 2017, [email protected] wrote:
> > On 2017/02/08 19:32, Pali via RT wrote:
> > > <URL: https://rt.cpan.org/Ticket/Display.html?id=120141 >
> > >
> > > On Tue Feb 07 23:07:52 2017, [email protected] wrote:
> > >> Hello,
> > >>
> > >> Column names and error messages should be treated as strings, but
> > >> they are octet-streams in DBD-mysql-4.041.
> > >>
> > >> The attached code creates a table with a column whose name
> > >> contains a non ASCII character.  After issueing a SELECT statement
> > >> and fetchrow_hashref, it tries to get a value using the column
> > >> name
> > >> at (1), but the result is undef.  If you use the octet stream for
> > >> the column name as a key, you get the value, at (2).
> > >>
> > >> Also, when you use Japanese error messages by adding line
> > >>      lc_messages=ja_JP
> > >> in [mysqld] section of my.ini, messages are not decoded in
> > >> DBD::mysql.  As a result, messages are unreadable in (3) and (4).
> > >> We could explicitly decode them as in (5) for message caught, but
> > >> this cannot be applied to (3).  Of course, it can be avoided by
> > >> not using automatic encoding for STDERR at (6), but then we need
> > >> to manually encode all other strings, a nightmare.
> > >>
> > >> Finally, I noticed that when error messages are in Japanese, make
> > >> test of DBD-mysql fails.  It may be difficult to avoid (I do not
> > >> know), but a warning message (lc_messages should not be changed)
> > >> in make test would help.
> > >>
> > >> DBD::mysql version: 4.041
> > >> Strawberry perl 64bit, v5.22.1
> > >> MariaDB
> > >>    $dbh->{mysql_clientinfo, mysql_clientversion,
> > >> mysql_serverversion}
> > >> returns:
> > >>    5.1.44, 50144, 50505, respectively.
> > >> Windows 7 Pro Service Pack 1
> > >>
> > >> Regards,
> > >> Tanabe Yoshinori
> > >>
> > >
> > > Hello, please try development version 4.041_1 of DBD-mysql. That
> > > one
> > > has fixed UTF-8 support for passing statements and parameters.
> > >
> >
> > Hello,
> >
> > I have just installed 4.041_01 ("print $DBD::mysql::VERSION" shows
> > the
> > number) and run the script again.  The results are the same as in my
> > first report.
> >
> > Thank you.
> > Tanabe
> 
> Hi! Can you try compile DBD::mysql (either 4.041_01 or from git
> master) with these two attached patches? It should fix wide Unicode
> characters in column names and error messages. Note that DBI itself
> has broken Unicode messages prior to version 1.635 (see
> https://rt.cpan.org/Public/Bug/Display.html?id=102404).

Trying to attach patches again...

From [email protected] on 2017-02-13 02:34:37:

On 2017/02/12 21:52, Pali via RT wrote:
> <URL: https://rt.cpan.org/Ticket/Display.html?id=120141 >
>
> On Str Feb 08 06:20:34 2017, [email protected] wrote:
>> On 2017/02/08 19:32, Pali via RT wrote:
>>> <URL: https://rt.cpan.org/Ticket/Display.html?id=120141 >
>>>
>>> On Tue Feb 07 23:07:52 2017, [email protected] wrote:
>>>> Hello,
>>>>
>>>> Column names and error messages should be treated as strings, but
>>>> they are octet-streams in DBD-mysql-4.041.
>>>>
>>>> The attached code creates a table with a column whose name
>>>> contains a non ASCII character.  After issueing a SELECT statement
>>>> and fetchrow_hashref, it tries to get a value using the column name
>>>> at (1), but the result is undef.  If you use the octet stream for
>>>> the column name as a key, you get the value, at (2).
>>>>
>>>> Also, when you use Japanese error messages by adding line
>>>>      lc_messages=ja_JP
>>>> in [mysqld] section of my.ini, messages are not decoded in
>>>> DBD::mysql.  As a result, messages are unreadable in (3) and (4).
>>>> We could explicitly decode them as in (5) for message caught, but
>>>> this cannot be applied to (3).  Of course, it can be avoided by
>>>> not using automatic encoding for STDERR at (6), but then we need
>>>> to manually encode all other strings, a nightmare.
>>>>
>>>> Finally, I noticed that when error messages are in Japanese, make
>>>> test of DBD-mysql fails.  It may be difficult to avoid (I do not
>>>> know), but a warning message (lc_messages should not be changed)
>>>> in make test would help.
>>>>
>>>> DBD::mysql version: 4.041
>>>> Strawberry perl 64bit, v5.22.1
>>>> MariaDB
>>>>    $dbh->{mysql_clientinfo, mysql_clientversion,
>>>> mysql_serverversion}
>>>> returns:
>>>>    5.1.44, 50144, 50505, respectively.
>>>> Windows 7 Pro Service Pack 1
>>>>
>>>> Regards,
>>>> Tanabe Yoshinori
>>>>
>>>
>>> Hello, please try development version 4.041_1 of DBD-mysql. That one
>>> has fixed UTF-8 support for passing statements and parameters.
>>>
>>
>> Hello,
>>
>> I have just installed 4.041_01 ("print $DBD::mysql::VERSION" shows the
>> number) and run the script again.  The results are the same as in my
>> first report.
>>
>> Thank you.
>> Tanabe
>
> Hi! Can you try compile DBD::mysql (either 4.041_01 or from git master) with these two attached patches? It should fix wide Unicode characters in column names and error messages. Note that DBI itself has broken Unicode messages prior to version 1.635 (see https://rt.cpan.org/Public/Bug/Display.html?id=102404).
>

Hello,  I have confirmed that the problems have gone by applying the 
patches (and upgrading DBI to a later version).  Thank you very much for 
the quick fix.
One concern is that the fix can break code currently running.
Best regards,
Tanabe

From [email protected] on 2017-02-13 08:26:46:

On Sun Feb 12 21:34:37 2017, [email protected] wrote:
> On 2017/02/12 21:52, Pali via RT wrote:
> > <URL: https://rt.cpan.org/Ticket/Display.html?id=120141 >
> >
> > On Str Feb 08 06:20:34 2017, [email protected] wrote:
> >> On 2017/02/08 19:32, Pali via RT wrote:
> >>> <URL: https://rt.cpan.org/Ticket/Display.html?id=120141 >
> >>>
> >>> On Tue Feb 07 23:07:52 2017, [email protected] wrote:
> >>>> Hello,
> >>>>
> >>>> Column names and error messages should be treated as strings, but
> >>>> they are octet-streams in DBD-mysql-4.041.
> >>>>
> >>>> The attached code creates a table with a column whose name
> >>>> contains a non ASCII character.  After issueing a SELECT statement
> >>>> and fetchrow_hashref, it tries to get a value using the column
> >>>> name
> >>>> at (1), but the result is undef.  If you use the octet stream for
> >>>> the column name as a key, you get the value, at (2).
> >>>>
> >>>> Also, when you use Japanese error messages by adding line
> >>>>      lc_messages=ja_JP
> >>>> in [mysqld] section of my.ini, messages are not decoded in
> >>>> DBD::mysql.  As a result, messages are unreadable in (3) and (4).
> >>>> We could explicitly decode them as in (5) for message caught, but
> >>>> this cannot be applied to (3).  Of course, it can be avoided by
> >>>> not using automatic encoding for STDERR at (6), but then we need
> >>>> to manually encode all other strings, a nightmare.
> >>>>
> >>>> Finally, I noticed that when error messages are in Japanese, make
> >>>> test of DBD-mysql fails.  It may be difficult to avoid (I do not
> >>>> know), but a warning message (lc_messages should not be changed)
> >>>> in make test would help.
> >>>>
> >>>> DBD::mysql version: 4.041
> >>>> Strawberry perl 64bit, v5.22.1
> >>>> MariaDB
> >>>>    $dbh->{mysql_clientinfo, mysql_clientversion,
> >>>> mysql_serverversion}
> >>>> returns:
> >>>>    5.1.44, 50144, 50505, respectively.
> >>>> Windows 7 Pro Service Pack 1
> >>>>
> >>>> Regards,
> >>>> Tanabe Yoshinori
> >>>>
> >>>
> >>> Hello, please try development version 4.041_1 of DBD-mysql. That
> >>> one
> >>> has fixed UTF-8 support for passing statements and parameters.
> >>>
> >>
> >> Hello,
> >>
> >> I have just installed 4.041_01 ("print $DBD::mysql::VERSION" shows
> >> the
> >> number) and run the script again.  The results are the same as in my
> >> first report.
> >>
> >> Thank you.
> >> Tanabe
> >
> > Hi! Can you try compile DBD::mysql (either 4.041_01 or from git
> > master) with these two attached patches? It should fix wide Unicode
> > characters in column names and error messages. Note that DBI itself
> > has broken Unicode messages prior to version 1.635 (see
> > https://rt.cpan.org/Public/Bug/Display.html?id=102404).
> >
> 
> Hello,  I have confirmed that the problems have gone by applying the
> patches (and upgrading DBI to a later version).  Thank you very much
> for
> the quick fix.
> One concern is that the fix can break code currently running.
> Best regards,
> Tanabe

Thank you for testing. I will reuse your script to create tests for this issue.

Currently Unicode support is broken for a long time in DBD::mysql and proper way is to fix current code.

From [email protected] on 2017-07-01 09:12:29:

Reopening, fix was reverted in 4.043.
@dveeden dveeden added the utf8 Unicode and UTF-8 handling label Oct 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
utf8 Unicode and UTF-8 handling
Projects
None yet
Development

No branches or pull requests

2 participants