Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible issue regarding strptime() and TZs #52

Open
atcroft opened this issue Jan 23, 2021 · 2 comments
Open

Possible issue regarding strptime() and TZs #52

atcroft opened this issue Jan 23, 2021 · 2 comments

Comments

@atcroft
Copy link

atcroft commented Jan 23, 2021

Tested platforms (3):

  • 5.30.1 under MSWin/x86_64 (Strawberry/BerryBrew)
  • 5.30.3 under MSWin/x86_64 (Cygwin64)
  • 5.30.3 under Linux/x86_64

EXPECTED output:

$ perl ./tp_test.pl
Time is:
Wed, 13 Jan 2021 22:21:49 CST
Time is:
Wed, 13 Jan 2021 22:21:49 CST
$

ACTUAL output (on all three platforms, differing only by the path to Piece.pm, formatted for readability):

$ perl ./tp_test.pl
Time is:
Wed, 13 Jan 2021 22:21:49 CST
Error parsing time at /usr/lib64/perl5/Time/Piece.pm line 597.
 at ./tp_test.pl line 11.
        main::__ANON__("Error parsing time at 
          /usr/lib64/perl5/Time/Piece.pm line 597.\x{a}") called 
          at /usr/lib64/perl5/Time/Piece.pm line 597
        Time::Piece::strptime("Time::Piece", 
          "Wed, 13 Jan 2021 22:21:49 CST", "%a, %d %b %Y %T %Z") 
          called at ./tp_test.pl line 23
$

Test code:

#!/usr/bin/perl
# vim: set expandtab tabstop=4 shiftwidth=4 softtabstop=4:

use strict;
use warnings;

use Carp;
use Time::Piece;

$SIG{__WARN__} = sub { Carp::cluck @_;   };
$SIG{__DIE__}  = sub { Carp::confess @_; };

$| = 1;

my $t = localtime;

my $pattern = "%a, %d %b %Y %T %Z";

my $str = $t->strftime( $pattern );
print "Time is:\n", $str, "\n";
# Format: Wed, 13 Jan 2021 17:22:23 CST

my $u = Time::Piece->strptime( $str, $pattern, );
print "Time is:\n", $u->strftime( $pattern ), "\n";

The behavior also appears to occur if I change my $t = localtime; to my $t = gmtime; as well, where the timezone is then 'UTC'. If it were only occurring under Cygwin and Strawberry, my first guess would be MSWin-related, but since I am seeing it on a Linux system as well, I'm not sure where to look for the cause of the issue.

Before I consider it a bug, I want to make sure the issue is not on my side or in my understanding. This is part of the reason my sample code attempted to go from object to string and back to object-to test both the strftime() and strptime() methods.

My C is quite rusty (and I haven't dug around inside the perl source tree before), but it appears the process goes as follows when looking at the perl-5.30.3 source on MetaCPAN (PLEASE CORRECT ME IF YOU SPOT ANY ERRORS BELOW):

  1. When Time::Piece->strptime($time_string, $format_string) (found in cpan/Time::Piece/Piece.pm in the source distribution) is called, the format is first parsed by Time::Piece->_translate_format($format, $strptime_trans_map) . $strptime_trans_map appears to only contain the content of $trans_map_common (which defines formats 'c', 'r', and 'X', and NOT the formats 'e', 'D', 'F', 'R', 's', 'T', 'u', 'V', 'x', 'z', or 'Z', from $strftime_trans_map ).
  2. _strptime($string, $format, $islocal, $locales) (found in cpan/Time::Piece/Piece.xs) is then called. There are two such functions in Piece.xs:
  • line 345: static char * strptime( pTHX const char buf*, const char *fmt, struct tm *tm, int *gotGMT )
  • line 1025: void _strptime ( string, format, got_GMT, SV* localization )
    Based on the signature, it appears the second (on line 1025) is called. After loading the locale data structure via cpopulate_c_time_locale(aTHX locales ), this version of _strptime calls the version on line 345 as remainder = (char *)strptime(aTHX string, format, &mytm, &gotGMT). If this returns with anything other than a '\0', either an "Error parsing time" or "Garbage at end of string in strptime:" message is returned.
  1. In _strptime( string, format, islocal, locales ) (to use the names from the original calls), when the "Z" character is found it appears to seek the end of the time zone name, then calls my_tzset(aTHX). According to perlguts, aTHX is an (a)rgument (TH)ingy(X).
  2. According to the comments, my_tzset(pTHX) ((p)rototype (TH)ingy(X)) is a wrapper to tzset() designed to make it work better on Win32. It does so by way of two #IFDEFs that determine if fix_win32_tzenv() is called. As my test failed on a Linux system, I do not believe the #IFDEFs are involved. tzset() (I believe from the system time.h) is then called.

Questions:

  1. Are there any (obvious?) flaws in my tracing of the logic above?
  2. If indeed it is a bug (can someone confirm?), is it a problem with perl, or with a library used while compiling perl?

Thank you for your time and attention. Stay safe!

See also:
Question regarding Time::Piece and timezones - PerlMonks

@smith153
Copy link
Collaborator

Your code trace is right as far as I can tell. As far as it being a bug... perhaps. Equally so, I'd blame "poor documentation". But on another hand... it could be fixed, so maybe it is a bug after all?? 😁 Similar to #49

The docs show in many places: Time::Piece->strptime but when strptime is called like that, it assumes the timezone is UTC regardless of whether or not '%Z' is there. Buy default, everything returned from a call like Time::Piece->strptime is a gmtime object.

If you are dealing with local times, doing localtime->strptime is the 'righter' way, though it does waste come cpu as it first creates an object and populates it with the current time only to be overwritten.

For your example, this would work:

my $t = localtime;

my $pattern = "%a, %d %b %Y %T %Z";

my $str = $t->strftime($pattern);
print "Time is:\n", $str, "\n";

$t->strptime( $str, $pattern, );
print "Time is:\n", $t->strftime($pattern), "\n";

@x-yuri
Copy link

x-yuri commented Sep 14, 2022

@smith153

As far as it being a bug... perhaps. Equally so, I'd blame "poor documentation".

Equally so, I'd blame the interface and the implementation. According to what I've discovered, localtime->strptime("...", "...%Z") does 2 incorrect things, but the result is nevertheless correct. In case you're dealing with local time zones (not with arbitrary ones). So I'm not sure if it's a good idea to use it your way. At least it gives the impression that it parses time zone abbreviations, which might make one think it parses arbitrary time zone abbreviations. Which it doesn't. But as an escape hatch... it may suffice.

The docs show in many places: Time::Piece->strptime but when strptime is called like that, it assumes the timezone is UTC regardless of whether or not '%Z' is there.

"...like that, the resulting time zone is UTC...". Because it successfully parses timestamps in arbitrary time zones.

If you are dealing with local times, doing localtime->strptime is the 'righter' way

Again, with local times, not with arbitrary time zones.

At least that is my understanding.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants