JavaScript, Django and PostgreSQL timezone handling

Please have a look at the first part, which covers date, time and time zone basics in detail.

This is a post about timezones issue I’ve been working on at DjaoDjin. Imagine there is a company with multiple teams distributed across the globe. They are all accessing the same DjaoDjin dashboard to see how well the business is doing. The problem lies in reporting the data accurately based on a user location. We could’ve just sticked with UTC, but presenting the data in a local time is a better user experience. So here is the journey of how we approached the issue.

A brief overview of the stack

Our core tech is djaodjin-saas, it is a pluggable Django app. It adds a multitude of API endpoints, among which there are a bunch of endpoints that return transaction data. The data is stored in PostgreSQL and API endpoints are powered by Django Rest Framework. On the client side, this transaction data is used to display graphs. Client side is built on top of Angular.js.

Client side

On page load Angular.js makes a GET request like this:

ends_at parameter is expected to be an ISO 8601 formatted timestamp in UTC. All DjaoDjin APIs are using ISO 8601. Stripe decided to use UNIX timestamp in their APIs instead, but we found it complicated to debug date/time related issues.

This request fetches transactions created during a 12 month period, counting backwards from today’s date, which is 2018-04-18, till 2017-04-01. In this case it is not a full 12 month period, because the current month hasn’t ended yet. However, a user then can pick from previous months to have a complete 12 month period, for example a period from 2018-03-01 till 2017-03-01.

However, this URL format is deprecated, it is now required to pass an additional timezone parameter besides the ends_at parameter. The reason behind this will become clear by the end of this post. Meanwhile, here’s how we get the time zone of the client:

Initially, we construct a Date object (which is then passed as ends_at parameter). This object has only a UTC offset of the browser. However, a browser doesn’t know the exact time zone name. In modern browsers there are APIs which standardize the process of getting the time zone name. We use Moment.js with Moment Timezone, which leverages these new APIs to guess the client’s time zone name:

moment.tz.guess() // "Europe/Kiev"

We then pass this string as a timezone parameter and a constructed Date object as an ends_at parameter when making an API call. When Angular.js makes an actual AJAX request the toISOString method of Date object is called, which produces an ISO 8601 timestamp.

Django Rest Framework side

When DRF receives a request, we parse the timestamp with django.utils.dateparse.parse_datetime function, the return value of which is an aware datetime object. Because ISO 8601 permits only UTC or UTC offset to be specified, the tzinfo attribute of this object is either a UTC (pytz.UTC) or a UTC offset (django.utils.timezone.FixedOffset).

Once we have an end date, we call a saas.managers.metrics.month_periods to generate a list of 12 datetime objects, that will be used to query the transactions later. Here is the original source of the function which handles UTC-only dates:

def month_periods(nb_months=12, from_date=None, step_months=1):
    dates = []
    from_date = datetime_or_now(from_date)
    dates.append(from_date)
    last = datetime(
        day=from_date.day, month=from_date.month, year=from_date.year,
        tzinfo=utc)
    if last.day != 1:
        last = datetime(day=1, month=last.month, year=last.year, tzinfo=utc)
        dates.append(last)
        nb_months = nb_months - 1
    for _ in range(0, nb_months, step_months):
        year = last.year
        month = last.month - step_months
        if month < 1:
            # integer division
            year = last.year + month // 12
            assert isinstance(year, six.integer_types)
            if month % 12 == 0:
                year -= 1
                month = 12
            else:
                month = month % 12
        last = datetime(day=1, month=month, year=year, tzinfo=utc)
        dates.append(last)
    dates.reverse()
    return dates

The problem with the current implementation is that when constructing new datetimeobjects, the time zone is set to UTC:

last = datetime(day=1, month=month, year=year, tzinfo=utc)

This is a bug, because if a HTTP request is made with a timestamp in UTC offset, the first datetime object will have a UTC offset time zone, while the rest of the datetimeobjects will still be in UTC. So, we have a potential 24 hour window for transactions with the wrong date to creep in. A solution to this problem is to use the time zone of the first datetime object when constructing the rest of the datetime objects. In our situation, it doesn’t make sense to convert the first object to UTC too, because when we construct datetime objects in original time zone, we don’t care about time, when constructing subsequent datetime objects, that’s because each period begins at 00:00. However, if we do convert to UTC, the time is no longer 00:00 and the day is potentially also different — this complicates the construction of periods. So, it is much easier to stick with whatever the time zone offset was passed with the request.

orig_tz = from_date.tzinfo
# later in loop
last = datetime(day=1, month=month, year=year, tzinfo=orig_tz)

The other problem is that the user who made the request might be located in a time zone which has DST. If the time zone has DST, its UTC offset will not be constant during the year. When generating datetime objects we use the same UTC offset for each of them, as a consequence for dates that fall onto DST period the actual local time will be off by one hour. To fix this we need the time zone name, which we’ll use to get the DST rules for this specific time zone.

There is no way to get the correct offset for particular date with DST without a time zone name and we can’t get time zone name based on offset, due to its changing nature. That means we need to pass the time zone from the client somehow. Unfortunately, ISO 8601 doesn’t specify a way to include a time zone name with the timestamp, only an offset. As a solution, we’ll pass a separate timezone parameter from the client. In case where a client doesn’t pass a time zone string or passes a wrong string, we’ll have to fallback to using UTC offsets, ignoring DST. We might actually require the timezone parameter in future to prevent potential date and time errors. Let’s modify the month_periodsfunction to accept a time zone string from the view:

def month_periods(nb_months=12, from_date=None, step_months=1, tz=None):
...

OK, now that we have a time zone string, we can solve the last issue with this function. When constructing datetime objects, we need to modify the offset based on the DST, so that the local time is always equal to 00:00. To do this, we will use a pytz package, which is an implementation of IANA database in Python. Let’s create a helper function which parses a time zone string into a tzinfo object:

from pytz import timezone, UnknownTimeZoneError

def parse_tz(tz):
  if tz:
    try:
      return timezone(tz)
    except UnknownTimeZoneError:
      pass

Back to month_periods function, we need to parse the time zone string first. If what we’ve got is an actual time zone, convert the from_date previous tzinfo object to the new time zone object, otherwise just leave it with the original UTC offset object. Here’s the code that converts the first period’s tzinfo object to the newly parsed time zone object:

tz_ob = parse_tz(tz)
if tz_ob:
  from_date = from_date.astimezone(tz_ob)

When dealing with the rest of the periods, we can construct a naive datetime (an object without a tzinfo attribute) first. Then if we have the time zone object, call a tz_ob.localize method which adds a time zone to the date (making it aware) and applies a correct UTC offset based on DST rules. If we don’t have a time zone object, we’ll fallback to using UTC offset (this ignores DST). Let’s wrap this into a helper function too:

def _handle_tz(dt, tz_ob, orig_tz):
  if tz_ob:
    # adding timezone info
    # this also accounts for DST
    loc = tz_ob.localize(dt)
  else:
    # adding UTC offset only
    loc = last.replace(tzinfo=orig_tz)
  return loc

The final month_periods function will look like this:

def month_periods(nb_months=12, from_date=None, step_months=1, tz=None):
    dates = []
    from_date = datetime_or_now(from_date)
    orig_tz = from_date.tzinfo
    tz_ob = parse_tz(tz)
    if tz_ob:
        from_date = from_date.astimezone(tz_ob)
    dates.append(from_date)
    last = datetime(day=from_date.day, month=from_date.month, year=from_date.year)
    last = _handle_tz(last, tz_ob, orig_tz)
    if last.day != 1:
        last = datetime(day=1, month=last.month, year=last.year)
        last = _handle_tz(last, tz_ob, orig_tz)
        dates.append(last)
        nb_months = nb_months - 1
    for _ in range(0, nb_months, step_months):
        year = last.year
        month = last.month - step_months
        if month < 1:
            # integer division
            year = last.year + month // 12
            assert isinstance(year, six.integer_types)
            if month % 12 == 0:
                year -= 1
                month = 12
            else:
                month = month % 12
        last = datetime(day=1, month=month, year=year)
        last = _handle_tz(last, tz_ob, orig_tz)
        dates.append(last)
    dates.reverse()

So, when we call this function we’ll have the following result:

>>> from django.utils.dateparse import parse_datetime
>>> from pprint import pprint # pretty print
>>> from_date = parse_datetime('2018-04-01T00:00:00+03:00')
>>> tz = 'Europe/Kiev'
>>> dates = month_periods(from_date=from_date, tz=tz)
>>> pprint(dates)
[datetime.datetime(2017, 4, 1, 0, 0, tzinfo=&lt;DstTzInfo 'Europe/Kiev' EEST+3:00:00 DST&gt;),
 datetime.datetime(2017, 5, 1, 0, 0, tzinfo=&lt;DstTzInfo 'Europe/Kiev' EEST+3:00:00 DST&gt;),
 datetime.datetime(2017, 6, 1, 0, 0, tzinfo=&lt;DstTzInfo 'Europe/Kiev' EEST+3:00:00 DST&gt;),
 datetime.datetime(2017, 7, 1, 0, 0, tzinfo=&lt;DstTzInfo 'Europe/Kiev' EEST+3:00:00 DST&gt;),
 datetime.datetime(2017, 8, 1, 0, 0, tzinfo=&lt;DstTzInfo 'Europe/Kiev' EEST+3:00:00 DST&gt;),
 datetime.datetime(2017, 9, 1, 0, 0, tzinfo=&lt;DstTzInfo 'Europe/Kiev' EEST+3:00:00 DST&gt;),
 datetime.datetime(2017, 10, 1, 0, 0, tzinfo=&lt;DstTzInfo 'Europe/Kiev' EEST+3:00:00 DST&gt;),
 datetime.datetime(2017, 11, 1, 0, 0, tzinfo=&lt;DstTzInfo 'Europe/Kiev' EET+2:00:00 STD&gt;),
 datetime.datetime(2017, 12, 1, 0, 0, tzinfo=&lt;DstTzInfo 'Europe/Kiev' EET+2:00:00 STD&gt;),
 datetime.datetime(2018, 1, 1, 0, 0, tzinfo=&lt;DstTzInfo 'Europe/Kiev' EET+2:00:00 STD&gt;),
 datetime.datetime(2018, 2, 1, 0, 0, tzinfo=&lt;DstTzInfo 'Europe/Kiev' EET+2:00:00 STD&gt;),
 datetime.datetime(2018, 3, 1, 0, 0, tzinfo=&lt;DstTzInfo 'Europe/Kiev' EET+2:00:00 STD&gt;),
 datetime.datetime(2018, 4, 1, 0, 0, tzinfo=&lt;DstTzInfo 'Europe/Kiev' EEST+3:00:00 DST&gt;)]

Awesome. Have you noticed how the time zone offset changes during the year, while time stays the same? That’s what we were trying to achieve with localize method.

Now that we have the list of correct periods, it is then used to make database queries. Once we have the transactions in our views, they are passed to DRF serializers, which encode them to produce a JSON response. DRF serializers don’t modify datetimeobjects and their time zone info, so we don’t need to worry about that.

Let’s have a look at how Django ORM and PostgreSQL handles time zones in detail.

Django ORM time zone handling

Internally, Django uses pytz package. Django time zone behavior depends on the USE_TZ setting. If your application is serious about time you should always have time zone support enabled and work with aware datetime objects only, otherwise the errors will inevitably creep in.

Django ORM will not modify aware datetime objects when constructing an SQL query, so it is pretty safe to pass them as arguments to the queries. When an ORM compiles a query, it basically calls the __str__() method for each of the datetime fields, which produces an ISO 8601 timestamp. This applies to both the queries produced by ORM and to raw SQL queries crafted by hand. So, the final SQL query that will be sent to PostgreSQL will have ISO 8601 timestamps with whatever UTC offset was in the tzinfoattribute, here’s an example:

SELECT COUNT(DISTINCT(prev.dest_organization_id)),
       SUM(prev.dest_amount)
FROM saas_transaction prev
LEFT OUTER JOIN (
    SELECT distinct(dest_organization_id)
    FROM saas_transaction
    WHERE created_at >= '2017-05-01 00:00:00+03:00'
    AND created_at < '2017-06-01 00:00:00+03:00'
    AND orig_organization_id = '2'
    AND orig_account = 'Receivable'
) curr
ON prev.dest_organization_id = curr.dest_organization_id
WHERE prev.created_at >= '2017-04-01 00:00:00+03:00'
    AND prev.created_at < '2017-05-01 00:00:00+03:00'
    AND prev.orig_organization_id = '2'
    AND prev.orig_account = 'Receivable'
    AND curr.dest_organization_id IS NULL

When Django establishes a connection with PostgreSQL it specifies a timezoneparameter, it is used by PostgreSQL to determine which time zone the timestamps should be returned in. If USE_TZ is set to True, the value of the parameter will be set to UTC by default. This means that ORM will receive the response from PostgreSQL with timestamps in UTC and that the response will be used to build a model with datetimefields in UTC. To construct model datetime fields a parse_datetime method is used, which parses a UTC timestamp from the database and constructs an aware datetimeobject in UTC. So, we don’t have the same aware datetime object that we stored or used to query initially, because datetime data is stored and retrieved in UTC and neither ORM nor PostgreSQL have any knowledge of time zones. Actually, Django converts those objects to a local time zone when they are used in Django Forms or templates, but this is not relevant to our situation.

At this point, we have a list of transaction rows with datetime fields in UTC. We can either convert the dates back to the original time zone first or pass them back to DRF.

Time zones in PostgreSQL

In PostgreSQL all date and time data is stored in UTC internally.

On input PostgreSQL accepts an ISO 8601 timestamp with a UTC offset or even with a time zone name, even though it is not a standard ISO 8601 string. In our case, Django and Python will produce a timestamp with UTC or UTC offset. When no time zone info is specified in a timestamp, the time zone will be taken from the timezone parameter (in our case it is configured by Django upon connection and set to UTC by default). Every timestamp with time zone or UTC offset used in SQL queries is converted to UTC by PostgreSQL internally.

Before outputting the data, PostgreSQL converts the timestamps from UTC to the time zone set by the timezone parameter (in our case it is set to UTC, so no conversion is required). Here’s an example:

djaodjin=> SHOW TimeZone;
 TimeZone
----------
 UTC
(1 row)

djaodjin=> INSERT INTO transactions (created_at) VALUES (TIMESTAMP WITH TIME ZONE '2018-04-19 03:00:00 Europe/Kiev');
INSERT 0 1
djaodjin=> SELECT created_at FROM transactions;
        created_at
------------------------
 2018-04-19 00:00:00+00
(1 row)

djaodjin=> SET TimeZone TO 'Europe/Kiev';
SET
djaodjin=> SELECT created_at FROM transactions;
        created_at
------------------------
 2018-04-19 03:00:00+03
(1 row)

Conclusion

In this post we showed how to present monthly SaaS reports midnight-to-midnight, in local time and through DST. Finally, we described in detail how each piece of software in the stack handles the time zones during this process. Clone djaodjin-saas on GitHub, it is free! If you need help with your SaaS product, I am available for hire.