August 26, 2025

The Zone

A Modern Data Science Platform

For Statistics Canada





Brought to you by The Zone Team ❤️
Statistics Canada | Statistique Canada
August 26, 2025

The Zone Team

Summer 2025

Featuring Anray Liu and Justin Zhang!

Co-op student developers from uOttawa and Carleton

Developers (IT-02):
  • Wendy Gaultier, Mathis Marcotte, Jose Matsuda, Souheil Yazji
Team Lead (Acting IT-03):
  • Bryan Paget
Statistics Canada | Statistique Canada
August 26, 2025

Our Story

One Platform, Two Zones

  • The Advanced Analytics Workspace (AAW)
    was the first Zone
  • The Zone is its Protected B counterpart
  • Built on the same foundation:
    Kubeflow on Kubernetes
Same core. Same capabilities.
Higher security.
Statistics Canada | Statistique Canada
August 26, 2025

What Is The Zone?

The Zone is a data science platform based on Kubeflow, designed to orchestrate notebooks, jobs, and machine learning workflows.

Featuring:

  • JupyterLab supporting Python, R, SAS, Julia
  • Kubeflow for scalable infrastructure for data science and automation
The Zone is a platform that runs on Azure AKS
but can run on any Kubernetes cluster.
Statistics Canada | Statistique Canada
August 26, 2025

Everyone Is Welcome

Which Language? No Barrier.

  • Python, R, Julia, SAS are all supported
  • Migrate at your pace, in your language
  • The only platform at StatCan where
    SAS and Python/R coexist

Statistics Canada | Statistique Canada
August 26, 2025

Everyone Is Welcome

Which Organization? No Barrier.

  • Open source by design: cloneable, shareable, federatable
  • We've already done it once: we can do it again
  • Ready to support other teams, departments, and levels of government
Statistics Canada | Statistique Canada
August 26, 2025

Why Was The Zone Created?

To deliver a secure, modern, and independent data platform.

  • Provide a Protected B compliant environment
  • Reduce reliance on proprietary tools
  • Long-term cost reduction through reusable, open infrastructure
This is about sovereignty, sustainability, and
self-reliance.
Statistics Canada | Statistique Canada
August 26, 2025

What We've Built

  • Over 2,200 onboarded users
  • 130+ daily notebook sessions
  • CronJobs powering real production workflows
  • A stable, proven platform used across divisions

This is not a prototype. This is production grade.
Statistics Canada | Statistique Canada
August 26, 2025

Summer 2025

  • MKL acceleration for faster numerical computing
  • Tesseract OCR to extract text from scanned documents
  • Volume and namespace cleaners for automatic resource cleanup
  • CronJobs for scheduling our pipeline-ready infrastructure
  • Readiness Probe: no more loading errors
  • Optimized Docker images for faster load times
Statistics Canada | Statistique Canada
August 26, 2025

From CronJobs to Pipelines

Today: CronJobs run in isolation.

Tomorrow: Connected, observable workflows powered by Argo Workflows and Kubeflow Pipelines.

  • Argo Workflows orchestrates complex job sequences on Kubernetes
  • Kubeflow Pipelines enable end-to-end ML workflows with UI, caching, versioning
  • Full logging, retry logic, error handling, triggers
Automation, evolved.
Statistics Canada | Statistique Canada
August 26, 2025

Kubernetes Is Built for Orchestration

At its core, Kubernetes is designed to orchestrate workloads: scaling, scheduling, and managing containers across clusters.

  • Powers modern cloud-native apps
  • Handles complex workflows reliably
  • Already runs The Zone's notebooks and jobs
Orchestration isn't the future.
It's already here, under the hood.
Statistics Canada | Statistique Canada
August 26, 2025

Kubeflow Brings Orchestration to Data Scientists

Kubeflow was built for pipelines. It brings Kubernetes' power to data scientists through an intuitive interface.

  • The AAW had Kubeflow Pipelines, Argo Workflows
  • We can re-enable them, no reinvention needed, for end-to-end workflows: versioned, reproducible, monitored
We're not starting from scratch. We're restoring what works and making it secure, scalable, standard.
Statistics Canada | Statistique Canada
August 26, 2025

Challenges Remain

  • Database connections (workload intake forms)
  • Legacy filer performance (data migration)
  • Linux and Kubernetes concepts (education)
  • Secrets management (Keycloak)
  • VS Code extension repo (org wide)
We're solving these, with your feedback.
Statistics Canada | Statistique Canada
August 26, 2025

A Federated Future

The Zone is open source and designed to scale. It can be deployed by:

  • Other teams at StatCan
  • Federal departments
  • Provincial and municipal governments
Same foundation. Same security.
Deployed where you need it.
Statistics Canada | Statistique Canada
August 26, 2025

Enter The Zone

You are invited to:

  • Access the platform: https://zone.statcan.ca
  • Attend training and workshops
  • Help shape the future of data science at StatCan
  • Host the next Zone in your department
The Zone is a movement toward openness, sovereignty,
and shared capability.
Statistics Canada | Statistique Canada

Title Slide

Who We Are

Our Story / History

What is The Zone?

Inclusivity & SAS Coexistence

Inclusivity & SAS Coexistence

Why Created

Platform Strengths

Summer 2025

From Cron to Pipeline

Future: Kubernetes Is Built for Orchestration

Future: Kubeflow Brings Orchestration to Data Science

Data Access

Portability & Federated Future

Call to Action