# Time Series This guide covers reading, writing, and managing time-series records in HEC-DSS files using **pydsstools**. Time-series records store sequences of values measured or computed at specific points in time. Common uses include streamflow hydrographs, precipitation measurements, water level observations, and reservoir operations data. ## Key Concepts | Concept | Description | |---------|-------------| | **Regular Time Series** | Values recorded at a fixed interval (e.g., every hour, every day). The E-part of the pathname specifies the interval (e.g., `1HOUR`, `1DAY`). | | **Irregular Time Series** | Values recorded at variable timestamps. The E-part uses an `IR-` prefix (e.g., `IR-DAY`, `IR-DECADE`). | | **TimeSeriesContainer** | Write-side container. Holds pathname, interval, values, times, units, and type before writing to DSS. | | **TimeSeriesStruct** | Read-side structure returned by the Cython layer. Wraps the C `zStructTimeSeries` and exposes values, times, and metadata. | | **HecTime** | Datetime class for HEC-DSS. Stores time as julian days + seconds since midnight. Used for timestamps in irregular time series. | | **UNDEFINED** | Sentinel float value representing missing data in DSS time-series cells. | | **Granularity** | Time precision in seconds. Minute granularity (60) is the default. Second granularity (1) is available but requires care to avoid integer overflow. | | **Window** | A date range tuple `(start, end)` for reading a subset of a time series. | ### Regular vs Irregular Time Series ``` Regular (1HOUR interval): 01Jan2025 01:00 -> 10.0 01Jan2025 02:00 -> 20.0 Fixed spacing, only start_time needed 01Jan2025 03:00 -> 30.0 Irregular (IR-DAY): 02Jul2010 12:00 -> 1.0 05Jan2012 00:00 -> 20.0 Variable spacing, each value has a timestamp 15Mar2014 02:00 -> 30.0 ``` - **Regular** time series only need a `start_time` and `interval` — timestamps are computed automatically. - **Irregular** time series need an explicit `times` array with one timestamp per value. --- ## Example 1 — Read a Regular Time Series The most common use case. `read_ts()` returns a `TimeSeriesStruct` with values and times. ```python from pydsstools.heclib.dss import HecDss dss_file = "sample.dss" pathname = "/REGULAR/TIMESERIES/FLOW//1HOUR//" with HecDss.Open(dss_file) as fid: ts = fid.read_ts(pathname) print("Type:", ts.dtype) # "Regular TimeSeries" print("Count:", ts.count) # Number of values print("Units:", ts.data_units) # e.g., "cfs" print("Data type:", ts.data_type) # e.g., "INST-VAL" print("Start:", ts.start_time) # HecTime object print("End:", ts.end_time) # HecTime object # Values as a NumPy array values = ts.values print("Values:", values) # Times as HecTime generator for t in ts.times: print(t.datetime(), "->", end=" ") ``` **TimeSeriesStruct properties:** | Property | Type | Description | |----------|------|-------------| | `values` | ndarray | NumPy array of float values | | `times` | generator of HecTime | Yields one HecTime per data point | | `count` | int | Number of data points | | `dtype` | str | `"Regular TimeSeries"` or `"Irregular TimeSeries"` | | `data_units` | str | Units string (e.g., `"cfs"`, `"ft"`) | | `data_type` | str | Type string (e.g., `"INST-VAL"`, `"PER-AVER"`) | | `start_time` | HecTime | Start time of the record | | `end_time` | HecTime | End time of the record | | `interval` | int | Interval in seconds (positive for regular, negative for irregular) | | `granularity` | int | Time granularity in seconds | | `tzid` | str | Timezone identifier | | `nodata` | ndarray of bool | Boolean mask where True = missing value | | `empty` | bool | True if all values are missing | --- ## Example 2 — Read with a Time Window Read a subset of a time series by specifying a `(start, end)` window. ```python from pydsstools.heclib.dss import HecDss dss_file = "sample.dss" pathname = "/REGULAR/TIMESERIES/FLOW//1HOUR//" with HecDss.Open(dss_file) as fid: ts = fid.read_ts( pathname, window=("15JUL2019 2300", "16JUL2019 0100"), ) times = [t.datetime() for t in ts.times] values = ts.values.tolist() print("Times:", times) print("Values:", values) ``` **Window format:** - A tuple of two date/time strings: `(start_date, end_date)`. - Accepts any format that `HecTime` can parse (see HecTime Quickstart). - Common formats: `"15JUL2019 2300"`, `"15Jul2019 23:00"`, `"2019-07-15T23:00:00"`. --- ## Example 3 — Read with Trimming By default, `read_ts()` returns the full date range including leading and trailing missing values. Use `trim_missing=True` to remove them. ```python with HecDss.Open(dss_file) as fid: # Without trimming — may have UNDEFINED values at edges ts_full = fid.read_ts(pathname) print("Full count:", ts_full.count) # With trimming — missing values at edges removed ts_trimmed = fid.read_ts(pathname, trim_missing=True) print("Trimmed count:", ts_trimmed.count) ``` `trim_missing` only applies to **regular** time series. It removes leading and trailing `UNDEFINED` values from the returned array. --- ## Example 4 — Read an Irregular Time Series Irregular time series have variable timestamps. The E-part of the pathname uses an `IR-` prefix (e.g., `IR-DAY`, `IR-DECADE`). ```python from pydsstools.heclib.dss import HecDss dss_file = "sample.dss" pathname = "/IRREGULAR/TIMESERIES/PARAM//IR-Decade//" with HecDss.Open(dss_file) as fid: ts = fid.read_ts(pathname) print("Type:", ts.dtype) # "Irregular TimeSeries" times = [t.datetime() for t in ts.times] values = ts.values.tolist() for t, v in zip(times, values): print(f" {t} -> {v}") ``` **Irregular time series window flags:** When reading with a time window, the `window_flag` parameter controls how boundary values are handled: | Flag | Behavior | |------|----------| | 0 | Strictly adhere to the time window (default) | | 1 | Also retrieve one value immediately before the window start | | 2 | Also retrieve one value immediately after the window end | | 3 | Retrieve one value before and one value after the window | ```python with HecDss.Open(dss_file) as fid: ts = fid.read_ts( pathname, window=("01JAN2019 0000", "01JAN2020 0000"), window_flag=1, # include one value before start ) ``` --- ## Example 5 — Write a Regular Time Series Build a `TimeSeriesContainer`, populate it, and write to DSS. ```python from pydsstools.core import TimeSeriesContainer, UNDEFINED from pydsstools.heclib.dss import HecDss dss_file = "output.dss" pathname = "/REGULAR/TIMESERIES/FLOW//1HOUR/EXAMPLE/" with HecDss.Open(dss_file) as fid: count = 4 interval = 1 # positive = regular time series tsc = TimeSeriesContainer(pathname, count, interval) tsc.start_time = "01JAN2025 23:00" tsc.data_units = "cfs" tsc.data_type = "INST" tsc.tzid = "UTC" tsc.values = [10, 20, UNDEFINED, 40] # UNDEFINED marks missing data fid.put_ts(tsc) ``` **TimeSeriesContainer construction:** ```python tsc = TimeSeriesContainer(pathname, count, interval) ``` | Parameter | Type | Description | |-----------|------|-------------| | `pathname` | str | DSS pathname with valid E-part interval | | `count` | int | Number of values | | `interval` | int | Positive for regular, negative for irregular | **Required properties for regular time series:** | Property | Type | Description | |----------|------|-------------| | `start_time` | str or HecTime | Starting date/time | | `values` | list, ndarray, or array | Time series values | | `data_units` | str | Units (e.g., `"cfs"`, `"ft"`) | | `data_type` | str | Type (e.g., `"INST"`, `"PER-AVER"`, `"PER-CUM"`) | **Optional properties:** | Property | Type | Description | |----------|------|-------------| | `tzid` | str | Timezone identifier (e.g., `"UTC"`) | --- ## Example 6 — Write a Regular Time Series (Keyword Arguments) Instead of creating a `TimeSeriesContainer`, you can pass keyword arguments directly to `put_ts()` with a pathname. ```python from pydsstools.heclib.dss import HecDss dss_file = "output.dss" pathname = "/REGULAR/TIMESERIES/FLOW//1HOUR/KWARGS/" with HecDss.Open(dss_file) as fid: fid.put_ts( pathname, values=[10, 20, 30, 40, 50], start_time="01JAN2025 15:00", data_units="ft", data_type="INST", tzid="UTC", ) ``` When `put_ts()` receives a pathname string instead of a `TimeSeriesContainer`, it automatically: 1. Determines the interval from the E-part of the pathname. 2. Creates a `TimeSeriesContainer` internally from the keyword arguments. 3. Writes the data to the DSS file. --- ## Example 7 — Write an Irregular Time Series Irregular time series require explicit timestamps for each value. ```python from pydsstools.core import TimeSeriesContainer from pydsstools.heclib.dss import HecDss dss_file = "output.dss" pathname = "/A/B/C//IR-DAY/EXAMPLE/" with HecDss.Open(dss_file) as fid: times = [ "02JUL2010 1200", "05JAN2012 0000", "15MAR2014 0200", "25FEB2018 0500", "19DEC2024 1200", ] values = [1, 20, 30, 40, 50] fid.put_ts( pathname, values=values, times=times, julian_base="01JAN2000", data_units="ft", data_type="INST", tzid="UTC", ) ``` **Key differences for irregular time series:** | Regular | Irregular | |---------|-----------| | E-part: `1HOUR`, `1DAY`, etc. | E-part: `IR-DAY`, `IR-DECADE`, etc. | | Needs `start_time` | Needs `times` list | | `interval` > 0 | `interval` < 0 | | No `julian_base` | Optional `julian_base` | --- ## Example 8 — Write Irregular Time Series with TimeSeriesContainer For more control, build the `TimeSeriesContainer` explicitly. ```python from pydsstools.core import TimeSeriesContainer from pydsstools.heclib.dss import HecDss dss_file = "output.dss" pathname = "/IRREGULAR/TIMESERIES/PARAM//IR-DECADE/EXAMPLE/" with HecDss.Open(dss_file) as fid: times = ["01JAN2019 02:01", "01JAN5000 01:02"] values = [2019, 5000] tsc = TimeSeriesContainer(pathname, len(values), -1) tsc.times = times # list of date strings tsc.values = values tsc.data_units = "ft" tsc.data_type = "INST" tsc.tzid = "UTC" fid.put_ts(tsc) ``` **TimeSeriesContainer `times` setter accepts:** | Input Type | Description | |------------|-------------| | list of str | Date/time strings (parsed via HecTime) | | list of HecTime | HecTime objects | | list of datetime | Python datetime objects | | ndarray of int | Raw integer time values | Times must be in ascending order. A `ValueError` is raised if they are not. --- ## Example 9 — Round-Trip: Read, Modify, Write Back Read an existing record, modify it, and write it back. ```python from pydsstools.core import TimeSeriesContainer from pydsstools.heclib.dss import HecDss dss_file = "sample.dss" pathname_in = "/REGULAR/TIMESERIES/FLOW//1HOUR//" pathname_out = "/REGULAR/TIMESERIES/FLOW//1HOUR/MODIFIED/" with HecDss.Open(dss_file) as fid: # Read ts = fid.read_ts(pathname_in, window=("15JUL2019 2300", "16JUL2019 0100")) # Modify — scale all values by 1.5 values = ts.values * 1.5 # Write back under a new F-part count = ts.count tsc = TimeSeriesContainer(pathname_out, count, ts.interval) tsc.start_time = ts.start_time tsc.values = values tsc.data_units = ts.data_units tsc.data_type = ts.data_type fid.put_ts(tsc) ``` --- ## Example 10 — Configure DSS Logging Control the verbosity of HEC-DSS library messages. ```python from pydsstools.heclib.dss.HecDss import Open from pydsstools.heclib.utils import dss_logging # Set logging level: "None", "General", "Diagnostic" dss_logging.config(level="General") dss_file = "example.dss" pathname = "/REGULAR/TIMESERIES/FLOW//1HOUR/Ex1/" with Open(dss_file) as fid: ts = fid.read_ts(pathname) ``` **Available logging levels:** | Level | Description | |-------|-------------| | `"None"` | Suppress all DSS messages | | `"General"` | Standard operational messages | | `"Diagnostic"` | Verbose debugging output | --- ## Example 11 — Check for Missing Data Use the `nodata` mask and `UNDEFINED` sentinel to handle missing values. ```python import numpy as np from pydsstools.core import UNDEFINED from pydsstools.heclib.dss import HecDss dss_file = "sample.dss" pathname = "/REGULAR/TIMESERIES/FLOW//1HOUR//" with HecDss.Open(dss_file) as fid: ts = fid.read_ts(pathname) # Boolean mask: True where value is missing mask = ts.nodata print("Missing values:", mask.sum()) # Check if entire record is empty print("All missing:", ts.empty) # Replace missing with NaN for analysis values = ts.values.astype(float) values[mask] = np.nan ``` --- ## Granularity and Overflow The internal DSS time representation stores timestamps as integer counts of time units since December 31, 1899 (julian day 0). The **granularity** setting controls the unit size: | Granularity | Unit | Max Date Range | |-------------|------|----------------| | 60 (default) | Minutes | ~4,000 years | | 1 | Seconds | ~68 years | **Second granularity (1) can overflow** for dates far from the julian base. When writing irregular time series spanning large date ranges with second precision, consider: 1. Using minute granularity (60) when seconds are not needed. 2. Setting a `julian_base` close to your data range. 3. Writing one value at a time with `prevent_overflow=True`. --- ## DSS Pathname Structure for Time Series ``` /A-Part/B-Part/C-Part/D-Part/E-Part/F-Part/ ``` | Part | Typical Use | Example | |------|-------------|---------| | A | Collection or project | `REGULAR`, `IRREGULAR` | | B | Location | `TIMESERIES`, `GAUGE-01` | | C | Parameter | `FLOW`, `PRECIP`, `STAGE` | | D | Date block start | `01JUL2019` (can be empty) | | E | Interval | `1HOUR`, `15MIN`, `1DAY`, `IR-DAY`, `IR-DECADE` | | F | Version or variant | `OBS`, `MODIFIED`, `EXAMPLE` | **Common E-part interval specifications:** | Regular | Irregular | |---------|-----------| | `1MIN` | `IR-DAY` | | `15MIN` | `IR-MONTH` | | `1HOUR` | `IR-YEAR` | | `1DAY` | `IR-DECADE` | | `1MON` | `IR-CENTURY` | | `1YEAR` | | When the D-part is empty, `read_ts()` retrieves all available data. When it contains a date, only that block is read unless a `window` is specified. --- ## API Summary ### Reading | Method | Returns | Description | |--------|---------|-------------| | `fid.read_ts(pathname)` | `TimeSeriesStruct` | Read full time series. | | `fid.read_ts(pathname, window=(...))` | `TimeSeriesStruct` | Read a time window. | | `fid.read_ts(pathname, trim_missing=True)` | `TimeSeriesStruct` | Read with edge trimming (regular only). | | `fid.read_ts(pathname, window_flag=N)` | `TimeSeriesStruct` | Read with boundary control (irregular only). | | `fid.read_ts(pathname, reg=True)` | `TimeSeriesStruct` | Force read as regular time series. | | `fid.read_ts(pathname, ireg=True)` | `TimeSeriesStruct` | Force read as irregular time series. | ### Writing | Method | Description | |--------|-------------| | `fid.put_ts(tsc)` | Write a `TimeSeriesContainer` object. | | `fid.put_ts(pathname, values=..., start_time=...)` | Write regular time series from keyword arguments. | | `fid.put_ts(pathname, values=..., times=...)` | Write irregular time series from keyword arguments. | ### TimeSeriesContainer Construction ```python from pydsstools.core import TimeSeriesContainer # Regular time series tsc = TimeSeriesContainer(pathname, count, interval) tsc.start_time = "01JAN2025 1500" # starting date/time tsc.values = [10, 20, 30] # list, ndarray, or array tsc.data_units = "cfs" # value units tsc.data_type = "INST" # INST, PER-AVER, PER-CUM tsc.tzid = "UTC" # timezone (optional) # Irregular time series tsc = TimeSeriesContainer(pathname, count, -1) tsc.times = ["01JAN2019 0200", ...] # one timestamp per value tsc.values = [100, 200, ...] # same length as times tsc.data_units = "ft" tsc.data_type = "INST" tsc.julian_base = "01JAN2000" # optional reference date ``` **Accepted input types for `values`:** list, tuple, `numpy.ndarray`, or `array.array`. Internally converted to float32. **Accepted input types for `times` (irregular only):** list of strings, list of HecTime, list of datetime, `numpy.ndarray` of int, or `array.array` of int.