pydsstools.heclib.dss.HecDss.Open
- class pydsstools.heclib.dss.HecDss.Open(dss_path, version=None, mode='rw')[source]
Bases:
OpenOpen a DSS file and create a dataset object that supports input/output operations.
This class provides a high-level, user-friendly interface for working with HEC-DSS files. It supports reading and writing time series, paired data, and spatial grid data.
- Parameters:
dss_path (str or Path or PathLike) – Path to the DSS file.
version ({6, 7} or None, optional) – DSS file version. If
None, detect automatically. If creating a new file,Nonecreates a version 7 file. Default is None.mode ({"rw", "r"}, optional) – File open mode.
"rw"allows read/write;"r"is read-only. Default is “rw”.
- Variables:
mode (str) – The file access mode.
version (int) – The DSS file version (6 or 7).
filename (str) – Path to the DSS file.
Examples
Open a DSS file for reading and writing:
>>> from pydsstools.heclib.dss.HecDss import Open >>> fid = Open("example.dss", mode="rw")
Open a DSS file as read-only:
>>> fid = Open("example.dss", mode="r") >>> fid.close()
Use context manager for automatic cleanup:
>>> with Open("example.dss") as fid: ... ts = fid.read_ts("/A/B/C/01JAN2020/1HOUR/F/")
See also
TimeSeriesContainerContainer for time series data
PairedDataContainerContainer for paired data
SpatialGridStructStructure for spatial grid data
Methods
__init__(dss_path[, version, mode])close(self)Close the DSS file handle.
copy_path(pathname_in, pathname_out[, dss_out])Copy a DSS record from one pathname to another.
del_path(pathname)Delete DSS record(s) matching the given pathname pattern.
get_status(self)Get file operation status codes.
path_dict([sub_type])Get all pathnames in DSS file organized by data type.
pd_info(pathname)Get information about a paired data record.
preallocate_pd(pathname, shape, **kwargs)Preallocate space for paired data record in DSS file.
put_grid(data[, pathname, gridinfo, flipud, ...])Write spatial grid to DSS-7 file.
put_grid0(data[, pathname, gridinfo, ...])Write spatial grid to DSS-6 file.
put_pd(data, **kwargs)Write new paired data or edit an existing paired data record in the DSS file.
put_ts(data, **kwargs)Write time-series data to DSS file.
read_grid(pathname[, metadata_only])Read spatial grid data from DSS file.
read_grid2(pathname[, metadata_only])Read spatial grid data from DSS file and return as tuple.
read_pd(pathname[, window, dataframe])Read paired data from DSS file.
read_pd_labels(pathname)Read paired data labels from DSS file.
read_ts(pathname[, window, trim_missing, ...])Read time-series record from DSS file.
search_path([pathname, sort])Search for DSS pathnames matching a pattern.
Attributes
file_statusfilenameread_statusversionwrite_status- read_ts(pathname, window=None, trim_missing=False, window_flag=0, reg=False, ireg=False)[source]
Read time-series record from DSS file.
- Parameters:
pathname (str or DssPathName) – DSS record pathname.
window (tuple of (start, end) or None, optional) – Time window to read. If
None, the date range encoded in the D-part of thepathnameis used. Default is None.trim_missing (bool, optional) – If True, removes missing values at the beginning and end of the data set. Applies to regular time-series only. Default is False.
window_flag ({0, 1, 2, 3}, optional) –
Applies to irregular time series only. Controls how the time window is applied. Default is 0.
Possible values:
0 : Strictly adhere to the time window.
1 : Also retrieve one value immediately before the start of the window.
2 : Also retrieve one value immediately after the end of the window.
3 : Retrieve one value immediately before the start and one immediately after the end of the window.
reg (bool, optional) – If True, treat the data as a regular time series. Default is False.
ireg (bool, optional) –
If True, treat the data as an irregular time series. Default is False.
If both
regandiregareFalseor both areTrue, the type of time series will be determined from the E-part ofpathname.
- Returns:
Time series data structure containing the requested data.
- Return type:
TimeSeriesStruct
- Raises:
ValueError – If pathname does not correspond to a valid time series record or if window_flag is invalid.
Examples
Read time series with a specific time window:
>>> ts = fid.read_ts(pathname, window=('10MAR2006 24:00:00', '09APR2006 24:00:00'))
Read entire time series:
>>> ts = fid.read_ts(pathname)
Read regular time series with trimming:
>>> ts = fid.read_ts(pathname, trim_missing=True, reg=True)
- put_ts(data, **kwargs)[source]
Write time-series data to DSS file.
- Parameters:
data (str or DssPathName or TimeSeriesContainer) – Either a pathname string or a TimeSeriesContainer object.
**kwargs (Any) –
Keyword arguments for TimeSeriesContainer when
datais a pathname.Required kwargs when data is pathname:
- valueslist or array-like
Time series values.
For regular time-series (interval > 0):
- start_timestr
Starting date/time.
For irregular time-series (interval < 0):
- timeslist of str
List of date/time strings.
- julian_basestr, optional
Julian base date.
- Return type:
None
- Raises:
TypeError – If data is not of expected type.
ValueError – If required parameters are missing or invalid.
Examples
Write using TimeSeriesContainer:
>>> from pydsstools.heclib.dss.HecDss import Open >>> from pydsstools.core import TimeSeriesContainer >>> fid = Open("dss_file.dss", mode="rw") >>> pathname = r"/A/B/C//1HOUR/F/" >>> values = [10, 20, 30, 40, 50] >>> interval = 1 >>> start_time = r"01JAN2025 1500" >>> data_units = "ft" >>> data_type = "inst" >>> timezone = "UTC" >>> tsc = TimeSeriesContainer(pathname, len(values), interval, values=values, ... start_time=start_time, data_units=data_units, ... data_type=data_type, tzid=timezone) >>> fid.put_ts(tsc)
Write irregular time series without using TimeSeriesContainer:
>>> pathname = r"/A/B/C//IR-DAY/F/" >>> julian_base = "01JAN2000" >>> times = ["02JUL2010 1200", "05JAN2012 0000", "15MAR2014 0200", ... "25FEB2018 0500", "19DEC2024 1200"] >>> values = [1, 20, 30, 40, 50] >>> fid.put_ts(pathname, values=values, times=times, julian_base=julian_base, ... data_units=data_units, data_type=data_type, tzid=timezone)
- read_pd(pathname, window=None, dataframe=True)[source]
Read paired data from DSS file.
- Parameters:
pathname (str or DssPathName) – DSS record pathname.
window (tuple of (int, int, int, int) or None, optional) –
Index window to read. If
None, all rows and columns are read. Default is None.Supported forms:
(row_start, row_end, col_start, col_end)
Indexing rules:
Zero-based and inclusive at both ends.
row_start/col_start>= 0 (first row/column is 0).row_end/col_end<= last valid index.Nonefor any bound selects the respective first/last index.Negative indices are allowed (Python-style) and are wrapped.
If an end index overflows the table size, it is clipped.
Any other out-of-range condition raises
IndexError.
dataframe (bool, optional) – If True, return a pandas DataFrame. If False, return a PairedDataStruct object. Default is True.
- Returns:
Paired data in the requested format.
- Return type:
pandas.DataFrame or PairedDataStruct
- Raises:
IndexError – If window indices are invalid or out of range.
Examples
Read paired data with a window:
>>> df = fid.read_pd(pathname, window=(2, 5, 0, None))
Read all paired data:
>>> df = fid.read_pd(pathname)
Read as PairedDataStruct:
>>> pds = fid.read_pd(pathname, dataframe=False)
- read_pd_labels(pathname)[source]
Read paired data labels from DSS file.
- Parameters:
pathname (str or DssPathName) – DSS record pathname.
- Returns:
Dictionary mapping primary column names to label names.
- Return type:
dict of str to str
Examples
>>> labels = fid.read_pd_labels("/A/B/STAGE-FLOW/D/E/F/") >>> print(labels) {'y0': 'Stage', 'y1': 'Flow'}
- pd_info(pathname)[source]
Get information about a paired data record.
- Parameters:
pathname (str or DssPathName) – DSS record pathname.
- Returns:
Dictionary containing paired data information with keys:
- ’curve_no’int
Number of curves (columns).
- ’data_no’int
Number of data points (rows).
- ’dtype’int
Data type code.
- ’label_size’int
Average label size in characters.
- Return type:
dict
Examples
>>> info = fid.pd_info("/A/B/STAGE-FLOW/D/E/F/") >>> print(f"Curves: {info['curve_no']}, Points: {info['data_no']}") Curves: 2, Points: 100
- put_pd(data, **kwargs)[source]
Write new paired data or edit an existing paired data record in the DSS file.
- Parameters:
data (PairedDataContainer or str or DssPathName) –
Input data to write. Can be:
A PairedDataContainer object.
A string or DssPathName specifying an existing or new DSS record pathname.
**kwargs (Any) –
Additional keyword arguments or attributes for the PairedDataContainer.
When writing a DataFrame:
- y_datapandas.DataFrame
DataFrame containing paired data.
- x_unitsstr
Units for x-axis data.
- x_typestr
Type of x-axis data (e.g., “linear”).
- y_unitsstr
Units for y-axis data.
- y_typestr
Type of y-axis data (e.g., “linear”).
When writing a single curve to preallocated record:
- col_indexint
Column index (0-based) to write to.
- y_datalist or array-like
Y-axis values for the curve.
- windowtuple of (int, int), optional
Row range (start, end) for writing.
- y_labelslist of str, optional
Labels for y-axis curves.
- Return type:
None
- Raises:
ValueError – If incompatible parameters are provided or indices are out of range.
IndexError – If data has too many values.
Examples
Write PairedDataContainer:
>>> from pydsstools.core import PairedDataContainer >>> pathname = "/A/B/STAGE-FLOW/D/E/F/" >>> curves = 2 >>> rows = 5 >>> pdc = PairedDataContainer(pathname, (rows, curves)) >>> pdc.x_data = [0.1, 0.2, 0.3, 0.4, 0.5] >>> pdc.y_data = [[10, 20, 30, 40, 50], [1, 2, 3, 4, 5]] >>> pdc.x_units = "ft" >>> pdc.x_type = "linear" >>> pdc.y_units = "cfs" >>> pdc.y_type = "linear" >>> fid.put_pd(pdc)
Write DataFrame:
>>> import pandas as pd >>> pathname = "/A/B/STAGE-FLOW/D/E/F/" >>> df = pd.DataFrame({"Curve #1": [1, 2], "Curve #2": [3, 4]}, index=[0.5, 0.6]) >>> fid.put_pd(pathname, x_units="ft", x_type="linear", y_data=df, ... y_units="cfs", y_type="linear")
Write a curve to preallocated paired data record:
>>> pathname = "/A/B/STAGE-FLOW/D/E/PREALLOC/" >>> fid.put_pd(pathname, col_index=2, y_data=[1, 2, 3, 4], window=(2, 5))
- preallocate_pd(pathname, shape, **kwargs)[source]
Preallocate space for paired data record in DSS file.
This method creates an empty paired data structure in the DSS file that can later be filled with individual curves using put_pd with col_index parameter.
- Parameters:
pathname (str or DssPathName) – DSS record pathname.
shape (list of int or tuple of (int, int)) – Shape of the paired data as (rows, columns).
**kwargs (Any) – Additional keyword arguments for PairedDataContainer initialization, such as x_units, y_units, x_type, y_type, etc.
- Return type:
None
Examples
>>> pathname = "/A/B/STAGE-FLOW/D/E/PREALLOC/" >>> fid.preallocate_pd(pathname, shape=(100, 5), x_units="ft", y_units="cfs")
- read_grid(pathname, metadata_only=False)[source]
Read spatial grid data from DSS file.
Reads both version 0 (DSS-6 format) and version 100 (latest DSS-7 format) spatial grid data from DSS file. The method automatically detects the grid version and converts older formats to the modern format.
- Parameters:
pathname (str or DssPathName) – DSS record pathname.
metadata_only (bool, optional) – If True, read only metadata without grid data. Default is False.
- Returns:
Spatial grid data structure containing grid data and metadata.
- Return type:
SpatialGridStruct
Examples
Read grid data:
>>> sg = fid.read_grid("/A/B/PRECIP/01JAN2020:0000/01JAN2020:2400/GRIDTYPE/")
Read only metadata:
>>> sg = fid.read_grid(pathname, metadata_only=True) >>> print(sg.gridinfo.shape) (100, 200)
Notes
There are slight differences in grid metadata between version-0 and version-100 grids. For example, the RLE-style compression used for precipitation data is supported only in version-0 grids. When a version-0 grid is read using
read_grid, this compression method is reported in the returnedgridinfoas undefined compression. Consequently, if a version-0 grid needs to be read and written back while preserving its original format, theread_grid2method should be used instead.
- read_grid2(pathname, metadata_only=False)[source]
Read spatial grid data from DSS file and return as tuple.
Reads both version 0 (DSS-6 format) and version 100 (latest DSS-7 format) spatial grid data. This method provides an alternative return format compared to read_grid.
- Parameters:
pathname (str or DssPathName) – DSS record pathname.
metadata_only (bool, optional) – If True, return only metadata (gridinfo). Default is False.
- Returns:
If metadata_only is False, returns tuple of (numpy.ndarray, gridinfo). If metadata_only is True, returns gridinfo only. Returns None if grid data is invalid.
- Return type:
Examples
Read grid as array and gridinfo:
>>> data, gridinfo = fid.read_grid2(pathname) >>> print(data.shape, gridinfo.grid_type)
Read only gridinfo:
>>> gridinfo = fid.read_grid2(pathname, metadata_only=True)
- put_grid(data, pathname=None, gridinfo=None, flipud=True, inplace=False, compute_stats=True, transform=None, normalize=True)[source]
Write spatial grid to DSS-7 file.
Writing to DSS-6 file is not allowed. Use put_grid0 for DSS-6 files.
- Parameters:
data (SpatialGridStruct or numpy.ndarray or numpy.ma.MaskedArray) –
Grid data to write.
- numpy.ndarray:
np.nanandnodata(fromgridinfo) and
UNDEFINEDvalues are treated as nodata.
- numpy.ndarray:
numpy.ma.MaskedArray: masked elements are treated as nodata.
SpatialGridStruct: a structured object containing grid and metadata.
pathname (str or DssPathName or None, optional) – Pathname for the DSS record. It can be None for SpatialGridStruct. The dates in parts D and E are automatically reformatted to correct convention. Part D uses the beginning of the day (e.g.,
02JAN2025:0000) while Part E uses the end of the previous day convention (e.g.,01JAN2025:2400). Default is None.gridinfo (GridInfo or subclass or None, optional) –
Metadata describing the grid. Can be one of:
GridInfo,HrapInfo, orAlbersInfo: requiresdata_type,cell_size,shapeat minimum.SpecifiedInfo: additionallynodataandcrs.
Default is None.
flipud (bool, optional) – If True, flips the rows of the data array upside down before writing. This is necessary when the input data is numpy array with origin at top-left (e.g., array representing raster image in rasterio). Default is True.
inplace (bool, optional) – If True, tries to modify the data in place to reduce memory usage. Default is False.
compute_stats (bool or list of float, optional) –
Controls whether and how statistics are computed for the grid data. Default is True.
Possible values:
True: compute min, max, mean, range values, and range counts.
False: do not compute statistics.
list of float: compute “greater than or equal to” counts for the specified values (maximum of 19 thresholds, excluding nodata).
transform (Any or None, optional) – Spatial transform information (e.g., affine transform). If provided, it overrides transform parameters in
gridinfo. Default is None.normalize (bool, optional) – If True, tries to normalize coords_cell0 and lower_left_cell based on min_xy or input transform parameter. Default is True.
- Return type:
None
- Raises:
Exception – If D-part or E-part is not a valid datetime string for time-stamped grids.
Examples
Write grid from array:
>>> import numpy as np >>> from pydsstools.core.gridinfo import SpecifiedGridInfo >>> data = np.random.rand(100, 200).astype(np.float32) >>> pathname = "/A/B/PRECIP/01JAN2020:0000/01JAN2020:2400/SHG/" >>> gridinfo = SpecifiedGridInfo(data_type="PER-CUM", cell_size=2000.0, ... lower_left_x=100000, lower_left_y=200000, ... rows=100, cols=200, nodata=-999.0) >>> fid.put_grid(data, pathname, gridinfo)
Write with custom statistics thresholds:
>>> fid.put_grid(data, pathname, gridinfo, compute_stats=[0, 10, 50, 100])
- put_grid0(data, pathname=None, gridinfo=None, flipud=True, inplace=False, compute_stats=True, transform=None, normalize=True)[source]
Write spatial grid to DSS-6 file.
Writing to DSS-7 file using this method is experimental and may cause problems. Use put_grid for DSS-7 files instead.
- Parameters:
data (SpatialGridStruct or numpy.ndarray or numpy.ma.MaskedArray) –
Grid data to write.
- numpy.ndarray:
np.nanandnodata(fromgridinfo) and
UNDEFINEDvalues are treated as nodata.
- numpy.ndarray:
numpy.ma.MaskedArray: masked elements are treated as nodata.
SpatialGridStruct: a structured object containing grid and metadata.
pathname (str or DssPathName or None, optional) – Pathname for the DSS record. It can be None for SpatialGridStruct. The dates in parts D and E are automatically reformatted to correct convention. Part D uses the beginning of the day (e.g.,
02JAN2025:0000) while Part E uses the end of the previous day convention (e.g.,01JAN2025:2400). Default is None.gridinfo (GridInfo or GridInfo6 or None, optional) – Metadata describing the grid for version 6 and 7. Default is None.
flipud (bool, optional) – If True, flips the rows of the data array upside down before writing. This is necessary when the input data is numpy array with origin at top-left (e.g., array representing raster image in rasterio). Default is True.
inplace (bool, optional) – If True, tries to modify the data in place to reduce memory usage. Default is False.
compute_stats (bool or list of float, optional) –
Controls whether and how statistics are computed for the grid data. Default is True.
Possible values:
True: compute min, max, mean, range values, and range counts.
False: do not compute statistics.
list of float: compute “greater than or equal to” counts for the specified values (maximum of 19 thresholds, excluding nodata).
transform (Any or None, optional) – Spatial transform information (e.g., affine transform). If provided, it overrides transform parameters in
gridinfo. Default is None.normalize (bool, optional) – If True, tries to normalize coords_cell0 and lower_left_cell based on min_xy or input transform parameter. Default is True.
- Return type:
None
- Raises:
Exception – If D-part or E-part is not a valid datetime string for time-stamped grids.
Notes
This method writes grid data in DSS-6 (version 0) format. It is primarily intended for maintaining compatibility with legacy DSS-6 files.
- copy_path(pathname_in, pathname_out, dss_out=None)[source]
Copy a DSS record from one pathname to another.
Can copy within the same file or to a different DSS file.
- Parameters:
pathname_in (str or DssPathName) – Source pathname to copy from.
pathname_out (str or DssPathName) – Destination pathname to copy to.
dss_out (Open or None, optional) – Destination DSS file object. If None, copies within the same file. Default is None.
- Return type:
None
Examples
Copy within same file:
>>> fid.copy_path("/A/B/C/D/E/F/", "/A/B/C_COPY/D/E/F/")
Copy to different file:
>>> with Open("target.dss", mode="rw") as fid_out: ... fid.copy_path("/A/B/C/D/E/F/", "/A/B/C/D/E/F/", dss_out=fid_out)
- del_path(pathname)[source]
Delete DSS record(s) matching the given pathname pattern.
- Parameters:
pathname (str or DssPathName) – Pathname or pathname pattern to delete. Supports wildcards (*).
- Return type:
None
Examples
Delete specific record:
>>> fid.del_path("/A/B/C/D/E/F/")
Delete multiple records with wildcards:
>>> fid.del_path("/A/B/*/D/E/F/")
- search_path(pathname='', sort=False)[source]
Search for DSS pathnames matching a pattern.
- Parameters:
pathname (str or DssPathName, optional) – Pathname pattern which can include wildcard (*) for defining search pattern. Empty string returns all pathnames. Default is “”.
sort (bool, optional) – If True, sort the returned pathnames. Default is False.
- Returns:
List of matching pathnames.
- Return type:
list of str
Examples
Get all pathnames:
>>> paths = fid.search_path()
Search with pattern:
>>> paths = fid.search_path("/A/B/*/D/E/F/")
Get sorted results:
>>> paths = fid.search_path("/A/*/*/*/*/F/", sort=True)
- path_dict(sub_type=False)[source]
Get all pathnames in DSS file organized by data type.
- Parameters:
sub_type (bool, optional) – If True, separate time series into regular and irregular, and grids by type. If False, group all time series together and all grids together. Default is False.
- Returns:
Dictionary mapping data type names to lists of pathnames.
When sub_type is True, keys include:
”ts-reg”: Regular time series
”ts-irreg”: Irregular time series
”pd”: Paired data
”text”: Text data
”text-table”: Text tables
”grid-undefined”: Undefined grid type
”grid-hrap”: HRAP grids
”grid-albers”: Albers grids
”grid-spec”: Specified grids
”tin”: TIN data
”location”: Location data
”array”: Array data
”image”: Image data
”generic”: Generic data
”undefined”: Undefined data types
When sub_type is False, keys include:
”ts”: All time series (regular + irregular)
”grid”: All grids (undefined + hrap + albers + specified)
Other keys same as above
- Return type:
dict of str to list of str
Examples
Get all paths grouped by general type:
>>> paths = fid.path_dict() >>> print(f"Time series: {len(paths['ts'])}") >>> print(f"Paired data: {len(paths['pd'])}")
Get paths with detailed sub-types:
>>> paths = fid.path_dict(sub_type=True) >>> print(f"Regular TS: {len(paths['ts-reg'])}") >>> print(f"Irregular TS: {len(paths['ts-irreg'])}")