gnss-product-management: Layers & Abstractions
Architecture blueprint for
packages/gnss-product-management/src/gnss_product_management/
Application Boundaries
Boundary |
Direction |
Modules |
Operations |
|---|---|---|---|
Network (FTP/FTPS) |
Outbound |
|
List directories, download files |
Network (HTTP/HTTPS) |
Outbound |
|
List directories (HTML parsing), download files |
Filesystem (bundled configs) |
Read |
|
Load YAML specs at startup |
Filesystem / Cloud (user workspace) |
Read/Write |
|
Search/store product files (local paths or |
User input |
Inbound |
|
Date, product name, constraints |
Layer Definitions
┌──────────────────────────────────────────────────────────────┐
│ Layer 4: INTERFACE │
│ GNSSClient, ProductQuery — user-facing entry point │
│ ProductRegistry, WorkSpace — environment setup │
│ (client/, environments/) │
├──────────────────────────────────────────────────────────────┤
│ Layer 3: ORCHESTRATION │
│ SearchPlanner, WormHole, ConnectionPoolFactory, │
│ LockfileManager │
│ Pipelines: DownloadPipeline, LockfileWriter, │
│ ResolvePipeline │
│ (factories/) │
├──────────────────────────────────────────────────────────────┤
│ Layer 2: CATALOG (Resolution + Registry) │
│ FormatCatalog, ProductCatalog, ResourceCatalog, │
│ SourcePlanner (Protocol) │
│ (specifications/format/, specifications/products/, │
│ specifications/remote/, factories/source_planner.py) │
├──────────────────────────────────────────────────────────────┤
│ Layer 1: SPECIFICATION (Data Models + YAML Loading) │
│ ParameterCatalog, FormatSpec, ProductSpec, SearchTarget, │
│ LocalResourceSpec, DependencySpec, LockProduct │
│ (specifications/, lockfile/models.py) │
├──────────────────────────────────────────────────────────────┤
│ Layer 0: CONFIGURATION (Static Data + Utilities) │
│ Bundled YAML files, │
│ as_path(), hash_file(), decompress_gzip() │
│ (utilities/, gnss-management-specs) │
└──────────────────────────────────────────────────────────────┘
Dependency Rule
Each layer may depend only on layers below it. No upward or lateral dependencies.
Layer 4 → Layer 3 → Layer 2 → Layer 1 → Layer 0
Layer 0: Configuration (Static Data + I/O Adapters)
Responsibility: Provide static bundled data, path constants, protocol-level I/O, and shared utility functions. No business logic.
Modules
Module |
Concern |
Boundary |
|---|---|---|
|
Path constants (META_SPEC_YAML, etc.) and all bundled YAML files |
Filesystem (package resources) |
|
|
— |
|
Computed field registration (DDD, GPSWEEK, etc.) |
— |
|
|
— |
Abstractions
as_path(uri)(utilities/paths.py): Single dispatch point for path construction. Returnscloudpathlib.CloudPathfors3://,gs://,az://URIs andpathlib.Pathfor everything else. All filesystem operations throughout the package flow through this helper.AnyPath:Union[Path, CloudPath]type alias used in signatures throughout layers 2–4.
Key Rule
Layer 0 must not import from any other layer. Utility functions operate on primitive types (strings, paths), not domain models.
Layer 1: Specification (Data Models + YAML Loading)
Responsibility: Define the domain vocabulary as Pydantic models. Load specifications from YAML. No resolution, no query logic, no I/O beyond YAML parsing.
Modules & Key Abstractions
Abstraction |
Module |
Description |
|---|---|---|
|
|
Single metadata field (name, value, pattern, derivation) |
|
|
Registry: parameter name → Parameter definition |
|
|
Field definition within a format version |
|
|
A format version with metadata fields + file templates |
|
|
Top-level format with versions |
|
|
Loaded format specs from YAML |
|
|
Read-only lookup of raw format specs |
|
|
Concrete product: name, parameters, directory/filename templates |
|
|
Template string with |
|
|
Generic: variant name → T |
|
|
Generic: version name → VariantCatalog[T] |
|
|
Abstract binding: product name + format ref + parameter overrides |
|
|
Loaded product specs from YAML |
|
|
Server endpoint (hostname, protocol, auth) |
|
|
Root center spec: servers + product offerings |
|
|
Concrete query target: product + server + directory + filename |
|
|
Resolved SearchTargets per center |
|
|
Group of product specs sharing a directory template |
|
|
Root local storage spec |
|
|
Single product dependency (spec name, required, constraints) |
|
|
Sort preference for a dependency resolution pass |
|
|
Full dependency declaration for a processing task |
|
|
Resolution result (status, URI string, remote URL) |
|
|
Aggregated results for all dependencies in a spec |
|
|
Per-file lock entry (hash, size, source URL, sink URI) |
|
|
Aggregate lockfile for one processing day |
Interfaces
Each spec type exposes a from_yaml(path) -> Self classmethod for loading. All models are Pydantic BaseModel subclasses.
ResolvedDependency.local_path is a str URI — pass it through as_path() to get a filesystem object for I/O.
Key Rule
Layer 1 models are declarative data. They define what exists, not how to build or query it.
Layer 2: Catalog (Resolution + Registry)
Responsibility: Transform abstract specifications into concrete, queryable objects. Maintain registries for lookup. This is where specs become usable products.
Modules & Key Abstractions
Abstraction |
Module |
Input → Output |
|---|---|---|
|
|
Base class enforcing |
|
|
Shared interface: |
|
|
FormatSpecCatalog + ParameterCatalog → resolved Products per format/version/variant |
|
|
ProductSpecCatalog + FormatCatalog → resolved Products per product/version/variant |
|
|
ResourceSpec + ProductCatalog → expanded SearchTarget list |
Resolution Chain
ParameterCatalog ──┐
├──► FormatCatalog ──► ProductCatalog ──► ResourceCatalog
FormatSpecCatalog ─┘ ProductSpecCatalog ─┘ ↑ ResourceSpec[]
ProductRegistry (Layer 4 setup, Layer 2 usage) holds the built ResourceCatalog objects and implements the SourcePlanner Protocol for remote resource lookup. WorkSpace implements SourcePlanner for local resource lookup. SearchPlanner (Layer 3) delegates to both.
Interfaces
FormatCatalog.build(format_spec_catalog, parameter_catalog) -> FormatCatalogProductCatalog.build(product_spec_catalog, format_catalog) -> ProductCatalogResourceCatalog.build(resource_spec, product_catalog) -> ResourceCatalog
ProductRegistry and WorkSpace satisfy the SourcePlanner Protocol used by SearchPlanner (Layer 3):
resource_ids -> List[str]source_product(product, resource_id) -> List[SearchTarget]
Key Rule
Catalogs are immutable after construction. Resolution happens once; the result is cached as data. No network I/O, no filesystem writes.
Layer 3: Orchestration (Query Building + Transport + Resolution)
Responsibility: Combine catalogs with user constraints to build, execute, and resolve queries. Touches external boundaries (network, filesystem, cloud storage).
Modules & Key Abstractions
Abstraction |
Module |
Responsibility |
|---|---|---|
|
|
Date + product + constraints → |
|
|
Directory listing + filename matching + file download |
|
|
fsspec-backed connection pools per host (FTP, FTPS, HTTP, HTTPS, file) |
|
|
Lockfile lifecycle: check, load, save |
|
|
Pipeline: |
|
|
Pipeline: write sidecar + aggregate lockfiles |
|
|
Pipeline: |
Data Flow
User constraints (date, product, parameters...)
│
▼
SearchPlanner
├── Resolve product templates (ProductCatalog)
├── Compute date fields (ParameterCatalog)
├── Narrow parameters (user constraints)
├── Expand combinations (cartesian product)
├── Local targets (WorkSpace.source_product)
└── Remote targets (ProductRegistry.source_product)
│
▼
List[SearchTarget] (directory pattern + filename pattern)
│
▼
WormHole
├── Group by (hostname, directory)
├── List directories in parallel (ConnectionPoolFactory)
├── Match filename patterns (regex)
└── Optionally download (ConnectionPoolFactory.download_file)
│
▼
List[SearchTarget] (filename.value populated)
Cloud / Local Filesystem Transparency
WorkSpace and LockfileManager accept base directories as URI strings, dispatched through as_path():
Local path:
/data/gnss→pathlib.PathS3 URI:
s3://bucket/gnss→cloudpathlib.S3Path
All path operations (.exists(), .iterdir(), .read_text(), .write_text(), .mkdir(), / operator) are identical across backends. The LockfileManager stores aggregate lockfiles at the resource’s base_dir / "dependency_lockfiles", enabling distributed workers to coordinate via shared cloud storage.
Interfaces
SearchPlanner.get(date, product, parameters, ...) -> List[SearchTarget]WormHole.search(List[SearchTarget]) -> List[SearchTarget](one per matched file)WormHole.download_one(query, local_resource_id, local_factory, date) -> AnyPath | NoneResolvePipeline.run(spec, date, sink_id, ...) -> Tuple[DependencyResolution, AnyPath | None]LockfileManager(lockfile_dir: str | Path | CloudPath)— storage-agnostic
Key Rule
Orchestration modules coordinate between catalogs and I/O. They do not define domain models — they consume them. Network/filesystem operations are delegated to ConnectionPoolFactory (via fsspec) or cloudpathlib.
Layer 4: Interface (Entry Point)
Responsibility: Provide user-facing APIs that wire all layers together. Hide internal complexity behind clean, fluent interfaces.
Modules & Key Abstractions
Abstraction |
Module |
Responsibility |
|---|---|---|
|
|
Primary entry point: search, download, resolve dependencies |
|
|
Fluent query builder ( |
|
|
User-facing search result (hostname, filename, parameters, local_path) |
|
|
Loads YAML specs, builds catalog chain, holds registered resource catalogs |
|
|
Registers local/cloud storage directories against |
|
|
Bound spec + |
Interfaces
# Construct from bundled defaults
client = GNSSClient.from_defaults(base_dir="/data/gnss") # local
client = GNSSClient.from_defaults(base_dir="s3://bucket/gnss") # S3
# Fluent query builder
results = (
client.query()
.for_product("ORBIT")
.on(date)
.where(TTT="FIN")
.sources("COD", "WUM")
.prefer(TTT=["FIN", "RAP"])
.search()
)
# Date-range query (searches each day in parallel)
results = (
client.query()
.for_product("ORBIT")
.on_range(start_date, end_date)
.where(TTT="FIN")
.search()
)
# Download
paths = client.download(results, sink_id="local")
# Full dependency resolution
resolution, lockfile_path = client.resolve_dependencies(dep_spec, date, sink_id="local")
For advanced use, construct manually:
registry = ProductRegistry()
registry.add_parameter_spec(META_SPEC_YAML)
registry.add_format_spec(FORMAT_SPEC_YAML)
registry.add_product_spec(PRODUCT_SPEC_YAML)
registry.add_resource_spec(center_yaml)
registry.build()
workspace = WorkSpace()
workspace.add_resource_spec(local_spec_yaml)
workspace.register_spec(base_dir="s3://my-bucket/gnss", spec_ids=["local_config"])
client = GNSSClient(env=registry, workspace=workspace)
Key Rule
All standard user code should interact through GNSSClient or ProductQuery. ProductRegistry and WorkSpace are setup objects; the pipelines and planners are internal implementation details.
Abstraction Inventory by Layer
Layer 0 (Configuration) Layer 1 (Specification)
───────────────────── ───────────────────────
as_path() / AnyPath Parameter
hash_file() ParameterCatalog
decompress_gzip() FormatFieldDef
_ensure_datetime() FormatVersionSpec
register_computed_fields() FormatSpec
gnss-management-specs YAMLs FormatSpecCatalog
FormatRegistry
Product
PathTemplate
VariantCatalog[T]
VersionCatalog[T]
ProductSpec
ProductSpecCatalog
Server
ResourceSpec
SearchTarget
ResourceCatalog
LocalCollection
LocalResourceSpec
Dependency
SearchPreference
DependencySpec
ResolvedDependency
DependencyResolution
LockProduct
DependencyLockFile
Layer 2 (Catalog) Layer 3 (Orchestration)
───────────────── ───────────────────────
Catalog (ABC base) SearchPlanner
SourcePlanner (Protocol) WormHole
FormatCatalog ConnectionPoolFactory
ProductCatalog LockfileManager
ResourceCatalog DownloadPipeline
LockfileWriter
ResolvePipeline
Layer 4 (Interface)
───────────────────
GNSSClient
ProductQuery
FoundResource
ProductRegistry
WorkSpace
RegisteredLocalResource
Summary
Layer |
Purpose |
Depends On |
Boundary |
|---|---|---|---|
0 — Configuration |
Static data, path utilities, bundled YAMLs |
Nothing |
Filesystem |
1 — Specification |
Domain models, YAML loading |
Layer 0 |
Filesystem (YAML reads) |
2 — Catalog |
Resolve specs → concrete objects, registries |
Layer 1 |
None (pure computation) |
3 — Orchestration |
Build queries, execute fetches, resolve deps, manage lockfiles |
Layers 0, 1, 2 |
Network, Filesystem, Cloud |
4 — Interface |
User-facing API, environment wiring |
Layers 1, 2, 3 |
User input |