General Naming Conventions and Typing #3

miili · 2022-11-03T16:52:20Z

Hello all,

first off, thank you for this first draft.

I know, I know, Fortran once had a limitation of 6 chars for any variable.
Please consider using clear self-explaining variables:

DASFileVersion  -> version
domain          -> data_unit
t0              -> start_time
dt              -> sampling_period
GL              -> gauge_length
lats            -> latitudes
longs           -> longitudes
elev            -> elevations
meta            -> additional_data

The text was updated successfully, but these errors were encountered:

miili · 2022-11-03T17:51:53Z

Please use confined dataclasses (https://docs.python.org/3/library/dataclasses.html) for the meta data. Here is a proposal:

from dataclasses import dataclass
from typing import Any, Literal

StrainUnit = Literal["m/m", "cm/m", "nm/m"]


@dataclass
class DASMeta:
    version: int
    data_unit: StrainUnit
    start_time: float
    sampling_period: float
    gauge_length: float
    latitudes: list[float]
    longitudes: list[float]
    elevations: list[float]
    additional_data: dict[str, Any]

    def endtime(self, nsamples) -> float:
        ...

andreas-wuestefeld · 2022-11-03T18:26:18Z

I chose readability over efficiency.
In my experience, your suggestion of classes increase the barrier of entry. For many student this might be their first contact with programming.

I envision this as reference reader, not optimum super-duper high-class reader. It should help people understand the data format.

But I am open for arguments against such approach

andreas-wuestefeld · 2022-11-03T18:28:22Z

regarding variable names, I am just lazy typing :-)
I understand the argument for descriptive names

Let's see what the community thinks

jpmorten-asn · 2022-11-04T07:36:58Z

My preference is definitely on writing out the variable names using underscores to include spaces. This can avoid a lot of misunderstandings and makes it possible to discover the structure of the data even when documentation is not available (lost, or forgotten). I think one aim of the project was indeed to create a discoverable format.

miili · 2022-11-04T13:37:00Z

In my experience, your suggestion of classes increase the barrier of entry. For many student this might be their first contact with programming.

@andreas-wuestefeld, if we are looking for a sustainable DAS data format we need an elaborate concept. Conceptualization of a data format is nothing for students or beginner programmers. We need performant I/O (layout) and efficient storage (compression).

I envision this as reference reader, not optimum super-duper high-class reader.

A sustainable data format should be super-duper efficient and versatile!

It should help people understand the data format.

A user does not need to understand a data format. All its complexity has to be abstracted away by a reference library. This is why e.g. ObsPy (libmseed) is so successful, libjpeg or libhdf5.

The fundamental question is whether we are looking for a serious DAS data format implemented by IRIS which can be used for

performant data analysis,
efficient archiving,
possibly streaming and
querying online repositories (similar to FDSNWS)

or a HDF5 structure for project-internal exchange in February.

andreas-wuestefeld · 2022-11-04T14:05:23Z

@miili I learned yesterday evening, in response to publishing this format, that IRIS is actually working on / considering a format
It may well be that this format is rather short-lived, although I hope it will prove its worth.

I thus changed the potentially misleading name from IRIS (as part of the IRIS RCN efforts) to more general miniDAS. The repro name is still the same but will be hopefully fixed over the weekend.

At this point, I feel it is most important to have a common format for the global month, ideal or not.
Your input is very good, and I am happy to hear these comments from someone obviously more familiar with the deep down programming features.

Maybe you can just point out the most easy-to-fix issues here to be implemented (space vs time for example?). variable names can obviously also be adjusted

andreas-wuestefeld · 2022-11-06T14:57:01Z

implemented.
comments on new names are welcome

dcbowden · 2022-12-02T14:24:15Z

I'm a month late to these discussions. I agree with @andreas-wuestefeld that the formal object-oriented structure is going to be a bit harder for many of us academics to deal with; I also had to wrap my head around how to work with it. That said I agree with @miili that it could be OK, in that most academics & students don't need to worry about the internals! We just need some very user friendly demos before February. Maybe Jupyter Notebooks? Not just a README list of headers and function inputs/outputs, but a full step-by-step guide showing how to load some interrogator's raw output (Silixa, Febus, whatever), declare the metadata object, use the from_numpy() function to eventually save the proper output, etc.

miili changed the title ~~General Naming Conventions~~ General Naming Conventions and Typing Nov 3, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

General Naming Conventions and Typing #3

General Naming Conventions and Typing #3

miili commented Nov 3, 2022

miili commented Nov 3, 2022

andreas-wuestefeld commented Nov 3, 2022

andreas-wuestefeld commented Nov 3, 2022

jpmorten-asn commented Nov 4, 2022 •

edited

Loading

miili commented Nov 4, 2022 •

edited

Loading

andreas-wuestefeld commented Nov 4, 2022

andreas-wuestefeld commented Nov 6, 2022

dcbowden commented Dec 2, 2022

General Naming Conventions and Typing #3

General Naming Conventions and Typing #3

Comments

miili commented Nov 3, 2022

miili commented Nov 3, 2022

andreas-wuestefeld commented Nov 3, 2022

andreas-wuestefeld commented Nov 3, 2022

jpmorten-asn commented Nov 4, 2022 • edited Loading

miili commented Nov 4, 2022 • edited Loading

andreas-wuestefeld commented Nov 4, 2022

andreas-wuestefeld commented Nov 6, 2022

dcbowden commented Dec 2, 2022

jpmorten-asn commented Nov 4, 2022 •

edited

Loading

miili commented Nov 4, 2022 •

edited

Loading