Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Strong-type data points, avoid magic strings #166

Open
omidkrad opened this issue Oct 30, 2019 · 3 comments
Open

Strong-type data points, avoid magic strings #166

omidkrad opened this issue Oct 30, 2019 · 3 comments

Comments

@omidkrad
Copy link

omidkrad commented Oct 30, 2019

It is great that this project is being converted to TypeScript. I wanted to suggest to use generics for strong-typing data points in time series, and avoid magic strings as much as possible. I created this commit to show how the interfaces could be updated to do that (it does not include the implementation):

Commit: Add generic type for data points

With that, as an example, avg function could be updated to support this signature:

avg(key: (data: TData) => number, filter?: any): number;

and be used like below:

import { timeSeries } from "pondjs";

interface SensorDataPoint {
    time: number;
    sensor: number;
    status: string;
}

const series = timeSeries<SensorDataPoint>({
    name: "sensor_data",
    columns: ["time", "sensor", "status"], // checked, non-existent keys like "status1" will give error
    points: [
        [1400425947000, 3, "ok"],
        [1400425948000, 8, "high"],
        [1400425949000, 2, "low"],
        [1400425950000, 3, "ok"],
    ]
});

// by field
series.avg(d => d.sensor);   // 4

// calculation
series.avg(d => d.sensor + d.sensor2 * 100);

// field path
series.avg(d => d.sensor.diningRoom);

I believe function signatures that receive magic strings should be deprecated in favor of this new signature.

@pjm17971
Copy link
Contributor

pjm17971 commented Nov 1, 2019

Firstly, thank you for taking the time to think about this and write this up. I like this suggestion as I've been contemplating removing the fieldSpec type references myself, both because of typing issues and also the complexity it scatters though the codebase. I'm in favor of doing something along these lines.

The two options I see to address the typing and complexity issues are:

  • option (a) Implement something like you wrote up where we type the event data payload and see where that takes the implementation, solving what ever problems that brings up
  • option (b) Vastly simplify the code by saying and event is a (timestamp, value) pair, a collection is a list of those pairs, and a timeseries is a map of { column -> Collection }, plus meta data (name, tz, etc).

I've started a branch to explore option (a) as this is a less radical change. When I ported Pond to Typescript originally that was the obvious thought (while I was generically typing the key, it made sense to generically type the data), but there was some obstacles that where too much of a lift at the time in terms of what that meant across the code. One such obstacle was how avg() type operations would work, operations that essentially used get() with a fieldSpec would work, and your suggestion for those is elegant. So let's try that.

There was a couple of other concerns that still would need to be solved or at least consider the trade off:

  • The library was originally built as a set of timeseries operations sitting on top of immutable.js. That approach, having that guarantee, has served as us pretty well over the years. I'm not sure how I feel about the trade-off here.

  • It seems like there's going to be times when we potentially want something from TData that's specific to that type, like ability to serialize and deserialize (which since it's an Immutable.map now, we know how to do that, once it's in user land we don't), and also possibly the ability to clone the data, and possibly other things. So that would either mean more user functions or that TData needs to extend an abstract interface that forces it to provide that information, even for the most simple use case.

  • Another concern that tips my thinking towards option (b) above is that we'd like to port most of this to Golang at some point, and Go doesn't support generics. That's a down the road concern, but one I'm still thinking about.

Anyway, I'm mostly replying to thank you for taking the time to make this suggestion. I'll continue to look into this.

@pjm17971
Copy link
Contributor

pjm17971 commented Nov 2, 2019

Kind of a long way to go, but got tests for this basic functionality working with Events and Collections:

interface Sensor {
    value: number;
    status: string;
}

const c = collection<Time, Sensor>(
    Immutable.List([
        event(time("2015-04-22T03:30:00Z"), { value: 32, status: "ok" }),
        event(time("2015-04-22T02:30:00Z"), { value: 42, status: "fail" })
    ])
);

c.avg(d => d.value);

// 37

c.toJSON(({ value, status }) => ({ value, ok: status === "ok" ? true : false }))

// [
//   { time: 1429673400000, data: { value: 32, ok: true } },
//   { time: 1429669800000, data: { value: 42, ok: false } }
// ]

@omidkrad
Copy link
Author

omidkrad commented Nov 2, 2019

That's great. I'm not very familiar with Pond but when I was reviewing it, I liked the project but thought the API needed that kind of improvement. It's great to see my suggestion was received well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants