Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce "StringStream.split()" method #23

Merged
merged 30 commits into from
Nov 3, 2021
Merged

Conversation

f1ames
Copy link
Contributor

@f1ames f1ames commented Oct 25, 2021

This PR introduces:

  • DataStream.flatMap() method
  • StringStream class
  • StringStream.split() method

It also contains some changes regarding BDD test setup (cucumber.js) since the main idea behind it was to provide BDD test for file processing (read -> split -> filter -> aggregate -> reduce) and so it required quite a lot of changes. At the moment, there is still no aggregate and reduce but I will most probably add it in separate PRs since this one got quite big anyways.

DataStream.flatMap() method have been introduced in dc9ad38 (with some minor corrections added later).

As for StringStream and its .split() method:

I also did some refactoring here and there so there are some changes not directly related to the above.

Copy link
Member

@MichalCz MichalCz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some change proposals

@@ -9,8 +9,9 @@ import { TransformFunction } from "../ifca";
// thus the workaround with BaseStream and BaseStreamCreators used below.

export interface BaseStream<T extends any> {
map<U>(callback: TransformFunction<T, U>): BaseStream<U>;
filter(callback: TransformFunction<T, Boolean>): BaseStream<T>;
map<U, W extends any[]>(callback: TransformFunction<T, U, W>, ...args: W): BaseStream<U>;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A bit of a question here: how will we know that the input is T from the return value?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be the same as IFCA where we convert T -> U so it's an IFCA<T,U>

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if I understand what you mean 🤔

We know the callback input (so each chunk) is of type T based on stream generic type (BaseStream<T>). So for methods we need to define output type since input is defined while constructing new stream instance.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something like:

type F<T,U> = (chunk: T) => U;

interface IFCA<T,U> {
    write(input:T);
    read(output:U);
}

class BaseStream<T,U=T> {
    protected ifca: IFCA<T,U>

    write(chunk: Parameters<typeof this.ifca.write>[0]);
    read(chunk: ReturnType<typeof this.ifca.read>);

    map(mapper: F<U,V>): BaseStream<T,V>
}

const test = 
    new BaseStream<string>()
        .map(x => parseInt(x))
        .map(x => ({x}) as {x: number})

test.write()
test.read()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, we have discussed it F2F and now I understand what's the issue. At the moment:

new BaseStream<string>()
        .map(x => parseInt(x))
        .map(x => ({x}) as {x: number})

returns BaseStream<number> which means it can accept only numbers. The idea is that any transform operation which returns new (or mutated stream) should return a stream which accepts values of the same type as initial stream and yields values of "requested type". In general:

BaseStream.from<InputType, OutputType>()
    .transform<Type1>()
    .transform<Type2>()
=>
BaseStream<InputType,Type2>

And the idea is we either mutate this (that's what we agreed on now) or alternatively return new stream instance (but then all intermediate streams, created while chaining, should be somehow made unusable).

I will cover this is a separate PR since it's quite significant change.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jan-warchol I have extracted it to https://app.clickup.com/t/1j4hr63 (no extensive description for now) but we can discus it if you need any clarifications (I guess it would/should be similar in python after typings are added).

src/streams/base-stream.ts Outdated Show resolved Hide resolved
src/streams/data-stream.ts Outdated Show resolved Hide resolved
src/streams/string-stream.ts Outdated Show resolved Hide resolved
src/streams/string-stream.ts Outdated Show resolved Hide resolved
src/streams/string-stream.ts Outdated Show resolved Hide resolved
src/streams/string-stream.ts Outdated Show resolved Hide resolved
src/streams/string-stream.ts Outdated Show resolved Hide resolved
@f1ames f1ames requested a review from MichalCz October 27, 2021 07:40
@f1ames f1ames marked this pull request as ready for review October 27, 2021 07:51
Base automatically changed from task/test-scenarios to main October 27, 2021 07:59
@f1ames
Copy link
Contributor Author

f1ames commented Oct 27, 2021

@jan-warchol @MichalCz ready for another review 👍

Copy link
Contributor

@jan-warchol jan-warchol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤯
tyle powiem.

test/streams/data/creation.spec.ts Show resolved Hide resolved
@@ -9,8 +9,9 @@ import { TransformFunction } from "../ifca";
// thus the workaround with BaseStream and BaseStreamCreators used below.

export interface BaseStream<T extends any> {
map<U>(callback: TransformFunction<T, U>): BaseStream<U>;
filter(callback: TransformFunction<T, Boolean>): BaseStream<T>;
map<U, W extends any[]>(callback: TransformFunction<T, U, W>, ...args: W): BaseStream<U>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔

src/streams/base-stream.ts Outdated Show resolved Hide resolved
src/types.ts Show resolved Hide resolved
@scramjet-bot
Copy link

@scramjet-bot
Copy link

Task linked: CU-1hyrf4q Przemyśleć generyki

This commit introduces '.flatMap()' method implementation which is
consistent with JavaScript 'Array.prototype.flatMap()'.

Since 'flatMap' is internally 'map -> filter (empty results) -> flatten'
I have reused '.map()' and '.filter()` methods and only added the
missing part which is writing flattened chunks to new DataStream
instance which is then returned. This is done through async generator
which ensures newly created data stream will read new chunks only when
it can keeping back pressure intact.
'StringStream.split' needs to be aware when the stream ends to yield
chunks which may be not yielded yet. This typically occurs when there is
no split sequence present (so all chunks glued together are yielded at
once after last chunks is processed).
This commit corrects the way '.from()' method uses generics by providing
'Constructor' type passed as frist argument. This also results in TS
correctly recognize types in most cases so I have removed
explicit generics from most of the tests.
Since DataStream is in fact the most basic class and provides generic
from there is no need for additional complexity.
@f1ames
Copy link
Contributor Author

f1ames commented Nov 3, 2021

Rebased onto latest main.

@f1ames f1ames merged commit 6f55422 into main Nov 3, 2021
@f1ames f1ames deleted the feature/stringstream-split branch November 3, 2021 10:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants