-
-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce "StringStream.split()" method #23
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some change proposals
@@ -9,8 +9,9 @@ import { TransformFunction } from "../ifca"; | |||
// thus the workaround with BaseStream and BaseStreamCreators used below. | |||
|
|||
export interface BaseStream<T extends any> { | |||
map<U>(callback: TransformFunction<T, U>): BaseStream<U>; | |||
filter(callback: TransformFunction<T, Boolean>): BaseStream<T>; | |||
map<U, W extends any[]>(callback: TransformFunction<T, U, W>, ...args: W): BaseStream<U>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A bit of a question here: how will we know that the input is T from the return value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be the same as IFCA where we convert T -> U
so it's an IFCA<T,U>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if I understand what you mean 🤔
We know the callback
input (so each chunk) is of type T
based on stream generic type (BaseStream<T>
). So for methods we need to define output type since input is defined while constructing new stream instance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something like:
type F<T,U> = (chunk: T) => U;
interface IFCA<T,U> {
write(input:T);
read(output:U);
}
class BaseStream<T,U=T> {
protected ifca: IFCA<T,U>
write(chunk: Parameters<typeof this.ifca.write>[0]);
read(chunk: ReturnType<typeof this.ifca.read>);
map(mapper: F<U,V>): BaseStream<T,V>
}
const test =
new BaseStream<string>()
.map(x => parseInt(x))
.map(x => ({x}) as {x: number})
test.write()
test.read()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, we have discussed it F2F and now I understand what's the issue. At the moment:
new BaseStream<string>()
.map(x => parseInt(x))
.map(x => ({x}) as {x: number})
returns BaseStream<number>
which means it can accept only numbers. The idea is that any transform operation which returns new (or mutated stream) should return a stream which accepts values of the same type as initial stream and yields values of "requested type". In general:
BaseStream.from<InputType, OutputType>()
.transform<Type1>()
.transform<Type2>()
=>
BaseStream<InputType,Type2>
And the idea is we either mutate this
(that's what we agreed on now) or alternatively return new stream instance (but then all intermediate streams, created while chaining, should be somehow made unusable).
I will cover this is a separate PR since it's quite significant change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jan-warchol I have extracted it to https://app.clickup.com/t/1j4hr63 (no extensive description for now) but we can discus it if you need any clarifications (I guess it would/should be similar in python after typings are added).
@jan-warchol @MichalCz ready for another review 👍 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤯
tyle powiem.
@@ -9,8 +9,9 @@ import { TransformFunction } from "../ifca"; | |||
// thus the workaround with BaseStream and BaseStreamCreators used below. | |||
|
|||
export interface BaseStream<T extends any> { | |||
map<U>(callback: TransformFunction<T, U>): BaseStream<U>; | |||
filter(callback: TransformFunction<T, Boolean>): BaseStream<T>; | |||
map<U, W extends any[]>(callback: TransformFunction<T, U, W>, ...args: W): BaseStream<U>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤔
Task linked: CU-1hyrf4q Przemyśleć generyki |
This commit introduces '.flatMap()' method implementation which is consistent with JavaScript 'Array.prototype.flatMap()'. Since 'flatMap' is internally 'map -> filter (empty results) -> flatten' I have reused '.map()' and '.filter()` methods and only added the missing part which is writing flattened chunks to new DataStream instance which is then returned. This is done through async generator which ensures newly created data stream will read new chunks only when it can keeping back pressure intact.
'StringStream.split' needs to be aware when the stream ends to yield chunks which may be not yielded yet. This typically occurs when there is no split sequence present (so all chunks glued together are yielded at once after last chunks is processed).
This commit corrects the way '.from()' method uses generics by providing 'Constructor' type passed as frist argument. This also results in TS correctly recognize types in most cases so I have removed explicit generics from most of the tests.
Since DataStream is in fact the most basic class and provides generic from there is no need for additional complexity.
7b167c9
to
d653507
Compare
Rebased onto latest |
This PR introduces:
DataStream.flatMap()
methodStringStream
classStringStream.split()
methodIt also contains some changes regarding BDD test setup (
cucumber.js
) since the main idea behind it was to provide BDD test for file processing (read -> split -> filter -> aggregate -> reduce) and so it required quite a lot of changes. At the moment, there is still noaggregate
andreduce
but I will most probably add it in separate PRs since this one got quite big anyways.DataStream.flatMap()
method have been introduced in dc9ad38 (with some minor corrections added later).As for
StringStream
and its.split()
method:I also did some refactoring here and there so there are some changes not directly related to the above.