ETL Spreadsheet project
Is TypeScript the right thing to try for the transformation expression definitions? Pro: compiler API Con: ties us to Node on the back end. Unsure whether JS/TS types map well to DB
Say I move data retrieval etc. into services. What methods do they have?
TransformerService
Simplest option: take an
RowData
Simple dynamic type. Similar to DataTable
Transformation
Abstract class with a Transform(data: RowData[]) : RowData[]
function
ScalarTransformation extends Transformation
Class which performs one-to-one operations on each row, and returns the same number of rows.
Question: This works well if I'm always creating new columns. Do I want to allow changes to existing columns?Sometimes transforming an existing column really is the right thing to do. But would that be confusing?
Yes, I think I want to let people set formulas for existing columns. Won't be very confusing. By default, columns copied from previous steps have a grey "N/A" in formula field. Say we want to trim a ticker
column, could give it a trim($self)
or trim($ticker)
formula. Self-references will only be valid for columns generated by previous steps, not new ones.
Question are circular formulae a problem?
Can I start small? Single formula/transformation per step, keep it simple.
TransformerService class takes in a formula, applies it to RowData[], gives result.
Transformation types:
transform(row): row | row[] | undefined row is 1 to 1 row[] is 1 to many undefined is 1 to 0 (a filter)
column formulae are a special case that can be aggregated and converted to a regular 1-to-1 transformation how to handle these visually?