- Add Promise and Async / Await support
- Add asynchronous line by line processing support
- Built-in TypeScript support
- Output format options
- Async Hooks Support
- Performance Improvement
- Dropped support to node.js<4
- 'csv', 'json', 'record_parsed', 'end_parsed' events were replaced by .subscribe and .then
- Worker has been removed
- fromFile / fromStream / fromString will not accept callback. Use .then instead
- ignoreColumns and includeColumns accepts only RegExp now
- .transf is removed
- .preRawData uses Promise instead of using callback
- removed toArrayString parameter
- line number now starts from 0 instead of 1
- Moved Converter constructor.
- end event will not emit if no downstream
// Promise
csv()
.fromFile(myCSVFilePath)
.then((jsonArray)=>{
}, errorHandle);
// async / await
const jsonArray= await csv().fromFile(myCSVFilePath);
// Promise chain
request.get(csvUrl)
.then((csvdata)=>{
return csv().fromString(csvdata)
})
.then((jsonArray)=>{
})
// async process
csv()
.fromFile(csvFilePath)
.subscribe((json,lineNumber)=>{
return Promise((resolve,reject)=>{
// process the json line in asynchronous.
})
},onError, onComplete)
// sync process
csv()
.fromFile(csvFilePath)
.subscribe((json,lineNumber)=>{
// process the json line in synchronous.
},onError, onComplete)
// csvtojson/index.d.ts file
import csv from "csvtojson";
/**
* csv data:
* a,b,c
* 1,2,3
* const csvStr;
*/
let result= await csv().fromString(csvStr);
/**
* result is json array:
* [{
* a: "1",
* b: "2",
* c: "3:
* }]
*/
result= await csv({output:"csv",noheader: true}).fromString(csvStr);
/**
* result is array of csv rows:
* [
* ["a","b","c"],
* ["1","2","3"]
* ]
*/
result= await csv({output:"line",noheader: true}).fromString(csvStr);
/**
* result is array of csv line in string (including end of line in cells if exists):
* [
* "a,b,c",
* "1,2,3"
* ]
*/
csv().fromFile(csvFile)
.preRawData((data)=>{
//async
return new Promise((resolve,reject)=>{
//async process
});
//sync
return data.replace("a","b");
})
csv().fromFile(csvFile)
.preFileLine((fileLine,lineNumber)=>{
//async
return new Promise((resolve,reject)=>{
//async process
});
//sync
return fileLine.replace("a","b");
})
.trans
has been replaced by .subscribe
. see below.
When converting to json
array, v2
is around 8-10 times faster than v1
There are many exciting changes in csvtojson v2
.
However, as a major release, it breaks something.
From v2.0.0
csvtojson only supports Node.JS >=4.0.0
From 2.0.0
, those events above are replaced by .subscribe
and .then
methods. The output format is controlled by a output
parameter which could be json
, csv
, line
in v2.0.0
Below some examples on code changes:
//before -- get json object
csv().fromString(myCSV).on("json",function(json){});
csv().fromString(myCSV).on("record_parsed",function(json){});
//now
csv().fromString(myCSV).subscribe(function(json){});
//before -- get csv row
csv().fromString(myCSV).on("csv",function(csvRow){});
//now
csv({output:"csv"}).fromString(myCSV).subscribe(function(csvRow){});
//before -- get final json array
csv().fromString(myCSV).on("end_parsed",function(jsonArray){});
//now
csv().fromString(myCSV).then(function(jsonArray){}); // Promise
const jsonArray=await csv().fromString(myCSV); // async /await
Worker feature makes sense to Command Line where it could utilize multiple CPU cores to speed up processing large csv file. However, it does not quite work as expected mainly because cooperation of multiple processes' result is very complex. Also the inter process communication adds too much overhead which minimize the benefit gained from spawning workers.
Thus in version 2.0.0
I decided to temporarily remove Worker
feature and will re-think how to better utilize multiple CPU Cores.
Before
csv().fromFile(myFile,function(err,jsonArr){})
After
//Promise
csv().fromFile(myFile).then(function(jsonArr){},function(err){})
// Async
const jsonArr=await csv().fromFile(myFile);
Before
csv({
ignoreColumns:["gender","age"]
})
Now
csv({
ignoreColumns: /gender|age/
})
.transf
was used purely for result transformation and has very bad performance.
It is now recommended to use .subscribe
instead
Before
csv()
.transf((jsonObj)=>{
jsonObj.myNewKey='some value'
}).pipe(downstream)
After
csv()
.subscribe((jsonObj)=>{
jsonObj.myNewKey='some value'
}).pipe(downstream)
Before
csv()
.preRawData((csvRawData,cb)=>{
var newData=csvRawData.replace('some value','another value')
cb(newData);
})
After
csv()
.preRawData((csvRawData)=>{
var newData=csvRawData.replace('some value','another value')
// synchronous
return newData;
// or asynchronously
return Promise.resolve(newData);
})
this feature is mostly not used.
first row in csv now is always indexed as 0 -- no matter it is header row or not.
The definition of end event is when there is no more data to be consumed from the stream. Thus it will not emit if there is no downstream after the parser. To subscribe the parsing finish, use done
event instead.
// before
csv().on("end",()=>{})
// now
csv().on("done",()=>{})