-
Notifications
You must be signed in to change notification settings - Fork 18
gristle_slicer
ken farmer edited this page May 15, 2021
·
6 revisions
gristle_slicer is a hybrid of the unix cut utility and Python slicing syntax.
The user can provide two types of criteria: inclusion and exclusion. Inclusion is applied first, then exclusion second. And criteria can be applied to columns as well as rows.
So, for example a typical and simple usage would be to get columns 0-5, 8,9, and 20-25 and 26-30 from the first and last 100 rows of a file. This would be expressed like this:
$ gristle_slicer -i sample.csv -c 0:26 -C "5:8, 10:20" -r ":1000, 10001:20000, -1" -R 100:200
This expression first selects columns:
- first includes (-c) a large range of columns (0 to 26)
- then it excludes (-C) two smaller ranges within it
Then selects rows:
- first includes the first rows up to row 1000 (left-side of colon is blank), all rows from 10,001 to 20,000 and finally the last row
- then from the above-selected list of rows it excludes rows 100 through 199 (based on row offsets of the original file).
Alternatively, one could use the field names from the file header rather than the numeric field positions:
- So, a simple example like: gristle_slicer -i sample.csv -c "0:26, 32, 45"
- Becomes: gristle_slicer -i sample.csv -c "id:home_state, dept, lname"
For more examples of this tool in action see:
- this posting
- the project examples