Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Croptype #71

Open
wants to merge 193 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 4 commits
Commits
Show all changes
193 commits
Select commit Hold shift + click to select a range
bf5dda2
added code for multiclass finetuning Presto for croptype task
May 31, 2024
7fd1dde
fixed result collection for sklearn models
May 31, 2024
6090b9f
added hierarchical classifier v.0 to downstream models
Jun 12, 2024
ea4b2b6
added patches for handling valid_date as token; added more updated de…
Jun 19, 2024
7285c38
major change: spatial prediction for croptype + looots of minor changes
Jul 2, 2024
1601dbc
replaced confusing argument name; improved formatting
Aug 5, 2024
586db66
added valid_month parameters to default config
Aug 5, 2024
532ec67
added placeholder for loading finetuned model
Aug 5, 2024
a350280
class name constructed from task_type; cleaned unused pieces
Aug 5, 2024
0cf1e60
added a line to mask latlons for hackey generalizability test
Aug 5, 2024
725e2af
formatting & cleaning
Aug 5, 2024
5f5d296
updated test split files
Aug 5, 2024
6ef5c14
bug fix
Aug 7, 2024
2440203
bug fixes
Aug 7, 2024
9b6088d
bug fixes and default argument updates
Aug 7, 2024
4417a3b
implemented simple balancing for croptype; CAUTION: makes training MU…
Aug 7, 2024
38907b0
switched class balancing from finetune_class to a new balanced_class;…
Aug 7, 2024
406fb16
Merge branch 'main' into croptype
kvantricht Aug 7, 2024
ee1b2c5
black fixes
kvantricht Aug 7, 2024
85e336e
isort fixes
kvantricht Aug 7, 2024
be27740
Black fix
kvantricht Aug 7, 2024
6f25105
Black fix
kvantricht Aug 7, 2024
8795d27
changed default from None to empty string, according to mypy suggesti…
Aug 8, 2024
019f37c
added missing imports
Aug 8, 2024
e682da2
changed conflicting variable name
Aug 8, 2024
f7790a4
fixed expected output types
Aug 8, 2024
e173157
removed duplicated function
Aug 8, 2024
6a9b07f
edited model loading to use updated loading function
Aug 8, 2024
e691a32
bug fix and mypy fixes
Aug 8, 2024
4ff8846
mypy typing fixes
Aug 8, 2024
d1c1687
another round of black fixes
Aug 8, 2024
c53c1b8
flake fixes and additional cleanup
Aug 8, 2024
4d9c39d
minor changes to pass ruff checks
Aug 8, 2024
4f0d8c7
formatting fixes
Aug 12, 2024
c684c2a
added balance as an argument
Aug 13, 2024
297b44d
added other_class to CROPTYPE19
Aug 13, 2024
b511938
added valid_time attribute handling for compatibility of Phase I and …
Aug 13, 2024
0c9e34b
fixed parsing balance as a bool
Aug 13, 2024
2c5e778
bug fixes
Aug 13, 2024
788d444
isort version fixes 🤦‍♀️
Aug 13, 2024
3afa345
another version black fix 😪
Aug 13, 2024
aead839
removed unused import
Aug 13, 2024
f79a904
removed unnecessary assignment
Aug 13, 2024
d5f4206
Merge branch 'main' into croptype
kvantricht Aug 20, 2024
3a27024
Merge branch 'main' into croptype
kvantricht Aug 21, 2024
4bc5995
Merge branch 'main' into croptype
kvantricht Aug 22, 2024
c0c06ee
fixes to pass test_dataset tests
Aug 20, 2024
d9cbc0c
fixes for test_eval tests
Aug 20, 2024
29edb75
fixes to pass test_presto tests
Aug 21, 2024
959269c
fixed reversed mapping of VV and VH 🤦‍♀️
Aug 21, 2024
748c40e
new asserts for new output format
Aug 21, 2024
5ac1448
added croptype eval test
Aug 22, 2024
66751c2
modified lr to be different for cropland/croptype training
Aug 22, 2024
3a68645
changed to cleaner json handling
Aug 22, 2024
f6a5cec
updated test_df to include more croptype samples for croptype tests
Aug 22, 2024
bf35cf4
isort fixes
Aug 22, 2024
ed85037
changing lr for croptype properly
Aug 22, 2024
f82be5a
added spatial inference test for croptype prediction
Aug 22, 2024
57b5099
formatting
Aug 22, 2024
753dc9a
resolving black version formatting
Aug 22, 2024
4f40b1f
resolving black version formatting v.2 😔
Aug 22, 2024
df8dcb8
added hiclass package to requirements
Aug 22, 2024
d72d819
Merge branch 'main' into croptype
kvantricht Aug 23, 2024
db561fa
uncommented model saving 🤦‍♀️ + minor changes
Sep 2, 2024
8a37d6e
substituted ifs with elifs as per Gabis suggestion
Sep 2, 2024
f5df76b
completely ignore catboost_info folder
Sep 2, 2024
751a919
removed catboost info folder
Sep 2, 2024
158012f
removed unnecessary commented lines
Sep 2, 2024
643c129
moved target_crop method into the WorldCerealLabelledDataset class as…
Sep 2, 2024
74525fc
breaking long lines
Sep 2, 2024
9043e8c
isort fixes
Sep 2, 2024
d0ada7a
putting target_crop back into WorldCerealBase class ☹
Sep 2, 2024
19e51b6
formatting fixes
Sep 2, 2024
59d2b3d
fixed computing valid_date_ind so that it's more robust; added fillin…
Sep 2, 2024
181ae8b
introduced MIN_SAMPLES_PER_CLASS parameter so that it can be reused i…
Sep 4, 2024
12ce57c
added additional balancing parameters; optimal values TBD
Sep 4, 2024
80b8313
add nans handling in metrics calculation
Sep 4, 2024
48c17e8
disentangled the device confusion in tests. Thanks Gabi!
Sep 4, 2024
75ff5f6
moved target_crop into WorldCerealLabelledDataset
Sep 4, 2024
47f581c
replaced model_mode parameter with a more transparent one; done some …
Sep 4, 2024
0f6df1d
formatting fixes
Sep 4, 2024
523014f
removed process_parquet function to utils
Sep 10, 2024
5c74855
addeded augment parameter
Sep 10, 2024
cb197a3
added function for timeseries subsetting, so that it is centered arou…
Sep 10, 2024
9e9ac9d
added augment parameter; replaced default link to new parquet file; a…
Sep 10, 2024
b3e0284
major rework of process_parquet function; minimal viable functionality
Sep 10, 2024
1d37536
moved MIN_EDGE_BUFFER parameter from utils to dataset.py
Sep 11, 2024
ae27e25
added logger message about enabled augmentation
Sep 11, 2024
8d7e4c1
removed augment=False parameter from evaluate function, since it is a…
Sep 11, 2024
b2f1aa3
rephrased checking if valid_date is too close to the edge without mes…
Sep 11, 2024
ff25509
bugs and typos fixes
Sep 11, 2024
e338c09
moved NODATA and MIN_EGDE parameters to dataops.py to avoid circular …
Sep 11, 2024
c9ffa01
updated test dataset to use new ong parquet format
Sep 11, 2024
f6e1a9e
updated tests
Sep 11, 2024
407cec9
created separate test file for process_parquet function
Sep 11, 2024
ad05b3d
an attempt to make time_token shift more general than just for months
Sep 11, 2024
91bfd27
merging main to croptype
Sep 20, 2024
e1b90f7
black fix
Sep 20, 2024
2945e9d
merging changes from main
Sep 20, 2024
ef06f94
black fix
Sep 20, 2024
30ab19c
adding test long parquet file
Sep 23, 2024
96dbc0d
fixed test file path
Sep 23, 2024
55dbbbe
isort fix
Sep 23, 2024
a7eedd8
fixed test and commented lines that will not be needed after merge
Sep 23, 2024
6f7646f
Formatting
kvantricht Sep 23, 2024
203c4ac
Formatting
kvantricht Sep 23, 2024
465d65a
making GT values binary crop/nocrop
Sep 23, 2024
54cc2be
Test with 1 epoch finetuning
kvantricht Sep 23, 2024
e44445a
Merge branch 'using-new-parquet-in-train' of github.com:WorldCereal/p…
kvantricht Sep 23, 2024
1a957a2
Bump einops version
kvantricht Sep 23, 2024
c83f7dc
created different py files for ss training and finetuning
Sep 24, 2024
5bf3743
fixed plotting functionality for new patches format
Sep 24, 2024
713a15a
removed unnecessary line
Sep 24, 2024
f8d9f84
fixed masking bug
Sep 25, 2024
c702081
fixed usage of time token during finetuning
Sep 25, 2024
100e606
added milder handling for lower mask_ratios
Sep 25, 2024
eda6d37
added logger messaging about balancing
Sep 25, 2024
eed2463
fixed bug in plotting
Sep 26, 2024
36f75a4
bug fixes and cleanup
Sep 26, 2024
2c0c325
added logging for masking and time token usage
Sep 26, 2024
e205514
bug fixes and cleanup
Sep 26, 2024
b216d3f
fixed SSL
Sep 26, 2024
aba07b0
added basic test to check balancing
Sep 26, 2024
9dab52d
fixed timestep_positions function for ssl
Sep 26, 2024
579da2f
added tests for temporal shift
Sep 26, 2024
1f6cb27
test fixes
Sep 27, 2024
8215473
isort fixes
Sep 27, 2024
14b1abe
formatting fixes
Sep 27, 2024
14cdce0
another version of black fixes 🤦‍♀️
Sep 27, 2024
767b187
Bump version
kvantricht Sep 30, 2024
b69eb3f
dont import matplotlib globally
kvantricht Oct 1, 2024
75cea4b
#108 avoid global import of `CLASS_MAPPINGS`
kvantricht Oct 1, 2024
08313a7
Remove unused import
kvantricht Oct 1, 2024
b25537b
Run tests with less CatBoost iterations
kvantricht Oct 1, 2024
e7147f3
Formatting fix
kvantricht Oct 1, 2024
094fc2b
added handling of corner case when during SSL we only have 12 timeste…
Oct 1, 2024
7eaa8ba
added a slightly better explanation of valid_position variable
Oct 1, 2024
fb255cf
fixed ndvi masking
Oct 1, 2024
bf76d57
formatting fixes
Oct 1, 2024
64c84fc
Allow running inference without valid_date token
kvantricht Oct 4, 2024
188f093
bug fix
Oct 4, 2024
f37314a
added proper NDVI masking to InferenceDataset + test
Oct 4, 2024
c566f37
formatting fixes
Oct 4, 2024
891a7ac
formatting fixes
Oct 4, 2024
6100d33
added corrected patch that starts with first day of month
Oct 4, 2024
ea28bf5
regenerated test features file
Oct 4, 2024
6c8c0a0
#109 pass `augment` argument
kvantricht Oct 5, 2024
c1eadb5
Add location_id and ref_id to processed parquet
kvantricht Oct 5, 2024
f8b0807
Add `ref_id` to test parquet
kvantricht Oct 6, 2024
d47ddfb
avoid if-else
kvantricht Oct 7, 2024
5cb8898
Avoid if else
kvantricht Oct 7, 2024
923340a
Avoid if-else
kvantricht Oct 7, 2024
8c88c56
Formatting
kvantricht Oct 7, 2024
ce3fae1
Run actions on PR to croptype
kvantricht Oct 7, 2024
ed60335
reintroduced ref_id into dataset and made cleaner logger message abou…
Oct 7, 2024
373e872
fixing the number of available_timesteps
Oct 7, 2024
f03649e
fixed available_timesteps computation for corner cases
Oct 7, 2024
c59c066
cleanup
Oct 7, 2024
3ba98a3
formatting
Oct 7, 2024
e806c28
additional check on the available_timesteps + descr
Oct 8, 2024
eee7dc5
isort fix, hopefully the correct version
Oct 8, 2024
f03c1cd
Check nr of timesteps in inference
kvantricht Oct 8, 2024
b051404
Merge branch 'timestep-position-debugging' of github.com:WorldCereal/…
kvantricht Oct 8, 2024
865ab1e
Attempt to auto-format
kvantricht Oct 8, 2024
add376a
Should be f-string
kvantricht Oct 9, 2024
e5e0109
Moved import to top
kvantricht Oct 9, 2024
89d72a4
Merge pull request #114 from WorldCereal/timestep-position-debugging
cbutsko Oct 9, 2024
e163e9f
removed unnecessary lines that double the size of embeddings
Oct 9, 2024
6780a83
added loading of finetuned model
Oct 9, 2024
6b904e8
slightly cleaner handling of valid_month token
Oct 9, 2024
66e0c72
changed strict to True during model loading
Oct 9, 2024
069ede7
enhanced plotting
Oct 10, 2024
8702c7f
updated masking not to take into account existing mask
Oct 10, 2024
f96f78d
turning on augmentation for downstream model
Oct 10, 2024
80bae7b
Bugfix: use `valid_month_as_token` kwarg
kvantricht Oct 11, 2024
2ef8dde
Formatting fixes
kvantricht Oct 11, 2024
ea042e7
Formatting fix bis
kvantricht Oct 11, 2024
205fd76
added tests for both for using valid_month token and not
Oct 11, 2024
8cd4d15
reverting masking changes for now; need to make sure it does not affe…
Oct 11, 2024
e4fdac4
changed default value of valid_month_as_token to False when loading m…
Oct 11, 2024
059bd6b
added valid_month related tests
Oct 11, 2024
641c8c5
commented lines that create ref feature files
Oct 11, 2024
e97fc70
added new reference feature files for with and without valid_month
Oct 11, 2024
2d889ee
removed unnecessary prints
Oct 11, 2024
6c43ceb
formatting
Oct 11, 2024
024ffc3
fixed test for valid_month token
Oct 11, 2024
6f05047
create ref feature files
Oct 11, 2024
34a026b
fixed tests
Oct 11, 2024
6b09d83
a very brave attempt to mess with encoder compile 🙈
Oct 11, 2024
e9cbfa8
removing redundant creation of valid_month token when the flag is False
Oct 12, 2024
080e20c
removed obsolete TODOs
Oct 12, 2024
a75adbf
Merge pull request #115 from WorldCereal/valid_month-and-mask-debugging
cbutsko Oct 12, 2024
aa2f74d
Bump version number to 0.1.6
kvantricht Oct 12, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -17,4 +17,6 @@ gha-creds-*.json
.idea
scrap
output/*
imgs/*
imgs/*
# don't track catboost training info
*/catboost_info
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

catboost_info looks like it is in git - should that folder be removed from git?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should indeed not be in git.

565 changes: 565 additions & 0 deletions catboost_info/catboost_training.json

Large diffs are not rendered by default.

Binary file added catboost_info/learn/events.out.tfevents
Binary file not shown.
Loading