Releases: open-mmlab/mmpretrain
MMPreTrain Release v1.0.1
Fix some bugs and enhance the codebase.
What's Changed
- [Fix] Fix Wrong-paramer Bug of RandomCrop by @Ezra-Yu in #1706
- [Refactor] BEiT refactor by @fanqiNO1 in #1705
- [Refactor] Fix spelling by @fanqiNO1 in #1689
- [Fix] Freezing of cls_token in VisionTransformer by @fabien-merceron in #1693
- [Fix] Typo fix of 'target' in vis_cam.py by @bryanbocao in #1655
- [Feature] Support LoRA by @fanqiNO1 in #1687
- [Fix] Fix the issue #1711 "GaussianBlur doesn't work" by @liyunlong10 in #1722
- [Enhance] Add GPU Acceleration Apple silicon mac by @NripeshN in #1699
- [Enhance] Adapt test cases on Ascend NPU. by @Ginray in #1728
- [Enhance] Nested predict by @marouaneamz in #1716
- [Enhance] Set 'is_init' in some multimodal methods by @fangyixiao18 in #1718
- [Enhance] Add init_cfg with type='pretrained' to downstream tasks by @fangyixiao18 in #1717
- [Fix] Fix dict update in minigpt4 by @fangyixiao18 in #1709
- Bump version to 1.0.1 by @fangyixiao18 in #1731
New Contributors
- @fabien-merceron made their first contribution in #1693
- @bryanbocao made their first contribution in #1655
- @liyunlong10 made their first contribution in #1722
- @NripeshN made their first contribution in #1699
Full Changelog: 1.0.0...v1.0.1
MMPreTrain Release v1.0.0: Backbones, Self-Supervised Learning and Multi-Modalilty
MMPreTrain Release v1.0.0: Backbones, Self-Supervised Learning and Multi-Modalilty
Support more multi-modal algorithms and datasets
We are excited to announce that there are several advanced multi-modal methods suppported! We integrated huggingface/transformers with vision backbones in MMPreTrain to run inference and training(in developing).
Methods | Datasets |
---|---|
BLIP (arxiv'2022) | COCO (caption, retrieval, vqa) |
BLIP-2 (arxiv'2023) | Flickr30k (caption, retrieval) |
OFA (CoRR'2022) | GQA |
Flamingo (NeurIPS'2022) | NLVR2 |
Chinese CLIP (arxiv'2022) | NoCaps |
MiniGPT-4 (arxiv'2023) | OCR VQA |
LLaVA (arxiv'2023) | Text VQA |
Otter (arxiv'2023) | VG VQA |
VisualGenomeQA | |
VizWiz | |
VSR |
Add iTPN, SparK self-supervised learning algorithms.
Provide examples of New Config and DeepSpeed/FSDP
We test DeepSpeed and FSDP with MMEngine. The following are the memory and training time with ViT-large, ViT-huge and 8B multi-modal models, the left figure is the memory data, and the right figure is the training time data.
Test environment: 8*A100 (80G) PyTorch 2.0.0
Remark: Both FSDP and DeepSpeed were tested with default configurations and not tuned, besides manually tuning the FSDP wrap policy can further reduce training time and memory usage.
New Features
- Transfer shape-bias tool from mmselfsup (#1658)
- Download dataset by using MIM&OpenDataLab (#1630)
- Support New Configs (#1639, #1647, #1665)
- Support Flickr30k Retrieval dataset (#1625)
- Support SparK (#1531)
- Support LLaVA (#1652)
- Support Otter (#1651)
- Support MiniGPT-4 (#1642)
- Add support for VizWiz dataset (#1636)
- Add support for vsr dataset (#1634)
- Add InternImage Classification project (#1569)
- Support OCR-VQA dataset (#1621)
- Support OK-VQA dataset (#1615)
- Support TextVQA dataset (#1569)
- Support iTPN and HiViT (#1584)
- Add retrieval mAP metric (#1552)
- Support NoCap dataset based on BLIP. (#1582)
- Add GQA dataset (#1585)
Improvements
- Update fsdp vit-huge and vit-large config (#1675)
- Support deepspeed with flexible runner (#1673)
- Update Otter and LLaVA docs and config. (#1653)
- Add image_only param of ScienceQA (#1613)
- Support to use "split" to specify training set/validation (#1535)
Bug Fixes
- Refactor _prepare_pos_embed in ViT (#1656, #1679)
- Freeze pre norm in vision transformer (#1672)
- Fix bug loading IN1k dataset (#1641)
- Fix sam bug (#1633)
- Fixed circular import error for new transform (#1609)
- Update torchvision transform wrapper (#1595)
- Set default out_type in CAM visualization (#1586)
Docs Update
New Contributors
- @alexwangxiang made their first contribution in #1555
- @InvincibleWyq made their first contribution in #1615
- @yyk-wew made their first contribution in #1634
- @fanqiNO1 made their first contribution in #1673
- @Ben-Louis made their first contribution in #1679
- @Lamply made their first contribution in #1671
- @minato-ellie made their first contribution in #1644
- @liweiwp made their first contribution in #1629
MMPreTrain Release v1.0.0rc8: Multi-Modality Support
Highlights
- Support multiple multi-modal algorithms and inferencers. You can explore these features by the gradio demo!
- Add EVA-02, Dino-V2, ViT-SAM and GLIP backbones.
- Register torchvision transforms into MMPretrain, you can now easily integrate torchvision's data augmentations in MMPretrain.
New Features
- Support Chinese CLIP. (#1576)
- Add ScienceQA Metrics (#1577)
- Support multiple multi-modal algorithms and inferencers. (#1561)
- add eva02 backbone (#1450)
- Support dinov2 backbone (#1522)
- Support some downstream classification datasets. (#1467)
- Support GLIP (#1308)
- Register torchvision transforms into mmpretrain (#1265)
- Add ViT of SAM (#1476)
Improvements
- [Refactor] Support to freeze channel reduction and add layer decay function (#1490)
- [Refactor] Support resizing pos_embed while loading ckpt and format output (#1488)
Bug Fixes
- Fix scienceqa (#1581)
- Fix config of beit (#1528)
- Incorrect stage freeze on RIFormer Model (#1573)
- Fix ddp bugs caused by
out_type
. (#1570) - Fix multi-task-head loss potential bug (#1530)
- Support bce loss without batch augmentations (#1525)
- Fix clip generator init bug (#1518)
- Fix the bug in binary cross entropy loss (#1499)
Docs Update
- Update PoolFormer citation to CVPR version (#1505)
- Refine Inference Doc (#1489)
- Add doc for usage of confusion matrix (#1513)
- Update MMagic link (#1517)
- Fix example_project README (#1575)
- Add NPU support page (#1481)
- train cfg: Removed old description (#1473)
- Fix typo in MultiLabelDataset docstring (#1483)
Contributors
A total of 12 developers contributed to this release.
@XiudingCai @Ezra-Yu @KeiChiTse @mzr1996 @bobo0810 @wangbo-zhao @yuweihao @fangyixiao18 @YuanLiuuuuuu @MGAMZ @okotaku @zzc98
MMPreTrain Release v1.0.0rc7: Providing powerful backbones with various pre-training strategies
MMPreTrain v1.0.0rc7 Release Notes
- Highlights
- New Features
- Improvements
- Bug Fixes
- Docs Update
Highlights
We are excited to announce that MMClassification and MMSelfSup have been merged into ONE codebase, named MMPreTrain, which has the following highlights:
- Integrated Self-supervised learning algorithms from MMSelfSup, such as MAE, BEiT, etc. Users could find that in our directory
mmpretrain/models
, where a new folderselfsup
was made, which support 18 recent self-supervised learning algorithms.
Contrastive leanrning | Masked image modeling |
---|---|
MoCo series | BEiT series |
SimCLR | MAE |
BYOL | SimMIM |
SwAV | MaskFeat |
DenseCL | CAE |
SimSiam | MILAN |
BarlowTwins | EVA |
DenseCL | MixMIM |
- Support RIFormer, which is a way to keep a vision backbone effective while removing token mixers in its basic building blocks. Equipped with our proposed optimization strategy, we are able to build an extremely simple vision backbone with encouraging performance, while enjoying high efficiency during inference.
-
Support LeViT, XCiT, ViG, and ConvNeXt-V2 backbone, thus currently we support 68 backbones or algorithms and 472 checkpoints.
-
Add t-SNE visualization, users could visualize t-SNE to analyze the ability of your backbone. An example of visualization: left is from
MoCoV2_ResNet50
and the right is fromMAE_ViT-base
:
- Refactor dataset pipeline visualization, now we could also visualize the pipeline of mask image modeling, such as BEiT:
New Features
- Support RIFormer. (#1453)
- Support XCiT Backbone. (#1305)
- Support calculate confusion matrix and plot it. (#1287)
- Support RetrieverRecall metric & Add ArcFace config (#1316)
- Add
ImageClassificationInferencer
. (#1261) - Support InShop Dataset (Image Retrieval). (#1019)
- Support LeViT backbone. (#1238)
- Support VIG Backbone. (#1304)
- Support ConvNeXt-V2 backbone. (#1294)
Improvements
- Use PyTorch official
scaled_dot_product_attention
to accelerateMultiheadAttention
. (#1434) - Add ln to vit avg_featmap output (#1447)
- Update analysis tools and documentations. (#1359)
- Unify the
--out
and--dump
intools/test.py
. (#1307) - Enable to toggle whether Gem Pooling is trainable or not. (#1246)
- Update registries of mmcls. (#1306)
- Add metafile fill and validation tools. (#1297)
- Remove useless EfficientnetV2 config files. (#1300)
Bug Fixes
- Fix precise bn hook (#1466)
- Fix retrieval multi gpu bug (#1319)
- Fix error repvgg-deploy base config path. (#1357)
- Fix bug in test tools. (#1309)
Docs Update
Contributors
A total of 13 developers contributed to this release.
Thanks to @techmonsterwang , @qingtian5 , @mzr1996 , @okotaku , @zzc98 , @aso538 , @szwlh-c , @fangyixiao18 , @yukkyo , @Ezra-Yu , @csatsurnh , @2546025323 , @GhaSiKey .
New Contributors
- @csatsurnh made their first contribution in #1309
- @szwlh-c made their first contribution in #1304
- @aso538 made their first contribution in #1238
- @GhaSiKey made their first contribution in #1313
- @yukkyo made their first contribution in #1246
- @2546025323 made their first contribution in #1321
Full Changelog: v1.0.0rc5...v1.0.0rc7
MMClassification Release V1.0.0rc5
Highlights
- Support EVA, RevViT, EfficientnetV2, CLIP, TinyViT and MixMIM backbones.
- Reproduce the training accuracy of ConvNeXt and RepVGG.
- Support multi-task training and testing.
- Support Test-time Augmentation.
New Features
- [Feature] Add EfficientnetV2 Backbone. (#1253)
- [Feature] Support TTA and add
--tta
intools/test.py
. (#1161) - [Feature] Support Multi-task. (#1229)
- [Feature] Add clip backbone. (#1258)
- [Feature] Add mixmim backbone with checkpoints. (#1224)
- [Feature] Add TinyViT for dev-1.x. (#1042)
- [Feature] Add some scripts for development. (#1257)
- [Feature] Support EVA. (#1239)
- [Feature] Implementation of RevViT. (#1127)
Improvements
- [Reproduce] Reproduce RepVGG Training Accuracy. (#1264)
- [Enhance] Support ConvNeXt More Weights. (#1240)
- [Reproduce] Update ConvNeXt config files. (#1256)
- [CI] Update CI to test PyTorch 1.13.0. (#1260)
- [Project] Add ACCV workshop 1st Solution. (#1245)
- [Project] Add Example project. (#1254)
Bug Fixes
- [Fix] Fix imports in transforms. (#1255)
- [Fix] Fix CAM visualization. (#1248)
- [Fix] Fix the requirements and lazy register mmcls models. (#1275)
Contributors
A total of 12 developers contributed to this release.
@marouaneamz @piercus @Ezra-Yu @mzr1996 @bobo0810 @suibe-qingtian @Scarecrow0 @tonysy @WINDSKY45 @wangbo-zhao @Francis777 @okotaku
MMClassification Release V1.0.0rc4
Highlights
- New API to get pre-defined models of MMClassification. See #1236 for more details.
- Refactor BEiT backbone and support v1/v2 inference. See #1144.
New Features
- Support getting models from the name defined in the model-index file. (#1236)
Improvements
- Support evaluation on both EMA and non-EMA models. (#1204)
- Refactor BEiT backbone and support v1/v2 inference. (#1144)
Bug Fixes
Docs Update
- Update install tutorial. (#1223)
- Update MobileNetv2 & MobileNetv3 readme. (#1222)
- Add version selection in the banner. (#1217)
Contributors
A total of 4 developers contributed to this release.
MMClassification Release V0.25.0
Highlights
- Support MLU backend.
- Add
dist_train_arm.sh
for ARM device.
New Features
Improvements
- Add
dist_train_arm.sh
for ARM device and update NPU results. (#1218)
Bug Fixes
- Fix a bug caused
MMClsWandbHook
stuck. (#1242) - Fix the redundant
device_ids
intools/test.py
. (#1215)
Docs Update
- Add version banner and version warning in master docs. (#1216)
- Update NPU support doc. (#1198)
- Fixed typo in
pytorch2torchscript.md
. (#1173) - Fix typo in
miscellaneous.md
. (#1137) - further detail for the doc for
ClassBalancedDataset
. (#901)
Contributors
A total of 7 developers contributed to this release.
@nijkah @xiaoyuan0203 @mzr1996 @Qiza-lyhm @ganghe74 @unseenme @wangjiangben-hw
MMClassification Release V1.0.0rc3
Highlights
- Add Switch Recipe Hook, Now we can modify training pipeline, mixup and loss settings during training, see #1101.
- Add TIMM and HuggingFace wrappers. Now you can train/use models in TIMM/HuggingFace directly, see #1102.
- Support retrieval tasks, see #1055.
- Reproduce MobileOne training accuracy. See #1191.
New Features
- Add checkpoints from EfficientNets NoisyStudent & L2. (#1122)
- Migrate CSRA head to 1.x. (#1177)
- Support RepLKnet backbone. (#1129)
- Add Switch Recipe Hook. (#1101)
- Add adan optimizer. (#1180)
- Support DaViT. (#1105)
- Support Activation Checkpointing for ConvNeXt. (#1153)
- Add TIMM and HuggingFace wrappers to build classifiers from them directly. (#1102)
- Add reduction for neck (#978)
- Support HorNet Backbone for dev1.x. (#1094)
- Add arcface head. (#926)
- Add Base Retriever and Image2Image Retriever for retrieval tasks. (#1055)
- Support MobileViT backbone. (#1068)
Improvements
- [Enhance] Enhance ArcFaceClsHead. (#1181)
- [Refactor] Refactor to use new fileio API in MMEngine. (#1176)
- [Enhance] Reproduce mobileone training accuracy. (#1191)
- [Enhance] add deleting params info in swinv2. (#1142)
- [Enhance] Add more mobilenetv3 pretrains. (#1154)
- [Enhancement] RepVGG for YOLOX-PAI for dev-1.x. (#1126)
- [Improve] Speed up data preprocessor. (#1064)
Bug Fixes
- Fix the torchserve. (#1143)
- Fix configs due to api refactor of
num_classes
. (#1184) - Update mmcls2torchserve. (#1189)
- Fix for
inference_model
cannot get classes information in checkpoint. (#1093)
Docs Update
- Add not-found page extension. (#1207)
- update visualization doc. (#1160)
- Support sort and search the Model Summary table. (#1100)
- Improve the ResNet model page. (#1118)
- update the readme of convnext. (#1156)
- Fix the installation docs link in README. (#1164)
- Improve ViT and MobileViT model pages. (#1155)
- Improve Swin Doc and Add Tabs enxtation. (#1145)
- Add MMEval projects link in README. (#1162)
- Add runtime configuration docs. (#1128)
- Add custom evaluation docs (#1130)
- Add custom pipeline docs. (#1124)
- Add MMYOLO projects link in MMCLS1.x. (#1117)
Contributors
A total of 14 developers contributed to this release.
@austinmw @Ezra-Yu @nijkah @yingfhu @techmonsterwang @mzr1996 @sanbuphy @tonysy @XingyuXie @gaoyang07 @kitecats @marouaneamz @okotaku @zzc98
MMClassification Release V0.24.1
New Features
- [Feature] Support mmcls with NPU backend. (#1072)
Bug Fixes
- [Fix] Fix performance issue in convnext DDP train. (#1098)
Contributors
A total of 3 developers contributed to this release.
@wangjiangben-hw @790475019 @mzr1996