Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[gemini] gemini support tensor parallelism. #4942

Merged
merged 47 commits into from
Nov 10, 2023

Commits on Nov 9, 2023

  1. [colossalai]fix typo

    flybird11111 committed Nov 9, 2023
    Configuration menu
    Copy the full SHA
    dc0dc0b View commit details
    Browse the repository at this point in the history
  2. [inference] Add smmoothquant for llama (hpcaitech#4904)

    * [inference] add int8 rotary embedding kernel for smoothquant (hpcaitech#4843)
    
    * [inference] add smoothquant llama attention (hpcaitech#4850)
    
    * add smoothquant llama attention
    
    * remove uselss code
    
    * remove useless code
    
    * fix import error
    
    * rename file name
    
    * [inference] add silu linear fusion for smoothquant llama mlp  (hpcaitech#4853)
    
    * add silu linear
    
    * update skip condition
    
    * catch smoothquant cuda lib exception
    
    * prcocess exception for tests
    
    * [inference] add llama mlp for smoothquant (hpcaitech#4854)
    
    * add llama mlp for smoothquant
    
    * fix down out scale
    
    * remove duplicate lines
    
    * add llama mlp check
    
    * delete useless code
    
    * [inference] add smoothquant llama (hpcaitech#4861)
    
    * add smoothquant llama
    
    * fix attention accuracy
    
    * fix accuracy
    
    * add kv cache and save pretrained
    
    * refactor example
    
    * delete smooth
    
    * refactor code
    
    * [inference] add smooth function and delete useless code for smoothquant (hpcaitech#4895)
    
    * add smooth function and delete useless code
    
    * update datasets
    
    * remove duplicate import
    
    * delete useless file
    
    * refactor codes (hpcaitech#4902)
    
    * rafactor code
    
    * add license
    
    * add torch-int and smoothquant license
    Xu-Kai authored and flybird11111 committed Nov 9, 2023
    Configuration menu
    Copy the full SHA
    dd59ca2 View commit details
    Browse the repository at this point in the history
  3. Update flash_attention_patch.py

    To be compatible with the new change in the Transformers library, where a new argument 'padding_mask' was added to forward function of attention layer.
    huggingface/transformers#25598
    Orion-Zheng authored and flybird11111 committed Nov 9, 2023
    Configuration menu
    Copy the full SHA
    52707c6 View commit details
    Browse the repository at this point in the history
  4. [kernel] support pure fp16 for cpu adam and update gemini optim tests (

    …hpcaitech#4921)
    
    * [kernel] support pure fp16 for cpu adam (hpcaitech#4896)
    
    * [kernel] fix cpu adam kernel for pure fp16 and update tests (hpcaitech#4919)
    
    * [kernel] fix cpu adam
    
    * [test] update gemini optim test
    ver217 authored and flybird11111 committed Nov 9, 2023
    Configuration menu
    Copy the full SHA
    61ec9f7 View commit details
    Browse the repository at this point in the history
  5. [format] applied code formatting on changed files in pull request 4908 (

    hpcaitech#4918)
    
    Co-authored-by: github-actions <[email protected]>
    2 people authored and flybird11111 committed Nov 9, 2023
    Configuration menu
    Copy the full SHA
    561553b View commit details
    Browse the repository at this point in the history
  6. [gemini] support gradient accumulation (hpcaitech#4869)

    * add test
    
    * fix no_sync bug in low level zero plugin
    
    * fix test
    
    * add argument for grad accum
    
    * add grad accum in backward hook for gemini
    
    * finish implementation, rewrite tests
    
    * fix test
    
    * skip stuck model in low level zero test
    
    * update doc
    
    * optimize communication & fix gradient checkpoint
    
    * modify doc
    
    * cleaning codes
    
    * update cpu adam fp16 case
    Fridge003 authored and flybird11111 committed Nov 9, 2023
    Configuration menu
    Copy the full SHA
    8d42002 View commit details
    Browse the repository at this point in the history
  7. [hotfix] fix torch 2.0 compatibility (hpcaitech#4936)

    * [hotfix] fix launch
    
    * [test] fix test gemini optim
    
    * [shardformer] fix vit
    ver217 authored and flybird11111 committed Nov 9, 2023
    Configuration menu
    Copy the full SHA
    da55732 View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    775ea1b View commit details
    Browse the repository at this point in the history
  9. [format] applied code formatting on changed files in pull request 4820 (

    hpcaitech#4886)
    
    Co-authored-by: github-actions <[email protected]>
    2 people authored and flybird11111 committed Nov 9, 2023
    Configuration menu
    Copy the full SHA
    0074178 View commit details
    Browse the repository at this point in the history
  10. Configuration menu
    Copy the full SHA
    907aa98 View commit details
    Browse the repository at this point in the history
  11. [Refactor] Integrated some lightllm kernels into token-attention (hpc…

    …aitech#4946)
    
    * add some req for inference
    
    * clean codes
    
    * add codes
    
    * add some lightllm deps
    
    * clean codes
    
    * hello
    
    * delete rms files
    
    * add some comments
    
    * add comments
    
    * add doc
    
    * add lightllm deps
    
    * add lightllm cahtglm2 kernels
    
    * add lightllm cahtglm2 kernels
    
    * replace rotary embedding with lightllm kernel
    
    * add some commnets
    
    * add some comments
    
    * add some comments
    
    * add
    
    * replace fwd kernel att1
    
    * fix a arg
    
    * add
    
    * add
    
    * fix token attention
    
    * add some comments
    
    * clean codes
    
    * modify comments
    
    * fix readme
    
    * fix bug
    
    * fix bug
    
    ---------
    
    Co-authored-by: cuiqing.li <[email protected]>
    Co-authored-by: CjhHa1 <[email protected]>
    3 people authored and flybird11111 committed Nov 9, 2023
    Configuration menu
    Copy the full SHA
    31fddbc View commit details
    Browse the repository at this point in the history
  12. [test] merge old components to test to model zoo (hpcaitech#4945)

    * [test] add custom models in model zoo
    
    * [test] update legacy test
    
    * [test] update model zoo
    
    * [test] update gemini test
    
    * [test] remove components to test
    ver217 authored and flybird11111 committed Nov 9, 2023
    Configuration menu
    Copy the full SHA
    8633a87 View commit details
    Browse the repository at this point in the history
  13. [inference] add reference and fix some bugs (hpcaitech#4937)

    * add reference and fix some bugs
    
    * update gptq init
    
    ---------
    
    Co-authored-by: Xu Kai <[email protected]>
    2 people authored and flybird11111 committed Nov 9, 2023
    Configuration menu
    Copy the full SHA
    9d543af View commit details
    Browse the repository at this point in the history
  14. [Inference]ADD Bench Chatglm2 script (hpcaitech#4963)

    * add bench chatglm
    
    * fix bug and make utils
    
    ---------
    
    Co-authored-by: CjhHa1 <cjh18671720497outlook.com>
    CjhHa1 authored and flybird11111 committed Nov 9, 2023
    Configuration menu
    Copy the full SHA
    fe79560 View commit details
    Browse the repository at this point in the history
  15. [Pipeline inference] Combine kvcache with pipeline inference (hpcaite…

    …ch#4938)
    
    * merge kvcache with pipeline inference and refactor the code structure
    
    * support ppsize > 2
    
    * refactor pipeline code
    
    * do pre-commit
    
    * modify benchmark
    
    * fix bench mark
    
    * polish code
    
    * add docstring and update readme
    
    * refactor the code
    
    * fix some logic bug of ppinfer
    
    * polish readme
    
    * fix typo
    
    * skip infer test
    FoolPlayer authored and flybird11111 committed Nov 9, 2023
    Configuration menu
    Copy the full SHA
    a610046 View commit details
    Browse the repository at this point in the history
  16. Configuration menu
    Copy the full SHA
    3b8137d View commit details
    Browse the repository at this point in the history
  17. [Inference] Dynamic Batching Inference, online and offline (hpcaitech…

    …#4953)
    
    * [inference] Dynamic Batching for Single and Multiple GPUs (hpcaitech#4831)
    
    * finish batch manager
    
    * 1
    
    * first
    
    * fix
    
    * fix dynamic batching
    
    * llama infer
    
    * finish test
    
    * support different lengths generating
    
    * del prints
    
    * del prints
    
    * fix
    
    * fix bug
    
    ---------
    
    Co-authored-by: CjhHa1 <cjh18671720497outlook.com>
    
    * [inference] Async dynamic batching  (hpcaitech#4894)
    
    * finish input and output logic
    
    * add generate
    
    * test forward
    
    * 1
    
    * [inference]Re push async dynamic batching (hpcaitech#4901)
    
    * adapt to ray server
    
    * finish async
    
    * finish test
    
    * del test
    
    ---------
    
    Co-authored-by: yuehuayingxueluo <[email protected]>
    
    * Revert "[inference]Re push async dynamic batching (hpcaitech#4901)" (hpcaitech#4905)
    
    This reverts commit fbf3c09.
    
    * Revert "[inference] Async dynamic batching  (hpcaitech#4894)"
    
    This reverts commit fced140.
    
    * Revert "[inference] Async dynamic batching  (hpcaitech#4894)" (hpcaitech#4909)
    
    This reverts commit fced140.
    
    * Add Ray Distributed Environment Init Scripts
    
    * support DynamicBatchManager base function
    
    * revert _set_tokenizer version
    
    * add driver async generate
    
    * add async test
    
    * fix bugs in test_ray_dist.py
    
    * add get_tokenizer.py
    
    * fix code style
    
    * fix bugs about No module named 'pydantic' in ci test
    
    * fix bugs in ci test
    
    * fix bugs in ci test
    
    * fix bugs in ci test
    
    * [infer]Add Ray Distributed Environment Init Scripts (hpcaitech#4911)
    
    * Revert "[inference] Async dynamic batching  (hpcaitech#4894)"
    
    This reverts commit fced140.
    
    * Add Ray Distributed Environment Init Scripts
    
    * support DynamicBatchManager base function
    
    * revert _set_tokenizer version
    
    * add driver async generate
    
    * add async test
    
    * fix bugs in test_ray_dist.py
    
    * add get_tokenizer.py
    
    * fix code style
    
    * fix bugs about No module named 'pydantic' in ci test
    
    * fix bugs in ci test
    
    * fix bugs in ci test
    
    * fix bugs in ci test
    
    * support dynamic batch for bloom model and is_running function
    
    * [Inference]Test for new Async engine (hpcaitech#4935)
    
    * infer engine
    
    * infer engine
    
    * test engine
    
    * test engine
    
    * new manager
    
    * change step
    
    * add
    
    * test
    
    * fix
    
    * fix
    
    * finish test
    
    * finish test
    
    * finish test
    
    * finish test
    
    * add license
    
    ---------
    
    Co-authored-by: yuehuayingxueluo <[email protected]>
    
    * add assertion for config (hpcaitech#4947)
    
    * [Inference] Finish dynamic batching offline test (hpcaitech#4948)
    
    * test
    
    * fix test
    
    * fix quant
    
    * add default
    
    * fix
    
    * fix some bugs
    
    * fix some bugs
    
    * fix
    
    * fix bug
    
    * fix bugs
    
    * reset param
    
    ---------
    
    Co-authored-by: yuehuayingxueluo <[email protected]>
    Co-authored-by: Cuiqing Li <[email protected]>
    Co-authored-by: CjhHa1 <cjh18671720497outlook.com>
    3 people authored and flybird11111 committed Nov 9, 2023
    Configuration menu
    Copy the full SHA
    9fce43b View commit details
    Browse the repository at this point in the history
  18. [Kernels]Updated Triton kernels into 2.1.0 and adding flash-decoding …

    …for llama token attention (hpcaitech#4965)
    
    * adding flash-decoding
    
    * clean
    
    * adding kernel
    
    * adding flash-decoding
    
    * add integration
    
    * add
    
    * adding kernel
    
    * adding kernel
    
    * adding triton 2.1.0 features for inference
    
    * update bloom triton kernel
    
    * remove useless vllm kernels
    
    * clean codes
    
    * fix
    
    * adding files
    
    * fix readme
    
    * update llama flash-decoding
    
    ---------
    
    Co-authored-by: cuiqing.li <[email protected]>
    2 people authored and flybird11111 committed Nov 9, 2023
    Configuration menu
    Copy the full SHA
    62eb99f View commit details
    Browse the repository at this point in the history
  19. fix ColossalEval (hpcaitech#4992)

    Co-authored-by: Xu Yuanchen <[email protected]>
    2 people authored and flybird11111 committed Nov 9, 2023
    Configuration menu
    Copy the full SHA
    fa1cbd3 View commit details
    Browse the repository at this point in the history
  20. [doc]Update doc for colossal-inference (hpcaitech#4989)

    * update doc
    
    * Update README.md
    
    ---------
    
    Co-authored-by: cuiqing.li <[email protected]>
    2 people authored and flybird11111 committed Nov 9, 2023
    Configuration menu
    Copy the full SHA
    3209431 View commit details
    Browse the repository at this point in the history
  21. [hotfix] Fix the bug where process groups were not being properly rel…

    …eased. (hpcaitech#4940)
    
    * Fix the bug where process groups were not being properly released.
    
    * test
    
    * Revert "test"
    
    This reverts commit 479900c.
    littsk authored and flybird11111 committed Nov 9, 2023
    Configuration menu
    Copy the full SHA
    f0482f4 View commit details
    Browse the repository at this point in the history
  22. Configuration menu
    Copy the full SHA
    cd8ad65 View commit details
    Browse the repository at this point in the history
  23. Configuration menu
    Copy the full SHA
    5266946 View commit details
    Browse the repository at this point in the history
  24. [Pipeline Inference] Merge pp with tp (hpcaitech#4993)

    * refactor pipeline into new CaiInferEngine
    
    * updata llama modeling forward
    
    * merge tp with pp
    
    * update docstring
    
    * optimize test workflow and example
    
    * fix typo
    
    * add assert and todo
    FoolPlayer authored and flybird11111 committed Nov 9, 2023
    Configuration menu
    Copy the full SHA
    ab8468c View commit details
    Browse the repository at this point in the history
  25. [release] update version (hpcaitech#4995)

    * [release] update version
    
    * [hotfix] fix ci
    ver217 authored and flybird11111 committed Nov 9, 2023
    Configuration menu
    Copy the full SHA
    f9c1920 View commit details
    Browse the repository at this point in the history
  26. [gemini] gemini support tp

    [gemini] gemini support tp
    
    [gemini] gemini support tp
    
    [gemini] gemini support tp
    
    [gemini] gemini support tp
    flybird11111 committed Nov 9, 2023
    Configuration menu
    Copy the full SHA
    2043b9d View commit details
    Browse the repository at this point in the history
  27. fix

    fix
    
    fix
    flybird11111 committed Nov 9, 2023
    Configuration menu
    Copy the full SHA
    da1915d View commit details
    Browse the repository at this point in the history
  28. update checkpointIO

    update checkpointIO
    
    update checkpointIO
    
    update checkpointIO
    
    update checkpointIO
    
    update checkpointIO
    
    update checkpointIO
    
    update checkpointIO
    
    update checkpointIO
    flybird11111 committed Nov 9, 2023
    Configuration menu
    Copy the full SHA
    9fd9e69 View commit details
    Browse the repository at this point in the history
  29. support fused layernorm

    support fused layernorm
    
    support fused layernorm
    flybird11111 committed Nov 9, 2023
    Configuration menu
    Copy the full SHA
    a89f2fd View commit details
    Browse the repository at this point in the history
  30. update fusedlayernorm

    update fusedlayernorm
    
    update fusedlayernorm
    flybird11111 committed Nov 9, 2023
    Configuration menu
    Copy the full SHA
    2406cb0 View commit details
    Browse the repository at this point in the history
  31. add sequence parallel to gemini

    add sequence parallel to gemini
    flybird11111 committed Nov 9, 2023
    Configuration menu
    Copy the full SHA
    a0509a6 View commit details
    Browse the repository at this point in the history
  32. fix

    flybird11111 committed Nov 9, 2023
    Configuration menu
    Copy the full SHA
    12cd780 View commit details
    Browse the repository at this point in the history
  33. fix comments

    fix comments
    
    fix comments
    flybird11111 committed Nov 9, 2023
    Configuration menu
    Copy the full SHA
    0110902 View commit details
    Browse the repository at this point in the history
  34. fix

    flybird11111 committed Nov 9, 2023
    Configuration menu
    Copy the full SHA
    86a5eca View commit details
    Browse the repository at this point in the history
  35. fix t5

    flybird11111 committed Nov 9, 2023
    Configuration menu
    Copy the full SHA
    6f13876 View commit details
    Browse the repository at this point in the history
  36. clear cache

    flybird11111 committed Nov 9, 2023
    Configuration menu
    Copy the full SHA
    5f16e4f View commit details
    Browse the repository at this point in the history
  37. fix

    flybird11111 committed Nov 9, 2023
    Configuration menu
    Copy the full SHA
    adead50 View commit details
    Browse the repository at this point in the history
  38. activate ci

    flybird11111 committed Nov 9, 2023
    Configuration menu
    Copy the full SHA
    ed825dc View commit details
    Browse the repository at this point in the history
  39. activate ci

    flybird11111 committed Nov 9, 2023
    Configuration menu
    Copy the full SHA
    37494c3 View commit details
    Browse the repository at this point in the history
  40. fix

    flybird11111 committed Nov 9, 2023
    Configuration menu
    Copy the full SHA
    73da4ca View commit details
    Browse the repository at this point in the history
  41. fix

    flybird11111 committed Nov 9, 2023
    Configuration menu
    Copy the full SHA
    cf2bc63 View commit details
    Browse the repository at this point in the history
  42. fix

    flybird11111 committed Nov 9, 2023
    Configuration menu
    Copy the full SHA
    6c85a9e View commit details
    Browse the repository at this point in the history
  43. fix

    flybird11111 committed Nov 9, 2023
    Configuration menu
    Copy the full SHA
    8dd4b41 View commit details
    Browse the repository at this point in the history
  44. revert

    flybird11111 committed Nov 9, 2023
    Configuration menu
    Copy the full SHA
    3d8319e View commit details
    Browse the repository at this point in the history
  45. modify tp gather method

    modify tp gather method
    
    modify tp gather method
    
    modify tp gather method
    flybird11111 committed Nov 9, 2023
    Configuration menu
    Copy the full SHA
    66ffed5 View commit details
    Browse the repository at this point in the history
  46. fix test

    flybird11111 committed Nov 9, 2023
    Configuration menu
    Copy the full SHA
    c40c459 View commit details
    Browse the repository at this point in the history
  47. Configuration menu
    Copy the full SHA
    bc575a2 View commit details
    Browse the repository at this point in the history