Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

adding default inductor config settings #423

Merged
merged 13 commits into from
Jun 25, 2024
Merged

Commits on Jun 25, 2024

  1. adding default inductor config settings

    Summary:
    
    making autoquant and quantize apis call a new
    recommended_inductor_config_setter util to set recommended apis
    
    also update groupsize -> groupsize in generate.py
    
    Test Plan:
    
    sh benchmarks.sh
    
    comparison of different config combinations for matmul precision,
    mixed_mm and coordinate_descent
    
    tok/s=  9.14, mem/s=  60.55 GB/s, peak_mem= 8.33 GB, model_size= 6.62 GB quant: int8dq, mod: Llama-2-7b-chat-hf,
    tok/s=147.02, mem/s= 973.53 GB/s, peak_mem= 8.95 GB, model_size= 6.62 GB quant: int8wo, mod: Llama-2-7b-chat-hf,
    tok/s=  9.23, mem/s=  61.11 GB/s, peak_mem= 8.33 GB, model_size= 6.62 GB quant: int8dq, mod: Llama-2-7b-chat-hf,
    tok/s=139.59, mem/s= 924.33 GB/s, peak_mem= 8.95 GB, model_size= 6.62 GB quant: int8wo, mod: Llama-2-7b-chat-hf,
    tok/s=  9.10, mem/s=  60.26 GB/s, peak_mem= 8.33 GB, model_size= 6.62 GB quant: int8dq, mod: Llama-2-7b-chat-hf,
    tok/s=146.98, mem/s= 973.23 GB/s, peak_mem= 8.95 GB, model_size= 6.62 GB quant: int8wo, mod: Llama-2-7b-chat-hf,
    tok/s=  9.28, mem/s=  61.48 GB/s, peak_mem= 8.33 GB, model_size= 6.62 GB quant: int8dq, mod: Llama-2-7b-chat-hf,
    tok/s=146.90, mem/s= 972.73 GB/s, peak_mem= 8.95 GB, model_size= 6.62 GB quant: int8wo, mod: Llama-2-7b-chat-hf,
    tok/s=  9.08, mem/s=  60.09 GB/s, peak_mem= 8.33 GB, model_size= 6.62 GB quant: int8dq, mod: Llama-2-7b-chat-hf,
    tok/s=137.58, mem/s= 911.00 GB/s, peak_mem= 8.95 GB, model_size= 6.62 GB quant: int8wo, mod: Llama-2-7b-chat-hf,
    tok/s=  9.19, mem/s=  60.87 GB/s, peak_mem= 8.61 GB, model_size= 6.62 GB quant: int8dq, mod: Llama-2-7b-chat-hf,
    tok/s=166.02, mem/s=1099.30 GB/s, peak_mem= 8.97 GB, model_size= 6.62 GB quant: int8wo, mod: Llama-2-7b-chat-hf,
    
    Reviewers:
    
    Subscribers:
    
    Tasks:
    
    Tags:
    HDCharles committed Jun 25, 2024
    Configuration menu
    Copy the full SHA
    0e5fc3e View commit details
    Browse the repository at this point in the history
  2. fixing tests

    Summary:
    
    Test Plan:
    
    Reviewers:
    
    Subscribers:
    
    Tasks:
    
    Tags:
    HDCharles committed Jun 25, 2024
    Configuration menu
    Copy the full SHA
    1421de8 View commit details
    Browse the repository at this point in the history
  3. fix weight only failures

    Summary:
    
    Test Plan:
    
    Reviewers:
    
    Subscribers:
    
    Tasks:
    
    Tags:
    HDCharles committed Jun 25, 2024
    Configuration menu
    Copy the full SHA
    c9932ce View commit details
    Browse the repository at this point in the history
  4. fixing new broken test

    Summary:
    
    Test Plan:
    
    Reviewers:
    
    Subscribers:
    
    Tasks:
    
    Tags:
    HDCharles committed Jun 25, 2024
    Configuration menu
    Copy the full SHA
    18cd1aa View commit details
    Browse the repository at this point in the history
  5. fixing autoquant test

    Summary:
    
    Test Plan:
    
    Reviewers:
    
    Subscribers:
    
    Tasks:
    
    Tags:
    HDCharles committed Jun 25, 2024
    Configuration menu
    Copy the full SHA
    ee5183f View commit details
    Browse the repository at this point in the history
  6. testing if inductor config is the issue

    Summary:
    
    Test Plan:
    
    Reviewers:
    
    Subscribers:
    
    Tasks:
    
    Tags:
    HDCharles committed Jun 25, 2024
    Configuration menu
    Copy the full SHA
    d0e822b View commit details
    Browse the repository at this point in the history
  7. are inductor configs somehow being set?

    Summary:
    
    Test Plan:
    
    Reviewers:
    
    Subscribers:
    
    Tasks:
    
    Tags:
    HDCharles committed Jun 25, 2024
    Configuration menu
    Copy the full SHA
    a5654f3 View commit details
    Browse the repository at this point in the history
  8. when is coordinate descent tuning beinng enabled?

    Summary:
    
    Test Plan:
    
    Reviewers:
    
    Subscribers:
    
    Tasks:
    
    Tags:
    HDCharles committed Jun 25, 2024
    Configuration menu
    Copy the full SHA
    26d9110 View commit details
    Browse the repository at this point in the history
  9. reset inductor config for tests

    Summary:
    
    Test Plan:
    
    Reviewers:
    
    Subscribers:
    
    Tasks:
    
    Tags:
    HDCharles committed Jun 25, 2024
    Configuration menu
    Copy the full SHA
    762ef41 View commit details
    Browse the repository at this point in the history
  10. more test fixes

    Summary:
    
    Test Plan:
    
    Reviewers:
    
    Subscribers:
    
    Tasks:
    
    Tags:
    HDCharles committed Jun 25, 2024
    Configuration menu
    Copy the full SHA
    b0c4e23 View commit details
    Browse the repository at this point in the history
  11. adding warning

    Summary:
    
    Test Plan:
    
    Reviewers:
    
    Subscribers:
    
    Tasks:
    
    Tags:
    HDCharles committed Jun 25, 2024
    Configuration menu
    Copy the full SHA
    8fadc39 View commit details
    Browse the repository at this point in the history
  12. handling of errors

    Summary:
    
    Test Plan:
    
    Reviewers:
    
    Subscribers:
    
    Tasks:
    
    Tags:
    HDCharles committed Jun 25, 2024
    Configuration menu
    Copy the full SHA
    3c2825e View commit details
    Browse the repository at this point in the history
  13. option to supress autoquant errors

    Summary:
    
    Test Plan:
    
    Reviewers:
    
    Subscribers:
    
    Tasks:
    
    Tags:
    HDCharles committed Jun 25, 2024
    Configuration menu
    Copy the full SHA
    d105072 View commit details
    Browse the repository at this point in the history