-
Notifications
You must be signed in to change notification settings - Fork 490
Pull requests: sgl-project/sglang
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[Performance, Triton] Optimize over mask compute to tl.load in fused_moe_kernel
#1980
opened Nov 10, 2024 by
HaiShaw
Loading…
Offline LLM Engine Benchmark Throughput
await-response
#1968
opened Nov 9, 2024 by
zolinthecow
Loading…
3 tasks done
[rust] cache-aware DP - approx tree
high priority
#1934
opened Nov 6, 2024 by
ByronHsu
Loading…
3 tasks done
[Draft] Add Tensor Parallel to torch_native_llama
await-response
#1876
opened Nov 2, 2024 by
kwen2501
Loading…
3 tasks
Surpport kv cache int8/int4 for triton backend
high priority
#1644
opened Oct 12, 2024 by
yuguo-Jack
Loading…
ProTip!
Mix and match filters to narrow down what you’re looking for.