Skip to content

Actions: huggingface/text-generation-inference

Server Tests

Actions

Loading...
Loading

Show workflow options

Create status badge

Loading
2,428 workflow runs
2,428 workflow runs

Filter by Event

Filter by Status

Filter by Branch

Filter by Actor

fix(server): llama v2 GPTQ
Server Tests #708: Pull request #648 synchronize by fxmarty
July 19, 2023 16:35 13m 11s fxmarty:fix-llama-gptq
July 19, 2023 16:35 13m 11s
fix(server): llama v2 GPTQ
Server Tests #707: Pull request #648 opened by fxmarty
July 19, 2023 16:33 2m 14s fxmarty:fix-llama-gptq
July 19, 2023 16:33 2m 14s
Add exllama GPTQ CUDA kernel support
Server Tests #705: Pull request #553 synchronize by fxmarty
July 19, 2023 14:59 13m 47s fxmarty:gptq-cuda-kernels
July 19, 2023 14:59 13m 47s
feat(router): ngrok edge
Server Tests #704: Pull request #642 opened by OlivierDehaene
July 19, 2023 09:59 23m 50s feat/ngrok_tunnel
July 19, 2023 09:59 23m 50s
feat(server): auto max_batch_total_tokens for flash att models
Server Tests #703: Pull request #630 synchronize by OlivierDehaene
July 19, 2023 00:06 13m 25s feat/automatic_max
July 19, 2023 00:06 13m 25s
Directly load GPTBigCode to specified device
Server Tests #702: Pull request #618 reopened by Atry
July 18, 2023 23:51 10m 4s Atry:patch-8
July 18, 2023 23:51 10m 4s
feat(server): auto max_batch_total_tokens for flash att models
Server Tests #701: Pull request #630 synchronize by OlivierDehaene
July 18, 2023 23:50 9m 29s feat/automatic_max
July 18, 2023 23:50 9m 29s
feat(server): auto max_batch_total_tokens for flash att models
Server Tests #700: Pull request #630 synchronize by OlivierDehaene
July 18, 2023 23:26 14m 1s feat/automatic_max
July 18, 2023 23:26 14m 1s
feat(server): auto max_batch_total_tokens for flash att models
Server Tests #699: Pull request #630 synchronize by OlivierDehaene
July 18, 2023 23:16 10m 49s feat/automatic_max
July 18, 2023 23:16 10m 49s
feat(server): auto max_batch_total_tokens for flash att models
Server Tests #698: Pull request #630 synchronize by OlivierDehaene
July 18, 2023 23:12 3m 36s feat/automatic_max
July 18, 2023 23:12 3m 36s
feat(server): auto max_batch_total_tokens for flash att models
Server Tests #697: Pull request #630 synchronize by OlivierDehaene
July 18, 2023 22:42 9m 44s feat/automatic_max
July 18, 2023 22:42 9m 44s
feat(server): auto max_batch_total_tokens for flash att models
Server Tests #696: Pull request #630 synchronize by OlivierDehaene
July 18, 2023 22:18 10m 12s feat/automatic_max
July 18, 2023 22:18 10m 12s
feat(server): auto max_batch_total_tokens for flash att models
Server Tests #695: Pull request #630 synchronize by OlivierDehaene
July 18, 2023 17:43 11m 15s feat/automatic_max
July 18, 2023 17:43 11m 15s
feat(server): auto max_batch_total_tokens for flash att models
Server Tests #694: Pull request #630 synchronize by OlivierDehaene
July 18, 2023 17:29 12m 3s feat/automatic_max
July 18, 2023 17:29 12m 3s
fix(server): fix llamav2 config
Server Tests #693: Pull request #635 opened by OlivierDehaene
July 18, 2023 16:47 3m 15s hotfix/llamav2_conf
July 18, 2023 16:47 3m 15s
v0.9.3
Server Tests #692: Pull request #634 opened by OlivierDehaene
July 18, 2023 16:11 2m 57s v0.9.3
July 18, 2023 16:11 2m 57s
feat(server): auto max_batch_total_tokens for flash att models
Server Tests #691: Pull request #630 synchronize by OlivierDehaene
July 18, 2023 16:06 12m 59s feat/automatic_max
July 18, 2023 16:06 12m 59s
feat(server): add support for llamav2
Server Tests #690: Pull request #633 opened by Narsil
July 18, 2023 16:04 23m 14s llamav2_post
July 18, 2023 16:04 23m 14s
feat(server): auto max_batch_total_tokens for flash att models
Server Tests #689: Pull request #630 synchronize by OlivierDehaene
July 18, 2023 15:03 9m 39s feat/automatic_max
July 18, 2023 15:03 9m 39s
feat(server): auto max_batch_total_tokens for flash att models
Server Tests #688: Pull request #630 synchronize by OlivierDehaene
July 18, 2023 14:38 10m 45s feat/automatic_max
July 18, 2023 14:38 10m 45s
feat(server): auto max_batch_total_tokens for flash att models
Server Tests #687: Pull request #630 synchronize by OlivierDehaene
July 18, 2023 14:19 11m 31s feat/automatic_max
July 18, 2023 14:19 11m 31s
feat(server): auto max_batch_total_tokens for flash att models
Server Tests #686: Pull request #630 synchronize by OlivierDehaene
July 18, 2023 14:11 7m 56s feat/automatic_max
July 18, 2023 14:11 7m 56s
feat(server): flash attention v2
Server Tests #685: Pull request #624 synchronize by OlivierDehaene
July 18, 2023 13:29 13m 11s feat/flash_v2
July 18, 2023 13:29 13m 11s
feat(server): auto max_batch_total_tokens for flash att models
Server Tests #684: Pull request #630 synchronize by OlivierDehaene
July 18, 2023 10:46 10m 21s feat/automatic_max
July 18, 2023 10:46 10m 21s
ProTip! You can narrow down the results and go further in time using created:<2023-07-18 or the other filters available.