Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

πŸ› [Bug]: β‰ˆ500ms Performance penalty using compression on and 'br' in 'Accept-Encoding' #2722

Closed
3 tasks done
the-hotmann opened this issue Nov 14, 2023 · 25 comments

Comments

@the-hotmann
Copy link
Contributor

Bug Description

Usually I by default set compression to 2 but today I noticed a problem when on fiber v2.51.0.

This is the config I use:

app.Use(
	compress.New(compress.Config{
		Level: 2,
	}),
)

I tested a bit and came to this conclusion:

  1. direct call/access (about 5ms)
  2. Nginx Proxy call (β‰ˆ400-600ms)

a direct call will always be fast, but as soon as I access a fiber 2.51.0 pages I see TTFBs of at least 400-600ms.

This penalty just appears to be happening on the HTML part, not any static assets.
Turning off compression solves the problem entirely.

So I had multiple thoughts:

  1. the user the docker container runs with does not have permission to write the compressed file and therefore takes long, but I discarted this thought, as this would also have to affect the direct call - which it does not.
  2. I tought about have a wrong ProxyHeader and did some tests there, as there might be some different behaviour behind NPM (Nginx Proxy Manager) but also all of these tests did not lead to anything.

How to Reproduce

Actually already described above

Expected Behavior

New features can have a performance penalty, but 400-600ms is not acceptable I guess?

Fiber Version

v2.51.0

Code Snippet (optional)

No response

Checklist:

  • I agree to follow Fiber's Code of Conduct.
  • I have checked for existing issues that describe my problem prior to opening this one.
  • I understand that improperly formatted bug reports may be closed without explanation.
@ReneWerner87
Copy link
Member

There is actually nothing in the last version that can trigger this
Can you go through the last versions in your setup

@the-hotmann
Copy link
Contributor Author

I will test it with the last previous 3 versions and report back.

@ReneWerner87
Copy link
Member

Ok thx

@nickajacks1
Copy link
Member

To diagnose where the issue is you can add app.Use(logger.New(logger.ConfigDefault)) to see how long fiber is taking.

package main

import (
	"github.com/gofiber/fiber/v2"
	"github.com/gofiber/fiber/v2/middleware/compress"
	"github.com/gofiber/fiber/v2/middleware/logger"
)

func main() {
	app := fiber.New()
	app.Use("/", logger.New(logger.ConfigDefault))
	app.Use("/", compress.New(compress.Config{Level: 2}))
	app.Static("/", "")
	app.Listen(":3030")
}

You will see logs that show approximately how long fiber took to process the request.

17:49:42 | 200 |    3.869307ms |       127.0.0.1 | GET     | /big.html       
17:49:45 | 200 |       75.42Β΅s |       127.0.0.1 | GET     | /big.html       
17:49:47 | 200 |      53.825Β΅s |       127.0.0.1 | GET     | /big.html       
17:49:47 | 200 |      42.185Β΅s |       127.0.0.1 | GET     | /big.html       

So if it's significantly different between versions of fiber, there's a problem

@the-hotmann
Copy link
Contributor Author

I always use the middleware logger. But this in this log it was not showing it - I can state that directly. Ofc there could be a discrepancy when using Nginx and Fiber, but I will test first and then provide all the info.

@the-hotmann
Copy link
Contributor Author

Ok, I just performed some tests:

  • https://fibercompression-v2-49-0.hotmann.de
  • https://fibercompression-v2-49-1.hotmann.de
  • https://fibercompression-v2-49-2.hotmann.de
  • https://fibercompression-v2-50-0.hotmann.de
  • https://fibercompression-v2-51-0.hotmann.de

and could not reproduce the bug. But I just noticed that also added a custom function to the engine, to be able to render HTML as unsafe / unescape - maybe this was causing the problem in combination with compression

Here the added custom-function, which I added to the engine:

engine := html.New("../templates", ".html")
engine.AddFunc(
	"unescape", func(s string) template.HTML {
		return template.HTML(s)
	},
)

I will try some more tests and update the docker-images afterwards - so the links then will reflect the updated state

@ReneWerner87
Copy link
Member

@the-hotmann so it cannot be reproduced? can we then close this issue until it is reproducible and then reopen it if necessary?

@the-hotmann
Copy link
Contributor Author

I woul like to keep it open for some more hours, as I will re-try to reproduce it.
I will come back with feedback and if not be able to reproduce it, I will close it.

Thanks

@ReneWerner87
Copy link
Member

ok

@nickajacks1
Copy link
Member

Actually, I found a similar behavior on the above snippet I shared, but it still occurred as far back as v2.40.0.
When I get the html page through a browser, there is a several hundred millisecond delay before the first byte:
chrome_4uqFbwpJeb
If I refresh right away, there is no delay, but if I wait a few seconds and refresh, the delay occurs again.
The delay does not occur if I use curl.
Using Express, the delay occurs on the first connection, but never again.
The connection is happening from windows to WSL (fiber) in my case.
I don't know enough about what a browser does on an HTTP get to know if this is the same thing you were experiencing or not, but indeed the same Initial Connection is very short when I connect to your example links.
image

@the-hotmann
Copy link
Contributor Author

the-hotmann commented Nov 15, 2023

I did some more tests and could reproduce the error again.

Feel free to download all 5 versions/tags from this docker-image: https://hub.docker.com/r/h0tmann/go_fiber_test
and use this docker-compose:

version: '3'
services:

##########################
####### FIBER TEST #######
##########################

  go_fiber_2490:
    container_name: go_fiber_2490
    image: h0tmann/go_fiber_test:v2.49.0
    hostname: go_fiber_2490
    ports:
      - 8003:8003
    deploy:
      resources:
        limits:
          memory: 300M
    restart: unless-stopped

  go_fiber_2491:
    container_name: go_fiber_2491
    image: h0tmann/go_fiber_test:v2.49.1
    hostname: go_fiber_2491
    ports:
      - 8004:8003
    deploy:
      resources:
        limits:
          memory: 300M
    restart: unless-stopped

  go_fiber_2492:
    container_name: go_fiber_2492
    image: h0tmann/go_fiber_test:v2.49.2
    hostname: go_fiber_2492
    ports:
      - 8005:8003
    deploy:
      resources:
        limits:
          memory: 300M
    restart: unless-stopped

  go_fiber_2500:
    container_name: go_fiber_2500
    image: h0tmann/go_fiber_test:v2.50.0
    hostname: go_fiber_2500
    ports:
      - 8006:8003
    deploy:
      resources:
        limits:
          memory: 300M
    restart: unless-stopped

  go_fiber_2510:
    container_name: go_fiber_2510
    image: h0tmann/go_fiber_test:v2.51.0
    hostname: go_fiber_2510
    ports:
      - 8007:8003
    deploy:
      resources:
        limits:
          memory: 300M
    restart: unless-stopped

Now use Nginx/NPM to point a domain to each of the containers.

You will end up with two possible URLs to call the very same container:

  1. http://domain.tld
  2. 123.123.123.123:8003 (8003-8007)

DO NOT TEST BEHIND CLOUDFLARE!
Cloudflare (my public tests are proxyied throu Cloudflare) is somehow making the problem not showing up.

My public Tests are hosted like this:

Client => Cloudflare => Nginx => Docker-Container

My internal Tests are tested like this:

Client => Nginx => Docker-Container (Here the Problem shows up!)

I am more then happy to provide you guys with the new URLs, apparently they point to IPs which I dont want to share publicly. So feel free to contact me privately or setup your own testcase according to the tutorial above.

Also I tested on the Versions:

  • v2.49.0
  • v2.49.1
  • v2.49.2
  • v2.50.0
  • v2.51.0

It happens on ALL of them.
BUT: Fiber itself reports times like 141.991Β΅s but the TTFB is soemwhere around 0.5s.

@nickajacks1 thanks for looking into it. I personally have some guesses, but I will just dump on every request the request header and print it to the HTML, then we can see what is the difference. Will rewrite my testcases and rebuild.
Some questions for you:

  1. If you proxy through Cloudflare, does the problem also go away for you?
  2. If you call the Golang/Fiber application directly (not through Nginx) does the problem also go away for you?

@ReneWerner87
Copy link
Member

Fiber itself reports times like 141.991Β΅s but the TTFB is soemwhere around 0.5s.

can you rule out that it's not the infrastructure you built around fiber?

@ReneWerner87
Copy link
Member

can you find out the version where it first appeared?

@the-hotmann
Copy link
Contributor Author

Indeed I can not, I have no proof it is fiber. But it completely disappears when turning off:

  1. eighter compression in fiber
  2. or the custom function for the template

Still could be nginx, but again, if behind nginx AND Cloudflare it does not appear. Currently a blackbox.

can you find out the version where it first appeared?

I will try to find out.

@the-hotmann
Copy link
Contributor Author

the-hotmann commented Nov 15, 2023

Some additional Info:

For the first time (I even recorded it) the app (fiber) crashed on many reloads:

13:24:46 | 200 |    3.560504ms |     127.10.0.10 | GET     | /
13:24:46 | 200 |      11.044Β΅s |     127.10.0.10 | GET     | /app.min.css
13:24:47 | 200 |    4.345086ms |     127.10.0.10 | GET     | /
13:24:47 | 200 |      10.703Β΅s |     127.10.0.10 | GET     | /app.min.css
13:24:47 | 200 |    4.368002ms |     127.10.0.10 | GET     | /
13:24:47 | 200 |      11.858Β΅s |     127.10.0.10 | GET     | /app.min.css
13:24:47 | 200 |    4.005441ms |     127.10.0.10 | GET     | /
13:24:47 | 200 |      14.102Β΅s |     127.10.0.10 | GET     | /app.min.css
13:24:48 | 200 |    4.434645ms |     127.10.0.10 | GET     | /
13:24:48 | 200 |      12.454Β΅s |     127.10.0.10 | GET     | /app.min.css
13:24:48 | 200 |    6.291476ms |     127.10.0.10 | GET     | /
13:24:48 | 200 |       16.13Β΅s |     127.10.0.10 | GET     | /app.min.css
13:24:49 | 200 |    5.200579ms |     127.10.0.10 | GET     | /
13:24:50 | 200 |      69.583Β΅s |     127.10.0.10 | GET     | /app.min.css
13:24:51 | 200 |    5.683539ms |     127.10.0.10 | GET     | /

DEBUG:
Host:          scratch
Kernel:        6.5.10-1
GoLang:        1.21.4
Fiber:         v2.51.0
Build date:    2023.11.15


 β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
 β”‚                 Fiber Test v1.0.3                 β”‚
 β”‚                   Fiber v2.51.0                   β”‚
 β”‚                 http://[::]:8003                  β”‚
 β”‚                                                   β”‚
 β”‚ Handlers ............. 5  Processes ........... 1 β”‚
 β”‚ Prefork ....... Disabled  PID ................. 1 β”‚
 β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

13:24:52 | 200 |    4.962311ms |     127.10.0.10 | GET     | /
13:24:53 | 200 |     962.536Β΅s |     127.10.0.10 | GET     | /app.min.css
13:24:54 | 200 |    4.472951ms |     127.10.0.10 | GET     | /
13:24:54 | 200 |      30.002Β΅s |     127.10.0.10 | GET     | /app.min.css
13:24:55 | 200 |    4.500065ms |     127.10.0.10 | GET     | /
13:24:56 | 200 |       7.734Β΅s |     127.10.0.10 | GET     | /app.min.css
13:24:56 | 200 |    4.019466ms |     127.10.0.10 | GET     | /
13:24:57 | 200 |       9.426Β΅s |     127.10.0.10 | GET     | /app.min.css
13:24:57 | 200 |    6.787522ms |     127.10.0.10 | GET     | /
13:24:58 | 200 |       9.033Β΅s |     127.10.0.10 | GET     | /app.min.css
13:24:58 | 200 |    5.446333ms |     127.10.0.10 | GET     | /
13:24:59 | 200 |      14.491Β΅s |     127.10.0.10 | GET     | /app.min.css
13:24:59 | 200 |    6.466976ms |     127.10.0.10 | GET     | /
13:25:00 | 200 |       9.368Β΅s |     127.10.0.10 | GET     | /app.min.css
13:25:01 | 200 |    3.629219ms |     127.10.0.10 | GET     | /
13:25:02 | 200 |       7.669Β΅s |     127.10.0.10 | GET     | /app.min.css

Here the video:

fiber_compression_test.MP4

P.S.: it crashed because it was OOM. When the thing happens (the high TTFBs) every reload consumes abour 30MB RAM, which will not be cleared by the GC for a long time (if ever).
Since I set RAM-Limits in my docker-compose to 300MB - which is more than plenty - the app crashed after some reloads.

@the-hotmann
Copy link
Contributor Author

the-hotmann commented Nov 15, 2023

Tested down to v2.0.0. Even happened there.
Had to remove a lot of the config, but the error was reproducable.

Maybe it is not the app itself, but the template package? Will try to downgrade this.
Cant use "github.com/gofiber/template/html" anymore, throws this error:

github.com/gofiber/template/html: module github.com/gofiber/template@latest found (v1.8.2), but does not contain package github.com/gofiber/template/html

@the-hotmann
Copy link
Contributor Author

the-hotmann commented Nov 15, 2023

I added the functionality to print out the request header.
Here are the two headers:

Fast/Direct:

Header Value
Accept text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,/;q=0.8,application/signed-exchange;v=b3;q=0.7
Accept-Encoding gzip, deflate
Accept-Language de-DE,de;q=0.9,en-US;q=0.8,en;q=0.7
Cache-Control max-age=0
Connection keep-alive
Host ###CENSORED###:8007
Upgrade-Insecure-Requests 1
User-Agent Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36

Slow/Nginx:

Header Value
Accept text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,/;q=0.8,application/signed-exchange;v=b3;q=0.7
Accept-Encoding gzip, deflate, br
Accept-Language de-DE,de;q=0.9,en-US;q=0.8,en;q=0.7
Cache-Control max-age=0
Connection close
Cookie __hstc=25561058.5ae6dd05059e40a9b3017761e5eb15e1.1699973432021.1699973432021.1699973432021.1; hubspotutk=5ae6dd05059e40a9b3017761e5eb15e1; _gcl_au=1.1.965372761.1699973432
Host fibercompression-v2-51-0.###CENSORED###
Sec-Ch-Ua "Google Chrome";v="119", "Chromium";v="119", "Not?A_Brand";v="24"
Sec-Ch-Ua-Mobile ?0
Sec-Ch-Ua-Platform "Windows"
Sec-Fetch-Dest document
Sec-Fetch-Mode navigate
Sec-Fetch-Site none
Sec-Fetch-User ?1
Upgrade-Insecure-Requests 1
User-Agent Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36
X-Forwarded-For ###CENSORED###
X-Forwarded-Proto https
X-Forwarded-Scheme https
X-Real-Ip ###CENSORED###

Maybe it does have something to do with the discrepancy of the Accept-Encoding header?
If Fiber sends back a brotli encoded file Nginx must reencode it to gzip? Since my browser somehow does not accept br?

Also the:

Connection close

Is strange, it should actually keep the connection opened.

This is the current full fibver config:

	engine := html.New("../templates", ".html")
	engine.AddFunc(
		"unescape", func(s string) template.HTML {
			return template.HTML(s)
		},
	)

	app := fiber.New(fiber.Config{
		Views:                engine,
		CompressedFileSuffix: ".gz",
		//AppName:              "Fiber Test v1.0.3",
		//JSONEncoder:          json.Marshal,
		//JSONDecoder:          json.Unmarshal,
		//Network: "tcp",
	})

	app.Use(
		// Aktiviere Kompression
		compress.New(compress.Config{
			Level: 2,
		}),

		// Aktiviere Logger
		logger.New(),
	)

	// Statische Dateien unter / ausliefern
	app.Static("/", "../static", fiber.Static{
		Compress: true,
	})

Some parts of the fiber.Config are commented out for backwards compatibility.

@the-hotmann
Copy link
Contributor Author

the-hotmann commented Nov 15, 2023

With Cloudflare the problem does not occur and The request Headers through Cloudflare look like this:

Header Value
Accept text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,/;q=0.8,application/signed-exchange;v=b3;q=0.7
Accept-Encoding gzip
Accept-Language de-DE,de;q=0.9,en-US;q=0.8,en;q=0.7
Cdn-Loop cloudflare
Cf-Connecting-Ip ###CENSORED###
Cf-Ipcountry DE
Cf-Ray 826852535a24bb62-FRA
Cf-Visitor {"scheme":"https"}
Connection close
Host fibercompression-v2-51-0.hotmann.de
Priority u=0, i
Sec-Ch-Ua "Google Chrome";v="119", "Chromium";v="119", "Not?A_Brand";v="24"
Sec-Ch-Ua-Mobile ?0
Sec-Ch-Ua-Platform "Windows"
Sec-Fetch-Dest document
Sec-Fetch-Mode navigate
Sec-Fetch-Site none
Sec-Fetch-User ?1
True-Client-Ip ###CENSORED###
Upgrade-Insecure-Requests 1
User-Agent Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36
X-Forwarded-For ###CENSORED###, ###CENSORED###
X-Forwarded-Proto https
X-Forwarded-Scheme https
X-Real-Ip ###CENSORED###

Here again is no Accept-Encoding header with br:

Accept-Encoding gzip

@ReneWerner87
Copy link
Member

Can you try the same or similar without fiber? Have you also activated compression in nginx for the routes? Then I would deactivate one of the two

Please try to narrow it down further

@the-hotmann
Copy link
Contributor Author

the-hotmann commented Nov 15, 2023

I have to decline.
I did some more testing with NOTHING, but fiber. And the situation is unambiguously & reproducable - it is Fiber.
Here how to reproduce it:

Setup fiber with

	app.Use(
		// Aktiviere Kompression
		compress.New(compress.Config{
			Level: 2,
		}),
	)

start the App

Now run these two curl requests directly and locally against your application:

1.)

curl -o /dev/null -s -w "TTFB: %{time_starttransfer}\n" http://127.0.0.1:8003 -H "Accept-Encoding: gzip, deflate"

2.) (Note: with br)

curl -o /dev/null -s -w "TTFB: %{time_starttransfer}\n" http://127.0.0.1:8003 -H "Accept-Encoding: gzip, deflate, br"

For me the results are:

1.)

$ curl -o /dev/null -s -w "TTFB: %{time_starttransfer}\n" http://127.0.0.1:8003 -H "Accept-Encoding: gzip, deflate"
TTFB: 0.006761

2.) (Note: with br)

$ curl -o /dev/null -s -w "TTFB: %{time_starttransfer}\n" http://127.0.0.1:8003 -H "Accept-Encoding: gzip, deflate, br"
TTFB: 0.545109

I have run this multiple times. It all times gives me the same result.
For me this is now proof that the problem is with fiber, as all other things in the infrastructure, as this is a direct request.
I also tested this on multiple servers - but also here this request is internal (127.0.0.1:8003).
This discovery aligns with the identified request headers of the requests shown above.

My analysis:
fiber does have problems when br is part of the Accept-Encoding Request-Header and compression is turned on.

@the-hotmann the-hotmann changed the title πŸ› [Bug]: β‰ˆ500ms Performance penalty using compression level 2 on v2.51.0 πŸ› [Bug]: β‰ˆ500ms Performance penalty using compression on and 'br' in 'Accept-Encoding' Nov 15, 2023
@ReneWerner87
Copy link
Member

Ok that's interesting

Maybe brotli just needs more time

By the way, we use a fasthttp middleware in our middleware for handling the compression, which internal uses a additional lib for it

So we need to find out which of these components is affected
Can you join Discord, it might be easier to work on this problem together there

@nickajacks1
Copy link
Member

When I use br encoding on a 963KB file with level 2 compression on a bare fiber server on localhost, I get these results:

TTFB: 0.031775

real	0m2.076s
user	0m0.000s
sys	0m0.023s

The whole request takes forever, but TTFB isn't as bad. Deflate and gzip are also slowish but still much faster (~150ms real time)

But if I use level 1 compression with brotli, it's a lot faster

TTFB: 0.007094

real	0m0.041s
user	0m0.013s
sys	0m0.007s

@the-hotmann
Copy link
Contributor Author

the-hotmann commented Nov 15, 2023

Can you join Discord, it might be easier to work on this problem together there

I am there :)

When I use br encoding on a 963KB file with level 2 compression on a bare fiber server on localhost, I get these results:

The fastest response from a fiber server I get, when I use compress as accept-encoding:

curl -o /dev/null -s -w "TTFB: %{time_starttransfer}\n" http://127.0.0.1:8003 -H "Accept-Encoding: compress"

Here my results with time:

br

$ time curl -o /dev/null -s -w "TTFB: %{time_starttransfer}\n" http://127.0.0.1:8003 -H "Accept-Encoding: br"
TTFB: 0.493420

real    0m0,521s
user    0m0,015s
sys     0m0,015s

compress

$ time curl -o /dev/null -s -w "TTFB: %{time_starttransfer}\n" http://127.0.0.1:8003 -H "Accept-Encoding: compress"
TTFB: 0.003681

real    0m0,022s
user    0m0,009s
sys     0m0,011s

@the-hotmann
Copy link
Contributor Author

the-hotmann commented Nov 16, 2023

Via Discord the problem could be locally reproducts, but it just seems that Compression-Level 2 (brotli lvl9 LINK) ist just very CPU intense and takes much longer than other compression techniques on theirhighest level.

So even if a single request took about 0.5s, this is to be expected when using brotli on its highest level, combined with some 100kb of text.

@ReneWerner87 thank you for the clearification!
@nickajacks1 thanks for the quick reproduction of the error and participation

P.S.: just don't use compression level 2 in the compress.Config ;)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants