Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lighter session timeout configuration. #1042

Closed
mshkalim opened this issue Jun 2, 2024 · 4 comments
Closed

Lighter session timeout configuration. #1042

mshkalim opened this issue Jun 2, 2024 · 4 comments

Comments

@mshkalim
Copy link

mshkalim commented Jun 2, 2024

Hi,
We are using Z2JH with sparkmagic kernel and Lighter (0.1.1) to create Spark sessions.
I faced an issue when I configured Lighter to kill the session after X time automatically when the session state was idle (not during data processing).
I found two environment variables that need to be taken care of, but the behavior was not as expected and I think the docs have some missing info or maybe there is a bug in Lighter.

  1. LIGHTER_SESSION_TIMEOUT_INTERVAL - according to the docs, this configuration represents the session lifetime from the last statement creation.
    I configured it to 2m for testing purposes and found it kills the session longer than 2 minutes from the last statement has finished the processing (~ 5 minutes). so I'm not sure what to expect here and how to measure it.

  2. LIGHTER_SESSION_TIMEOUT_ACTIVE - as I understand, this configuration stands for if a statement is waiting (like pending?) in a queue while another statement is in process. when I configured the timeout interval to 2m (again, for testing purposes), Lighter killed the session no matter if the timeout active was "true" or "false".

can someone please explain some more about these configurations?
like:

  • Are these configurations tied to each other or not?
  • Does timeout active will enable the timeout interval feature?
  • In the timeout interval docs wrote "representing session lifetime (from last statement creation)", but the "session lifetime" already contains the "last statement creation time", what stands for it?

Thank you

@pdambrauskas
Copy link
Collaborator

Hey, yes, these configurations are related

LIGHTER_SESSION_TIMEOUT_INTERVAL is used for killing "forgotten" sessions, lighter check when the last statement was created and if it was created more than configured amount of time ago - lighter kills it.

LIGHTER_SESSION_TIMEOUT_ACTIVE - prevents killing sessions if there are some uncompleted statements.

LIGHTER_SESSION_TIMEOUT_ACTIVE will make no effect if LIGHTER_SESSION_TIMEOUT_INTERVAL is set to zero.

Regarding your 2m configuration - Lighter executes process, that kills timed-out sessions every 10mins. Thats why your session got killed later than configured.

@mshkalim
Copy link
Author

mshkalim commented Jun 3, 2024

Hi,
Thank you for your reply.

So if I understand how things work now, the LIGHTER_SESSION_TIMEOUT_INTERVAL will kill 'forgotten' sessions when their lifetime exceeds the timeout interval we configured. there is another process that checks if sessions have reached the timeout interval runs every 10m that's why it did not take effect after 2m. In addition, LIGHTER_SESSION_TIMEOUT_ACTIVE changes the behavior of the session killer process when setting it to true; then it will not kill the session if it has running statements even if it reaches the timeout interval.

If so, I can tell you that Lighter killed my session when I had one statement that still didn't finish the process (spark job) even LIGHTER_SESSION_TIMEOUT_ACTIVE set to true or false and LIGHTER_SESSION_TIMEOUT_INTERVAL set to 2m.

What can be wrong there?

BTW 1, I would like to know your meaning of 'forgotten' sessions, are they forgotten by us? or by lighter?
BTW 2, I didn't find anything about the process that runs every 10m in the docs, I think that it is better to mention it.

@pdambrauskas
Copy link
Collaborator

If so, I can tell you that Lighter killed my session when I had one statement that still didn't finish the process (spark job) even LIGHTER_SESSION_TIMEOUT_ACTIVE set to true or false and LIGHTER_SESSION_TIMEOUT_INTERVAL set to 2m.

I've double-checked the code, it works as follows:

  • If LIGHTER_SESSION_TIMEOUT_INTERVAL=0 - it does not kill sessions at all
  • if LIGHTER_SESSION_TIMEOUT_INTERVAL=10m or any other positive value - it kills sessions after 10m from when the last session statement was created. Only if that last statement is completed
  • if LIGHTER_SESSION_TIMEOUT_INTERVAL=10m and LIGHTER_SESSION_TIMEOUT_ACTIVE=true - it kills the session after 10m from when the last session statement was submitted. Even if that last statement is still running

Do you suspect it works differently for you? Can you see Killing because of timeout log line in the lighter logs?

BTW 1, I would like to know your meaning of 'forgotten' sessions, are they forgotten by us? or by lighter?

I mean forgotten by the user. Lighter should not forget about your sessions.

BTW 2, I didn't find anything about the process that runs every 10m in the docs, I think that it is better to mention it.

Yes, we'll update it.

@mshkalim
Copy link
Author

mshkalim commented Jul 1, 2024

After setting the LIGHTER_SESSION_TIMEOUT_INTERVAL with value greater then 10m and LIGHTER_SESSION_TIMEOUT_ACTIVE to false, I got it work.

but just to let you know, when setting LIGHTER_SESSION_TIMEOUT_INTERVAL value lesser then 10m the it not work's as expected

thank you :)

@mshkalim mshkalim closed this as completed Jul 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants