Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add jit_stack_size option #782

Closed
wants to merge 42 commits into from
Closed

Conversation

alubbe
Copy link
Contributor

@alubbe alubbe commented May 31, 2016

This is a quick suggestion on how a jit_stack_size option could be implemented. I lack the actual C skills to pull this off, and I would first like to discuss the direction of this approach.

Allocating and freeing jit_stacks for each regex is too costly according to http://www.pcre.org/original/doc/html/pcrejit.html#SEC9, and it seems safe to reuse the same stack if the code runs single-threaded according to that same FAQ.

My idea would thus be to have a "global" jit_stack, initially set to 32K (the current default), and to have a lua-land function akin to this

function ngx.re.opt(option, value)
  if option == "jit_stack_size" then
    ngx_http_lua_ffi_set_jit_stack_size(value)
  end
end
void 
ngx_http_lua_ffi_set_jit_stack_size(int size)
{
  pcre_jit_stack_free(jit_stack);
  int min_size = MIN(32 * 1024, size);
  jit_stack = pcre_jit_stack_alloc(min_size, size);
}

Every time a regex is compiled in ngx_http_lua_regex.c, we just add

pcre_assign_jit_stack(sd, NULL, jit_stack);

What remains to be discussed is where jit_stack lives. It could be a mutable C global, it could be part of the Lua state or we could instantiate it on every compile, which would be more expensive, but assuming that most users combine "j" with "o", it might not make a dent after all.

Feedback is most welcome.

@agentzh
Copy link
Member

agentzh commented May 31, 2016

@alubbe The direction is correct, I think.

The global jit_stack should live in the ngx_http_lua_main_conf_t struct. The C globals are not safe enough for HUP reload.

@alubbe
Copy link
Contributor Author

alubbe commented May 31, 2016

Updated this PR and added openresty/lua-resty-core#44.
Where should jit_stack be instantiated intially? In the C code or from lua?

@agentzh
Copy link
Member

agentzh commented May 31, 2016

@alubbe It should be initialized in the C function ngx_http_lua_create_main_conf, I guess?

@alubbe
Copy link
Contributor Author

alubbe commented Jun 1, 2016

Added. I feel like that 32 * 1024 magic number should live somewhere, something like MIN_JIT_STACK_SIZE. Where would its place be?

Also, aside from adding tests, together with openresty/lua-resty-core#44 is there anything missing?

@@ -798,6 +798,7 @@ ngx_http_lua_create_main_conf(ngx_conf_t *cf)
#if (NGX_PCRE)
lmcf->regex_cache_max_entries = NGX_CONF_UNSET;
lmcf->regex_match_limit = NGX_CONF_UNSET;
lmcf->jit_stack = pcre_jit_stack_alloc(32*1024, 32*1024);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also check that this is not NULL. What's the idiomatic way of throwing errors here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alubbe Just return NULL here when the allocation fails.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alubbe Also, we should guard this piece with macros when PCRE is built without JIT or an older version of PCRE is used.

@agentzh
Copy link
Member

agentzh commented Jun 1, 2016

@alubbe So the new API won't work if lua-resty-core is not used?

@agentzh
Copy link
Member

agentzh commented Jun 1, 2016

@alubbe Your PR is currently failing Travis CI checks BTW. It does not even compile.

@alubbe
Copy link
Contributor Author

alubbe commented Jun 2, 2016

I fixed the build, added some guards and extracted the magic number.
Is there a way to share the #define parts between the files, in particular LUA_HAVE_PCRE_JIT and NGX_LUA_RE_MIN_JIT_STACK_SIZE?

@alubbe
Copy link
Contributor Author

alubbe commented Jun 2, 2016

Tbh, this PR is as much a discussion as it is an implementation attempt. This feature is probably simple enough to be brought into the non-resty-core world via the lua5.1-c api, but I have exactly zero experience with that.

@alubbe alubbe changed the title WIP: Add jit_stack_size option Add jit_stack_size option Jun 2, 2016
@alubbe
Copy link
Contributor Author

alubbe commented Jun 3, 2016

For tests, we could use something like this (thanks to JuliaLang/julia#8278)

ngx.re.opt("jit_stack_size", 128 * 1024) -- the match below will fail without this
local subject = [[71.163.72.113 - - [30/Jul/2014:16:40:55 -0700] "GET example.com/thevacantwall/wp-content/uploads/2013/02/DSC_006421.jpg HTTP/1.1" 200 492513 "http://images.search.yahoo.com/images/view;_ylt=AwrB8py9gdlTGEwADcSjzbkF;_ylu=X3oDMTI2cGZrZTA5BHNlYwNmcC1leHAEc2xrA2V4cARvaWQDNTA3NTRiMzYzY2E5OTEwNjBiMjc2YWJhMjkxMTEzY2MEZ3BvcwM0BGl0A2Jpbmc-?back=http%3A%2F%2Fus.yhs4.search.yahoo.com%2Fyhs%2Fsearch%3Fei%3DUTF-8%26p%3Dapartheid%2Bwall%2Bin%2Bpalestine%26type%3Dgrvydef%26param1%3D1%26param2%3Dsid%253Db01676f9c26355f014f8a9db87545d61%2526b%253DChrome%2526ip%253D71.163.72.113%2526p%253Dgroovorio%2526x%253DAC811262A746D3CD%2526dt%253DS940%2526f%253D7%2526a%253Dgrv_tuto1_14_30%26hsimp%3Dyhs-fullyhosted_003%26hspart%3Dironsource&w=588&h=387&imgurl=occupiedpalestine.files.wordpress.com%2F2012%2F08%2F5-peeking-through-the-wall.jpg%3Fw%3D588%26h%3D387&rurl=http%3A%2F%2Fwww.stopdebezetting.com%2Fwereldpers%2Fcompare-the-berlin-wall-vs-israel-s-apartheid-wall-in-palestine.html&size=49.0KB&name=...+%3Cb%3EApartheid+wall+in+Palestine%3C%2Fb%3E...+%7C+Or+you+go+peeking+through+the+%3Cb%3Ewall%3C%2Fb%3E&p=apartheid+wall+in+palestine&oid=50754b363ca991060b276aba291113cc&fr2=&fr=&tt=...+%3Cb%3EApartheid+wall+in+Palestine%3C%2Fb%3E...+%7C+Or+you+go+peeking+through+the+%3Cb%3Ewall%3C%2Fb%3E&b=0&ni=21&no=4&ts=&tab=organic&sigr=13evdtqdq&sigb=19k7nsjvb&sigi=12o2la1db&sigt=12lia2m0j&sign=12lia2m0j&.crumb=.yUtKgFI6DE&hsimp=yhs-fullyhosted_003&hspart=ironsource" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/36.0.1985.125 Safari/537.36]]
local regex = [[([\d\.]+) ([\w.-]+) ([\w.-]+) (\[.+\]) "([^"\r\n]*|[^"\r\n\[]*\[.+\][^"]+|[^"\r\n]+.[^"]+)" (\d{3}) (\d+|-) ("(?:[^"]|\")+)"? ("(?:[^"]|\")+)"?]]
local matches, err = ngx.re.match(subject, regex, "j")
return ngx.say(matches and matches[0] or err)

As of right now, it segfaults, but my C knowledge is so extremely limited that I don't even know how retrieve a meaningful error message. But I feel that we are very close.

@alubbe
Copy link
Contributor Author

alubbe commented Jun 10, 2016

I'd love to finish this PR, but I'll need a little bit of help on why it segfaults and how to best write the test that I put together.

@alubbe
Copy link
Contributor Author

alubbe commented Jun 11, 2016

Okay so one large part of the reason it segfaulted was that I was not using the new jit_stack in the ffi functions, but the larger problem is https://github.com/openresty/lua-nginx-module/pull/782/files#diff-5a62a11d0a2e8e55e4c5a7feaac0edd1R2502
This call to free the previous stack fails - I cannot figure out why. If I comment it out (and accept leaking 32kb of memory, assuming you only call the new helper once during worker init or similar) everything works great and the test that would be failing (#782 (comment)) passes. You can also reduce the jit_stack_size down to 32kb again and make the test case fail again.

Hence, we need to figure out why pcre_jit_stack_free(lmcf->jit_stack); fails, wrap my tests into the official suite and maybe add a non-jit helper function to set a new jit_stack. I definitely need help with this ;)

@alubbe
Copy link
Contributor Author

alubbe commented Jun 11, 2016

Through some trial and error, we now have a working ffi and non-ffi implementation as well as new and passing tests - with one big caveat: this line segfaults everything https://github.com/openresty/lua-nginx-module/pull/782/files#diff-5a62a11d0a2e8e55e4c5a7feaac0edd1R1940

If you could point me in the direction of how to figure this one out, I can remove that final roadblock to finishing up the PR.

@alubbe
Copy link
Contributor Author

alubbe commented Jun 11, 2016

Just a thought: depending on what you prefer, we could also merge the two test cases into one, thereby reusing the subject and regex variables.

# define LUA_HAVE_PCRE_JIT 0
#endif

#endif
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These macros are already defined in common.h, so why the duplication here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because I have absolutely no idea how C works ;) Will remove them!

@agentzh
Copy link
Member

agentzh commented Jun 14, 2016

@alubbe I prefer separate smaller test cases which are easier to debug. I don't really care about duplication in the tests :)

Thanks for your hard work on this PR. Will you look into my latest comments? Thanks!

@alubbe
Copy link
Contributor Author

alubbe commented Apr 14, 2017

Test failure seems to be an unrelated flaky test. Could someone trigger a re-run?

@alubbe
Copy link
Contributor Author

alubbe commented Apr 16, 2017

Perfect, both PRs are rebased and green. Any remaining open points @agentzh ?

@agentzh
Copy link
Member

agentzh commented Apr 19, 2017

@alubbe I think it's mostly fine now. Will get to this as soon as I can manage. Sorry, have more urgent things to take care of atm.

@agentzh
Copy link
Member

agentzh commented May 5, 2017

@alubbe I'm terribly sorry for the long delay on my side! Merged with the following extra patch:

diff --git a/src/ngx_http_lua_module.c b/src/ngx_http_lua_module.c
index 72f934d0..7b17ac41 100644
--- a/src/ngx_http_lua_module.c
+++ b/src/ngx_http_lua_module.c
@@ -814,6 +814,7 @@ ngx_http_lua_create_main_conf(ngx_conf_t *cf)
      *      lmcf->running_timers = 0;
      *      lmcf->watcher = NULL;
      *      lmcf->regex_cache_entries = 0;
+     *      lmcf->jit_stack = NULL;
      *      lmcf->shm_zones = NULL;
      *      lmcf->init_handler = NULL;
      *      lmcf->init_src = { 0, NULL };
diff --git a/src/ngx_http_lua_regex.c b/src/ngx_http_lua_regex.c
index 20a45cdc..80519ecd 100644
--- a/src/ngx_http_lua_regex.c
+++ b/src/ngx_http_lua_regex.c
@@ -1926,7 +1926,7 @@ error:

 ngx_int_t
 ngx_http_lua_ffi_set_jit_stack_size(int size, u_char *errstr,
-                                    size_t *errstr_size)
+    size_t *errstr_size)
 {
 #if LUA_HAVE_PCRE_JIT

diff --git a/src/ngx_http_lua_regex.h b/src/ngx_http_lua_regex.h
index 9752827d..03dffb80 100644
--- a/src/ngx_http_lua_regex.h
+++ b/src/ngx_http_lua_regex.h
@@ -15,7 +15,7 @@
 #if (NGX_PCRE)
 void ngx_http_lua_inject_regex_api(lua_State *L);
 ngx_int_t ngx_http_lua_ffi_set_jit_stack_size(int size, u_char *errstr,
-                                              size_t *errstr_size);
+    size_t *errstr_size);
 #endif

@agentzh agentzh closed this May 5, 2017
@alubbe
Copy link
Contributor Author

alubbe commented May 5, 2017

Perfect, it's been a long journey :D
Thanks again for all of your input and patience with me!

@alubbe alubbe deleted the pcre_jit_stack_size branch May 5, 2017 09:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants