Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(A) Search drop-down [incl 10 suggested article titles] (1) never appears, or (2) appears many seconds later, or (3) appears nearly instantly if you type BACKSPACE — or if you hit ENTER, ERR_CONNECTION_REFUSED is very common (B) SEARCH FAILS when search query is a Greek letter like "pi", "alpha", "beta" ETC #769

Closed
holta opened this issue May 10, 2022 · 23 comments

Comments

@holta
Copy link

holta commented May 10, 2022

This irritating problem (appears) to affect all schools and everyone using very large ZIM files (on Raspberry Pi's especially?)

Specifically, this severe UX glitch occurs very often with very large ZIM files like https://download.kiwix.org/zim/wikipedia/wikipedia_en_all_maxi_2021-12.zim

What happens is that users browse to type in a search query into the top-right (e.g. of http://box.lan/kiwix/wikipedia_en_all_maxi_2021-12/) but the search drop-down menu very often never appears.

Or the drop-down appears Many Seconds Later — for no obvious reason (zero CPU load, and many gigabytes of RAM available, even when no others users are using kiwix-serve).

The problem occurs with any version of kiwix-tools (kiwix-serve) from recent years (including the very latest 3.2.0-4).

The problem occurs regardless if the https://internet-in-a-box.org is proxying kiwix-serve or if kiwix-serve if being accessed directly over port 3000.

Very Strange Workarounds:

  • Typing BACKSPACE sometimes helps, to force the drop-down to finally appear, e.g. by shortening your search query from "AIDS" to "AID"
  • Typing "S" to restore your original search query ("AIDS") often then works, to force the drop-down to finally appear

While very useful, not every teacher and student can handle the above quirky workarounds :/

Does anybody have any idea what the root cause might be — for this very common usability glitch? Apologies I'm not sure exactly what the pattern is! But certainly it occurs very consistently and commonly despite the intermittent/random nature of the problem — this seems to affect all schools using very large ZIM files under these common conditions (on Raspberry Pi servers, no matter how recent or how old, the delay does not seem the be affected very strangely...)

BACKGROUND: This longstanding problem has been ongoing in recent years. I told schools that faster CPU's, more RAM and newer OS's would probably speed things up, such that this irritating delay (or the drop-down never appearing at all!) might effectively cease to be a problem.

I Was 100% Wrong: this confusing problem remains just as severe today in May 2022, even with massive amounts of RAM, fast SSD's instead of microSD cards, the fastest Raspberry Pi — and no matter whether using 32-bit or 64-bit Raspberry Pi OS Lite.

SIDE EFFECT: Many/most users end up doing Full-Text Search by accident. As they don't realize there's any other option — i.e. users often hit "Enter" after typing in a search query — as a result of the "10 suggestions" drop-down never having appeared.

@mgautierfr
Copy link
Member

mgautierfr commented May 11, 2022

The problem is probably io speed, not cpu or ram size.

On wikipedia_en_all_maxi_2020-08.zim the title xapian database is 2692521984 bytes so 2.5 GB.
From the raspberry Pi4 benchmarcks, the USB storage throughput (with a SSD external drive) is about 364MB/s, so you need at least 7 seconds to load it in RAM. This IO is not visible (not CPU Load)
And if you use a PI3, the speed is about 38 MB/s so 67s to load everything.
Raspberry Pi is not cheap for nothing.

Xapian doesn't load all the database in memory. It loads a first part, start interpret it and load another part and so.... So the total time is probably less than 7 seconds but you need several seconds just for the IO. During this time, the frontend is waiting a answer and do not show the dropdown. It appears many second later, when the search complete and the frontend has results to show.

Typing BACKSPACE sometimes helps, to force the drop-down to finally appear, e.g. by shortening your search query from "AIDS" to "AID"

Xapian somehow have to load data for "AID" first to search (and load data) for "AIDS". So when you remove the S, the request for AID is faster as the AID "data" is already loaded.
But the request for AIDS is not stop on the server. It is just discarded by the frontend.

Typing "S" to restore your original search query ("AIDS") often then works, to force the drop-down to finally appear

You do another request for "AIDS". All the data is already loaded so the request is fast.


What can we do ?

Recent works on libzim API and libkiwix cache the searchers and searches. It helps to keep the parsed data in memory. But the loading of the "raw" data in ram is not really impacted. Even if create a new searcher, we can assume that kernel know what pages are loaded in ram and doesn't load them again. And if the memory is full (or any other reason), the kernel will discard pages even if a searcher is in cache on our side.

Stop using cheap hardware :) (Yes, I know this is not the answer you want to hear, but it is an answer anyway)

We can stop using xapian database for suggestion. We introduce xapian database here to have better results. But if user never see the results, they are not better.
But we would have to use the title sorted listing. So only title starting by searched term. And case-sensitive (except if add another case-insensitive listing in zim files along with listing/titleOrdered https://wiki.openzim.org/wiki/Search_indexes)

@holta
Copy link
Author

holta commented May 11, 2022

On wikipedia_en_all_maxi_2020-08.zim the title xapian database is 2692521984 bytes so 2.5 GB.

@mgautierfr that's a really great explanation (series of explanations!)

Thank you.

Xapian's entire ~2.5 GB doesn't need to be loaded into memory as you say — but if indeed Xapian requires large subsets of that to be moved across the computer's internal bus(es) that would indeed explain a lot 🤔

(And on the bright side, the better the pattern is understood, the more schools can try to adapt with such realities.)

But we would have to use the title sorted listing. So only title starting by searched term. And case-sensitive (except if add another case-insensitive listing in zim files along with listing/titleOrdered https://wiki.openzim.org/wiki/Search_indexes)

Certainly the list of 10 suggested article titles will never be perfect — however this evolves year-by-year, as we know from things like kiwix/kiwix-tools#513 — and we will live with that, whatever's decided!

👍

@holta
Copy link
Author

holta commented May 12, 2022

@mgautierfr this might be unrelated — but schools perceive this to be very much related:

Would you happen to know why about 30% of full-text searches (on Raspberry Pi, as described above, in essentially every school that uses them) fail completely, with the following message if using Chrome browser:

This site can’t be reached

box refused to connect.

Try:

Checking the connection
ERR_CONNECTION_REFUSED

Reloading the page a couple times often works, i.e. forcing a 2nd attempt, or a 3rd attempt.

This happens when there is no load on the Raspberry Pi 4 server, which has many GB of RAM available, and no other users during testing (which has reconfirmed this).

In any case the failures appear to occur very consistently about 30% of the time — at completely random intervals.

FYI this has been reconfirmed using unproxied URL's like the following example:

NOTE: during testing it's important to use a new search query (search string, a.k.a. search pattern) every time, to avoid caching of prior search results.

@kelson42
Copy link
Collaborator

I propose to wait the libkiwix 10.2.0 release and test again with it. If it still fail, then we will try to investigate to get a clear reproduction case.

@mgautierfr
Copy link
Member

I don't know why the connection is refused but there is another cause of slow down.

By default, kiwix-serve use only 4 threads to answer requests. If the thread pool is full, connection are accepted by the httpd library but directly put in "wait state" until a thread is freed.
If you start searches with "A", "AI", "AID", "AIDS", you have filled your thread pool and no request can be handle until a search is completed.

I've just tested with several fulltext search (changing the pagination) on a zim file on a usb drive (for long io) and on the 6 requests with only 2 threads available, I've "succeed" to have one 500 error (to investigate)
The 500 error is maybe reported as ERR_CONNECTION_REFUSED by the proxy ?

@kelson42
Copy link
Collaborator

@mgautierfr We should implement #395 to avoid one thread to be occuped too long and assure a better distribution of threads usage.

@holta
Copy link
Author

holta commented May 12, 2022

Thank you for the explanations & suggestions!

Quick questions below:

If you start searches with "A", "AI", "AID", "AIDS", you have filled your thread pool and no request can be handle until a search is completed.

Do you know if that applies when the user types in their search string very slowly — does this effectively launch 4 Xapian Title Searches — i.e. using up all 4 threads?

Search strings longer than 4 letters might use up all threads if so, causing serious resource starvation — if slow typing really does exacerbate these problems ?

(e.g. Is it important not to pause between each letter while typing in a search string?)

I propose to wait the libkiwix 10.2.0 release and test again with it.

Good to know. Roughly when is that (likely) expected?

@kelson42
Copy link
Collaborator

I propose to wait the libkiwix 10.2.0 release and test again with it.

Good to know. Roughly when is that (likely) expected?

I expect within a week.

@holta
Copy link
Author

holta commented May 12, 2022

Just FYI all testing mentioned above was reconfirmed with kiwix-tools 3.2.0-4:

root@box:/opt/iiab/kiwix/bin# ./kiwix-serve --version
kiwix-tools 3.2.0

libkiwix 10.1.1
+ libzim 7.2.1
+ libxapian 1.4.18
+ libcurl 7.67.0
+ libmicrohttpd 0.9.72
+ libz 1.2.12
+ libicu 58.2.0
+ libpugixml 0.12.0

libzim 7.2.1
+ libzstd 1.5.2
+ liblzma 5.2.4
+ libxapian 1.4.18
+ libicu 58.2.0

@holta holta changed the title Search drop-down [incl 10 suggested article titles] (1) never appears, or (2) appears many seconds later, or (3) appears nearly instantly if you type BACKSPACE Search drop-down [incl 10 suggested article titles] (1) never appears, or (2) appears many seconds later, or (3) appears nearly instantly if you type BACKSPACE — or if you hit ENTER, ERR_CONNECTION_REFUSED is very common May 12, 2022
@kelson42
Copy link
Collaborator

kelson42 commented Jun 26, 2022

@holta Would you be able please to provide an update with kiwix-tools 3.3.0? Does it works better?

@kelson42 kelson42 self-assigned this Jun 26, 2022
@holta
Copy link
Author

holta commented Jun 26, 2022

For the moment I cannot reproduce the search drop-down's severe slowness on Raspberry Pi 4 with these 2 different versions of kiwix-tools — those were kiwix-tools 3.2.0-1 from 2022-02-02...

# kiwix-serve --version

kiwix-tools 3.2.0

libkiwix 10.0.1
+ libzim 7.2.0
+ libxapian 1.4.18
+ libcurl 7.67.0
+ libmicrohttpd 0.9.72
+ libz 1.2.8
+ libicu 58.2.0
+ libpugixml 0.12.0

libzim 7.2.0
+ libzstd 1.5.1
+ liblzma 5.2.4
+ libxapian 1.4.18
+ libicu 58.2.0

And kiwix-tools 3.3.0 from 2022-06-15...

# kiwix-serve --version

kiwix-tools 3.3.0

libkiwix 11.0.0
+ libzim 7.2.2
+ libxapian 1.4.18
+ libcurl 7.67.0
+ libmicrohttpd 0.9.72
+ libz 1.2.12
+ libicu 58.2.0
+ libpugixml 0.12.0

libzim 7.2.2
+ libzstd 1.5.2
+ liblzma 5.2.4
+ libxapian 1.4.18
+ libicu 58.2.0

On the own hand this appears to be good news. On the other hand, I'd like to understand why I (and others) had so much trouble with kiwix-serve 3.2.0-4 back in early May. I'll try to do more tests in coming days to see if this can be better understood.

FYI both above tests used http://box/kiwix/wikipedia_en_all_maxi_2021-12/

@kelson42
Copy link
Collaborator

@holta Do you use the same hardware (rpi+sd) as a few months ago?

@holta
Copy link
Author

holta commented Jun 26, 2022

@holta Do you use the same hardware (rpi+sd) as a few months ago?

Yes.

Strangely I also cannot reproduce the slowness when using kiwix-tools 3.2.0-4 just as in early May.

But am using a different OS today: for the moment anyway I'm using the 32-bit Raspberry Pi OS on Raspberry Pi 4, whereas in early May I was using the 64-bit version of Raspberry Pi OS on Raspberry Pi 4.

So hypothetically the slowness flaw might be arising from using armhf builds of kiwix-tools on 64-bit Raspberry Pi OS??
(I'll investigate more in coming days.)

@kelson42
Copy link
Collaborator

@holta Quite impatient to know more about your investigations :)

@holta
Copy link
Author

holta commented Jun 27, 2022

@holta Quite impatient to know more about your investigations :)

  • Every kiwix-tools release after 3.2.0-1 has the problem i.e. all 5 releases {3.2.0-2, 3.2.0-3, 3.2.0-4, 3.2.0-5, 3.3.0}. In short, quickly typing in the search query "aids" fails to display the search dropdown about 80% of the time.
    • Whereas I could NOT reproduce the problem with kiwix-tools 3.2.0-1 from 2022-02-02, despite trying many times, with many reboots to verify. FYI 3.2.0-2 was released 2022-03-28 (I don't know how to further bisect, lacking the nightly builds from February and March).
  • The problem affects kids and schools severely because it occurs when connecting to the Raspberry Pi 4 over WiFi, no matter which WiFi driver/firmware (I tried 3 different recent WiFi drivers/firmwares from Raspberry Pi OS to be 100% sure).
    • The problem does NOT occur when connecting to the Raspberry Pi 4 over Ethernet (very eye-opening, but unfortunately very few schools use Ethernet, so this does not help them).
  • Sometimes the problem is even more severe, e.g. even search query "aid" will not generate a search dropdown.
    • Workaround: Type BACKSPACE one more time (to use search query "ai") which will almost always generate a dropdown.
    • Then type (restore) "d" to generate a dropdown for "aid".
    • Finally, type (restore) "s" to generate a dropdown for the original search query "aids".

The above symptoms are essentially/exactly as described in early May 2022, further up on this ticket (#769).

Here is an ADDITIONAL (RELATED?) ISSUE... that appears extremely similar: (but might have a different root cause?)

  • The search dropdown Almost Never Appears (no matter if WiFi or Ethernet connection to Raspberry Pi 4) if you type out Greek letters e.g. "alpha" "beta" "omicron" etc (the letter "mu" is the only exception, among all 24 Greek letters). To be clear: the search drop-down did not appear when I tried each and every one of the other 23 Greek letters.
    • EXAMPLE: Type out (spell out) Greek letter "pi" (nothing will appear, no matter longer how long you wait) and then add the letter "t" and the dropdown will usually appear very quickly (offering search options relating to "pit").
    • If it does not, add another letter or two (e.g. "pithy") to force the dropdown to appear.
    • Then back off, one letter at a time. Search query "pit" will now generate a dropdown.
    • BUT: search query "pi" will never generate a dropdown (it appears!)
  • This new (related?) issue appears 100% deterministic and should be much easier to diagnose as it occurs even over Ethernet, and is confirmed to also be a problem with the x86_64 version of kiwix-tools 3.3.0

RECAP / CLARIFICATIONS:

  • 32-bit Raspberry Pi OS is equivalent to 64-bit Raspberry Pi OS (with regard to both problems) so my suspicion there yesterday was wrong.
  • Opening a new tab is sufficient to test for both problem(s). (In other words: the kiwix-serve / kiwix-search machine does not need to be rebooted, nor does your browser cache need to cleared.)
  • I tested everything with port 3000 (http://box:3000/kiwix/wikipedia_en_all_maxi_2021-12/) so as to guarantee 100% that Internet-in-a-Box's NGINX proxy was not involved in any way.

@kelson42
Copy link
Collaborator

@mgautierfr Do you have all the material you need to try a reproduction case?

@kelson42
Copy link
Collaborator

@holta latest libzim/libkiwix/kiwix-tools are still not released yet, but will have to tackle this in the next months. Do you know at least if this still appear with latest nightly of kiwix-serve?

@holta
Copy link
Author

holta commented Oct 24, 2022

quickly typing in the search query "aids" fails to display the search dropdown about 80% of the time

Quick Tests: I can't reproduce the above with kiwix-tools nightly build 2022-10-24, with the latest Raspberry Pi OS:

Search query "AIDS" is slow to appear, but appeared every time within about 5-10 seconds.

type out Greek letters e.g. "alpha" "beta" "omicron" etc (the letter "mu" is the only exception, among all 24 Greek letters). To be clear: the search drop-down did not appear

The above failure however DOES occur every time — EXAMPLE:

  • The search query "pi" will never work (as it's a Greek letter, try it!)
  • Adding or removing a letter (e.g. "p" or "pit") will however work.

@kelson42
Copy link
Collaborator

@mgautierfr Might that be that this ticket has been a duplicate of kiwix/kiwix-tools#573? For latest "pi" stuff stringly suspect a stopword on steeming which fails, what donyou think?

@holta holta changed the title Search drop-down [incl 10 suggested article titles] (1) never appears, or (2) appears many seconds later, or (3) appears nearly instantly if you type BACKSPACE — or if you hit ENTER, ERR_CONNECTION_REFUSED is very common (1) Search drop-down [incl 10 suggested article titles] (1) never appears, or (2) appears many seconds later, or (3) appears nearly instantly if you type BACKSPACE — or if you hit ENTER, ERR_CONNECTION_REFUSED is very common (2) SEARCH FAILS when search query is a Greek letter like "pi", "alpha", beta" ETC Oct 24, 2022
@holta holta changed the title (1) Search drop-down [incl 10 suggested article titles] (1) never appears, or (2) appears many seconds later, or (3) appears nearly instantly if you type BACKSPACE — or if you hit ENTER, ERR_CONNECTION_REFUSED is very common (2) SEARCH FAILS when search query is a Greek letter like "pi", "alpha", beta" ETC (A) Search drop-down [incl 10 suggested article titles] (1) never appears, or (2) appears many seconds later, or (3) appears nearly instantly if you type BACKSPACE — or if you hit ENTER, ERR_CONNECTION_REFUSED is very common (B) SEARCH FAILS when search query is a Greek letter like "pi", "alpha", beta" ETC Oct 24, 2022
@kelson42 kelson42 modified the milestones: 12.1.0, 12.0.0 Oct 29, 2022
@holta holta changed the title (A) Search drop-down [incl 10 suggested article titles] (1) never appears, or (2) appears many seconds later, or (3) appears nearly instantly if you type BACKSPACE — or if you hit ENTER, ERR_CONNECTION_REFUSED is very common (B) SEARCH FAILS when search query is a Greek letter like "pi", "alpha", beta" ETC (A) Search drop-down [incl 10 suggested article titles] (1) never appears, or (2) appears many seconds later, or (3) appears nearly instantly if you type BACKSPACE — or if you hit ENTER, ERR_CONNECTION_REFUSED is very common (B) SEARCH FAILS when search query is a Greek letter like "pi", "alpha", "beta" ETC Oct 30, 2022
@kelson42
Copy link
Collaborator

kelson42 commented Nov 1, 2022

@mgautierfr A feedback?

@kelson42
Copy link
Collaborator

kelson42 commented Nov 3, 2022

I had a look to the problem with "pi" and the problem is related to wrong json escaping, see this:

$ curl -s "http://127.0.0.1:8080/suggest?content=wikipedia_en_all_nopic_2022-01&term=pi" | cat -n
     1	[
     2	  {
     3	    "value" : "PI",
     4	    "label" : "<b>PI</b>",
     5	    "kind" : "path"
     6	      , "path" : "A/PI"
     7	  },
     8	  {
     9	    "value" : "Pi",
    10	    "label" : "<b>Pi</b>",
    11	    "kind" : "path"
    12	      , "path" : "A/Pi"
    13	  },
    14	  {
    15	    "value" : "Pi.",
    16	    "label" : "<b>Pi</b>.",
    17	    "kind" : "path"
    18	      , "path" : "A/Pi."
    19	  },
    20	  {
    21	    "value" : "Pí",
    22	    "label" : "Pí",
    23	    "kind" : "path"
    24	      , "path" : "A/Pí"
    25	  },
    26	  {
    27	    "value" : "\pi",
    28	    "label" : "\<b>pi</b>",
    29	    "kind" : "path"
    30	      , "path" : "A/\pi"
    31	  },
    32	  {
    33	    "value" : "E^pi-pi",
    34	    "label" : "E^<b>pi</b>-<b>pi</b>",
    35	    "kind" : "path"
    36	      , "path" : "A/E^pi-pi"
    37	  },
    38	  {
    39	    "value" : "PI 88788",
    40	    "label" : "<b>PI</b> 88788",
    41	    "kind" : "path"
    42	      , "path" : "A/PI_88788"
    43	  },
    44	  {
    45	    "value" : "PI-21858",
    46	    "label" : "<b>PI</b>-21858",
    47	    "kind" : "path"
    48	      , "path" : "A/PI-21858"
    49	  },
    50	  {
    51	    "value" : "PI-3K",
    52	    "label" : "<b>PI</b>-3K",
    53	    "kind" : "path"
    54	      , "path" : "A/PI-3K"
    55	  },
    56	  {
    57	    "value" : "Pi (1998)",
    58	    "label" : "<b>Pi</b> (1998)",
    59	    "kind" : "path"
    60	      , "path" : "A/Pi_(1998)"
    61	  },
    62	  {
    63	    "value" : "pi ",
    64	    "label" : "containing 'pi'...",
    65	    "kind" : "pattern"
    66	    
    67	  }
    68	]

and here with json integrety check:

$ curl "http://127.0.0.1:8080/suggest?content=wikipedia_en_all_nopic_2022-01&term=pi" | jq
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  1317  100  1317    0     0  24687      0 --:--:-- --:--:-- --:--:-- 24849
parse error: Invalid escape at line 27, column 19

@veloman-yunkan Would you be able to quickly fit that (and adapt the test)? In general I wonder that we have this kind of bug, we don't use an external primitive to do the json escaping?!

@kelson42 kelson42 assigned veloman-yunkan and unassigned kelson42 Nov 3, 2022
@kelson42
Copy link
Collaborator

kelson42 commented Nov 9, 2022

@veloman-yunkan Any feedback? this seems to me to be a blocker for 12.0.0 release

@kelson42
Copy link
Collaborator

kelson42 commented Nov 17, 2022

Everything should works fine now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants