You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Quering taking lot of time (18 sec to 3 min) intermittently
I am using pymagnitude in one of my project to load and use GoogleNews-vectors-negative300.bin.
I have converted GoogleNews-vectors-negative300.bin ----> to a .magnitude file and loading the .magnitude file using Magnitude(). I use pymagnitude to generate embedding of words and then train a ANN model on those embedding.
On my local (with below mentioned details), i face no issue and
Environments:-
(local):-
Mac, 32 GB RAM,docker with centos ---- very fast less than fraction of a second
(Testing Environment):-
CentOs 16 GB Ram --- intermittent slowness, taking 18sec to 3 min for querying some words and the process timeouts.
** I am using a mount , to keep my mmap files. And assured that it is not getting wiped out.
Here are the finings of a few words on Testing Environment and on local :-
Word, Time on Testing Environment
li��n , 0.82 min
ph���m, 0.4 min
al,1.3
Time on local of above keys is very less , even less than a second.
On further investigation and profiling execution time we observed that more time is being taken in case an OOV token if found, and _db_query_similar_keys_vector function is invoked.
Sample Queries which are taking more time:-
SELECT
magnitude.*
FROM
magnitude_subword,
magnitude
WHERE
char_ngrams MATCH "\uf000al" OR "al" OR "l" OR "\uf000"
AND magnitude.rowid = magnitude_subword.rowid
ORDER BY
(
(
LENGTH(offsets(magnitude_subword)) - LENGTH(
REPLACE(offsets(magnitude_subword), ' ', '')
)
) + 1
) DESC,
magnitude.key LIKE 'a%'
AND LENGTH(magnitude.key) <= 4 DESC,
magnitude.key LIKE '%';
-- Took 3.8 min to execute
SELECT
magnitude.*
FROM
magnitude_subword,
magnitude
WHERE
char_ngrams MATCH "\uf000ch" OR "ch" OR "h" OR "n" OR "ng" OR "ng\uf000"
AND magnitude.rowid = magnitude_subword.rowid
ORDER BY
(
(
LENGTH(offsets(magnitude_subword)) - LENGTH(
REPLACE(offsets(magnitude_subword), ' ', '')
)
) + 1
) DESC,
magnitude.key LIKE 'a%'
AND LENGTH(magnitude.key) <= 4 DESC,
magnitude.key LIKE '%';
-- Took 2 min to execute
The text was updated successfully, but these errors were encountered:
Quering taking lot of time (18 sec to 3 min) intermittently
I am using pymagnitude in one of my project to load and use GoogleNews-vectors-negative300.bin.
I have converted GoogleNews-vectors-negative300.bin ----> to a .magnitude file and loading the .magnitude file using Magnitude(). I use pymagnitude to generate embedding of words and then train a ANN model on those embedding.
On my local (with below mentioned details), i face no issue and
Environments:-
(local):-
Mac, 32 GB RAM,docker with centos ---- very fast less than fraction of a second
(Testing Environment):-
CentOs 16 GB Ram --- intermittent slowness, taking 18sec to 3 min for querying some words and the process timeouts.
** I am using a mount , to keep my mmap files. And assured that it is not getting wiped out.
Here are the finings of a few words on Testing Environment and on local :-
Word, Time on Testing Environment
li��n , 0.82 min
ph���m, 0.4 min
al,1.3
Time on local of above keys is very less , even less than a second.
On further investigation and profiling execution time we observed that more time is being taken in case an OOV token if found, and _db_query_similar_keys_vector function is invoked.
Sample Queries which are taking more time:-
SELECT
magnitude.*
FROM
magnitude_subword,
magnitude
WHERE
char_ngrams MATCH "\uf000al" OR "al" OR "l" OR "\uf000"
AND magnitude.rowid = magnitude_subword.rowid
ORDER BY
(
(
LENGTH(offsets(magnitude_subword)) - LENGTH(
REPLACE(offsets(magnitude_subword), ' ', '')
)
) + 1
) DESC,
magnitude.key LIKE 'a%'
AND LENGTH(magnitude.key) <= 4 DESC,
magnitude.key LIKE '%';
-- Took 3.8 min to execute
SELECT
magnitude.*
FROM
magnitude_subword,
magnitude
WHERE
char_ngrams MATCH "\uf000ch" OR "ch" OR "h" OR "n" OR "ng" OR "ng\uf000"
AND magnitude.rowid = magnitude_subword.rowid
ORDER BY
(
(
LENGTH(offsets(magnitude_subword)) - LENGTH(
REPLACE(offsets(magnitude_subword), ' ', '')
)
) + 1
) DESC,
magnitude.key LIKE 'a%'
AND LENGTH(magnitude.key) <= 4 DESC,
magnitude.key LIKE '%';
-- Took 2 min to execute
The text was updated successfully, but these errors were encountered: