Queries not finishing when partition count on db increased. #88

akshaysyaduvanshi · 2024-05-14T06:11:08Z

Earlier Setup
32 partitions on DB
User Pool Resource Limits
CPU limit : 60
Queue Depth = 80
Connection version used : 4.1.6 aws marketplace singlestore connection.

I am using aws glue with 32 executors, for above setup and i could fetch 1 billion of records within 30 minutes.

Now one change we have done is changed the partitions count to 150 on database, now what i see is, some queries are running , some queries get queued, and running queries never finish.

Any idea on above , what could be causing this? Do all 150 queries need to execute in parallel?

AdalbertMemSQL · 2024-05-16T10:52:27Z

If you are using ReadFromAggregators parallel read feature - then yes. All reading tasks must start at the same time.
In the latest version, the connector tries to estimate how many resources the Spark cluster has and run several reading tasks inside of a single Spark task if needed. But generally, it is recommended to have enough big Spark cluster.

If you don't like to depend on number of database partitions in this way, you can use the ReadFromAggregatorsMaterialized (it will use more memory on the database side) feature or disable parallel read at all.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Queries not finishing when partition count on db increased. #88

Queries not finishing when partition count on db increased. #88

akshaysyaduvanshi commented May 14, 2024 •

edited

Loading

AdalbertMemSQL commented May 16, 2024

Queries not finishing when partition count on db increased. #88

Queries not finishing when partition count on db increased. #88

Comments

akshaysyaduvanshi commented May 14, 2024 • edited Loading

AdalbertMemSQL commented May 16, 2024

akshaysyaduvanshi commented May 14, 2024 •

edited

Loading