Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use cache backend to store tqdm process information #58

Closed
wants to merge 1 commit into from

Conversation

jedie
Copy link
Collaborator

@jedie jedie commented Jan 3, 2022

Remove the TaskProgressModel model that was used to store process information. But this has some
disadvantages:

Refactor that completely by using the Django cache to store progress information.
If a task has been finished: Transfer the information from cache to database.

  • fix progress information in admin
  • Update all tests

_huey_signals.SIGNAL_EXPIRED,
_huey_signals.SIGNAL_REVOKED,
_huey_signals.SIGNAL_INTERRUPTED,
)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if this information is completely correct.

jedie pushed a commit that referenced this pull request Jan 7, 2022
Use the code from #44 but without the big README
change, because `cumulate2parents` is already deprecated when it is introduced, see:
*
#60
* #58
jedie pushed a commit that referenced this pull request Jan 7, 2022
Use the code from #44 but without the big README
change, because `cumulate2parents` is already deprecated when it is introduced, see:
*
#60
* #58
Remove the `TaskProgressModel` model that was used to store process information. But this has some
disadvantages:

* Every `process_info.update()` call creates one `TaskProgressModel` instance and this model was
never cleaned. So if many items was processed, the table may be get full. So this PR fixed #46
* If many small `process_info.update()` calls happens, then we get a high database load

Refactor that completely by using the Django cache to store progress information.
If a task has been finished: Transfer the information from cache to database.
@Skrattoune
Copy link
Contributor

Skrattoune commented Jan 27, 2022

Hi Jedie,

any news on this?

Because I just realized that you closed issue #46 while my solution is not implemented
... and your potential fix #58 is not yet implemented

While I didn't had this problem anymore since August 22 with the fix I implemented locally(see #46),
I suddently realized that with version 0.4.4 (5 months later), a new ProgressInfo object is still created each time the ProgressInfo.update() method is called ... overloading the database and greatly slowing down the system.

The code that has been working without any issue in prod since August 22, 2021:

    def update(self, n=1):
        """
        Create a TaskProgressModel instance to main and sub tasks
        to store the progress information.
        """
        self.total_progress += n

        now = timezone.now()
        ids = [self.task.id]
        objects = [
            TaskProgressModel.objects.get_or_create(
                task_id=self.task.id)[0])
        ]

        if self.parent_task_id:
            # Store information for main task, too:
            ids.append(self.parent_task_id)
            
            if self.cumulate2parents:
                objects.append(
                    TaskProgressModel.objects.get_or_create(
                    task_id=self.parent_task_id)[0]
                )

        for obj in objects:
            if obj.progress_count:
                obj.progress_count += n
            else:
                obj.progress_count = n
            obj.save()

        # Update the last change date times:
        TaskModel.objects.filter(task_id__in=ids).update(
            update_dt=now
        )

I'll create a PR to speed-up the process

@Skrattoune
Copy link
Contributor

I submitted a PR #67

corresponding to #46 while we wait for #58 to be availlable

@jedie
Copy link
Collaborator Author

jedie commented Jan 28, 2022

@Skrattoune Your idea is used here: #68

Any comments on code change here?!?

@jedie jedie closed this Feb 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

django.db.utils.OperationalError: (1114, "The table 'huey_monitor_taskprogressmodel' is full")
2 participants