Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chat download fails for one channel #437

Closed
NothingTV opened this issue May 28, 2024 · 14 comments · Fixed by #439
Closed

chat download fails for one channel #437

NothingTV opened this issue May 28, 2024 · 14 comments · Fixed by #439

Comments

@NothingTV
Copy link

Hi there,

I monitor few channels and one of them constantly fails to finish the chat download task, I can't see any relevant errors, besides of that, that seems to tell nothing about the cause itself:

ganymede-api  | {"level":"debug","time":"2024-05-28T16:41:32+02:00","message":"finished downloading live video for 42390743976"}
ganymede-api  | {"level":"debug","time":"2024-05-28T16:41:32+02:00","message":"getting video duration"}
ganymede-api  | {"level":"debug","time":"2024-05-28T16:41:32+02:00","message":"video duration: 9047"}
ganymede-api  | {"level":"debug","time":"2024-05-28T16:41:32+02:00","message":"getting twitch videos for user: 138907996 with type archive and cursor eyJiIjpudWxsLCJhIjp7Ik9mZnNldCI6NTJ9fQ"}
ganymede-api  | {"level":"info","time":"2024-05-28T16:41:32+02:00","message":"found vod id 2157350903 for livestream 42390743976, updating database"}
ganymede-api  | {"level":"debug","time":"2024-05-28T16:41:32+02:00","message":"stopping heartbeat download-livevideo-42390743976"}
ganymede-api  | {"level":"debug","service":"worker","ActivityID":"11","ActivityType":"KillTwitchLiveChatDownload","Attempt":1,"Namespace":"default","RunID":"d5abeedf-3a10-4601-ac0c-ad767b61d17c","TaskQueue":"archive","WorkerID":"7928c1a39b9b","WorkflowID":"8585e116-182d-4ecd-8e9b-6ba451ac7b46_5","WorkflowType":"DownloadTwitchLiveVideoWorkflow","time":"2024-05-28T16:41:32+02:00","message":"ExecuteActivity"}
ganymede-api  | {"level":"info","time":"2024-05-28T16:41:32+02:00","message":"killing chat downloader for channel silverline"}
ganymede-api  | {"level":"debug","time":"2024-05-28T16:41:32+02:00","message":"found pid 224\n245\n for chat_downloader"}
ganymede-api  | {"level":"error","service":"worker","ActivityType":"KillTwitchLiveChatDownload","Attempt":1,"Error":"exit status 1","Namespace":"default","RunID":"d5abeedf-3a10-4601-ac0c-ad767b61d17c","TaskQueue":"archive","WorkerID":"7928c1a39b9b","WorkflowID":"8585e116-182d-4ecd-8e9b-6ba451ac7b46_5","time":"2024-05-28T16:41:32+02:00","message":"Activity error."}
ganymede-api  | {"level":"debug","time":"2024-05-28T16:41:32+02:00","message":"Error notification is disabled"}

In the workflows, you can also see that it fails (none of the tasks have a valid time tho):

Terminated
	[87256fac-bf53-4acf-a0e7-0705e6c2a56b_34](https://XXXX/workflows/87256fac-bf53-4acf-a0e7-0705e6c2a56b_34/f70d19d2-b6f2-4911-99dc-7f4e58458f4c)	[f70d19d2-b6f2-4911-99dc-7f4e58458f4c](https://XXXX/workflows/87256fac-bf53-4acf-a0e7-0705e6c2a56b_34/f70d19d2-b6f2-4911-99dc-7f4e58458f4c)	[ArchiveTwitchLiveChatWorkflow](https://XXXX/workflows/87256fac-bf53-4acf-a0e7-0705e6c2a56b_34/f70d19d2-b6f2-4911-99dc-7f4e58458f4c)	Invalid Date
Terminated
	[87256fac-bf53-4acf-a0e7-0705e6c2a56b_39](https://XXXX/workflows/87256fac-bf53-4acf-a0e7-0705e6c2a56b_39/50ac117d-f2d6-4317-8a10-f185806338ac)	[50ac117d-f2d6-4317-8a10-f185806338ac](https://XXXX/workflows/87256fac-bf53-4acf-a0e7-0705e6c2a56b_39/50ac117d-f2d6-4317-8a10-f185806338ac)	[DownloadTwitchLiveChatWorkflow](https://XXXX/workflows/87256fac-bf53-4acf-a0e7-0705e6c2a56b_39/50ac117d-f2d6-4317-8a10-f185806338ac)	Invalid Date
Failed
	[4b4e4993-1ceb-11ef-9764-0242ac170007](https://XXXX/workflows/4b4e4993-1ceb-11ef-9764-0242ac170007/87256fac-bf53-4acf-a0e7-0705e6c2a56b)	[87256fac-bf53-4acf-a0e7-0705e6c2a56b](https://XXXX/workflows/4b4e4993-1ceb-11ef-9764-0242ac170007/87256fac-bf53-4acf-a0e7-0705e6c2a56b)	[ArchiveLiveVideoWorkflow](https://XXXX/workflows/4b4e4993-1ceb-11ef-9764-0242ac170007/87256fac-bf53-4acf-a0e7-0705e6c2a56b)	Invalid Date
Failed
	[87256fac-bf53-4acf-a0e7-0705e6c2a56b_44](https://XXX/workflows/87256fac-bf53-4acf-a0e7-0705e6c2a56b_44/8585e116-182d-4ecd-8e9b-6ba451ac7b46)	[8585e116-182d-4ecd-8e9b-6ba451ac7b46](https://XXXX/workflows/87256fac-bf53-4acf-a0e7-0705e6c2a56b_44/8585e116-182d-4ecd-8e9b-6ba451ac7b46)	[ArchiveTwitchLiveVideoWorkflow](https://XXXX/workflows/87256fac-bf53-4acf-a0e7-0705e6c2a56b_44/8585e116-182d-4ecd-8e9b-6ba451ac7b46)	Invalid Date
Failed
	[8585e116-182d-4ecd-8e9b-6ba451ac7b46_5](https://XXXX/workflows/8585e116-182d-4ecd-8e9b-6ba451ac7b46_5/d5abeedf-3a10-4601-ac0c-ad767b61d17c)	[d5abeedf-3a10-4601-ac0c-ad767b61d17c](https://XXXX/workflows/8585e116-182d-4ecd-8e9b-6ba451ac7b46_5/d5abeedf-3a10-4601-ac0c-ad767b61d17c)	[DownloadTwitchLiveVideoWorkflow](https://XXXX/workflows/8585e116-182d-4ecd-8e9b-6ba451ac7b46_5/d5abeedf-3a10-4601-ac0c-ad767b61d17c)	Invalid Date

It's only for this one channel.

@Zibbp
Copy link
Owner

Zibbp commented May 28, 2024

Looks like it's failing because there are two processes running downloading that chat. A couple of questions:

  1. Has this recently started happening?
  2. If you restart the containers (when nothing is archiving) does the next archive of the channel succeed? If not, can you post the logs for that failure?

I'll update the code to handle multiple processes if it gets stuck in a bad state like this again.

@NothingTV
Copy link
Author

NothingTV commented May 31, 2024

  1. I can't answer you that, since I recently revived the whole instance.

  2. Just restarted, no new errors, also same result (still pending)

  3. Okay, so it looks like it worked after the api restart, two vods could get recorded without issues, now the streamer had a network issue, because of that she had to restart the stream entirely and it's stuck again on the chat_downloader Task, but this time, it also doesn't start another task for the stream after her disconnect.
    Errors like this flood my api container:

ganymede-api  | {"level":"warn","service":"worker","ActivityID":"9","ActivityType":"DownloadTwitchLiveChat","Attempt":1,"Error":"context canceled","Namespace":"default","RunID":"3ed0720a-0654-487c-a1fc-30447ab7d32e","TaskQueue":"archive","WorkerID":"7928c1a39b9b","WorkflowID":"c354a72c-8b5b-475a-be67-822256320a31_39","WorkflowType":"DownloadTwitchLiveChatWorkflow","time":"2024-05-31T18:04:49+02:00","message":"RecordActivityHeartbeat with error"}

I think these are the only logs for this stream:

ganymede-api  | {"level":"debug","time":"2024-05-31T17:06:22+02:00","message":"finished downloading live video for 42403585224"}
ganymede-api  | {"level":"debug","time":"2024-05-31T17:06:22+02:00","message":"getting video duration"}
ganymede-api  | {"level":"debug","time":"2024-05-31T17:06:23+02:00","message":"video duration: 2154"}
ganymede-api  | {"level":"debug","time":"2024-05-31T17:06:23+02:00","message":"getting twitch videos for user: 138907996 with type archive and cursor eyJiIjpudWxsLCJhIjp7Ik9mZnNldCI6NTJ9fQ"}
ganymede-api  | {"level":"info","time":"2024-05-31T17:06:23+02:00","message":"found vod id 2160007110 for livestream 42403585224, updating database"}
ganymede-api  | {"level":"debug","time":"2024-05-31T17:06:23+02:00","message":"stopping heartbeat download-livevideo-42403585224"}
ganymede-api  | {"level":"debug","service":"worker","ActivityID":"11","ActivityType":"KillTwitchLiveChatDownload","Attempt":1,"Namespace":"default","RunID":"e375084a-4f1e-4702-83b3-b6ebc867f531","TaskQueue":"archive","WorkerID":"7928c1a39b9b","WorkflowID":"f8333481-f9c5-4bcb-874b-0ccf30e78407_5","WorkflowType":"DownloadTwitchLiveVideoWorkflow","time":"2024-05-31T17:06:23+02:00","message":"ExecuteActivity"}
ganymede-api  | {"level":"info","time":"2024-05-31T17:06:23+02:00","message":"killing chat downloader for channel silverline"}
ganymede-api  | {"level":"debug","time":"2024-05-31T17:06:23+02:00","message":"found pid 172\n264\n for chat_downloader"}
ganymede-api  | {"level":"error","service":"worker","ActivityType":"KillTwitchLiveChatDownload","Attempt":1,"Error":"exit status 1","Namespace":"default","RunID":"e375084a-4f1e-4702-83b3-b6ebc867f531","TaskQueue":"archive","WorkerID":"7928c1a39b9b","WorkflowID":"f8333481-f9c5-4bcb-874b-0ccf30e78407_5","time":"2024-05-31T17:06:23+02:00","message":"Activity error."}
ganymede-api  | {"level":"debug","time":"2024-05-31T17:06:23+02:00","message":"Error notification is disabled"}
ganymede-api  | {"level":"warn","service":"worker","ActivityID":"9","ActivityType":"DownloadTwitchLiveChat","Attempt":1,"Error":"workflow execution already completed","Namespace":"default","RunID":"561516b2-f742-4e89-bd2c-87567b3f481f","TaskQueue":"archive","WorkerID":"7928c1a39b9b","WorkflowID":"4b1cba14-5e02-4cee-bcb7-f70c46290cea_39","WorkflowType":"DownloadTwitchLiveChatWorkflow","time":"2024-05-31T17:06:41+02:00","message":"RecordActivityHeartbeat with error"}

I also still have Invalid Date in every temporal task.

@Zibbp Zibbp reopened this May 31, 2024
@Zibbp
Copy link
Owner

Zibbp commented May 31, 2024

I made some changes in #439 that should resolve the double process ID error that you are having. Can you use the :main tag for both the API and frontend container and try again?
Before you switch the tags, please read #441 first as you might encounter this issue when you go to pull the images.

@NothingTV
Copy link
Author

Just switched to the :main-Tag, I will keep you posted.

I can't manually start tasks now tho, I get Request Failed Not Found after clicking the symbols.

@Zibbp
Copy link
Owner

Zibbp commented Jun 2, 2024

Whoops, new :main image tag should resolve the requests failing.

@NothingTV
Copy link
Author

So I tested it for few weeks/days with the latest update and I sadly still get errors during the conert-chat task. :(
But this time it seems that there is something useful in the logs:

ganymede-api  | {"level":"error","service":"worker","ActivityType":"ConvertTwitchLiveChat","Attempt":3,"Namespace":"default","PanicError":"runtime error: slice bounds out of range [67:63]","PanicStack":"activity for archive [panic]:\ngithub.com/zibbp/ganymede/internal/utils.ConvertTwitchLiveChatToTDLChat({0xc0041ae050, 0x44}, {0xc0041220a0, 0xe}, {0xc000aae870, 0x24}, {0xc004122050, 0xb}, 0x40f89b73, {0x353d6500, ...}, ...)\n\t/app/internal/utils/tdl.go:254 +0x121b\ngithub.com/zibbp/ganymede/internal/activities.ConvertTwitchLiveChat({0x1697ca8, 0xc0006de240}, {{0xc004122030, 0xb}, {0xc004122018, 0x4}, {0xc004122040, 0x6}, {0xc004122048, 0x4}, ...})\n\t/app/internal/activities/video.go:824 +0x10a6\nreflect.Value.call({0x1292280?, 0x1512e60?, 0x30?}, {0x148a653, 0x4}, {0xc0006de270, 0x2, 0xc0006de270?})\n\t/usr/local/go/src/reflect/value.go:596 +0xca6\nreflect.Value.Call({0x1292280?, 0x1512e60?, 0x2a?}, {0xc0006de270?, 0x30?, 0x7d3a3e99ca38?})\n\t/usr/local/go/src/reflect/value.go:380 +0xb9\ngo.temporal.io/sdk/internal.executeFunction({0x1292280, 0x1512e60}, {0xc0006e4040, 0x2, 0x1294ec0?})\n\t/go/pkg/mod/go.temporal.io/[email protected]/internal/internal_worker.go:1858 +0x26b\ngo.temporal.io/sdk/internal.executeFunctionWithContext({0x1697ca8, 0xc0006de240}, {0x1292280, 0x1512e60}, {0xc0001c2940, 0x1, 0x1})\n\t/go/pkg/mod/go.temporal.io/[email protected]/internal/internal_worker.go:1842 +0x1ae\ngo.temporal.io/sdk/internal.(*activityEnvironmentInterceptor).ExecuteActivity(0xc0006e2000, {0x1697d50?, 0xc000440070?}, 0xc003ffe000)\n\t/go/pkg/mod/go.temporal.io/[email protected]/internal/internal_activity.go:366 +0x67\ngo.temporal.io/sdk/internal.(*activityExecutor).ExecuteWithActualArgs(0xc000c81ad0, {0x1697d50, 0xc000440070}, {0xc0001c2940, 0x1, 0x1})\n\t/go/pkg/mod/go.temporal.io/[email protected]/internal/internal_worker.go:859 +0x148\ngo.temporal.io/sdk/internal.(*activityExecutor).Execute(0xc000c81ad0, {0x1697d50, 0xc000440070}, 0xc004196100)\n\t/go/pkg/mod/go.temporal.io/[email protected]/internal/internal_worker.go:845 +0x139\ngo.temporal.io/sdk/internal.(*activityTaskHandlerImpl).Execute(0xc000c180e0, {0x148deeb, 0x7}, 0xc000a18ea0)\n\t/go/pkg/mod/go.temporal.io/[email protected]/internal/internal_task_handlers.go:2117 +0x924\ngo.temporal.io/sdk/internal.(*activityTaskPoller).ProcessTask(0xc000444370, {0x12137a0, 0xc00062e028})\n\t/go/pkg/mod/go.temporal.io/[email protected]/internal/internal_task_pollers.go:1001 +0x283\ngo.temporal.io/sdk/internal.(*baseWorker).processTask(0xc000304c60, {0x1213860, 0xc0008b8580})\n\t/go/pkg/mod/go.temporal.io/[email protected]/internal/internal_worker_base.go:518 +0x153\ngo.temporal.io/sdk/internal.(*baseWorker).processTaskAsync.func1()\n\t/go/pkg/mod/go.temporal.io/[email protected]/internal/internal_worker_base.go:369 +0x45\ncreated by go.temporal.io/sdk/internal.(*baseWorker).processTaskAsync in goroutine 107\n\t/go/pkg/mod/go.temporal.io/[email protected]/internal/internal_worker_base.go:365 +0xa5","RunID":"580a5948-50a5-4087-83ad-c79560affee4","TaskQueue":"archive","WorkerID":"561ceb8a5f03","WorkflowID":"2ae47e86-8149-40fa-b4c3-13eeff09549d_9","time":"2024-07-11T21:57:29+02:00","message":"Activity panic."}

@Zibbp
Copy link
Owner

Zibbp commented Jul 13, 2024

@NothingTV can you upload the *-live-chat.json file from the archive?

@Zibbp
Copy link
Owner

Zibbp commented Jul 14, 2024

Tested with the change in #467 and the chat converts fine. If you have the /tmp directory mounted on your host filesystem then you can pull the latest :main API image and try rerunning the convert task. If you don't have the /tmp directory mounted you can copy out the files and copy them back in after a restart.

The current queue system has been a major pain point for users so I'm completely reworking it in #468 which will provide more flexibility when it comes to re-running certain tasks.

@NothingTV
Copy link
Author

NothingTV commented Jul 20, 2024

So I tested it for few days and most converts worked without issues, I think the main issue is gone, but this night I got a new chat convert failed task, I can't see anything in the logs about this (even tried to grep it), I'll attach the chat maybe you can reproduce it.
https://fps.wtf/42625316792_c1f6ed3f-45e8-11ef-8833-0242ac170007-live-chat.json

// EDIT: Again, with this error:

{"level":"error","service":"worker","ActivityType":"ConvertTwitchLiveChat","Attempt":3,"Namespace":"default","PanicError":"runtime error: slice bounds out of range [168:166]","PanicStack":"activity for archive [panic]:\ngithub.com/zibbp/ganymede/internal/utils.ConvertTwitchLiveChatToTDLChat({0xc00200c0f0, 0x44}, {0xc0019065c0, 0x9}, {0xc001f26150, 0x24}, {0xc001906580, 0xb}, 0x19a9b358, {0x2d276af0, ...}, ...)\n\t/app/internal/utils/tdl.go:254 +0x121b\ngithub.com/zibbp/ganymede/internal/activities.ConvertTwitchLiveChat({0x1697ca8, 0xc0020139e0}, {{0xc001906560, 0xb}, {0xc001906548, 0x4}, {0xc001906570, 0x6}, {0xc001906578, 0x4}, ...})\n\t/app/internal/activities/video.go:824 +0x10a6\nreflect.Value.call({0x1292280?, 0x1512e60?, 0x30?}, {0x148a653, 0x4}, {0xc002013a10, 0x2, 0xc002013a10?})\n\t/usr/local/go/src/reflect/value.go:596 +0xca6\nreflect.Value.Call({0x1292280?, 0x1512e60?, 0x2a?}, {0xc002013a10?, 0x30?, 0x1294ec0?})\n\t/usr/local/go/src/reflect/value.go:380 +0xb9\ngo.temporal.io/sdk/internal.executeFunction({0x1292280, 0x1512e60}, {0xc001902680, 0x2, 0x1294ec0?})\n\t/go/pkg/mod/go.temporal.io/[email protected]/internal/internal_worker.go:1858 +0x26b\ngo.temporal.io/sdk/internal.executeFunctionWithContext({0x1697ca8, 0xc0020139e0}, {0x1292280, 0x1512e60}, {0xc000ded1a0, 0x1, 0x1})\n\t/go/pkg/mod/go.temporal.io/[email protected]/internal/internal_worker.go:1842 +0x1ae\ngo.temporal.io/sdk/internal.(*activityEnvironmentInterceptor).ExecuteActivity(0xc0040224c0, {0x1697d50?, 0xc000318690?}, 0xc000197698)\n\t/go/pkg/mod/go.temporal.io/[email protected]/internal/internal_activity.go:366 +0x67\ngo.temporal.io/sdk/internal.(*activityExecutor).ExecuteWithActualArgs(0xc0004a6060, {0x1697d50, 0xc000318690}, {0xc000ded1a0, 0x1, 0x1})\n\t/go/pkg/mod/go.temporal.io/[email protected]/internal/internal_worker.go:859 +0x148\ngo.temporal.io/sdk/internal.(*activityExecutor).Execute(0xc0004a6060, {0x1697d50, 0xc000318690}, 0xc000b02080)\n\t/go/pkg/mod/go.temporal.io/[email protected]/internal/internal_worker.go:845 +0x139\ngo.temporal.io/sdk/internal.(*activityTaskHandlerImpl).Execute(0xc000284b60, {0x148deeb, 0x7}, 0xc0003d2000)\n\t/go/pkg/mod/go.temporal.io/[email protected]/internal/internal_task_handlers.go:2117 +0x924\ngo.temporal.io/sdk/internal.(*activityTaskPoller).ProcessTask(0xc00070ea50, {0x12137a0, 0xc0002183f8})\n\t/go/pkg/mod/go.temporal.io/[email protected]/internal/internal_task_pollers.go:1001 +0x283\ngo.temporal.io/sdk/internal.(*baseWorker).processTask(0xc000734160, {0x1213860, 0xc001980020})\n\t/go/pkg/mod/go.temporal.io/[email protected]/internal/internal_worker_base.go:518 +0x153\ngo.temporal.io/sdk/internal.(*baseWorker).processTaskAsync.func1()\n\t/go/pkg/mod/go.temporal.io/[email protected]/internal/internal_worker_base.go:369 +0x45\ncreated by go.temporal.io/sdk/internal.(*baseWorker).processTaskAsync in goroutine 157\n\t/go/pkg/mod/go.temporal.io/[email protected]/internal/internal_worker_base.go:365 +0xa5","RunID":"80d5ce0b-b3b0-4f06-8b5b-bd0bdddf0ba6","TaskQueue":"archive","WorkerID":"0f7757bb0faf","WorkflowID":"0ca636a9-2d44-4aa6-aa61-601806d2a369_9","time":"2024-07-20T15:54:08+02:00","message":"Activity panic."}

Chat

@Zibbp
Copy link
Owner

Zibbp commented Jul 23, 2024

I can't reproduce this on my production instance running :main or my development instance running the v3 branch. v3 is nearing the testing phase. If you're able to hold out a few more days until later this week or weekend I'll post a full write up for people who want to test. I've got the troublesome channel from your linked chats as a watched channel in my user acceptance testing (v3 branch) instance for myself to keep an eye out on.

@NothingTV
Copy link
Author

I'm always happy to test things, so if I can, I will :)

@Zibbp
Copy link
Owner

Zibbp commented Jul 28, 2024

v3 is available for testing. Changelog, breaking changes, and upgrade instructions are available in #474.

@Zibbp
Copy link
Owner

Zibbp commented Aug 12, 2024

I'm closing this out as v3.0.0 has made numerous changes to the archival flow. If you're able to reproduce this on v3 please open a new issue.

@Zibbp Zibbp closed this as completed Aug 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants