Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed jobs #605

Open
cavis opened this issue Apr 14, 2022 · 0 comments
Open

Failed jobs #605

cavis opened this issue Apr 14, 2022 · 0 comments
Labels

Comments

@cavis
Copy link
Member

cavis commented Apr 14, 2022

Saw some failing jobs yesterday, that caused a bunch of "SQS messages too old" alarms. Because the workers were dying, the messages were hidden for ~1 hour 20 times in a row.

I think these were a story that got deleted before the image callback and search indexer+deindexer jobs came in. I thought we were handling those GID failures everywhere, but maybe not. Look into it, because those alarms are noisy!

From the image-callback queue:

{
    "Time": "2022-04-13T18:05:26.848Z",
    "Timestamp": 1649873126.848,
    "JobResult":
    {
        "Job":
        {
            "Id": "gid://prx/StoryImage/784584"
        },
        "Execution":
        {
            "Id": "arn:aws:states:us-east-1:561178107736:execution:StateMachine-xeT5hO7gtTy9:2d85a4fd-8a5b-4301-ba5d-a3b38dd9e1c3"
        },
        "FailedTasks":
        [],
        "State": "DONE",
        "TaskResults":
        [
            {
                "Task": "Copy",
                "Mode": "AWS/S3",
                "BucketName": "production.mediajoint.prx.org",
                "ObjectKey": "public/piece_images/784584/LWoS_season02_Cover_v4_031522.png",
                "Time": "2022-04-13T18:05:23.726Z",
                "Timestamp": 1649873123.726
            },
            {
                "Task": "Inspect",
                "Inspection":
                {
                    "Size": 12330765,
                    "Extension": "png",
                    "MIME": "image/png",
                    "Image":
                    {
                        "Width": 2000,
                        "Height": 2000,
                        "Format": "png"
                    }
                }
            },
            {
                "Task": "Image",
                "BucketName": "production.mediajoint.prx.org",
                "ObjectKey": "public/piece_images/784584/LWoS_season02_Cover_v4_031522_square.png",
                "Time": "2022-04-13T18:05:25.885Z",
                "Timestamp": 1649873125.885
            },
            {
                "Task": "Image",
                "BucketName": "production.mediajoint.prx.org",
                "ObjectKey": "public/piece_images/784584/LWoS_season02_Cover_v4_031522_small.png",
                "Time": "2022-04-13T18:05:25.967Z",
                "Timestamp": 1649873125.967
            },
            {
                "Task": "Image",
                "BucketName": "production.mediajoint.prx.org",
                "ObjectKey": "public/piece_images/784584/LWoS_season02_Cover_v4_031522_medium.png",
                "Time": "2022-04-13T18:05:26.392Z",
                "Timestamp": 1649873126.392
            }
        ]
    }
}

And 2 in the search indexer queue:

{
    "job_class": "SearchIndexerJob",
    "job_id": "c104191c-8439-42b2-8d91-e8fe639c56e6",
    "queue_name": "dc51b3fd_prod_cms_search_indexer",
    "arguments":
    [
        {
            "_aj_globalid": "gid://prx/Story/416310"
        }
    ],
    "locale": "en"
}
{
    "job_class": "SearchDeindexerJob",
    "job_id": "1016a8dc-5337-4baf-82c0-bdc8cd79a478",
    "queue_name": "dc51b3fd_prod_cms_search_indexer",
    "arguments":
    [
        "Story",
        416310
    ],
    "locale": "en"
}
@cavis cavis added the medium label Apr 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant