Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add aggregator events #377

Closed
wants to merge 1 commit into from

Conversation

andrey-canon
Copy link
Contributor

@andrey-canon andrey-canon commented Dec 6, 2023

Description:
This implements multiple transformers for events that I'm proposing in this PR

Progress events, those will be emitted when the block has not reached the 100% of completion

  • edx.completion_aggregator.progress.course
  • edx.completion_aggregator.progress.chapter
  • edx.completion_aggregator.progress.sequential
  • edx.completion_aggregator.progress.vertical

Completion events, those will be emitted when the block has reached the 100% of completion

  • edx.completion_aggregator.completion.course
  • edx.completion_aggregator.completion.chapter
  • edx.completion_aggregator.completion.sequential
  • edx.completion_aggregator.completion.vertical

And basically I'm proposing this since the current "completion" implementation doesn't cover completion events and that has the same limitation as the edx-completion plugin, that only works with unit components

Note

Docs part is missing since I'm not sure if this can be included because those events doesn't belong to the standard openedx libraries, so if this pr is viable and will update that

Possible Reviewers:

Merge checklist:

  • All reviewers approved
  • CI build is green
  • Version bumped
  • Changelog record added
  • Documentation updated (not only docstrings)
  • Commits are squashed

Post merge:

  • Create a tag
  • Check new version is pushed to PyPI after tag-triggered build is
    finished.

Author concerns: List any concerns about this PR - inelegant
solutions, hacks, quick-and-dirty implementations, concerns about
migrations, etc.

@openedx-webhooks
Copy link

Thanks for the pull request, @andrey-canon! Please note that it may take us up to several weeks or months to complete a review and merge your PR.

Feel free to add as much of the following information to the ticket as you can:

  • supporting documentation
  • Open edX discussion forum threads
  • timeline information ("this must be merged by XX date", and why that is)
  • partner information ("this is a course on edx.org")
  • any other information that can help Product understand the context for the PR

All technical communication about the code itself will be done via the GitHub pull request interface. As a reminder, our process documentation is here.

Please let us know once your PR is ready for our review and all tests are green.

@openedx-webhooks openedx-webhooks added the open-source-contribution PR author is not from Axim or 2U label Dec 6, 2023
@bmtcril
Copy link
Contributor

bmtcril commented Dec 7, 2023

We had talked about using the completion aggregator for this in the past, but had decided not to due to some performance issues. I don't know what the current state of performance is on the package, but I think @ormsbee had raised the concern. I do know that even without the aggregations Completions is pretty database intensive.

That said, I don't think the aggregator does anything that we can't do in ClickHouse with the data we already have (course structure and block completion state), and faster than we would get from the aggregator running in async mode.

Overall my preference would be to invest in generating the aggregations on the data processing side and not burden the LMS with generating something it doesn't use. If you really want to go down this road instead I think you can include these transformers in the aggregator project itself (using the same @XApiTransformersRegistry.register decorator and apps.ready() methodology if ERB is installed). Then the transformer code can live with the data model producing the events and version together. The only potential issue I see is making sure that the aggregator falls after ERB in INSTALLED_APPS.

@andrey-canon
Copy link
Contributor Author

@bmtcril thanks for your answer, I wasn't aware the completion aggregator library has or had performance issues, indeed
you made me check the celery tasks in our production instance and thankfully, the task times haven't changed since we install that library.

Regarding the other part of your answer, personally I disagree, I mean your comments are totally valid but they are conditioned to the aspects integration, you mentioned Clickhouse, data side processing and that could be ok when you have access to the whole integration however when someone is facing an external service where they don't even know which database that uses, that answer is not valid indeed the progress events, that I'm proposing here, are not compatible with aspects since aspects tries to convert all the progress events and then store that in the completion table and that's not possible with the current structure.
Finally this makes me wonder what is the scope of this project, a readme part says It provides a backend that can take events and re-transmit them to external services however now I feel that this is limited to aspects and in my opinion this library, at least, should emit standard(I know there is no standard) or common events in learning platform since data processing is just valid in a limited context.

@bmtcril
Copy link
Contributor

bmtcril commented Dec 7, 2023

@andrey-canon my apologies! The project should definitely be more general use, I just haven't had any conversations about it outside of Aspects in the last year so that's where my brain went. :) I'd love to have a chat with you about your use cases so we can better understand what the needs are, and to make sure we're not making decisions that limit them or introduce surprise breaking changes.

I was also completely wrong about having the ability to replicate what the aggregator does, as it handles things like cohorts and enrollment tracks that no sane downstream system should be trying to model. So I think the contribution is a valuable one, and likely to increase adoption of the aggregator in the long term.

I think the comment still stands that we don't want to be putting transformers in here for external plugins. Eventually I think it would make sense for the all of the transformers to live closer to the places where their tracking logs are emitted so they can version together. As much as possible this package shouldn't need to version in step with changes to other ones, which may have conflicts or different release schedules. I'm curious what @ziafazal @Ian2012 and @pomegranited think about all of that.

FWIW there are some callouts to potential performance issues in the openedx-completion-aggregator readme, specifically around the frequency of the asyc jobs, but again I'm not sure what the current state of all that is, or at what scale it could be come a problem. When this came up last March I wanted to bring the functionality into the completions project proper, but this was cited as a blocking issue for that at the time.

@ziafazal
Copy link
Contributor

ziafazal commented Dec 8, 2023

@andrey-canon @bmtcril This library is not limited to Aspects instead it is limited to edx-platform standard events. Standard events means those events emitted by official open edX installation. Supporting non standard events can pollute the code base of this library with stuff which might not be useful for most of its consumers.

@pomegranited
Copy link
Contributor

I think the comment still stands that we don't want to be putting transformers in here for external plugins. Eventually I think it would make sense for the all of the transformers to live closer to the places where their tracking logs are emitted so they can version together.

I disagree.. I don't think it makes sense to put xapi tracking log transformations with the tracking log emissions -- What does the completion library care about xAPI?

If we were going to do this, I'd say it's better to remove the transformation layer entirely and emit the tracking events directly as xAPI (or Caliper, as configured).

As much as possible this package shouldn't need to version in step with changes to other ones, which may have conflicts or different release schedules.

I agree with this, but we can do that by handling all known versions of events that might be sent, and strictly avoid breaking changes that might tie us to a release version.

Standard events means those events emitted by official open edX installation. Supporting non standard events can pollute the code base of this library with stuff which might not be useful for most of its consumers.

Agreed, but we should enshrine this in an ADR, specifying:

@pomegranited
Copy link
Contributor

I disagree.. I don't think it makes sense to put xapi tracking log transformations with the tracking log emissions -- What does the completion library care about xAPI?

I've been discussing this with @bmtcril , and I now see his point now about keeping the transforms for non-standard events near the event emitters. Specifically this:

I think you can include these transformers in the aggregator project itself (using the same @XApiTransformersRegistry.register decorator and apps.ready() methodology if ERB is installed). Then the transformer code can live with the data model producing the events and version together. The only potential issue I see is making sure that the aggregator falls after ERB in INSTALLED_APPS.

This capability means this library is "pluggable", so the transformers could live in https://github.com/open-craft/openedx-completion-aggregator, or a separate repo if that's not acceptable for some reason.

@Agrendalath what do you think about making event-routing-backends a dependency of the completion aggregator, and pulling these transforms (and tests) over there?

And this would be a great example case to use when writing up the ADR.

@Agrendalath
Copy link
Member

Agrendalath commented Dec 13, 2023

@bmtcril,

FWIW there are some callouts to potential performance issues in the openedx-completion-aggregator readme, specifically around the frequency of the asyc jobs, but again I'm not sure what the current state of all that is, or at what scale it could be come a problem. When this came up last March I wanted to bring the functionality into the completions project proper, but this was cited as a blocking issue for that at the time.

We've been using it in some Production environments for a few years. In one particular instance, we are running it every 30 minutes and didn't notice any side effects, even during shorter exams with around 1-2k users. We didn't test it on a bigger scale, though.

@pomegranited,

what do you think about making event-routing-backends a dependency of the completion aggregator, and pulling these transforms (and tests) over there?

This might be a basic question, but I have zero context on the event-routing-backends repo.
Currently, this library is not installed as a dependency for the edx-platform. Would there be any side effects to installing it along with the completion aggregator? The docs mention that the caliper backend is disabled by default, so does it mean that the xapi backend will suddenly start processing all events included in the event-routing-backends if we don't explicitly disable it? This would be a bit confusing from the perspective of somebody who only wants to install the completion aggregator.
If my assumptions are correct, perhaps we could define it like this to make this library optional for the completion aggregator.

extras_require={
    'event-routing-backends':  load_requirements('requirements/event-routing-backends.in'),
}

Also, what does it mean for backward compatibility? Is this library compatible with Palm?

@pomegranited
Copy link
Contributor

@Agrendalath My apologies, I should have given more context here.

Currently, this library is not installed as a dependency for the edx-platform.

Yes, this library isn't part of standard edx-platform; it's installed by the Aspects analytics Tutor plugin.

Would there be any side effects to installing it along with the completion aggregator? The docs mention that the caliper backend is disabled by default, so does it mean that the xapi backend will suddenly start processing all events included in the event-routing-backends if we don't explicitly disable it?

Ah yep, that would happen. So we need to disable XAPI_EVENTS_ENABLED by default, and enable it in the Aspects plugin. CC @bmtcril

If my assumptions are correct, perhaps we could define it like this to make this library optional for the completion aggregator.

Unfortunately that won't work with what I'm suggesting here. We'd need this library installed in order to have the Transforms proposed in this PR live the aggregator repo.

Also, what does it mean for backward compatibility? Is this library compatible with Palm?

Good point -- @bmtcril do we need to add this repo and tutor-contrib-aspects to the list that get tagged for named releases?

But here's what I know:

@Ian2012
Copy link
Contributor

Ian2012 commented Dec 14, 2023

@pomegranited We are running Olive, but I think this is compatible since Maple (but not 100% supported)

@pomegranited
Copy link
Contributor

@andrey-canon eduNEXT have an example of how they implemented non-openedx event transforms: https://github.com/eduNEXT/xblock-filesmanager/pull/35/files#diff-db6941ed877b308035e22d050146b2cf84a1c7b00101217bdbd9041326932034

@Ian2012 How do you feel about this implementation? Is this the model we should follow for other non-openedx events?

@Ian2012
Copy link
Contributor

Ian2012 commented Jan 10, 2024

@pomegranited @andrey-canon Yes, this is the pattern we should follow as this facilitates managing any changes on the source package. I don't think there is much to add besides what @bmtcril has said.

In case you are building a custom xAPI transformer and that's not accepted in the source package, you can create an external library that register your custom events.

@andrey-canon
Copy link
Contributor Author

First thank you all for your answers, I think we can extract a lot of topics to discuss, however personally I don't like the direction this pr is heading there are a lot topics that are related but don't cover the main issue that , in my opinion, is the lack of common events and its transformers. In this case this pr is related with progress and completion events, if the completion-aggregator library is not a viable option how we can provide those events to a different LRS

  • should we add those events to the edx-completion library ?
  • should we propose openedx-completion-aggregator as a standard library ? and move to the open-edx organization, opencratf is opened to that option ?
  • or how can we improve the current completion event, this is a small critique, but the current implementation is useless in a production scenario that generates statements of type progress for every unit component and the result always have the key completion equal to True, why is progress and not completed?
  • regarding the previous item, how you can ensure that the events and transformers that you are allowing are valuable ? sorry for been so annoying with this but I couldn't find any implementation for that completion event

I would like to know if you have another alternatives to improve the standard behavior obviously the extension is always an option and indeed that could be considered trivial and easier but that doesn't contribute, on the other hand, I would like to complement some previous answers since I was not able to follow the discussion at the right moment

eduNEXT are running Aspects in production -- @Ian2012 what release version are you running

The biggest instance with aspects that edunext supports is running in Maple, the issues that I faced were solved by @Ian2012 some months ago.

FWIW there are some callouts to potential performance issues in the openedx-completion-aggregator readme, specifically around the frequency of the asyc jobs, but again I'm not sure what the current state of all that is, or at what scale it could be come a problem. When this came up last March I wanted to bring the functionality into the completions project proper, but this was cited as a blocking issue for that at the time.

This has been running for a month in prod and I haven't seen any performance issues, sentry has report some errors but the amount is not relevant compared with the traffic

eduNEXT have an example of how they implemented non-openedx event transforms: https://github.com/eduNEXT/xblock-filesmanager/pull/35/files#diff-db6941ed877b308035e22d050146b2cf84a1c7b00101217bdbd9041326932034

Thanks for that, indeed that implementation is inspired by code I shared with the author of that commit, and that makes me leave another comment, the transformer implementation could be trivial however event-routing-backends library makes an explicit import from the edx-platform that makes harder to test plugins that uses this library, so why you didn't consider the wrapper approach ? I think that would be very helpful for future extensions.

Finally as I mentioned in the PR description I didn't finish this because I wasn't sure this was going to be accepted and most of you have confirmed it that's why I close this PR.

Thanks again :)

@openedx-webhooks
Copy link

@andrey-canon Even though your pull request wasn’t merged, please take a moment to answer a two question survey so we can improve your experience in the future.

@pomegranited
Copy link
Contributor

Thank you for raising this PR and starting the discussions here @andrey-canon .

should we propose openedx-completion-aggregator as a standard library ? and move to the open-edx organization, opencratf is opened to that option ?

I'm sure we'd be fine with that -- could add a waffle flag to disable aggregation by default (to banish the spectre of performance issues that may or may not be resolved), and then your new events could be transformed here just like the other openedx events.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
open-source-contribution PR author is not from Axim or 2U
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

7 participants