-
Notifications
You must be signed in to change notification settings - Fork 999
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Modularize ingestion distributed compute engine support #444
Comments
Agreed with this problem. I think we should be defining the exact extension points here very clearly though. It can be hard to talk about in the abstract, so the questions I see are
Then more than that. I can see the introduction of this layer introducing a lot of overhead and complexity in the short term, even though it will pay dividends if teams are starting to fork the code base now (which might be the case already with 0.3). I would want to make sure we get alignment on the future direction of Feast so that we can stabilize the architecture before solidifying these modularization points. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
I'm extremely wary of this type of complexity within the service. I'm pretty biased but I would prefer something more along the lines of these options:
I don't know what the state of the art is these days with OSGi or other SP frameworks but I don't think the complexity they bring is worth it. |
Closing this issue since it is now stale. The Job Service manages jobs, and we have different launcher implementations available. Currently we are purely using Spark. |
This is a companion to #402 and the larger topic of storage engine modularization which was realized in #529 and subsequent PRs that implemented the new interfaces.
Just as adding support for new storage engines tends to cause a dependency explosion for Feast
ingestion
&serving
, the same is true for Beam Runner / job management adapter glue incore
(this all could move toserving
with future plans, but that won't change the fundamental problem this issue is about).So for both storage and compute engines, I feel that some modularity strategy is needed for loose binding at build time, configurable for runtime. The goals would be to:
hadoop-common
orhadoop-client
leave you with close to 200MB of jars, andbeam-runners-spark
andbeam-sdks-java-io-hcatalog
among others have these deps [as provided scope, but the point stands I believe]).Possibilities might be OSGi or
java.util.ServiceLoader
(and Spring integration or alternatives thereof). Open to other ideas!Relates to #362
The text was updated successfully, but these errors were encountered: