Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HDFS with one NameNode and no JournalNodes (no HA) for Integration Tests #266

Open
fhennig opened this issue Nov 7, 2022 · 9 comments
Open

Comments

@fhennig
Copy link
Contributor

fhennig commented Nov 7, 2022

Druid uses HDFS for DeepStorage in its unit tests. A new HDFS instance needs to be spun up every time a test is run, and HDFS is a major contributor to the overall test duration, and also causes tests to fail occasionally because of its long startup times.

As I have learned today (thanks to Lars!), HDFS only uses a second NameNode and the JournalNodes for high availability (HA) but can also run without them. I think it would be great to have this. For an intance that only lives a few minutes and might not even see data written to it, the most bare-bones setup should be used.

@fhennig fhennig changed the title HDFS with one NameNode and No JournalNodes (no HA) for Integration Tests HDFS with one NameNode and no JournalNodes (no HA) for Integration Tests Nov 7, 2022
@lfrancke
Copy link
Member

lfrancke commented Nov 7, 2022

It is a great idea but I'm not sure if our HDFS even supports a non-HA setup.
So we'd probably have at least one JN but only a single NN.

@maltesander
Copy link
Member

As far as i remember:

  • When creating a namenode we check if there is already an active namenode
  • When creating datanodes we check that all namenode are ready or in standby mode

Ill run a quick test today.

@maltesander
Copy link
Member

Running with nightly:
ERROR hdfs_controller: stackable_operator::logging::controller: Failed to reconcile object controller.name="hdfsclusters.hdfs.stackable.tech" error=reconciler for object HdfsCluster.v1alpha1.hdfs.stackable.tech/simple-hdfs.default failed error.sources=[Missing node role journalnode]

I removed the check here


But the namenode is not coming up. So this would require more investigation.

@lfrancke
Copy link
Member

lfrancke commented Nov 8, 2022

Yes, it requires a different configuration if running without HA.
I'm not sure if we even want to support that though....

@maltesander
Copy link
Member

Yeah but it could significantly lower the test duration / resources if we can strip it down. Or switch to S3?

@lfrancke
Copy link
Member

lfrancke commented Nov 8, 2022

We should either support it or if we don't support it then we shouldn't allow invalid configurations...

@maltesander
Copy link
Member

Yeah, i was more talking about optimizing the tests. I think we could switch to minio for deep storage in most cases?

@fhennig
Copy link
Contributor Author

fhennig commented Nov 8, 2022

At least for druid we could, yes.

@maltesander
Copy link
Member

At least for druid we could, yes.

Ok did not see this is the hdfs-operator repo...the deep storage stuff pointed me to druid.. :D

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants