Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deprecate and remove Multiple Data Paths #71205

Open
2 of 5 tasks
rjernst opened this issue Apr 1, 2021 · 53 comments
Open
2 of 5 tasks

Deprecate and remove Multiple Data Paths #71205

rjernst opened this issue Apr 1, 2021 · 53 comments
Assignees
Labels
:Core/Infra/Core Core issues without another label Meta Team:Core/Infra Meta label for core/infra team

Comments

@rjernst
Copy link
Member

rjernst commented Apr 1, 2021

Multiple Data Paths (MDP) is a pseudo-software-RAID-0 feature within Elasticsearch allowing multiple paths to be specified in the path.data setting (which usually point to different disks). Although it has been used in the past as a simple way to run a multi-disk setups, it has long been a source of user complaints due to confusing or unintuitive behavior. Additionally, the implementation is complex, and not well-tested nor maintained, with practically no benefit over spanning the data path filesystem across multiple drives and/or running one node for each data path.

We have long advised against using MDP, and are now ready to deprecate and remove it. This is a meta-issue to track that work.

@rjernst rjernst added :Core/Infra/Core Core issues without another label Meta labels Apr 1, 2021
@rjernst rjernst self-assigned this Apr 1, 2021
@elasticmachine elasticmachine added the Team:Core/Infra Meta label for core/infra team label Apr 1, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra (Team:Core/Infra)

rjernst added a commit to rjernst/elasticsearch that referenced this issue Apr 1, 2021
This commit adds a node level deprecation log message when multiple
data paths are specified.

relates elastic#71205
@willemdh
Copy link

willemdh commented Apr 2, 2021

@rjernst
What? This is imho a serious regression, which makes me kind of depressed, as this would imply we are going to have to redeploy all our nodes using mdp..... Why remove this handy feature?

rjernst added a commit that referenced this issue Apr 2, 2021
This commit adds a node level deprecation log message when multiple
data paths are specified.

relates #71205
rjernst added a commit to rjernst/elasticsearch that referenced this issue Apr 2, 2021
This commit adds a node level deprecation log message when multiple
data paths are specified.

relates elastic#71205
rjernst added a commit that referenced this issue Apr 2, 2021
This commit adds a node level deprecation log message when multiple
data paths are specified.

relates #71205
rjernst added a commit to rjernst/elasticsearch that referenced this issue Apr 5, 2021
This commit adds a deprecation note to the multiple data paths doc. It also removes mention of multiple paths support in the setup settings table.

relates elastic#71205
rjernst added a commit that referenced this issue Apr 5, 2021
This commit adds a deprecation note to the multiple data paths doc. It also removes mention of multiple paths support in the setup settings table.

relates #71205
rjernst added a commit that referenced this issue Apr 5, 2021
This commit adds a deprecation note to the multiple data paths doc. It also removes mention of multiple paths support in the setup settings table.

relates #71205
@rjernst
Copy link
Member Author

rjernst commented Apr 6, 2021

@willemdh When removing features, we carefully weigh the benefit and ubiquitousness of a feature versus the cost of continuing to support that feature. In this case, multiple-data-paths is a feature that has a very high cost (numerous bugs and deficiencies in its design), relatively few users, and most importantly, better alternatives that are standard outside of Elasticsearch. While we understand it can be frustrating to need to change deployment methodologies, we believe that effort will ultimately improve these deployments of Elasticsearch by relying on industry standards, thereby allowing developers to focus time on other improvements like frozen tier, index lifecycle management, or ARM hardware support (these are just a few random improvements we recently worked on).

Please do note that MDP is only being deprecated here, and won't be removed until 8.0, so there is still plenty of time to work on a planned migration away from this legacy feature.

@willemdh
Copy link

willemdh commented Apr 7, 2021

Thanks for taking the time to answer my concern @rjernst , but we are still not happy about this.

As a 5+ year Elastic user, this decision (and a few others I'm not going into) is making me wonder what are Elastics other plans for the future. What other core functionalities will be deprecated...? If you would have said deprecated by Elastic 9, I'd understand, but Elastic 8 is not that far away imho. It's already barely doable to keep up with the large amount of breaking changes that come with every major update, because not updating is not an option, considering almost every major release had included security bug fixes that were not always backported, and new security bugs.

Also, if I remember right last year an Elastic support engineer advised us to use mdp on our nvme disks (please don't ask me to look that up..., our older nodes do not have mdp, at a certain point we switched from SSD to NVME and we raised that question somewhere, as our physical raid controllers did not support nvme) These physical nodes are supposed to run for like 5 years, and now wll have to redeploy in less then a year, imho a waste of time, as mdp worked fine here afaik. These nodes have up to 10 TB data each, redeploying takes careful planning and timing.

Why not keep it deprecated during Elastic 8 and log a deprecation warning during the full Elastic 8 lifecycle? Even now I cannot find anything in https://www.elastic.co/guide/en/elasticsearch/reference/current/important-settings.html#path-settings about mdp being deprecated or advice not to use it. The warnign about rebalancing does not count imho, as when you use ILM on all data, shards are being moved to other nodes anyway after x days..

Please please Elastic can you think just a little more on the impact on your (paying) customers when you decide to change / break things that worked for years.... Just so you know I am not the only one who thinks one of the greatest disadvantages of Elastic is the rate things are changing / breaking.

@bytebilly
Copy link
Contributor

Hi @willemdh, thanks for sharing your thoughts on this topic. We work in the open and we expect our users to be an active part of the discussion.

As @rjernst mentioned, the maintenance cost of this feature and managing confusion for our users has a pretty high cost to the team, and we performed a serious analysis on how our users (and paying customers) will be affected by this change.
I understand that your position is at the very opposite of what we know is the position of the vast majority of other users, and we are sorry about that.

I also want to be sure that you are aware that we do provide security bugfixes for "the latest minor of the previous major" (6.8 today, 7.latest when 8.0 will be out), so running Elasticsearch 7.x after 8.0 will be released will not expose you to security risks. We suggest to plan the upgrade as soon as possible, but we don't want to put our users at risk if they need more time to move to the new version. If you feel that there is a security bugfix that has not been backported, feel free to open an issue and we'll go through it case by case.

We'll monitor this issue, and if we see an unexpected strong pushback from a relevant number of users, we're still in time to reconsider our decision before 8.0.

rjernst added a commit to rjernst/elasticsearch that referenced this issue Apr 24, 2021
into errors. It effectively removes support for multiple data paths.

relates elastic#71205
rjernst added a commit that referenced this issue Apr 24, 2021
This commit converts the deprecation messages for multiple data paths
into errors. It effectively removes support for multiple data paths.

relates #71205
elasticsearchmachine pushed a commit that referenced this issue Oct 19, 2021
* Adjust Multiple data paths deprecation to warning (#79463)

Since Multiple Data Paths support has been added back to 8.0, the
deprecation is no longer critical, as the feature will not be
immediately removed after 7.16. This commit adjusts the deprecation
level to indicate it is a warning, intead of a critical deprecation that
would otherwise need to be addressed before upgrading.

relates #71205

* fix message
@bg256
Copy link

bg256 commented Dec 30, 2021

Leaving a comment to echo what other people have said on this issue. We use MDP in our clusters and need it to handle our current load. Removing this feature is definitely something that is going to cause us issues going forward.

@elasticforme
Copy link

This is 50% great news. Thank you Elastic for considering this. I can safely upgrade my cluster to 7.16.2 with multiple data path on clusters. Thank you again.

@st4r-fish
Copy link

If I understand this correctly, I won't be able to attach more than one disk per Elasticsearch node in the future. If that's right, it'll cause serious problems for me, since in GCP I have limited (and costly) options. To have a 30GB spare memory for Java heap, I need a fairly big instance. Then to keep my VM costs down, I'm attaching multiple 2TB disks that are housing my data, since it'd skyrocket my expenses to create a hot-warm-cold cluster. I need to cap the disk at 2TB because after that I/O performance will drop. So with that change, I either suffer from I/O rate deterioration or have to create a much expensive cluster.
Let me know if I didn't understand something right. Thanks!

@heipei
Copy link

heipei commented Feb 10, 2022

https://www.elastic.co/guide/en/elasticsearch/reference/current/important-settings.html#path-settings so given that MDP are still mentioned in the 8.0 docs (albeit as "deprecated" and slated for removal), does this feature still work in 8.0? If so, when is it slated for removal, or have there been any more recent discussions about its fate?

I have always been a huge fan and vocal proponent of ElasticSearch, but it feels to me like the one-sided decision to remove this feature is creating a lot of stress and confusion for lots of users which is very sad to see.

@bytebilly
Copy link
Contributor

Yes, MDP is available (as a deprecated feature) in Elasticsearch 8.0.
We don't have a date for removal yet, but we still suggest to consider alternative options where possible.

@st4r-fish
Copy link

Yes, MDP is available (as a deprecated feature) in Elasticsearch 8.0. We don't have a date for removal yet, but we still suggest to consider alternative options where possible.

@bytebilly 2 weeks earlier I posted if I understand this topic correctly, I'm still not sure if my interpretation is correct. Could you please elaborate on alternative options where one needs to house hundreds of TBs of data but can't attach more than one disk to an instance?
Thank you!

@bytebilly
Copy link
Contributor

@st4r-fish you will not be able to specify multiple paths to handle data.
Being able to use multiple disks or not depends on the system configuration of your environment, and tradeoffs of what your platform provides. For example, GCE supports LVM under some circumstances, but I cannot know if this will be a viable solution in your specific setup.

If you want to discuss more about your scenario and possible solutions, I suggest you to engage on https://discuss.elastic.co/ (or to open a support case in case you have a support account), since those are the best places to get technical support from both our engineers and our community.

@P-EB
Copy link

P-EB commented Feb 21, 2022

@st4r-fish you will not be able to specify multiple paths to handle data. Being able to use multiple disks or not depends on the system configuration of your environment, and tradeoffs of what your platform provides. For example, GCE supports LVM under some circumstances, but I cannot know if this will be a viable solution in your specific setup.

If you want to discuss more about your scenario and possible solutions, I suggest you to engage on https://discuss.elastic.co/ (or to open a support case in case you have a support account), since those are the best places to get technical support from both our engineers and our community.

Come on. You're phrasing it like it won't induce any significant change.

It's totally false.

LVM the way you'd suggest it is like a gigantic RAID0. When one has 3 disks, this could be considered, but on 24x4TB disks machines, it's plain suicide. And for 24x4TB disks, any RAID5/6/X method is just a disaster to happen (hello, URE). So you'll need a lot of split and parity and you'll lose quite some raw space.

MDP as a logical JBOD is unique as it's not possible to achieve it's feature with any """"industry standard"""" methods you're trying to shape as an alternative.

MDP is the only set where losing a disk is losing data if one doesn't have replication, but not losing all data, without having to pay one overly expensive license per disk. Of course, if you aim at selling one dockerish license per disk, this deprecation and removal of feature is quite "smart".

Still waiting for real technical solutions, and, in the meantime, pondering whether we should move towards an alternative to elastic.

@ceeeekay
Copy link

ceeeekay commented Feb 22, 2022

I suggest you to engage on https://discuss.elastic.co/ (or to open a support case in case you have a support account), since those are the best places to get technical support from both our engineers and our community.

I'd have thought the Github issue that's literally tracking the removal of MDP would be the best place to discuss this.

It's not a "Technical support" issue. It's a "Why are you doing this to us?" issue.

@willemdh
Copy link

"So why are you doing this to us?"

I see no reason why this feature 'has' to be removed..? Imho they could easily add a note to the docs that MDP is less supported instead of deprecating it completely.

@elasticforme
Copy link

I have 10 data node, each with 4x2TB SSD. if I raid them then loosing one disk means whole node is lost.
if I raid1, 5,6 or 10 I loose space plus speed compare to individual disk as it has to write parity etc..

but if I use individual disk loosing one disk is very negligible.
I don't understand why ELK do not get that. Or they might have bigger concern that they can't share.

@st4r-fish
Copy link

Hi @bytebilly ,

Unfortunately, the community isn't eager to discuss this scenario:
https://discuss.elastic.co/t/deprecate-and-remove-multiple-data-paths/297911
What are your suggestions?

Thank you!

@phyrexian
Copy link

I have 10 data node, each with 4x2TB SSD. if I raid them then loosing one disk means whole node is lost.
if I raid1, 5,6 or 10 I loose space plus speed compare to individual disk as it has to write parity etc..

Then don't use RAID, use a Union Filesystem.

Elastic has historically handled this internally by using the idea that "one node == one filesystem"; As far as I can tell MultipleDataPaths is an abstraction that basically presented multiple nodes (one per disk) to the problem.

Shards in Elasticsearch have always been bound to a single "datapath" (go ahead and create an index with 1 shard and 0 replicas on a stand-along node with 4 disks, only 1 will ever get used!)

Nothing is stopping you from re-using the old logic: you can start 4 copies of Elasticsearch on one node each with a single data path, then define an index that shards across all of those disks. Obviously this has complexity and "concerns" (which is what I think Elastic is looking to rid themselves of here!).

A proper Union Filesystem would let you work around the MDP depreciation AND allow you to span a single shard across multiple disks. (This is configurable at the FS level, based on how the routing rules work) by creating a "spanned volume" that places files (in configurable grouping levels) on specific disks.

Using (for example) mergerfs, the branches feature could be used to group a "shard" to a single disk, effectively replicating the prior depreciated behaviour. This approach LACKS the ability to define which disk the shard is allocated to, which is the advantage of running multiple instances of ES (you can then setup shard allocation strategies that limit what data lives where / etc)

IMO: MDP depreciation is a good thing. It removes complexity from a block that can be implemented better by other, more focused, specialized products and allows the project to move forward with new features that are being held back supporting "broad" features like this.

IMO: what would be even better was if an alternative replacement idea was well documented such that the community had a "common replacement path" for a widely used feature, but I get that doing so would effectively be a "hot potato" topic that few would want to take or maintain over multiple years.

@willemdh
Copy link

"you can start 4 copies of Elasticsearch on one node"

?? What kind of a workaround is that lol. Yes let's pay 4 times the amount we pay now, just because we want to remove some complexity for a feature which has been running perfectly fine for years......

@P-EB
Copy link

P-EB commented Mar 19, 2022

I have 10 data node, each with 4x2TB SSD. if I raid them then loosing one disk means whole node is lost.
if I raid1, 5,6 or 10 I loose space plus speed compare to individual disk as it has to write parity etc..

Then don't use RAID, use a Union Filesystem.

Maybe you should give this a second thought. First except overlayfs, mergerfs and aufs, there are not so much maintained union fs, and overlay FS isn't suited to present a jbod as one filesystem except if you feature it as a RAID0, which means loosing all data when you lose one disk.

Elastic has historically handled this internally by using the idea that "one node == one filesystem"; As far as I can tell MultipleDataPaths is an abstraction that basically presented multiple nodes (one per disk) to the problem.

Nope. MDP is just "oh let's use all these folders instead of just one".

Shards in Elasticsearch have always been bound to a single "datapath" (go ahead and create an index with 1 shard and 0 replicas on a stand-along node with 4 disks, only 1 will ever get used!)

Indeed, but as you should know, generally any index has multiple shards and a cluster isn't made of just one index.

Nothing is stopping you from re-using the old logic: you can start 4 copies of Elasticsearch on one node each with a single data path, then define an index that shards across all of those disks. Obviously this has complexity and "concerns" (which is what I think Elastic is looking to rid themselves of here!).

I guess you'd be happy to pay for the three additional licenses?

A proper Union Filesystem would let you work around the MDP depreciation AND allow you to span a single shard across multiple disks. (This is configurable at the FS level, based on how the routing rules work) by creating a "spanned volume" that places files (in configurable grouping levels) on specific disks.

Either it's an overly complex thing that will be a mess, or it's basically a RAID0. Both are quite a huge waste of time.

Using (for example) mergerfs, the branches feature could be used to group a "shard" to a single disk, effectively replicating the prior depreciated behaviour. This approach LACKS the ability to define which disk the shard is allocated to, which is the advantage of running multiple instances of ES (you can then setup shard allocation strategies that limit what data lives where / etc)

Rebuilding a mergerFS when one disk fails is quite tedious (you have to extract the data from all remaining disks, recreate a mergerFS with the dead disk replaced, and reput the data on it) - and don't talk me about SnapRAID -, compared to replace one disk and let elastic rebalance the data, which elastic will still have to do if a disk dies in a mergerFS.

IMO: MDP depreciation is a good thing. It removes complexity from a block that can be implemented better by other, more focused, specialized products and allows the project to move forward with new features that are being held back supporting "broad" features like this.

Still waiting for a relevant product here. And still waiting for an explaination about how it's not a huge waste of time for the one setting their cluster up, with much maintenance cost added in the battle.

@st4r-fish
Copy link

While I don't see any suggested solution that wouldn't cause multiplying the costs or leave I/O performance in the dust (large disks), in the Elastic forum it seems that "multiple data paths are not supported".

@elasticforme
Copy link

We already raise this issue. but as a company Elastic has already decided to go that route.
I don't think that anyone of us can do anything about it. it is just fact.

@bg256
Copy link

bg256 commented Apr 4, 2022

In talking to our ES Sales rep, I learned that they license nodes based on memory size, each node being worth 64GB of RAM (you should definitely double-check with your sales rep). If you currently have a single server running one instance of ES at 30 GB of heap, you could split it out into number of instances and split up the heap size based on the number of drives attached to that server. This is what we're planning on doing.

@st4r-fish
Copy link

In talking to our ES Sales rep, I learned that they license nodes based on memory size, each node being worth 64GB of RAM (you should definitely double-check with your sales rep). If you currently have a single server running one instance of ES at 30 GB of heap, you could split it out into number of instances and split up the heap size based on the number of drives attached to that server. This is what we're planning on doing.

That's nice. However, I read suggestions like this multiple times:

The Elasticsearch process is very memory intensive. Elasticsearch uses a JVM (Java Virtual Machine), and close to 50% of the memory available on a node should be allocated to JVM. The JVM machine uses memory because the Lucene process needs to know where to look for index values on disk. The other 50% is required for the file system cache which keeps data that is regularly accessed in memory.
(source)

Since I have 10*2TB disks per ES instance and the CPU is already clocking around 80% (indexing rate ~140K/s). This means I can't do anything about that since I can't customize the VM resources ($$$). As I said earlier, I'd either have to attach larger disks (that'd cause a significant drop in I/O performance due to throttling) or spin up multiple smaller VMs resulting in wasting money on a problem I didn't have in the past 5 years. I can't even go to hot-warm-cold, since I need the data being indexed and there are 320 (*0.8) CPUs already doing that. That's why I'd like to see a solution that wouldn't cause performance issues (I love how fast Elasticsearch is currently) and won't need to pay 3-10 times more than I'm paying right now for analytics (e.g., managed cloud is waaay to expensive for me).
The reason I'm here, and waiting for something that I can work with is that I really like how easily I can manage my cluster and how fast it is. I'm capable of managing my whole cluster (~30 nodes) without professional support (occasionally visiting the forum). So paying extra for something I can do for free and with low effort isn't something I'd like to do. I used MDP for over 5 years with tens of terabytes of data, hundreds of indices, and thousands of shards and never saw any issue with that.

@rpasche
Copy link

rpasche commented Apr 5, 2022

This message is again shown as critical within version 7.17.1

image

image

@elasticforme
Copy link

The reason I'm here, and waiting for something that I can work with is that I really like how easily I can manage my cluster and how fast it is. I'm capable of managing my whole cluster (~30 nodes) without professional support (occasionally visiting the forum). So paying extra for something I can do for free and with low effort isn't something I'd like to do. I used MDP for over 5 years with tens of terabytes of data, hundreds of indices, and thousands of shards and never saw any issue with that.

I have huge cluster like that. and now I can't upgrade to 8.x and someone up in food chain just told me if we can't upgrade look for alternative. and so we are already checking few other technology. can't say anything more then that.

@rpasche
Copy link

rpasche commented Apr 5, 2022

@elasticforme it looks like it is still possible to use MDP in version 8.1.2. I just installed a test cluster and configured multiple data paths and the server starts...

See logs from test cluster having 2 20GB volumes configured.

 [data0] using [2] data paths, mounts [[/data (/dev/xvdf), /data2 (/dev/xvdg)]], net usable_space [36.9gb], net total_space [39.1gb], types [ext4]

@willemdh
Copy link

willemdh commented Apr 6, 2022

Seems that the critical alert in the upgrade assistant was a bug and should have been a warning...

#85695

At least that gives us some extra time to move away from it..

@st4r-fish
Copy link

@bytebilly

If you want to discuss more about your scenario and possible solutions, I suggest you to engage on https://discuss.elastic.co/ (or to open a support case in case you have a support account), since those are the best places to get technical support from both our engineers and our community.

There isn't much going on there, nor here. I'm fine with y'all saying that we have to solve it ourselves, but I need to know if Elastic is going to help our case or not. As previously discussed most of us would need to invest thousands of dollars to be able to scale with this change which is a no-go. Moving to another solution needs planning and testing (time&money too) but needs to be set in motion to be able to do that before EOL. Not sure if you moved hundreds of terabytes of data but it isn't a walk in the park.

@P-EB
Copy link

P-EB commented Jun 20, 2022

One thing is for sure, elastic will lose some clients if they decide to keep putting their heads in the sand singing "lalalalala industrial standards lalalalala".

@mtovmassian
Copy link

mtovmassian commented Aug 31, 2022

It's a shame,

I was hoping that the Multiple Data Paths feature would have solved my low disk space issue in my cluster.
From my point of view it was a flexible and cheap solution:

  • Basically I need to double the disk size of my nodes: 100 GB -> 200 GB
  • By attaching a 100 GB volume as a second Elasticsearch partition I'm only billed 4€ extra per node.
  • Without the MDP feature the next solution is to change the instance type to have a 200 GB disk: it cost me 43€ extra per node...

But in my case I was wondering if this other solution is possible:

  • Attach a 200 GB volume and create a partition for it: sudo fdisk /dev/vdb1
  • Mount the /dev/vdb1 partition with the Elasticsearch data dir /var/lib/elasticsearch as target: sudo mount /dev/vdb1 /var/lib/elasticsearch
    Is it possible since the Elasticsearch data dir is at first within the main partition /dev/sda1?

@BlackMetalz
Copy link

BlackMetalz commented May 25, 2023

@rjernst How is it going? after community feedback, it won't be removed, right?

MDP gives better performance while optimizing costs for cloud infrastructure

Update: I just read #78525
I guess this issue should be close

@P-EB
Copy link

P-EB commented Jul 17, 2023

I would not bet that the matter is closed on the customer's side. I think some day elastic will again push to drop the feature despite the lack of existence of any equivalent feature in the "industry".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Core/Infra/Core Core issues without another label Meta Team:Core/Infra Meta label for core/infra team
Projects
None yet
Development

No branches or pull requests