[RFC] Expanding DLS/FLS using Views #13137
Labels
enhancement
Enhancement or improvement to existing feature or request
Search
Search query, autocomplete ...etc
untriaged
Preempting questions
#6181
https://opensearch.org/docs/latest/security/access-control/document-level-security/
Basically, DLS let you filter out documents while FLS lets you filter out fields from docs.
While the Security plugin is home to DLS/FLS, Views lives in OpenSearch core and further development would likely involve a combination of changes in both repos. Further, I want to get lots of people's input and core sees the most foot-traffic.
I think it is easiest to make a change when you have some existing framework in place. In this case, we already have Views which was initially discussed as a potential replacement for Dls/Fls amongst other things.
The Amazon-affiliated maintainers of the Security plugin are currently looking for some new directions to drive our development. This seems like a prime opportunity to address some of main areas we hear user complaints about. If we are looking to establish potential new directions I think this could be a really good one.
Before investing a sizeable chunk of time and effort into making this shift in model, I want to gauge whether the community as a whole would be interested in something like that and what types of other considerations they may want addressed. While there are some issues that suggest the current DLS/FLS system is lacking, maybe this is selection bias and it is not as big of an issue as I think. Likewise, I want to provide the opportunity for everyone to voice their opinion on using Views as the replacement.
Background
This issue is meant to be a request for comments around a shift in how OpenSearch implements DLS/FLS.
Recently, @peternied created #11957 which abstracts indices from searches. I cannot do as well of a job explaining all the intricacies of Views as Peter himself, so I encourage everyone to visit #6181 to learn more.
This RFC is then meant to be the next step towards addressing one of the common pain-points OpenSearch Security users share: DLS/FLS can be hard to use, limited in its application, and non-intuitive in advanced cases.
As noted on the original RFC for future work around Views (#12424), the Security plugin frequently gets new issues focused around extending DLS/FLS nested fields.
That being said, I propose OpenSearch (and the Security plugin) pivot to remodel the DLS/FLS system to use views as opposed to the current system.
What is the current system for DLS/FLS
Currently, DLS/FLS takes place inside the Security plugin using a dual-pronged approach.
DLS
There are two forms of DLS which occur depending on the type of request being made.
The main place interception DLS takes place is inside the Security plugin with the DlsFlsRequestValve interface and its respective DlsFlsValveImpl. In this case, the onIndexModule method inside the OpenSearchSecurityPlugin modifies the query sent by the user. The apply method of the SecurityFilter class then verifies the action is allowed (there are a couple actions such as Updates which are not allowed while using DLS/FLS).
Finally, using the instances of the DlsFlsFilterLeafReader, OpenSearch is able to modify the behavior of Lucene and customize its segment merge policy. This lets OpenSearch mark documents as deleted on the fly. An example of how this is used is the Configuration index which is only accessible via the Admin TLS certificate. Anytime a non-admin user tries to access the index, the wrapped DirectoryReader uses the EmptyFilterLeafReader and the SecurityIndexSearcherWrapper.apply method to mark all docs in the index as deleted. This same process happens for all other protected or system indices. Behind the scenes, the SecurityIndexSearcherWrapper calls on the EmptyFilterLeafReader to return it a bit set showing all docs as deleted.
FLS
FLS is performed in a similar manner to the on-the-fly deletion of sensistive documents. FLS is based on the FieldFilterLeafReader class in Lucene. Using the DlsFlsFilterLeafReader in the Security plugin, when you execute a query, the LeafReader wrapper modifies the underlying Lucene bit set to hold only field info for the allowed fields. Within the category of FLS there are two additional considerations: source field handling and anonymization/field masking.
Source Field Handling
When using FLS we are not only dealing with the stored fields but also the source fields (an OpenSearch concept). Because you are able to include source fields with your search results, we need to account for these fields when performing FLS. The FlsStoredFieldVisitor removes filtered fields from the source fields if they are included in the search result.
Anonymization
Anonymization/field masking is based on the StoredFieldVisitor which utilizes the Visitor pattern (a Lucene construct for looking at stored fields) and adds hashing to the target fields. The security plugin handles this in the HashingStoredFieldVisitor class. There the overridden stringField method and the binaryField method allow for the anonymization of any target fields.
Reworking with Views
As discussed there are two types of DLS (interception and deletion) as well as FLS which need to be replaced for Views to cover the existing DLS/FLS functionality. There are a few different patterns which look promising.
The options for these patterns are:
Perform DLS by returning a View with only the allowed docs
For this operation flow, DLS would be peformed by whole-scale removing the existing DLS logic and dropping a customized ViewService into the Security plugin. In this scenario, the DlsFlsValveImpl would be swapped out for a siimilar class (DlsFlsViewImpl?) which could make use of Views for resolving Dls requests instead of EvaluatedDlsFlsConfigs. Rather than reading the EvaluatedDlsFlsConfigs, a request could be associated with a View (resolved from the roles similar to Dls/Fls but with a structured object) and then the specified indices could be returned before the search request is performed against only that subset. The View framework for this already exists with the SearchViewAction and GetViewAction.
This option has an added benefit over the existing implementation since it would not be necessary to use the on-the-fly-deletion form of DLS anymore. Because the search request would be restricted internally to the View(s) associated with the user's role there should not be the need to dynamically filter out protected or system indices.
Fls in this scenario would likely need to be handled as part of the view preprocessing or in a similar way to how it is handled now
Pros: Removes the second type of DLS processing, makes results more predictable since Views could be given a graphical component for visualization etc.
Cons: Pretty complicated changes, different user experience, not backwards compatible at all
Perform DLS by replacing the internal filtering system with reroutes to Views
For this option, the idea would be to keep the general Dls/Fls system the same but replace the internal components of the DlsFlsValveImpl with calls to a ViewService implementation which would be added to the Security plugin. Instead of making use of the current implementation, the Security plugin could handle conversion of Dls/Fls specs into Views. This would let a Dls/Fls configuration be behind-the-scenes converted into a View which would then be searched similar to the previous option. By converting the configurations into Views, operations like comparisons and combinations could be done in a memory efficient way. The results could be cached and memoized.
The problem with this option is that it does not directly address many of the common complaints with Dls/Fls. It is not really any more simple since all the conversion would be behind the scenes. Further, it does not remove the need for the dual form of Dls since it does not direclty address the chance that a search request would show protected indices.
Pros: Easier implementation, stored objects may boost repeated Dls/Fls performance
Cons: Does not directly address common concerns, still requires dual Dls system
Expand DLS by offering View support
A third option to consider is an effort to simply offer DLS through Views without directly replacing the entire system. For this approach, we could avoid a complete overhaul of the DLS system and simply connect the new concept of Views to the existing DLS framework. In this case, we would likely want to add an additional class to the Security plugin which again extended the ViewService from core but was called as part of the DLS process. Leaving FLS alone for now, this approach could see improvements to the usability of DLS by offering similar support to the first option (above) but not replacing all the existing code. Inside the DlsFlsValveImpl we could add conditional checking against the type of request and resolve a DLS filtered request to a View if one existed. This would likely require a new way to convert between a DLS pattern and a View, but would let users replace the somewhat cumbersone DLS definitons with the more straighforward View representation. For instance, while DLS currently requires a JSON formatted definition for each pattern associated with a role, we could offer View-based definitions as an alternative. As static references, we could then perform resolvable operations on View-based DLS definitions when they are defined.
For example, given two Views (1 and 2), we could provide simple arithmetic definitions for DLS. Because View 1 and View 2 are each objects inside OpenSearch, we could let users reference DLS patterns by saying "in View 1 or View 2" etc. We would also be able to provide administrators the ability to preview a DLS result since we can resolve the Views to their assigned indices at any given point. This would help address some of the more confusing scenarios of DLS where the interaction between multiple DLS-enforced roles is not clear.
Another benefit of this approach is that it would potentially lower the amount of instances where we are forced to interact on Lucene-level objects during DLS. Because this approach has some similarities with the first option, we would again be able to avoid the on-the-fly deletion work we currently perform. If the protected indices were not in any of the allowed Views you would not need to dynamically delete them.
Pros: Helps with user experience, less code changes
Cons: Interacts with the complicated current DLS system, does not address FLS
How would views need to change?
The most liekly requirements for any of the options for reworking DLS/FLS with Views is create new constructs which let OpenSearch track the indices associated with a View more effectively. Looking at the current View implementation, it does not seem there is a great way for accessing the indices inside the View as a batch. Adding an abstraction for ViewableIndex or something similar which was able to have its field names checked against from the ViewService may be helpful.
What is next?
Having overviewed a couple ideas for incorporating Views into an improved DLS/FLS system, I would love to hear community feedback around interest (should we bother?), feature requests around this area (is there something you want addressed I did not mention?), and pitfalls (did I overlook something or am I completely wrong about something?).
Outstanding questions for the community
Here are some questions I still have as the author of this RFC and things I hope the community may be able to help answer.
The current implementation is heavily intertwined with Lucene, is this needed?
Can we get to point where DLS is performed in SQL-style syntax, and is this desired?
Can Views help us resolve permissions before execution of an action and then tell whether a request will pass before it is sent?
Can we add negated logic to DLS if we use Views? (Can we append 'must not match View 1', etc.)
The text was updated successfully, but these errors were encountered: