Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: Integrate with Apache Ranger #1054

Closed
acelyc111 opened this issue Jul 18, 2022 · 9 comments
Closed

Feature: Integrate with Apache Ranger #1054

acelyc111 opened this issue Jul 18, 2022 · 9 comments
Labels

Comments

@acelyc111
Copy link
Member

Apache Ranger™ [1] is a framework to enable, monitor and manage comprehensive data security across the Hadoop platform. There are many big data components support to integate with Ranger, like HDFS,HBase,Hive,Yarn,Kafka,Kudu.

Now Pegasus supports Kerberos and built-in ACL, but it's a bit of difficult to manage it, we can make Pegasus interact with Ranger to make it easier for management.

  1. https://ranger.apache.org/
@acelyc111
Copy link
Member Author

It is also tracked in Ranger community https://issues.apache.org/jira/browse/RANGER-3831

@kirbyzhou
Copy link

If I understand it correctly, the ACL model is so simple now. There are two access_controller class.

  • meta_access_controller:

Super User is allowed to do anything.
All other users are allowed to do the things listed in FLAGS_meta_acl_rpc_allow_list. There is no per-user settings.
Default meta_acl_rpc_allow_list are

RPC_CM_LIST_APPS 
RPC_CM_LIST_NODES 
RPC_CM_CLUSTER_INFO 
RPC_CM_QUERY_PARTITION_CONFIG_BY_INDEX
  • replica_access_controller:

Super User is allowed to do anything.
Users in the users list are allowed to do anything. Users are set by 'replica_access_controller.allowed_users' vars in replica::update_ac_allowed_users
There seems no per-user settings too.

It seems we first have to enhance the access controller mech, add per-user / per-table support.
Such as :
{ Table1: { user1: read, user2: read+write }, Table2: { user3: read, user4: read+write } }

@kirbyzhou
Copy link

Add a draft of Ranger Service definition
here

@acelyc111
Copy link
Member Author

acelyc111 commented Aug 15, 2022

The current ACL is described in #170 and Pegasus 安全认证

It can be summarized as following:

operation \ user super user table owner other users
query cluster basic info
table read and write ×
cluster control × ×

The extended ACL is based on the former design and detailed as following:

operation \ details ACL symbol rpc code resource example --
Global level -- -- --  
query cluster,server metadata cluster:
RPC_CM_LIST_NODES
RPC_CM_CLUSTER_INFO
RPC_CM_LIST_APPS(can query all tables)

server:
RPC_QUERY_DISK_INFO
 
control cluster,server,multi tables control cluster+server:
RPC_HTTP_SERVICE(http request,has no principal currently)

set LB level:
RPC_CM_CONTROL_META

recover meta server though replica servers:
RPC_CM_START_RECOVERY

on replica server:
migrate replica between disks:
RPC_REPLICA_DISK_MIGRATE

on replica server:
add new disks:
RPC_ADD_NEW_DISK

on replica server:
detect hot key:
RPC_DETECT_HOTKEY

cluster+server: remote command(include many operations):
RPC_CLI_CLI_CALL

(multi-tables)backup policy's add,modify(maybe removed later):
RPC_CM_ADD_BACKUP_POLICY
RPC_CM_MODIFY_BACKUP_POLICY
-- --
Database level -- -- -- --
query cluster,server list cluster:
RPC_CM_LIST_APPS(can only query tables in the database)
-- --
create table create RPC_CM_CREATE_APP db1 can create tables prefixed by "db1_"
drop/recall table drop RPC_CM_DROP_APP
RPC_CM_RECALL_APP
-- --
manager table - query metadata (multi-tables)query backup policy(may be removed later):
RPC_CM_QUERY_BACKUP_POLICY

(single-table)query backup policy:
RPC_CM_QUERY_BACKUP_STATUS

(single-table)query backup from policy status:
RPC_CM_QUERY_RESTORE_STATUS

(single-table)query duplication:
RPC_CM_QUERY_DUPLICATION

(single-table)query partition split:
RPC_CM_QUERY_PARTITION_SPLIT

(single-table)query bulk load:
RPC_CM_QUERY_BULK_LOAD_STATUS

(single-table)query manual compact:
RPC_CM_QUERY_MANUAL_COMPACT_STATUS

(single-table)query RF of a table:
RPC_CM_GET_MAX_REPLICA_COUNT
-- --
manager table - control control (single-table)start backup:
RPC_CM_START_BACKUP_APP
(single-table)control restore from backup:
RPC_CM_START_RESTORE

(single-table)migrate one replica:
RPC_CM_PROPOSE_BALANCER

(single-table)add or modify duplication:
RPC_CM_ADD_DUPLICATION
RPC_CM_MODIFY_DUPLICATION

(single-table)update table's envs:
RPC_CM_UPDATE_APP_ENV

(single-table)DDD diagnose for tables:
RPC_CM_DDD_DIAGNOSE

(single-table)start and control partition split:
RPC_CM_START_PARTITION_SPLIT
RPC_CM_CONTROL_PARTITION_SPLIT

(single-table)start, clear up and control bulk load:
RPC_CM_START_BULK_LOAD
RPC_CM_CONTROL_BULK_LOAD
RPC_CM_CLEAR_BULK_LOAD

(single-table)start manual compact:
RPC_CM_START_MANUAL_COMPACT

(single-table)update table's RF:
RPC_CM_SET_MAX_REPLICA_COUNT
-- --
Database/Table level -- -- -- --
Read data read meta server:
route info:
RPC_CM_QUERY_PARTITION_CONFIG_BY_INDEX

list:
RPC_CM_LIST_APPS(can only query privileged tables)

replica server:
server level:
replica_stub::on_client_read

replica level:
replica::on_client_read
db1/table1 can read table db1.table1
Write data write replica server:
server level:
replica_stub::on_client_write

replica level:
replica::on_client_write
db1/* can write tables prefixed by 'db1'
server internal (not in ACL) N/A RPC_CM_CONFIG_SYNC
RPC_CM_UPDATE_PARTITION_CONFIGURATION
RPC_CM_REPORT_RESTORE_STATUS
RPC_CM_DUPLICATION_SYNC
RPC_CM_REGISTER_CHILD_REPLICA
RPC_CM_NOTIFY_STOP_SPLIT
RPC_CM_QUERY_CHILD_STATE
RPC_NEGOTIATION
RPC_CALL_RAW_MESSAGE
RPC_CALL_RAW_SESSION_DISCONNECT
RPC_NFS_GET_FILE_SIZE
RPC_NFS_COPY
RPC_FD_FAILURE_DETECTOR_PING
RPC_CALL_RAW_MESSAGE
RPC_CALL_RAW_SESSION_DISCONNECT
RPC_CONFIG_PROPOSAL
RPC_GROUP_CHECK
RPC_QUERY_PN_DECREE
RPC_QUERY_REPLICA_INFO
RPC_QUERY_LAST_CHECKPOINT_INFO
RPC_PREPARE
RPC_GROUP_CHECK
RPC_QUERY_APP_INFO
RPC_LEARN
RPC_LEARN_COMPLETION_NOTIFY
RPC_LEARN_ADD_LEARNER
RPC_REMOVE_REPLICARPC_COLD_BACKUP
RPC_CLEAR_COLD_BACKUP
RPC_SPLIT_NOTIFY_CATCH_UP
RPC_SPLIT_UPDATE_CHILD_PARTITION_COUNT
RPC_BULK_LOADRPC_GROUP_BULK_LOAD
-- --

@acelyc111
Copy link
Member Author

acelyc111 commented Aug 15, 2022

  1. More restrict for common query type requests.
    a. As table above, users should be granted 'metadata' before query cluster info.
  2. MetaServer have to support table level ACL
    a. For example, query route info of a table (i.e. RPC_CM_QUERY_PARTITION_CONFIG_BY_INDEX) must be in ACL
  3. To support 'database' level ACL:
    a. on MetaServer:
    i. Handle requests on MetaServer, they are listed in the table above, 'Database level.list/create/drop/metadata/control'.
    ii. Add unordered_map<'table_prefix', unordered_set<user_name>> structure.
    iii. Parse table name from request messages when handle requests.
    a. on ReplicaServer:
    i. Add unordered_map<'table_prefix', unordered_set<user_name>> structure too.
    ii. Because "table prefix" string doesn't belong to any tables/replica, so besides replica envs, we have to add an extra server level envs for ACL.
    iii. We have to maintainance the relationship between table id and table name carefully.
    iv. Parse table id and transfer it to table name from request messages when handle requests.
  4. Implemention
    a. The leader meta_server request ACL details from Apache Ranger though HTTP periodically.
    b. Parse the JSON formatted response to internal required structure.
    c. Set the structure to MetaServer and remote Zookeeper.
    d. Set to each tables.
    e. Send envs info to ReplciaServers though CconfigSync RPC.
  5. The relationship between table name and database name:
    a. Use che '.' to split table name, suppose the part before '.' is database name.
    b. When bootstrap, for the tables already created, if they are not match the new naming rule, consider they are in the "default" database.
    c. When ACL enabled, it's not allowed to create table with name which not matched the rule.
    d. For the rename operation, it's not allowed to modify the prefix.

@acelyc111
Copy link
Member Author

  1. 需更严格的普通查看类请求:
    a. 如上表格,“Global级别权限”的“metadata”权限也需要ACL
  2. meta server上增加表级的ACL:
    a. 如获取表的路由信息(RPC_CM_QUERY_PARTITION_CONFIG_BY_INDEX),也需做ACL
  3. 为支持database(前缀)粒度的控制:
    a. meta server上的:
    i. 针对如上列表中的“Database级别权限”的在metaserver上处理的请求
    ii. 需增加unordered_map<表前缀, unordered_set<用户名>>的结构
    iii. 在各个请求响应函数中,解析到请求的”表名“之后加以判断
    replica server上的:
    i. 需增加与meta server上相似的数据结构(unordered_map<表前缀, unordered_set<用户名>>)
    ii. 因为”表前缀”并不属于任何表,所以除了表级的envs外,还需增加server级的envs,用于鉴权
    iii. 因为在replica server上并不维护“表名”,而只有“表id”,所以还需增加表id与表名的映射关系
    iv. 在各个请求处理函数中,解析到请求的”表id“之后,映射为表名,再加以判断
  4. 实现:
    a. leader meta_server定期地通过http请求从ranger获取pegasus的ACL
    b. 解析Json格式的ACL成需要的数据结构
    c. 设置到meta server自身的acl结构中,也存储到远端存储(即zookeeper)
    d. 设置到各个表的envs中
    e. 后续通过meta server → replica server的同步,将envs下发到各个replica server上
  5. 表名与database名的映射关系:
    a. 通过符号“.”来划分database名,“.”之前的即为database name
    b. 启动时,对于已创建的表,如果不符合分割方式,则他处于“default” database中
    c. 开启ACL后,再创建不符合分割规则的表则报错
    d. 对于rename操作,不允许修改前缀

@kirbyzhou
Copy link

We have finished the service definition in ranger according to this draft.
See https://issues.apache.org/jira/browse/RANGER-3831

@WHBANG
Copy link
Contributor

WHBANG commented Feb 22, 2023

Introduce the implementation and how to use:

  1. The class diagram
    image

First, you need to add ACL related configurations. The client configuration has not changed, the server has added new configurations:

enable_ranger_acl: indicates whether to use ranger for acl
ranger_service_url: ranger server url
ranger_service_name: use ranger policy name
mandatory_enable_acl: mandatory use range policy, currently used for testing

The details are as follows:

server
[security]
  update_ranger_policy_interval_sec
[ranger]
  ranger_service_url
  ranger_service_name
  ranger_legacy_table_database_mapping_rule
  mandatory_enable_acl
[security]
  enable_auth = true
  krb5_keytab = /root/apache/pegasus.keytab
  krb5_config = /etc/krb5.conf
  krb5_principal = XXXXX
  sasl_plugin_path = /root/apache/incubator-pegasus/thirdparty/output/lib/sasl2
  service_fqdn = XXXXX
  service_name = XXXXX
  mandatory_auth = true
  enable_acl = true
  super_users =
  meta_acl_rpc_allow_list =
  enable_ranger_acl = true
java client
java
meta_servers = 127.0.0.1:34601,127.0.0.1:34602,127.0.0.1:34603
operation_timeout = 5000
async_workers = 4
enable_perf_counter = false
perf_counter_tags = cluster=onebox,app=unit_test
push_counter_interval_secs = 10
meta_query_timeout = 5000
auth_protocol = kerberos
kerberos_service_name = XXXXX
kerberos_service_fqdn = XXXXX
kerberos_keytab = /root/apache/pegasus.keytab
kerberos_principal = XXXXX
shell
[security]
  enable_auth = true
  krb5_keytab = /root/apache/pegasus.keytab
  krb5_config = /etc/krb5.conf
  krb5_principal = XXXXX
  sasl_plugin_path = /root/apache/incubator-pegasus/thirdparty/output/lib/sasl2
  service_fqdn = XXXXX
  service_name = XXXXX

Second compatibility:
Retained the old ACL mode

  1. Use the old ACL
enable_acl = true
enable_ranger_acl = false
  1. user ranger for ACL
enable_acl = true
enable_ranger_acl = true
  1. Third, define the ranger policy
  • Pegasus resources can be divided into multiple types, and the operation types of each resource can also be divided. One operation type corresponds to one ACL symbol

image

  • ACLs on each type of resource correspond to specific rpc_code

image

image

image

  1. pegasus+ranger

After completing the integration of ranger with pegasus, you can set permissions on the ranger web page according to your own needs
image
image

@kirbyzhou
Copy link

kirbyzhou commented Feb 22, 2023 via email

empiredan pushed a commit that referenced this issue Mar 9, 2023
#1054

This patch is compatible with old and new acl.

- Modify some method names and parameter names to make them more accurate.
- Defines the configuration parameters that the new ACL needs to use.
- Two new 'allowed()' methods are provided for meta_server and replica_server.
- Some incompatible methods will be removed, (allowed&pre_check)commented.
empiredan pushed a commit that referenced this issue Mar 10, 2023
#1054

This patch is to prepare for parse policies and dump policies:
- 'DEFINE_JSON_SERIALIZATION' for data structure.
- Preparations for json parsing
- add unit test for 'parse_policies_from_json'
acelyc111 pushed a commit that referenced this issue Mar 17, 2023
…resources policies (#1388)

#1054

This patch implements how to pull policies from the Ranger Service
and dump policies to remote storage.

- Pull policies in JSON format from Ranger service and parse
  policies from JSON formated string.
- Create the path to save policies in remote storage, and update
  using resources policies.
- Dump policies to remote storage.
- Sync policies to app envs.
- Update the cached global/database resources policies.
acelyc111 pushed a commit that referenced this issue Apr 3, 2023
#1054

This patch implements meta access controller using Ranger for ACL.
1. Re-implemented the access control logic of RPCs registered in `meta_serivce.h`,
    and adapt the old and new ACL.
2. Some internal RPCs are registered in the `_allowed_rpc_code_list`.
2. Realize periodic update of resource strategy from Ranger service.
3. Changed some ut.
empiredan pushed a commit that referenced this issue Apr 16, 2023
#1054

This patch implements replica access controller using Ranger for ACL.

1. The Ranger policy info of the table is written to the app_envs of the table.
2. Support using the policy in the app_envs for ACL when the replica server
     reads and writes.
3. Modify some unit tests, and be compatible with old and new ACL.
acelyc111 pushed a commit that referenced this issue Apr 20, 2023
…rn (#1445)

#1054

This patch add ACL to the learn action of replica. 

1. specifically, regard learn as a write action, and use the Ranger 
    policy to determine whether the master-slave can learn.
acelyc111 pushed a commit that referenced this issue Apr 21, 2023
#1452)

#1054

This patch add ACL to the NFS copy of replica.

1. Added `gpid` info to the data structure defined in `nfs.thrift`
2. Perform ACL through the Ranger policy matched by `gpid`
3. The registration of nfs is moved to `replica_stub.cpp`, and the original registration 
    information is retained, which is convenient for testing
empiredan pushed a commit that referenced this issue Jun 1, 2023
)

#1054

The access control management of RPC RPC_CM_LIST_APPS is removed from
the global level resource, which is managed by the database resource.
empiredan pushed a commit that referenced this issue Jun 8, 2023
)

#1054

This patch adds a new conf item `legacy_table_database_mapping_policy_name`,
the legacy table (the tables which are created before Ranger ACL enabled) will be
matched to the database named `legacy_table_database_mapping_policy_name`
for ACL.

"*" can match any table, including legacy tables and tables named by new rules.
empiredan pushed a commit that referenced this issue Jun 15, 2023
#1054

This patch fixes the judgment logic when ranger matches policies:

1. Traverse all resource policies
   i. If the current policy matches deny_condition
      a. does not match any deny_exclude, returns kDenied, and the traversal ends
      b. A deny_exclude is matched, return kPending, and continue to the next policy judgment
   ii. No policy is matched or the return value is kPending, enter 2
2. Traverse all resource policies again
   i. If the current policy matches allow_condition
      a. does not match any allow_exclude, returns kAllowed, and the traversal ends
      b. An allow_exclude is matched, return kPending, and continue to the next policy judgment
   ii. If the return value is kPending, it will return kDenied
3. dose not match any policy, return kDenied
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants