Specify label behaviour in documentation (project-oak#687)

Ref project-oak#307
rbehjati · Mar 11, 2020 · a39e166 · a39e166
1 parent 3da7825
commit a39e166
Show file tree

Hide file tree

Showing 2 changed files with 175 additions and 29 deletions.
diff --git a/docs/concepts.md b/docs/concepts.md
@@ -4,12 +4,12 @@
 - [Oak Node](#oak-node)
   - [WebAssembly](#webassembly)
 - [Channels](#channels)
+- [Labels](#labels)
 - [Pseudo-Nodes](#pseudo-nodes)
 - [Oak Application](#oak-application)
 - [Oak Runner](#oak-runner)
   - [Workflow](#workflow)
 - [Remote Attestation](#remote-attestation)
-- [Oak Runtime Updates](#oak-runtime-updates)
 - [Time](#time)
 
 ## Oak Runtime
@@ -22,7 +22,7 @@ Each Oak Runtime instance lives in its own dedicated enclave and is isolated
 from both the host as well as other enclaves and Oak Runtime instances on the
 same machine.
 
-## Oak Node
+## Nodes
 
 The unit of execution in Oak is an **Oak Node**. The code for an Oak Node is a
 self-contained [WebAssembly module](https://webassembly.org/docs/modules/) that
@@ -57,19 +57,163 @@ others, for example Go.
 
 ## Channels
 
-Communication from the Oak Node to the Oak Runtime and to other Nodes is
-implemented via **channels**. A channel represents a uni-directional stream of
+Communication from an Oak Node to the Oak Runtime and to other Nodes is
+implemented via **Channels**. A Channel represents a uni-directional stream of
 messages, with a receive half and a send half that an Oak Node can read from or
-write to respectively. Each half of a channel is identified by a **handle**,
+write to respectively. Each half of a Channel is identified by a **handle**,
 which is used as a parameter to the corresponding
 [host function](abi.md#host-functions) calls. Channel handles are integer values
 that are specific to a particular Node (like per-process UNIX file descriptors),
-so handle value 3 for Node A identify a different channel than handle value 3
+so handle value 3 for Node A identify a different Channel than handle value 3
 for Node B (unless there happens to be a coincidence of numbering).
 
+## Labels
+
+### Overview
+
+- Nodes and Channels have **Labels**, fixed at creation time
+- The Runtime keeps track of Labels, and enforces flow of data between Nodes and
+  Channels based on them
+- Each Label has two **components**: **secrecy** and **integrity**
+- Each component is an unordered set of **tags**
+- Each tag is a structured representation of a **security principal**
+- A security principal is either a **user** or a **computation**
+
+### Details
+
+Labels are used to enforce information flow control (IFC) between Nodes and
+Channels in an Oak Application and the outside world.
+
+#### Runtime
+
+The Oak Runtime keeps track of the Label associated with each Node and each
+Channel. When a Node creates a new Node or Channel, the creator Node specifies
+the Label to associate with the newly created Node or Channel. The Label does
+not change any more after that (apart from explicit declassification operations,
+which are not currently supported). Labels are always considered public, and a
+Node may inspect its own Label or that of any Channel it has access to, and (if
+the Oak Runtime is implemented correctly) this is guaranteed to not allow the
+creation of side channels based on Labels.
+
+#### Components
+
+A Label has two components: **secrecy** and **integrity**.
+
+Each component is represented as an (unordered) set of tags, and each tag refers
+to an individual security principal in the system.
+
+The following types of security principals are supported in Oak:
+
+- **user**: specified by a bearer token, public key, or some other assertion of
+  a user identity
+- **computation**: specified by a measurement (hash) of the WebAssembly module
+  running within an Oak Node
+
+Tags and Labels are represented as serialized
+[protobuf messages](/oak/proto/policy.proto).
+
+Any security principal may be used (as a tag) as part of secrecy or integrity
+components, depending on the required use case.
+
+Intuitively, if the secrecy component contains tag `t`, the Node or Channel is
+allowed to see secrets owned by principal `t`. Similarly, if the integrity
+component contains tag `t`, that means the Node or Channel is trusted by `t`.
+
+#### Information Flow
+
+Labels are compared to decide whether or not data movement is allowed, for
+example when a Node sends a value over a Channel. Given two labels `L_a` and
+`L_b`, `L_a ⊑ L_b` (pronounced "flows to") if a value from a source labeled
+`L_a` can be sent to a destination labeled `L_b`. The "flows to" operator
+compares both secrecy and integrity:
+
+- the **secrecy** check ensures that the destination is permitted to view any
+  secrets that may have influenced the value
+- the **integrity** check ensures that the value sent is trusted by the
+  destination
+
+More concretely, if `L_a = (S_a, I_a)` (where `S_a` and `I_a` are the secrecy
+and integrity components respectively), and `L_b = (S_b, I_b)` then `L_a ⊑ L_b`
+iff `S_a ⊆ S_b` and `I_a ⊇ I_b`. Notably, the integrity part of the check is
+"flipped" by convention (see "Integrity Considerations for Secure Computer
+Systems" below). More fundamental than adhering to convention, by flipping the
+integrity order in this way, we can retain the intuitive meaning behind
+integrity tags as representing trust by a principal (trust by a user, trust on a
+subject, etc.).
+
+Intuitively, data can only flow from `a` to `b` if:
+
+- `b` is **at least as secret** as `a`
+- `a` is **at least as trusted** as `b`
+
+The least privileged label is usually referred to as "public trusted" (and
+represented as `⊥`, pronounced "bottom"), which corresponds to a Node or Channel
+which has only observed public data, and its inputs are not endorsed with any
+level of integrity; in this label, both secrecy and integrity components are
+empty sets.
+
+As an example, if the `a`'s secrecy is `{s_0, s_1}`, and `a`'s integrity is
+`{i_0, i_1}`, then:
+
+- information can flow from `a` to `b` if `b`'s secrecy is
+  `{s_0, s_1, s_2} ⊇ {s_0, s_1}`, since `b` is "at least as secret" than `a`
+- information cannot flow from `a` to `b` if `b`'s secrecy is
+  `{s_0} ⊉ {s_0, s_1}`, since `b` is "less secret" than `a` (in particular, `b`
+  is not allowed to see secrets owned by `s_1`)
+- information cannot flow from `a` to `b` if `b`'s integrity is
+  `{i_0, i_1, i_2} ⊈ {i_0, i_1}`, since `a` is "less trusted" than `b` (in
+  particular, `a` is not trusted by `i_2`)
+- information can flow from `a` to `b` if `b`'s integrity is
+  `{i_0} ⊆ {i_0, i_1}`, since `a` is "at least as trusted" than `b`
+
+In particular:
+
+- A Node with label `L_w` may write to a Channel with Label `L_c` iff
+  `L_w ⊑ L_c`.
+- A Node with label `L_r` may read from a Channel with Label `L_c` iff
+  `L_c ⊑ L_r`.
+
+If a Node tries to write to or read from a Channel that it is not allowed to,
+based on the `⊑` relation, the operation fails immediately (i.e. the read or
+write ABI call returns an error to the caller), and no side effects are
+performed.
+
+It follows that bi-directional communication between Nodes `a` and `b` is only
+allowed (via two uni-directional Channels) if
+`(L_a ⊑ L_b) ∧ (L_b ⊑ L_a) ⇒ L_a = L_b`, i.e. if `a` and `b` have identical
+secrecy and integrity.
+
+More details on Information Flow Control may be found in the following
+references:
+
+- [Information Flow Control for Standard OS Abstractions](https://pdos.csail.mit.edu/papers/flume-sosp07.pdf)
+- [Flow-Limited Authorization](https://www.cs.cornell.edu/andru/papers/flam/flam-csf15.pdf)
+- [Integrity Considerations for Secure Computer Systems](http://seclab.cs.ucdavis.edu/projects/history/papers/biba75.pdf)
+
+### gRPC and user labels
+
+As [described below](#pseudo-nodes), each incoming gRPC invocation is
+represented by a message containing two Channels handles:
+
+- a "request receiver" read handle to a Channel whose Label has:
+
+  - a secrecy component explicitly specified by the caller as gRPC request
+    metadata; this represents the secrecy guarantees that the caller wants to
+    impose on the request message.
+  - an integrity component implicitly set by the gRPC pseudo-node based on some
+    trusted authentication mechanism that is performed as part of the gRPC
+    connection itself; this represents the actual authority of the caller.
+
+- a "response sender" write handle to a Channel, whose Label has:
+
+  - a secrecy component implicitly set by the gRPC pseudo-node based on some
+    trusted authentication mechanism that is performed as part of the gRPC
+    connection itself; this represents the actual authority of the caller.
+  - an empty integrity component.
+
 ## Pseudo-Nodes
 
-An Oak node is limited to exchanging messages with other Nodes via channels; to
+An Oak Node is limited to exchanging messages with other Nodes via Channels; to
 allow interactions with the outside world, the Oak system also provides a number
 of **pseudo-Nodes**.
 
@@ -79,7 +223,7 @@ Application:
 - Pseudo-Node instances are created with `node_create()` as for Wasm Nodes (with
   the exception of the gRPC pseudo-Node, which is automatically created at
   Application start-of-day).
-- Nodes exchange messages with the pseudo-Nodes over channels.
+- Nodes exchange messages with the pseudo-Nodes over Channels.
 
 However, the pseudo-Nodes are implemented as part of the Oak Runtime (executing
 as native C++ code, rather than Wasm code) so that they can interact with the

diff --git a/oak/server/rust/oak_abi/src/label.rs b/oak/server/rust/oak_abi/src/label.rs
@@ -29,6 +29,8 @@ use prost::Message;
 use hashbrown::HashSet;
 
 /// A trait representing a label as part of a lattice.
+///
+/// See https://github.com/project-oak/oak/blob/master/docs/concepts.md#labels
 pub trait Label: Sized {
     /// Convert the label to bytes.
     fn serialize(&self) -> Vec<u8>;
@@ -37,7 +39,7 @@ pub trait Label: Sized {
     fn deserialize(bytes: &[u8]) -> Option<Self>;
 
     /// Compare two labels according to the lattice structure: L_0 ⊑ L_1.
-    fn can_flow_to(&self, other: &Self) -> bool;
+    fn flows_to(&self, other: &Self) -> bool;
 }
 
 impl Label for crate::proto::policy::Label {
@@ -51,7 +53,7 @@ impl Label for crate::proto::policy::Label {
         Self::decode(bytes).ok()
     }
 
-    fn can_flow_to(&self, other: &Self) -> bool {
+    fn flows_to(&self, other: &Self) -> bool {
         #![allow(clippy::mutable_key_type)]
 
         let self_secrecy_tags: HashSet<_> = self.secrecy_tags.iter().collect();
@@ -177,37 +179,37 @@ mod tests {
         //          public_trusted
 
         // Data with any label can flow to the same label.
-        assert_eq!(true, public_trusted.can_flow_to(&public_trusted));
-        assert_eq!(true, label_0.can_flow_to(&label_0));
-        assert_eq!(true, label_1.can_flow_to(&label_1));
-        assert_eq!(true, label_0_1.can_flow_to(&label_0_1));
-        assert_eq!(true, label_1_0.can_flow_to(&label_1_0));
+        assert_eq!(true, public_trusted.flows_to(&public_trusted));
+        assert_eq!(true, label_0.flows_to(&label_0));
+        assert_eq!(true, label_1.flows_to(&label_1));
+        assert_eq!(true, label_0_1.flows_to(&label_0_1));
+        assert_eq!(true, label_1_0.flows_to(&label_1_0));
 
         // label_0_1 and label_1_0 are effectively the same label, since the order of tags does not
         // matter.
-        assert_eq!(true, label_0_1.can_flow_to(&label_1_0));
-        assert_eq!(true, label_1_0.can_flow_to(&label_0_1));
+        assert_eq!(true, label_0_1.flows_to(&label_1_0));
+        assert_eq!(true, label_1_0.flows_to(&label_0_1));
 
         // public_trusted data can flow to more private data;
-        assert_eq!(true, public_trusted.can_flow_to(&label_0));
-        assert_eq!(true, public_trusted.can_flow_to(&label_1));
-        assert_eq!(true, public_trusted.can_flow_to(&label_0_1));
+        assert_eq!(true, public_trusted.flows_to(&label_0));
+        assert_eq!(true, public_trusted.flows_to(&label_1));
+        assert_eq!(true, public_trusted.flows_to(&label_0_1));
 
         // Private data cannot flow to public_trusted.
-        assert_eq!(false, label_0.can_flow_to(&public_trusted));
-        assert_eq!(false, label_1.can_flow_to(&public_trusted));
-        assert_eq!(false, label_0_1.can_flow_to(&public_trusted));
+        assert_eq!(false, label_0.flows_to(&public_trusted));
+        assert_eq!(false, label_1.flows_to(&public_trusted));
+        assert_eq!(false, label_0_1.flows_to(&public_trusted));
 
         // Private data with non-comparable labels cannot flow to each other.
-        assert_eq!(false, label_0.can_flow_to(&label_1));
-        assert_eq!(false, label_1.can_flow_to(&label_0));
+        assert_eq!(false, label_0.flows_to(&label_1));
+        assert_eq!(false, label_1.flows_to(&label_0));
 
         // Private data can flow to even more private data.
-        assert_eq!(true, label_0.can_flow_to(&label_0_1));
-        assert_eq!(true, label_1.can_flow_to(&label_0_1));
+        assert_eq!(true, label_0.flows_to(&label_0_1));
+        assert_eq!(true, label_1.flows_to(&label_0_1));
 
         // And vice versa.
-        assert_eq!(false, label_0_1.can_flow_to(&label_0));
-        assert_eq!(false, label_0_1.can_flow_to(&label_1));
+        assert_eq!(false, label_0_1.flows_to(&label_0));
+        assert_eq!(false, label_0_1.flows_to(&label_1));
     }
 }