[RFC] Host and Hostname fields - Stage 0 #1512

hadadata59 · 2021-07-13T12:07:58Z

Have you signed the contributor license agreement? yes
Have you followed the contributor guidelines? yes
For proposing substantial changes or additions to the schema, have you reviewed the RFC process? yes
If submitting code/script changes, have you verified all tests pass locally using make test?
If submitting schema/fields updates, have you generated new artifacts by running make and committed those changes?
Is your pull request against master? Unless there is a good reason otherwise, we prefer pull requests against master and will backport as needed.
Have you added an entry to the CHANGELOG.next.md?

cla-checker-service · 2021-07-13T12:08:01Z

💚 CLA has been signed

ebeahan

Thanks for submitting this stage 0 RFC, @hadadata59!

I left some initial feedback.

ebeahan · 2021-07-20T17:08:00Z

rfcs/text/0000/agent.yml

@@ -0,0 +1,7 @@
+- name: agent
+  fields:
+    - name: hostname


The agent.* fields are meant to describe the software entity collecting events on a host or observer. As a software entity, the agent.hostname field has been left out intentionally since the hostname is instead an attribute of a host.* or an observer.*.

We have seen evidence of records (observer.) which report on a host (host.) and regarding the agent (agent.*) where the hostnames of each (observer, host, and agent) are unique.

Do you have an example you'd be willing to share for this discussion?

@ebeahan
Here is an anonymized representation of the event output from the agent (represented with 123) reporting on the HOST and the other agent deployed on the host (321).

application.reporting_agent.assetid GUID123_HOSTGUID

application.reporting_agent.endpoint.log.level INFO

application.reporting_agent.endpoint.product.version 2.7

host.broker_guid BROKERGUID

host.domain HOSTNAME.DOMAIN.COM

host.resident_agent.name HOSTNAME123

host.hostname HOSTNAME

host.id HOSTGUID

host.ip HOSTIP

reporting_agent.server HOSTNAME.DOMAIN.COM

reporting_agentguid GUID123

event.resident_agent.version 10.4

host.resident_agent_server.guid GUID321

host.resident_agent_server.name "HOSTNAMESVR321"

reporting_agent.server.name "HOSTNAMESVR123.DOMAIN.COM"

Hi @hadadata59! Just catching up here and want to verify I understand correctly.

From your example, you have agent 123 and agent 321 and both run on the same host that they are monitoring?

@kgeller yes. two agents, one host.

Awesome.

So in this scenario, we would say we are receiving logs from both agent 123 and agent 321 about host 1? If so, could we not just populate the host.name field in both of those sets of logs from the agent?

I don't quite follow how, in this scenario, we'd need additional hostname fields.

What are thoughts on leveraging the existing agent.name to capture the hostname?

For example, Beats does this as the default, unless overridden in the configuration: elastic/beats#18000

This is a single log record (not sets of logs) where the box (host.hostname), the non-reporting 'resident' agent (?) and the reporting agent (agent.hostname) all provide unique hostnames in the record.

ebeahan · 2021-07-20T17:20:00Z

rfcs/text/0000/destination.yml

@@ -0,0 +1,7 @@
+- name: destination
+  fields:
+    - name: hostname


With the source/destination/client/server field sets, the address value should populate the .address field and be duplicated to the appropriate field based on the value:.ip for IP addresses, .domain for FQDNs or hostnames.

I believe the .domain field serves the same function you're proposing here. Or do you have different motivations for proposing this addition?

Using the example of somehost.example.com as a fully qualified domain name:
where some somehost is the hostname
and example.com is the domain name
See https://docs.microsoft.com/en-us/windows/win32/dns/naming-conventions
There is general confusion when the FQDN can be both .domain as well as host.name.
The point here is to isolate the hostname (I.E. somehost), as well as the domain (.domain) for a more accurate and reliable representation of the data and for ease of user search.

I think this addition can make sense here.

Very early drafts of ECS did include source.hostname and destination.hostname fields, but the project later removed the fields. The discussion was that having both source.hostname and source.domain caused confusion, and arguably using hostname vs. host in a network-centric context was incorrect.

Here are some of the past conversations, if anyone's curious: #175 #84

Sometimes revisiting past decisions is valuable, though, of course! However, there would be a good bit of work to reassign the [source|destination|client|server].domain field's intent; this would be a significant breaking change for ECS.

ebeahan · 2021-07-20T17:28:27Z

rfcs/text/0000/host.yml

@@ -0,0 +1,25 @@
+- name: host
+  fields:
+    - name: model


Capturing these types of inventory attributes has come up in past ECS discussions. One pitfall to avoid would be limiting them to certain field sets that wouldn't allow them to describe a broader range of assets someone might have in their inventory or CMDB.

Examples could be power supplies, generators, or server racks. These items would still have model, manufacturer, or serial_number attributes to capture but wouldn't necessarily still be considered hosts in the ECS sense of a host.

In past brainstorming, the idea of creating an inventory.* or asset.* field set has been suggested, but I think that idea would best be discussed as its own RFC.

I am not sure I understand the distinction you're trying to make here. We have other lower level objects that are disimmilar (.name). Not sure why using a .model for host would preclude using a .model with a different description for another object would be problematic? The inventory. or asset. concept is still talking about an entity (a host), so it would be nice for context but it would also make my search problematic (E.G. if I needed to see a SW inventory on a host). In that case the fact that the scan came from tenable or my EDR host module would be the indicator that it was a point in time inventory vice an event with an associated host.

I think what @ebeahan was trying to say was that in ECS, we describe host broadly as a 'general computing instance' meaning it can be anything from hardware to virtual machine to docker container, etc. The intention with a inventory.* or asset.* would be specifically a physical item we want to keep track of.

Understood. 1.12 has container fields, so I think that we can not worry about that issue. As far as a VM, host.model would be hard to populate, as it is not a current vmware field I am aware of. But, serial_number and vendor are capturable and exist.
I think it's fine to move toward a new object level. I think knowing the host is a Dell ModelX Serial#Y in context of the OS the host is running etc. is valuable and you might lose some of that context in the record, but otherwise;
If the inventory or asset tag is the way forward, let's choose a path.

I personally prefer asset, but I think the RFC process could certainly guide us towards a name. Is this something you are interested in leading?

As I revisited and am rethinking my suggestion, I'm starting to think adding these fields under host.* as proposed may be the better option.

Like @hadadata59 mentioned, we already store asset details about a host, like architecture, OS details, geolocation data, underneath host.* fields already. We also explicitly list hardware as a host type without specifying that a piece of hardware must be compute hardware.

As mentioned in #1512 (comment), I like the symmetry of using product over model to match the existing observer.product field naming.

rfcs/text/0000-host-and-hostname-fields.md

jamiehynds · 2021-10-05T12:03:42Z

@melissaburpo this RFC may be relevant for mappings osquery host data. Is there someone on your team that could provide feedback as to whether the fields could be leveraged by osquery?

melissaburpo · 2021-10-05T18:27:54Z

Hi @jamiehynds - I'll take a look, but @aleksmaus or @james-elastic from our team may have some input as well. Thanks for the ping!

melissaburpo · 2021-10-05T19:27:41Z

I think these three proposed fields in particular map well to values that are retrievable via osquery from a query to the system_info table:

Proposed ECS field	Osquery possible mapping	Example value
host.model	system_info.hardware_model	MacBookPro16,2
host.manufacturer	system_info.hardware_vendor	Apple Inc.
host.serial_number	system_info.hardware_serial	C02F3...

Regarding hostname: we currently populate host.hostname for osquery results, so I'm not sure if the proposed hostname fields -- agent.hostname, destination.hostname, source.hostname -- would be specifically needed for mapping osquery host data, but it is possible these could be used for mapping values from other host fields; it depends on the use case. There are around 12 osquery tables that include hostname/host fields (per a quick search through the schema).

jamiehynds · 2021-10-20T13:13:04Z

host.os.codename and host.containerized were suggested in this issue. Worth adding to this RFC?

ebeahan · 2021-10-27T14:08:58Z

rfcs/text/0000/host.yml

+      description: >
+        The model associated with the host.
+
+    - name: manufacturer


Thoughts about using vendor over manufacturer?

vendor provides symmetry with observer.vendor. Also, if someone had a specific use case to capture a host's ODM along with the vendor, I could see possible confusion over which one to place in the manufacturer field.

we have been using vendor for software and manufacturer for hardware. However, I do not see a reason to object.

ebeahan · 2021-10-27T14:13:44Z

@melissaburpo @jamiehynds If possible, I'd propose we keep symmetry with the existing observer.* field names:

Existing `observer.*` fields	Potential new `host.*` fields
`observer.product`	`host.product`
`observer.vendor`	`host.vendor`
`observer.serial_number`	`host.serial_number`

ebeahan · 2021-10-29T21:17:40Z

I've added comments in-line to the open conversation threads, but I wanted to summarize the discussion in a single place.

Additional `host.*` fields

Regardless of specific field naming and placement, there seems to be agreement around adding fields to capture a host's serial, manufacturer/vendor, and model/product.

We can continue to discuss naming and placement in the PR for stage one.

Adding `.hostname` fields

I've shared past conversations and my own hesitation about adding (or re-adding) .hostname fields across other applicable fieldsets (source, destination, client, server, agent).

However, we can note this under the Concerns section and make sure we revisit in stage 1.

Next steps

These are the steps I've identified before merging:

Verify there's agreement around adding fields to capture a host's serial, manufacturer/vendor, and model/product.
If the .hostname additions remain in the proposal, note past discussion in the Concerns section of the document.

djptek · 2021-11-16T08:43:16Z

Perhaps we need to also update the docs, with an elaboration, see mention of host.hostname

https://github.com/elastic/ecs/blob/main/docs/using-guidelines.asciidoc#guidelines-for-field-names

ebeahan · 2021-12-06T18:33:48Z

Hi @hadadata59, did you see the summary of the next steps in #1512 (comment) to move this proposal forward?

github-actions · 2022-02-24T00:18:36Z

This PR is stale because it has been open for 60 days with no activity.

Stage 0 initial RFC

6127321

ebeahan added the RFC label Jul 13, 2021

ebeahan reviewed Jul 20, 2021

View reviewed changes

hadadata59 added 3 commits July 26, 2021 15:39

Merge branch 'master' into host-and-hostname-fields

b97705c

Update 0000-host-and-hostname-fields.md

9cb92f4

Merge branch 'elastic:master' into host-and-hostname-fields

60f3117

ebeahan reviewed Oct 27, 2021

View reviewed changes

Update rfcs/text/0000-host-and-hostname-fields.md

d6ec5b2

djptek mentioned this pull request Nov 16, 2021

Does vendor device info get mapped to the agent.* fields? #227

Closed

github-actions bot added the stale Stale issues and pull requests label Feb 24, 2022

kgeller mentioned this pull request Aug 15, 2022

Host Fields #2025

Closed

kfirpeled mentioned this pull request Jan 26, 2023

[Cloud Security] Fix ECS import method elastic/integrations#5106

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RFC] Host and Hostname fields - Stage 0 #1512

[RFC] Host and Hostname fields - Stage 0 #1512

hadadata59 commented Jul 13, 2021

cla-checker-service bot commented Jul 13, 2021 •

edited

Loading

ebeahan left a comment

ebeahan Jul 20, 2021

hadadata59 Jul 26, 2021

ebeahan Aug 5, 2021

hadadata59 Aug 18, 2021

kgeller Sep 28, 2021

hadadata59 Oct 11, 2021

kgeller Oct 11, 2021

ebeahan Oct 29, 2021

hadadata59 Nov 9, 2021

ebeahan Jul 20, 2021

hadadata59 Jul 26, 2021

kgeller Sep 28, 2021

ebeahan Oct 26, 2021

ebeahan Jul 20, 2021

hadadata59 Jul 26, 2021

kgeller Sep 28, 2021

hadadata59 Oct 11, 2021

kgeller Oct 11, 2021

ebeahan Oct 27, 2021

ebeahan Oct 29, 2021

jamiehynds commented Oct 5, 2021

melissaburpo commented Oct 5, 2021

melissaburpo commented Oct 5, 2021

jamiehynds commented Oct 20, 2021

ebeahan Oct 27, 2021

hadadata59 Nov 9, 2021

ebeahan commented Oct 27, 2021

ebeahan commented Oct 29, 2021

djptek commented Nov 16, 2021

ebeahan commented Dec 6, 2021

github-actions bot commented Feb 24, 2022

application.reporting_agent.assetid	GUID123_HOSTGUID
application.reporting_agent.endpoint.log.level	INFO
application.reporting_agent.endpoint.product.version	2.7
host.broker_guid	BROKERGUID
host.domain	HOSTNAME.DOMAIN.COM
host.resident_agent.name	HOSTNAME123
host.hostname	HOSTNAME
host.id	HOSTGUID
host.ip	HOSTIP
reporting_agent.server	HOSTNAME.DOMAIN.COM
reporting_agentguid	GUID123
event.resident_agent.version	10.4
host.resident_agent_server.guid	GUID321
host.resident_agent_server.name	"HOSTNAMESVR321"
reporting_agent.server.name	"HOSTNAMESVR123.DOMAIN.COM"

[RFC] Host and Hostname fields - Stage 0 #1512

Are you sure you want to change the base?

[RFC] Host and Hostname fields - Stage 0 #1512

Conversation

hadadata59 commented Jul 13, 2021

cla-checker-service bot commented Jul 13, 2021 • edited Loading

ebeahan left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jamiehynds commented Oct 5, 2021

melissaburpo commented Oct 5, 2021

melissaburpo commented Oct 5, 2021

jamiehynds commented Oct 20, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ebeahan commented Oct 27, 2021

ebeahan commented Oct 29, 2021

Additional host.* fields

Adding .hostname fields

Next steps

djptek commented Nov 16, 2021

ebeahan commented Dec 6, 2021

github-actions bot commented Feb 24, 2022

cla-checker-service bot commented Jul 13, 2021 •

edited

Loading

Additional `host.*` fields

Adding `.hostname` fields