You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is kind of a grab bag of stuff we're finding out using ECS in our mission. This issue more about providing feedback than asking for anything. And starting conversations! If you have better ideas, please tell me! 😄
We have lots of kinds of feeds:
source and destination top levels are working out so far. These are mostly about network as we don't have a lot of application logs yet. And most of the application logs are going into a service container.
We're still doing source.geoip and source.geoip.asn in some places but have switched to source.geo and source.asn (same same destination) which is working well.
We are leaving the *.ip field in both geo and asn so that if code wants to work on them, they can just take that part of the json and run with it. I don't want to make a developer stitch together information from different parts of the structure. To be clear, we have all of these:
[ client.ip, client.geo.ip, client.asn.ip ]
networkhas grown:
network.interface holds the eth0 type name.
network.status has values like OK
Making field names more self-documenting:
Some field names are synonyms to the uninitiated, or even to a well trained analyst at 3am!
Example: [ agent, device, source, host ]
We had a case where an analyst could not figure out which IP address in an event was the one where the event actually happened. There was nothing about device.ip that looked wrong. When you have hundreds of subscriber networks, the IP address itself isn't a great clue.
Obviously, some of this is down to training, but in the heat of battle (we have a cyber mission) I think we should try to make it obvious. (Don't push the red button, push the cherry button! It's right there next to the rose button!)
I'm thinking of moving device to event.received_by or something like that. We could make a list of objects that relayed the event, but I know kibana doesn't like those very much.
No, seriously, close your eyes, spin around three times and read both of these and try to pick out which is agent and which is device:
The {field_name_here} fields contain the data about the {field_name_here}/client/shipper that created the event.
{field_name_here} fields are used to provide additional information about the {field_name_here} that is the source of the information. This could be a firewall, network device, etc.
Putting things in other things: I've got this kind of paradigm developing where if you have a data from a foo service, I put service.type: foo and then I create a service.foo: {...} which is a container for whatever was reported in that foo structure, which could be the entire original event. That seems to be doing a good job of isolating key names from each other. Especially when I have no control over what foo says or might say going forward.
Timestamps: This is working out well:
If message has a timestamp looking substring in it, put that in event.timestamp but do not reformat it. This will end up being a string type like "19/Nov/2018:10:35:27 -0500" Leave it like that.
Use logstash's date filter to turn event.timestmap into @timestamp, so that's a date time type now.
Use an ingest pipeline in Elasticsearch to add an event.indextime as a date time type, like so:
We also use an event.starttime and event.endtime along with event.duration. In cases like network flows, the event.timestamp is some time after the event.endtime. Take your pick as to which you turn into @timestamp I guess, depending on mission.
We use timelion to plot counts for @timestamp and event.indextime in the same chart as a feed health indicator. If the counts generally match, you're keeping up.
The difference between the two is how long it takes data to get into your system.
The text was updated successfully, but these errors were encountered:
This is amazing, thanks for putting this together :-)
We'll be doing another big push to add some missing things and clarify others ;-) We'll be taking a good look at your feedback. This is really helpful.
This is kind of a grab bag of stuff we're finding out using ECS in our mission. This issue more about providing feedback than asking for anything. And starting conversations! If you have better ideas, please tell me! 😄
We have lots of kinds of feeds:
source
anddestination
top levels are working out so far. These are mostly about network as we don't have a lot of application logs yet. And most of the application logs are going into aservice
container.source.geoip
andsource.geoip.asn
in some places but have switched tosource.geo
andsource.asn
(same samedestination
) which is working well.*.ip
field in bothgeo
andasn
so that if code wants to work on them, they can just take that part of the json and run with it. I don't want to make a developer stitch together information from different parts of the structure. To be clear, we have all of these:[ client.ip, client.geo.ip, client.asn.ip ]
network
has grown:network.interface
holds theeth0
type name.network.status
has values likeOK
Making field names more self-documenting:
[ agent, device, source, host ]
device.ip
that looked wrong. When you have hundreds of subscriber networks, the IP address itself isn't a great clue.device
toevent.received_by
or something like that. We could make a list of objects that relayed the event, but I know kibana doesn't like those very much.agent
and which isdevice
:Putting things in other things: I've got this kind of paradigm developing where if you have a data from a
foo
service, I putservice.type: foo
and then I create aservice.foo: {...}
which is a container for whatever was reported in thatfoo
structure, which could be the entire original event. That seems to be doing a good job of isolating key names from each other. Especially when I have no control over whatfoo
says or might say going forward.Timestamps: This is working out well:
message
has a timestamp looking substring in it, put that inevent.timestamp
but do not reformat it. This will end up being a string type like"19/Nov/2018:10:35:27 -0500"
Leave it like that.date
filter to turnevent.timestmap
into@timestamp
, so that's a date time type now.event.indextime
as a date time type, like so:event.starttime
andevent.endtime
along withevent.duration
. In cases like network flows, theevent.timestamp
is some time after theevent.endtime
. Take your pick as to which you turn into @timestamp I guess, depending on mission.@timestamp
andevent.indextime
in the same chart as a feed health indicator. If the counts generally match, you're keeping up.The text was updated successfully, but these errors were encountered: