Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Protective Monitoring: Replace Firehose logs with s3 #7607

Closed
7 tasks done
SimonPPledger opened this issue Jul 30, 2024 · 11 comments
Closed
7 tasks done

Protective Monitoring: Replace Firehose logs with s3 #7607

SimonPPledger opened this issue Jul 30, 2024 · 11 comments
Assignees
Labels

Comments

@SimonPPledger
Copy link
Contributor

SimonPPledger commented Jul 30, 2024

User Story

We currently provide a number of sets of logs to the XSIAM protective monitoring tool. However they are having issues with the info coming, via firehose, from the following:
VPC flow logs
Network firewall logs
Route 53 resolver logs

This ticket is to provide the same log information into new S3 buckets.
To then liaise with the security development team to confirm that they can now ingest these logs correctly

Value / Purpose

This will improve the security of the modernisation platform and enable the SOC to alert us of any issues

Useful Contacts

leonardo.marini@justice.gov.uk

Additional Information

This was originally done using firehose - the original ticket is here #6163

Definition of Done

  • Discuss with Mike log files required
  • New module created - including new S3 bucket(s) created
  • Implement module to replace firehose
  • Security confirms info ingested by monitoring tool
  • Older firehose module removed
  • User docs have been updated
  • source/runbooks/integration-with-protective-monitoring.html.md.erb updated as required
@SimonPPledger SimonPPledger changed the title Protective Monitoring: VPC logs to S3 buckets Protective Monitoring: Replcac Firehose logs with s3 Jul 31, 2024
@SimonPPledger SimonPPledger changed the title Protective Monitoring: Replcac Firehose logs with s3 Protective Monitoring: Replace Firehose logs with s3 Jul 31, 2024
@dms1981 dms1981 self-assigned this Aug 13, 2024
@dms1981 dms1981 moved this from To Do to In Progress in Modernisation Platform Aug 13, 2024
@dms1981
Copy link
Contributor

dms1981 commented Aug 13, 2024

From discussion with Leonardo I think we actually have almost everything in place. We already collate the logs required in S3 (VPC Flow logs, Route 53 Resolver logs, Firewall logs), so all that ought to be required is the SQS setup, notifications to SQS, and an appropriate user & policy for XSIAM

@mikereiddigital
Copy link
Contributor

Hi @dms1981. When I looked into this I could not see where the vpc flow logs output to s3 is set up so my impression was that we would need to add additional buckets & new flow logs to output to them. I was also considering the merit of keeping all of these buckets in core logging - so the same cross-account configuration that cloudtrail uses in its module - and having the accessible to the single IAM user account already set up.
How this would translate into actionable terraform:

  • A local module in core-logging for the s3 buckets, kms, iam resources etc & these resources created first. Assuming one bucket per cloud watch
  • A new remote module for the s3 subscription filters, sqs queue etc where the cloudwatch log resources are created (core-vpc, core-logging, core-shared-services etc)
    There may be questions around the one IAM account being aligned with security guidance.

Anyhow let me know what you think. Talk to you tomorrow.

@dms1981
Copy link
Contributor

dms1981 commented Aug 14, 2024

I've done some reading and agree with the approach of exporting the logs into our core-logging account. I think we'll want a new bucket (with attendant bucket policy & SQS), and some expansion of the existing user to access that bucket.
From there, using AWS Data Firehose to stream logs into the new bucket feels like the best approach. It's referenced in the AWS docs in these places:

I did originally think that we were putting our logs into S3 but on checking that's not the case - the logs in question are sent to CloudWatch. Reconfiguring our logging to send to S3 instead of CloudWatch doesn't feel like the right approach, but streaming the logs feels architecturally correct (and is supported by the documentation I reference above).

@dms1981
Copy link
Contributor

dms1981 commented Aug 16, 2024

I propose to take the following approach in solving this story:

  1. Create new resources (bucket, queue) in core-logging & a terraform module to stream logs
  2. Test the module & delivery of logs from core-vpc-sandbox
  3. Use the new module to stream logs across to core-logging as required by SOC team
  4. Clean up any old resources that are no longer used

@dms1981
Copy link
Contributor

dms1981 commented Aug 29, 2024

This has been implemented in line with existing examples. There are some questions around behaviours seen with SQS queues and the Cortex application, but those will be handled separately to this issue.

@dms1981 dms1981 closed this as completed Aug 29, 2024
@github-project-automation github-project-automation bot moved this from In Progress to Done in Modernisation Platform Aug 29, 2024
@dms1981 dms1981 moved this from Done to For Review in Modernisation Platform Aug 29, 2024
@dms1981
Copy link
Contributor

dms1981 commented Oct 8, 2024

Logs are successfully being ingested directly from S3 for Route53 Resolver Query Logs, and VPC Flow Logs. Network Firewall Logs are being delivered directly from CloudWatch via AWS Data Firehose.

This was referenced Oct 9, 2024
@dms1981
Copy link
Contributor

dms1981 commented Oct 10, 2024

To review this you can see the relevant IAM role used by Cortex in our core-logging account.

You should be able to see that it has been used recently to retrieve logs.

You can see logs being delivered for collection in the relevant buckets: core-logging-r53-resolver-logs, core-logging-vpc-flow-logs, modernisation-platform-logs-cloudtrail.

You will be able to check the SQS queues to ensure there hasn't been a build-up of messages.

You will also be able to check the firehose-errors bucket in core-network-services to see that no errors are being encountered by the AWS Data Stream that transmits firewall logs.

Finally, the documentation is in our user-guide here.

@dms1981 dms1981 moved this from In Progress to For Review in Modernisation Platform Oct 10, 2024
@richgreen-moj
Copy link
Contributor

richgreen-moj commented Oct 11, 2024

I guess this needs removing based on the iam role being used now?

Also I like the points you've noted above about potential issues that could arise (SQS buildup/firehose-errors). Do we want to raise some tickets for monitoring that sort of thing in future (proactive vs reactive)?

@dms1981
Copy link
Contributor

dms1981 commented Oct 11, 2024

Updated the docs and also raised a ticket around monitoring to inform the SOC PM team they may need to take corrective action

@ewastempel
Copy link
Contributor

I went through the list in this post when reviewing the ticket. I have also verified the documentation and ticket raising in this post were addressed.

This ticket is now done.

@github-project-automation github-project-automation bot moved this from For Review to Done in Modernisation Platform Oct 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Done
Development

No branches or pull requests

5 participants