-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor audit trail code #516
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great stuff, thank you for doing the thinking to simplify this part of the codebase 👍
I've added a few comments but nothing blocking, up to you if you're interested to tweak things a little more.
val accessTypeAttrName = "j_accessType" | ||
val isExternalAttrName = "j_external" | ||
|
||
private type DbAttr = (String, AttributeValue) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it simplify / clarify things to make this a case class instead of a type alias? Alternatively, is this bringing much value?
In the latter case, this type alias isn't used anywhere except right here (the next few lines), so maybe inling the real type would simplify the AuditLogDbEntryAttrs
definition.
Considering the approach to instead use a case class, I notice that the tests for this code all look something like this: attrs.partitionKey._2.s()
. Making this a case class with named fields would let us do something more descriptive than _2
, something which I had to dig into while reviewing that part of this change.
I think either approahc to changing this would be fine. If you want to keep it as-is that's ok too, this isn't a big deal. It jumped out to me because I generally try to keep quite a high bar for things to qualify for type aliases. Type aliases bring the abstraction cost of having a new concept for a thing without the extra safety and ergonomics of a real typed wrapper with named fields.
@@ -108,11 +59,21 @@ object AuditTrailDB { | |||
queryResult(dynamoDB, request) | |||
} | |||
|
|||
private def attrCondition( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not 100% convinced this helper is worthwhile but if it is perhaps we could call it something like attrEqualCondition
, to better convey that this is hard-coded to do an EQ check?
There are only two uses of this helper, and its definition isn't loads shorter than its invocation. Would it be simpler to write the builder out each time, especially since this would be consistent with the "BETWEEN" condition used just below for date range?
If IntelliJ did that underlined "Duplicate code fragment" thing then no worries, that's a clear sign wiser minds than I say this is definitely worthwhile 😁
@@ -239,7 +238,7 @@ class Janus( | |||
janusData.access | |||
) | |||
} yield { | |||
AuditTrailDB.insert(table, auditLog) | |||
AuditTrailDB.insert(auditLog) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I realise this isn't part of your change, but while we're here anyway and tidying up the DB usage can we pull this up into the for comp steps? I don't like that we have an important side-effecting step in the yield block here (my apologies for the current state of things).
Something like _ = AuditTrailDB.insert(auditLog)
as the last step would highlight that this is an important operation, separate from building our return value.
I'd do the same for the log line, but I don't feel as strongly we should do that now.
Equally, I'm very happy to tackle this along with a wider review of the codebase in a separate PR - I only mention it here because our diff has hit it.
dynamoDB.putItem(request) | ||
} | ||
|
||
private def toAttribValue(value: Any): AttributeValue = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Delighted to see this go away, great change 👍
val secondaryIndexName = "AuditTrailByUser" | ||
|
||
val partitionKeyName = "j_account" | ||
val sortKeyName = "j_timestamp" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are a few places in the codebase after this change where we lose a bit of the connection to what's really going on, when we use this name.
From a DynamoDB implementation point of view, the DB's sort key is what we're using to implement date sorting/filtering. It makes sense to call this sortKeyName
to highlight this important connection to the implementation detail. An example of this is https://github.com/guardian/janus-app/pull/516/files#diff-3fcccabe5ec8afa651c9696d905657fabb5b0ba56779b459c6150def64b600fbR76, makes great sense as-is.
However, it's also the audit log timestamp, and there are other use cases where it would sense for this name to be connected to our business domain. An example is the date range condition's implementation above. In this case it's probably more helpful to know that we're using the timestamp here.
I don't have an answer for this, I think we're being pulled in both directions! One observation is that we refer to both domains in this block, which I think helps with comprehension. The code right here tells us that this is the DB sort field and that it relates to the access time.
Maybe there's a name for this member that could convey that this is the timestamp and that it is the timestamp?
val tableName = "AuditTrail" | ||
val secondaryIndexName = "AuditTrailByUser" | ||
|
||
val partitionKeyName = "j_account" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This consolidation might also give us the opportunity to add a comment explaining the j_
prefix, which is to avoid clashing with DynamoDB reserved keywords. Up to you if you think this is relevant / useful?
This change optimises and simplifies some of the code that manages interactions with the audit trail DB table.
Changes
The name of the table and its key schema are already hardcoded throughout the code so it seems redundant to occasionally look these details up as well.
As mentioned above, the table name is already hardcoded so need to pass it around as a parameter.
To make it clear where the same field is being used, what it is and avoid typos etc.
To test
These tests have been done locally and need to be repeated in production: