Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ALSP] Synchronization Engine BatchRequest spam detection (Permissionless-related engine level spam detection) #4704

Merged
merged 33 commits into from
Oct 12, 2023
Merged
Show file tree
Hide file tree
Changes from 25 commits
Commits
Show all changes
33 commits
Select commit Hold shift + click to select a range
5f82eb9
load test WIP
gomisha Sep 12, 2023
16befb0
Merge branch 'master' into misha/6812-alsp-sync-engine-batch-request-…
gomisha Sep 20, 2023
db67457
godocs update
gomisha Sep 20, 2023
43a1c59
load test 1 implemented - 0 block IDs
gomisha Sep 20, 2023
ade2b0b
load test 2 - unknown blocks
gomisha Sep 20, 2023
8ad805a
lint fix
gomisha Sep 20, 2023
7a5b40d
Merge branch 'master' into misha/6812-alsp-sync-engine-batch-request-…
gomisha Sep 21, 2023
9b4f529
core implementation, load test
gomisha Sep 21, 2023
5296f6e
added remaining load tests
gomisha Sep 21, 2023
2a33007
Merge branch 'master' into misha/6812-alsp-sync-engine-batch-request-…
gomisha Oct 3, 2023
71aec44
validateBatchRequestForALSP() godoc update
gomisha Oct 3, 2023
6999004
godoc updates - validateRangeRequestForALSP, validateSyncRequestForALSP,
gomisha Oct 3, 2023
b9932df
Merge branch 'master' into misha/6812-alsp-sync-engine-batch-request-…
gomisha Oct 4, 2023
8728fc1
SpamDetectionConfig godoc update
gomisha Oct 4, 2023
52ad0aa
spamProbabilityMultiplier reduced to 1000
gomisha Oct 4, 2023
5cd7137
removed extra line from assignment, error handling
gomisha Oct 5, 2023
0ee522e
removed dropping message on misbehavior
gomisha Oct 5, 2023
eeec1a1
validateBatchRequestForALSP only returns error, logs load misbehaviors
gomisha Oct 5, 2023
ee502fd
validateBatchRequestForALSP godoc update
gomisha Oct 5, 2023
856e5b8
validateRangeRequestForALSP only returns error, logs load misbehaviors
gomisha Oct 5, 2023
c8d0d4a
godocs update
gomisha Oct 5, 2023
7829354
validateSyncRequestForALSP only returns error, logs misbehaviors
gomisha Oct 5, 2023
fc47924
godocs update
gomisha Oct 5, 2023
eff71d5
sync request test fix
gomisha Oct 5, 2023
7fa258b
throw irrecoverable error from process() for any ALSP validation error
gomisha Oct 5, 2023
a341f96
Merge branch 'master' into misha/6812-alsp-sync-engine-batch-request-…
gomisha Oct 6, 2023
d1bd216
validate*ResponseForALSP only return error
gomisha Oct 6, 2023
362ad7d
Merge branch 'master' into misha/6812-alsp-sync-engine-batch-request-…
gomisha Oct 7, 2023
cdfb39f
Merge branch 'misha/6812-alsp-sync-engine-batch-request-spam' of http…
gomisha Oct 7, 2023
5cc2f1c
log.Debug() for logging.KeyLoad
gomisha Oct 7, 2023
13b5713
delete line break between error return and error handling
gomisha Oct 7, 2023
7d2678f
Merge branch 'master' into misha/6812-alsp-sync-engine-batch-request-…
gomisha Oct 11, 2023
0d1fda2
golangci-lint version update (CI fix)
gomisha Oct 12, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 15 additions & 4 deletions engine/common/synchronization/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -37,19 +37,30 @@ func WithScanInterval(interval time.Duration) OptionFunc {
}
}

// spamProbabilityMultiplier is used to convert probability factor to an integer as well as a maximum value - 1
// spamProbabilityMultiplier is used to convert probability factor to an integer as well as a maximum value for the
// random number that can be generated by the random number generator.
const spamProbabilityMultiplier = 1001
const spamProbabilityMultiplier = 1000

// SpamDetectionConfig contains configuration parameters for spam detection for different message types.
// The probability of creating a misbehavior report for a message of a given type is calculated differently for different
// message types.
// MisbehaviourReports are generated for two reasons:
// 1. A malformed message will always produce a MisbehaviourReport, to notify ALSP of *unambiguous* spam.
// 2. A correctly formed message may produce a MisbehaviourReport probabilistically, to notify ALSP of *ambiguous* spam.
// This effectively tracks the load associated with a particular sender, for this engine, and, on average,
// reports message load proportionally as misbehaviour to ALSP.
type SpamDetectionConfig struct {
jordanschalm marked this conversation as resolved.
Show resolved Hide resolved

// syncRequestProb is the probability of creating a misbehavior report for a SyncRequest message.
// batchRequestBaseProb is the base probability in [0,1] that's used in creating the final probability of creating a
// misbehavior report for a BatchRequest message. This is why the word "base" is used in the name of this field,
// since it's not the final probability and there are other factors that determine the final probability.
// The reason for this is that we want to increase the probability of creating a misbehavior report for a large batch.
batchRequestBaseProb float32

// syncRequestProb is the probability in [0,1] of creating a misbehavior report for a SyncRequest message.
syncRequestProb float32

// rangeRequestBaseProb is the base probability that's used in creating the final probability of creating a
// rangeRequestBaseProb is the base probability in [0,1] that's used in creating the final probability of creating a
// misbehavior report for a RangeRequest message. This is why the word "base" is used in the name of this field,
// since it's not the final probability and there are other factors that determine the final probability.
// The reason for this is that we want to increase the probability of creating a misbehavior report for a large range.
Expand Down
189 changes: 117 additions & 72 deletions engine/common/synchronization/engine.go
Original file line number Diff line number Diff line change
Expand Up @@ -208,59 +208,29 @@ func (e *Engine) Process(channel channels.Channel, originID flow.Identifier, eve
func (e *Engine) process(channel channels.Channel, originID flow.Identifier, event interface{}) error {
switch message := event.(type) {
case *messages.BatchRequest:
report, valid, err := e.validateBatchRequestForALSP(channel, originID, message)
err := e.validateBatchRequestForALSP(originID, message)
if err != nil {
return fmt.Errorf("failed to validate batch request from %x: %w", originID[:], err)
}
if !valid {
e.con.ReportMisbehavior(report) // report misbehavior to ALSP
e.log.
Warn().
Hex("origin_id", logging.ID(originID)).
Str(logging.KeySuspicious, "true").
Msgf("received invalid batch request from %x: %v", originID[:], valid)
e.metrics.InboundMessageDropped(metrics.EngineSynchronization, metrics.MessageBatchRequest)
return nil
irrecoverable.Throw(context.TODO(), fmt.Errorf("failed to validate batch request from %x: %w", originID[:], err))
}
return e.requestHandler.Process(channel, originID, event)
case *messages.RangeRequest:
report, valid, err := e.validateRangeRequestForALSP(originID, message)
err := e.validateRangeRequestForALSP(originID, message)
if err != nil {
return fmt.Errorf("failed to validate range request from %x: %w", originID[:], err)
}
if !valid {
e.con.ReportMisbehavior(report) // report misbehavior to ALSP
e.log.
Warn().
Hex("origin_id", logging.ID(originID)).
Str(logging.KeySuspicious, "true").
Msgf("received invalid range request from %x: %v", originID[:], valid)
e.metrics.InboundMessageDropped(metrics.EngineSynchronization, metrics.MessageRangeRequest)
return nil
irrecoverable.Throw(context.TODO(), fmt.Errorf("failed to validate range request from %x: %w", originID[:], err))
}
return e.requestHandler.Process(channel, originID, event)

case *messages.SyncRequest:
report, valid, err := e.validateSyncRequestForALSP(originID)
err := e.validateSyncRequestForALSP(originID)
if err != nil {
return fmt.Errorf("failed to validate sync request from %x: %w", originID[:], err)
}
if !valid {
e.con.ReportMisbehavior(report) // report misbehavior to ALSP
e.log.
Warn().
Hex("origin_id", logging.ID(originID)).
Str(logging.KeySuspicious, "true").
Msgf("received invalid sync request from %x: %v", originID[:], valid)
e.metrics.InboundMessageDropped(metrics.EngineSynchronization, metrics.MessageSyncRequest)
return nil
irrecoverable.Throw(context.TODO(), fmt.Errorf("failed to validate sync request from %x: %w", originID[:], err))
}
return e.requestHandler.Process(channel, originID, event)

case *messages.BlockResponse:
report, valid, err := e.validateBlockResponseForALSP(channel, originID, message)
if err != nil {
return fmt.Errorf("failed to validate block response from %x: %w", originID[:], err)
irrecoverable.Throw(context.TODO(), fmt.Errorf("failed to validate block response from %x: %w", originID[:], err))
}
if !valid {
e.con.ReportMisbehavior(report) // report misbehavior to ALSP
Expand All @@ -269,15 +239,13 @@ func (e *Engine) process(channel channels.Channel, originID flow.Identifier, eve
Hex("origin_id", logging.ID(originID)).
Str(logging.KeySuspicious, "true").
Msgf("received invalid block response from %x: %v", originID[:], valid)
e.metrics.InboundMessageDropped(metrics.EngineSynchronization, metrics.MessageBlockResponse)
return nil
}
return e.responseMessageHandler.Process(originID, event)

case *messages.SyncResponse:
report, valid, err := e.validateSyncResponseForALSP(channel, originID, message)
if err != nil {
return fmt.Errorf("failed to validate sync response from %x: %w", originID[:], err)
irrecoverable.Throw(context.TODO(), fmt.Errorf("failed to validate sync response from %x: %w", originID[:], err))
}
if !valid {
e.con.ReportMisbehavior(report) // report misbehavior to ALSP
Expand All @@ -286,8 +254,6 @@ func (e *Engine) process(channel channels.Channel, originID flow.Identifier, eve
Hex("origin_id", logging.ID(originID)).
Str(logging.KeySuspicious, "true").
Msgf("received invalid sync response from %x: %v", originID[:], valid)
e.metrics.InboundMessageDropped(metrics.EngineSynchronization, metrics.MessageSyncResponse)
return nil
}
return e.responseMessageHandler.Process(originID, event)
default:
Expand Down Expand Up @@ -509,27 +475,98 @@ func (e *Engine) sendRequests(participants flow.IdentifierList, ranges []chainsy
}
}

// TODO: implement spam reporting similar to validateSyncRequestForALSP
func (e *Engine) validateBatchRequestForALSP(channel channels.Channel, id flow.Identifier, batchRequest *messages.BatchRequest) (*alsp.MisbehaviorReport, bool, error) {
return nil, true, nil
// validateBatchRequestForALSP checks if a batch request should be reported as a misbehavior and sends misbehavior report to ALSP.
// The misbehavior is due to either:
// 1. unambiguous malicious or incorrect behavior (0 block IDs) OR
// 2. large number of block IDs in batch request. This is more ambiguous to detect as malicious behavior because there is no way to know for sure
// if the sender is sending a large batch request maliciously or not, so we use a probabilistic approach to report the misbehavior.
//
// Args:
// - originID: the sender of the batch request
// - batchRequest: the batch request to validate
// Returns:
// - error: If an error is encountered while validating the batch request. Error is assumed to be irrecoverable because of internal processes that didn't allow validation to complete.
func (e *Engine) validateBatchRequestForALSP(originID flow.Identifier, batchRequest *messages.BatchRequest) error {
// Generate a random integer between 0 and spamProbabilityMultiplier (exclusive)
n, err := rand.Uint32n(spamProbabilityMultiplier)
if err != nil {
return fmt.Errorf("failed to generate random number from %x: %w", originID[:], err)
}

// validity check: if no block IDs, always report as misbehavior
if len(batchRequest.BlockIDs) == 0 {
e.log.Warn().
Hex("origin_id", logging.ID(originID)).
Str(logging.KeySuspicious, "true").
Str("reason", alsp.InvalidMessage.String()).
Msg("received invalid batch request with 0 block IDs, creating ALSP report")
report, err := alsp.NewMisbehaviorReport(originID, alsp.InvalidMessage)
gomisha marked this conversation as resolved.
Show resolved Hide resolved
if err != nil {
// failing to create the misbehavior report is unlikely. If an error is encountered while
// creating the misbehavior report it indicates a bug and processing can not proceed.
return fmt.Errorf("failed to create misbehavior report (invalid batch request, no block IDs) from %x: %w", originID[:], err)
}
// failed unambiguous validation check and should be reported as misbehavior
e.con.ReportMisbehavior(report)
return nil
}

// to avoid creating a misbehavior report for every batch request received, use a probabilistic approach.
// The larger the batch request and base probability, the higher the probability of creating a misbehavior report.

// batchRequestProb is calculated as follows:
// batchRequestBaseProb * (len(batchRequest.BlockIDs) + 1) / synccore.DefaultConfig().MaxSize
// Example 1 (small batch of block IDs) if the batch request is for 10 blocks IDs and batchRequestBaseProb is 0.01, then the probability of
// creating a misbehavior report is:
// batchRequestBaseProb * (10+1) / synccore.DefaultConfig().MaxSize
// = 0.01 * 11 / 64 = 0.00171875 = 0.171875%
// Example 2 (large batch of block IDs) if the batch request is for 1000 block IDs and batchRequestBaseProb is 0.01, then the probability of
// creating a misbehavior report is:
// batchRequestBaseProb * (1000+1) / synccore.DefaultConfig().MaxSize
// = 0.01 * 1001 / 64 = 0.15640625 = 15.640625%
batchRequestProb := e.spamDetectionConfig.batchRequestBaseProb * (float32(len(batchRequest.BlockIDs)) + 1) / float32(synccore.DefaultConfig().MaxSize)
gomisha marked this conversation as resolved.
Show resolved Hide resolved
if float32(n) < batchRequestProb*spamProbabilityMultiplier {
// create a misbehavior report
e.log.Warn().
gomisha marked this conversation as resolved.
Show resolved Hide resolved
Hex("origin_id", logging.ID(originID)).
Str(logging.KeyLoad, "true").
Str("reason", alsp.ResourceIntensiveRequest.String()).
Msgf("for %d block IDs, creating probabilistic ALSP report", len(batchRequest.BlockIDs))
report, err := alsp.NewMisbehaviorReport(originID, alsp.ResourceIntensiveRequest)
if err != nil {
// failing to create the misbehavior report is unlikely. If an error is encountered while
// creating the misbehavior report it indicates a bug and processing can not proceed.
return fmt.Errorf("failed to create misbehavior report from %x: %w", originID[:], err)
}
// failed probabilistic (load) validation check and should be reported as misbehavior
e.con.ReportMisbehavior(report)
return nil
}
return nil
}

// TODO: implement spam reporting similar to validateSyncRequestForALSP
func (e *Engine) validateBlockResponseForALSP(channel channels.Channel, id flow.Identifier, blockResponse *messages.BlockResponse) (*alsp.MisbehaviorReport, bool, error) {
return nil, true, nil
}

// validateRangeRequestForALSP checks if a range request should be reported as a misbehavior.
// It returns a misbehavior report and a boolean indicating whether validation passed, as well as an error.
// Returns an error that is assumed to be irrecoverable because of internal processes that didn't allow validation to complete.
// Returns true if the range request is valid and should not be reported as misbehavior.
// Returns false if either a) the range request is invalid or b) the range request is valid but should be reported as misbehavior anyway (due to probabilities) or c) an error is encountered.
func (e *Engine) validateRangeRequestForALSP(originID flow.Identifier, rangeRequest *messages.RangeRequest) (*alsp.MisbehaviorReport, bool, error) {
// Generate a random integer between 1 and spamProbabilityMultiplier (exclusive)
// validateRangeRequestForALSP checks if a range request should be reported as a misbehavior and sends misbehavior report to ALSP.
// The misbehavior is due to either:
// 1. unambiguous malicious or incorrect behavior (toHeight < fromHeight) OR
// 2. large height in range request. This is more ambiguous to detect as malicious behavior because there is no way to know for sure
// if the sender is sending a large range request height maliciously or not, so we use a probabilistic approach to report the misbehavior.
//
// Args:
// - originID: the sender of the range request
// - rangeRequest: the range request to validate
// Returns:
// - error: If an error is encountered while validating the range request. Error is assumed to be irrecoverable because of internal processes that didn't allow validation to complete.
func (e *Engine) validateRangeRequestForALSP(originID flow.Identifier, rangeRequest *messages.RangeRequest) error {
// Generate a random integer between 0 and spamProbabilityMultiplier (exclusive)
n, err := rand.Uint32n(spamProbabilityMultiplier)

if err != nil {
return nil, false, fmt.Errorf("failed to generate random number from %x: %w", originID[:], err)
return fmt.Errorf("failed to generate random number from %x: %w", originID[:], err)
}
gomisha marked this conversation as resolved.
Show resolved Hide resolved

// check if range request is valid
Expand All @@ -544,10 +581,11 @@ func (e *Engine) validateRangeRequestForALSP(originID flow.Identifier, rangeRequ
if err != nil {
// failing to create the misbehavior report is unlikely. If an error is encountered while
// creating the misbehavior report it indicates a bug and processing can not proceed.
return nil, false, fmt.Errorf("failed to create misbehavior report (invalid range request) from %x: %w", originID[:], err)
return fmt.Errorf("failed to create misbehavior report (invalid range request) from %x: %w", originID[:], err)
}
// failed validation check and should be reported as misbehavior
return report, false, nil
// failed unambiguous validation check and should be reported as misbehavior
e.con.ReportMisbehavior(report)
return nil
}

// to avoid creating a misbehavior report for every range request received, use a probabilistic approach.
Expand All @@ -568,35 +606,41 @@ func (e *Engine) validateRangeRequestForALSP(originID flow.Identifier, rangeRequ
// create a misbehavior report
e.log.Warn().
gomisha marked this conversation as resolved.
Show resolved Hide resolved
Hex("origin_id", logging.ID(originID)).
Str(logging.KeySuspicious, "true").
Str(logging.KeyLoad, "true").
Str("reason", alsp.ResourceIntensiveRequest.String()).
Msgf("from height %d to height %d, creating probabilistic ALSP report", rangeRequest.FromHeight, rangeRequest.ToHeight)
report, err := alsp.NewMisbehaviorReport(originID, alsp.ResourceIntensiveRequest)

if err != nil {
// failing to create the misbehavior report is unlikely. If an error is encountered while
// creating the misbehavior report it indicates a bug and processing can not proceed.
return nil, false, fmt.Errorf("failed to create misbehavior report from %x: %w", originID[:], err)
return fmt.Errorf("failed to create misbehavior report from %x: %w", originID[:], err)
}
// failed validation check and should be reported as misbehavior
return report, false, nil

// failed probabilistic (load) validation check and should be reported as misbehavior
e.con.ReportMisbehavior(report)
return nil
}

// passed all validation checks with no misbehavior detected
return nil, true, nil
return nil
}

// validateSyncRequestForALSP checks if a sync request should be reported as a misbehavior.
// It returns a misbehavior report and a boolean indicating whether validation passed, as well as an error.
// Returns an error that is assumed to be irrecoverable because of internal processes that didn't allow validation to complete.
// Returns true if passed validation.
// Returns false if either a) failed validation (due to probabilities) or b) an error is encountered.
func (e *Engine) validateSyncRequestForALSP(originID flow.Identifier) (*alsp.MisbehaviorReport, bool, error) {
// Generate a random integer between 1 and spamProbabilityMultiplier (exclusive)
// validateSyncRequestForALSP checks if a sync request should be reported as a misbehavior and sends misbehavior report to ALSP.
// The misbehavior is ambiguous to detect as malicious behavior because there is no way to know for sure if the sender is sending
// a sync request maliciously or not, so we use a probabilistic approach to report the misbehavior.
//
// Args:
// - originID: the sender of the sync request
// Returns:
// - error: If an error is encountered while validating the sync request. Error is assumed to be irrecoverable because of internal processes that didn't allow validation to complete.
func (e *Engine) validateSyncRequestForALSP(originID flow.Identifier) error {
// Generate a random integer between 0 and spamProbabilityMultiplier (exclusive)
n, err := rand.Uint32n(spamProbabilityMultiplier)

if err != nil {
return nil, false, fmt.Errorf("failed to generate random number from %x: %w", originID[:], err)
return fmt.Errorf("failed to generate random number from %x: %w", originID[:], err)
}

// to avoid creating a misbehavior report for every sync request received, use a probabilistic approach.
Expand All @@ -606,7 +650,7 @@ func (e *Engine) validateSyncRequestForALSP(originID flow.Identifier) (*alsp.Mis
// create misbehavior report
e.log.Warn().
gomisha marked this conversation as resolved.
Show resolved Hide resolved
Hex("origin_id", logging.ID(originID)).
Str(logging.KeySuspicious, "true").
Str(logging.KeyLoad, "true").
Str("reason", alsp.ResourceIntensiveRequest.String()).
Msg("creating probabilistic ALSP report")

Expand All @@ -615,13 +659,14 @@ func (e *Engine) validateSyncRequestForALSP(originID flow.Identifier) (*alsp.Mis
if err != nil {
// failing to create the misbehavior report is unlikely. If an error is encountered while
// creating the misbehavior report it indicates a bug and processing can not proceed.
return nil, false, fmt.Errorf("failed to create misbehavior report from %x: %w", originID[:], err)
return fmt.Errorf("failed to create misbehavior report from %x: %w", originID[:], err)
}
return report, false, nil
e.con.ReportMisbehavior(report)
return nil
}

// passed all validation checks with no misbehavior detected
return nil, true, nil
return nil
}

// TODO: implement spam reporting similar to validateSyncRequestForALSP
Expand Down
Loading