Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add Fuzzer and Fuzzer CLI #3769

Open
wants to merge 24 commits into
base: master
Choose a base branch
from
Open

feat: Add Fuzzer and Fuzzer CLI #3769

wants to merge 24 commits into from

Conversation

rlaau
Copy link

@rlaau rlaau commented Feb 18, 2025

Background

Go has a native fuzzer written in Go, but Gno does not yet have such a fuzzer.
I thought it would be useful to have a simple fuzzer that can be imported and used via the testing library within Gno files.

Therefore, I have added fuzzing-related functionalities to stdlibs/testing and stdlibs/testing/fuzzing.
Users can now import this fuzzer from the testing library and use it just like Go’s native fuzzer, but as a Gno-native fuzzer.


Goal

  1. Provide a Gno fuzzer with the same interface as Go’s fuzzer for ease of use.
  2. Ensure that the Gno fuzzer runs within the GnoVM execution logic, allowing it to detect errors specific to Gno programs.
  3. Maintain a certain level of performance for fuzzing.

Changes

  • Fix: Updated

    • gnovm/cmd/gno/test.go
    • gnovm/pkg/test/test.go
    • gnovm/stdlibs/testing/testing.gno
    • Enabled calling the fuzzer via CLI.
  • Add: gnovm/stdlibs/testing/fuzz.gno – Manages the interface for the Gno fuzzer.

  • Add: gnovm/stdlibs/testing/fuzzing/fuzz_hasher.gno – Assigns and manages IDs for test coverage using FuzzHasher.

  • Add: gnovm/stdlibs/testing/fuzzing/fuzz_logger.gno – Manages error inputs and logs via FuzzLogger.

  • Add: gnovm/stdlibs/testing/fuzzing/fuzz_manager.gno – Efficiently manages seed inputs and coverage (hash numbers) priorities via FuzzManager.

  • Add: gnovm/stdlibs/testing/fuzzing/get_coverage.gno – Executes input values to collect coverage data.

  • Add: gnovm/stdlibs/testing/fuzzing/mutator.gno – Manages the seed values to be mutated and generates the next input from those seeds.

  • Add: gnovm/stdlibs/testing/fuzzing/parser.gno, gnovm/stdlibs/testing/fuzzing/parser_for_not_sb.gno – Analyzes input arguments to help the mutator perform more effective transformations.

  • Add: gnovm/stdlibs/testing/fuzzing/random.gno – Generates PCG random numbers for fuzzing.


Algorithm

  1. The user provides initial seeds via f.Add. All initial seeds are executed, and based on the results:

    • They are coordinated by f.manager.
    • f.mutator stores the seeds along with analyzed execution results.
  2. When f.Fuzz receives the target program to test, the following loop begins:

    1. f.manager selects the highest-priority coverage (the least executed one) and extracts a seed from its queue, passing it to f.mutator.
    2. f.mutator analyzes the seed and selects an appropriate mutation strategy to modify it.
    3. The fuzzer and f.hasher execute the mutated seed and classify the execution results (assigning a hash number to coverage).
    4. The results are processed:
      • f.manager coordinates the seed based on execution results.
      • f.mutator updates mutation strategies according to coverage and seed data.
    5. If a failure occurs, the fuzzer stops execution, and f.logger reports the error.
    6. Or, the fuzzer terminates when the predefined number of iterations is completed.

gfalg


Usage

  1. Currently, coverage measurement for arbitrary functions is not implemented. The following example demonstrates coverage measurement for a specific function where coverage tracking is possible.

  2. The CLI command allows usage similar to Go’s fuzzer:

    gno test [file name] -fuzz=[function prefix] -v -i=[iteration count]
  3. Other functionalities follow the usage described in Go’s fuzz documentation.

  4. However, there are a few differences:

  • The anonymous function passed to f.Fuzz must follow the format:
  • func(t *testing.T, args …interface{})
  • The fuzzing termination condition is iteration-based rather than time-based.
  1. Otherwise, usage is identical.
package testing_test

import (
	"testing"
	"testing/fuzzing"
	"unicode/utf8"
)

/// TODO: Once coverage is fully implemented, test whether the fuzzer can detect
//       issues such as HTTP request failures or compilation errors.

// FuzzEdgeCase tests whether the fuzzer can effectively generate edge cases.
// It evaluates the ability to produce edge inputs in a structured manner.
func FuzzEdgeCase(f *testing.F) {
	f.Add("apple hello", int(42131231230))
	f.Add("rainy day", int(98401132231331))
	f.Add("winter comes", int(12349123123))
	f.Fuzz(func(t *testing.T, orig ...interface{}) {
		v, ok := orig[0].(string)
		if !ok {
			panic("dont match")
		}
		i, ok2 := orig[1].(int)
		if !ok2 {
			panic("dont match")
		}
		rev := fuzzing.Reverse(v)
		doubleRev := fuzzing.Reverse(rev)
		if v != doubleRev && i > 300 && i < 500 {
			t.Errorf("Before: %q, after: %q", orig, doubleRev)
		}
		if utf8.ValidString(v) && !utf8.ValidString(rev) && i > 300 && i < 1000 {
			t.Errorf("Reverse produced invalid UTF-8 string %q", rev)
		}
	})
}

// FuzzSymbolicPath tests whether the fuzzer can explore symbolic execution paths.
// It assesses the ability to traverse deep path depths effectively.
func FuzzSymbolicPath(f *testing.F) {
	f.Add("")
	f.Fuzz(func(t *testing.T, orig ...interface{}) {
		s, ok := orig[0].(string)
		if !ok {
			panic("dont match")
		}
		if len(s) > 0 && s[0] == 'b' {
			if len(s) > 1 && s[1] == 'a' {
				if len(s) > 2 && s[2] == 'd' {
					if len(s) > 3 && s[3] == '!' {
						panic("crash triggered")
					}
				}
			}
		}
	})
}

gf


Notes

  • Since coverage tracking has not yet been fully implemented, this fuzzer cannot be used on all code immediately.
  • The fuzzer was designed with future extensibility in mind and requires ongoing improvements, such as:
    -- Refining mutation strategies
    -- Improving management logic
    -- Optimizing priority selection
    -- Enhancing sampling techniques
  • However, to prevent the PR from becoming too large, I have submitted only the minimal functionality for now.

Future Enhancements

  • TODOs:
    -- Additional input minimization
    -- Refining mutation and strategy update methods
    -- Pattern-based analysis
    -- Implement full coverage tracking

  • Apply sampling techniques for seed selection (e.g., Thompson sampling)

  • Set an execution limit to stop exploration beyond a threshold and reset strategies

  • Improve transformation and priority determination using Sonar-like distance metrics

  • Extract execution logic separately and process it in parallel using multiprocessing, assigning each process its own GnoVM instance for testing

@github-actions github-actions bot added the 📦 🤖 gnovm Issues or PRs gnovm related label Feb 18, 2025
@Gno2D2 Gno2D2 requested a review from a team February 18, 2025 03:08
@Gno2D2 Gno2D2 added the review/triage-pending PRs opened by external contributors that are waiting for the 1st review label Feb 18, 2025
@Gno2D2
Copy link
Collaborator

Gno2D2 commented Feb 18, 2025

🛠 PR Checks Summary

🔴 Pending initial approval by a review team member, or review from tech-staff

Manual Checks (for Reviewers):
  • IGNORE the bot requirements for this PR (force green CI check)
  • The pull request description provides enough details
Read More

🤖 This bot helps streamline PR reviews by verifying automated checks and providing guidance for contributors and reviewers.

✅ Automated Checks (for Contributors):

🟢 Maintainers must be able to edit this pull request (more info)
🔴 Pending initial approval by a review team member, or review from tech-staff

☑️ Contributor Actions:
  1. Fix any issues flagged by automated checks.
  2. Follow the Contributor Checklist to ensure your PR is ready for review.
    • Add new tests, or document why they are unnecessary.
    • Provide clear examples/screenshots, if necessary.
    • Update documentation, if required.
    • Ensure no breaking changes, or include BREAKING CHANGE notes.
    • Link related issues/PRs, where applicable.
☑️ Reviewer Actions:
  1. Complete manual checks for the PR, including the guidelines and additional checks if applicable.
📚 Resources:
Debug
Automated Checks
Maintainers must be able to edit this pull request (more info)

If

🟢 Condition met
└── 🟢 And
    ├── 🟢 The base branch matches this pattern: ^master$
    └── 🟢 The pull request was created from a fork (head branch repo: rlaau/gno)

Then

🟢 Requirement satisfied
└── 🟢 Maintainer can modify this pull request

Pending initial approval by a review team member, or review from tech-staff

If

🟢 Condition met
└── 🟢 And
    ├── 🟢 The base branch matches this pattern: ^master$
    └── 🟢 Not (🔴 Pull request author is a member of the team: tech-staff)

Then

🔴 Requirement not satisfied
└── 🔴 If
    ├── 🔴 Condition
    │   └── 🔴 Or
    │       ├── 🔴 At least 1 user(s) of the organization reviewed the pull request (with state "APPROVED")
    │       ├── 🔴 At least 1 user(s) of the team tech-staff reviewed pull request
    │       └── 🔴 This pull request is a draft
    └── 🔴 Else
        └── 🔴 And
            ├── 🟢 This label is applied to pull request: review/triage-pending
            └── 🔴 On no pull request

Manual Checks
**IGNORE** the bot requirements for this PR (force green CI check)

If

🟢 Condition met
└── 🟢 On every pull request

Can be checked by

  • Any user with comment edit permission
The pull request description provides enough details

If

🟢 Condition met
└── 🟢 And
    ├── 🟢 Not (🔴 Pull request author is a member of the team: core-contributors)
    └── 🟢 Not (🔴 Pull request author is user: dependabot[bot])

Can be checked by

  • team core-contributors

@rlaau rlaau changed the title feat: Add Fuzz and Fuzz CLI feat: Add Fuzzer and Fuzzer CLI Feb 18, 2025
@kristovatlas kristovatlas self-assigned this Feb 18, 2025
Copy link
Member

@notJoon notJoon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've left some comments. Also, please translate all comments to English.

}

default:
panic("logical Error. Type not implemented")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems good to change the panic message. It looks better to simply output that the particular type hasn't been implemented yet.

Suggested change
panic("logical Error. Type not implemented")
panic(ufmt.Sprintf("Type (%d) not implemented", input.(type))

result, usedStrategyMap := MutateNotSbEnt2interface(ms.NotSbEntsDict[argIdx], strength)
boolResult, ok := result.(bool)
if !ok {
panic("logical Error. If it occred, Needs to fix")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Modify this kind of panic messages as well to provide clear information to the user.


// parseKeyVals identifies key-delimiter-value structures in the token stream.
// It detects and extracts key-value pairs, including optional whitespace
// surrounding delimiters like ":", "=", and ":=".
Copy link
Member

@notJoon notJoon Feb 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This type of situation where multi-tokens need to be parsed is efficiently handled by combining recursive descent with limited lookahead.

Note: You don't need to make apply this comment right now, but please keep this in mind as it would make future maintenance easier.

@thehowl
Copy link
Member

thehowl commented Feb 19, 2025

Hey @rlaau, thank you for your contribution.

At the current time, we are in a feature freeze ahead of the mainnet beta launch. This is a large feature which will require extensive time to review and merge, so it will be deferred to after the mainnet launch.

In the meantime, Joon's comment are a good start; but yes, keep in mind that review by the core team will be deferred


var shift int8
if exponent <= 1 {
shift = int8(1 + int(mantissa%2))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changing mantissa%2 to mantissa&1 allows for more efficient processing because it only checks the LSB (Least Significant Bit) instead of performing an actual division operation.

You can also apply this method in randomFloat64 too.

func randomFloat32(a float32) float32 {
bits := math.Float32bits(a)

exponent := (bits >> 23) & 0xFF
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can reduce unnecessary type conversions by setting exponent to int32 type. In this case shift should also be changed to int32 type.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to add a comment explaining that the pcg from stdlib couldn't be used due to an import cycle, which is why it was separately added to the testing package.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
📦 🤖 gnovm Issues or PRs gnovm related review/triage-pending PRs opened by external contributors that are waiting for the 1st review
Projects
Status: Triage
Development

Successfully merging this pull request may close these issues.

5 participants