Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add Address abstraction #986

Merged
merged 110 commits into from
Jun 21, 2022
Merged

add Address abstraction #986

merged 110 commits into from
Jun 21, 2022

Conversation

williballenthin
Copy link
Collaborator

@williballenthin williballenthin commented Apr 8, 2022

This PR introduce a dedicated Address representation. capa can now handle VA vs RVA vs file offset vs .NET token vs .NET token+offset. This ended up touching a lot of code, so this PR also introduces a codification of the freeze and json formats via pydantic.

closes #981
closes #983
closes #756

In summary: add an Address abstract base class (ABC) and some implementors for things like VA, file offset, .NET tokens. Whenever we'd previously emit an int to represent the VA, we now emit an Address instance with appropriate details.

Because we can't use int to identify locations anymore, as extractors enumerate functions/bb/insn, they now return a unified *Handle that contains:

  1. the address instance, and
  2. the inner, extractor-specific data
  3. (FunctionHandles also have a general purpose context dict for caching)

Here's an example of how the handles are used in an extractor: dnfile.extract_insn_api_features
It should be pretty easy, or something is wrong.

Things like render and the matching engine know what to do with Addresses, so these can be used consistently throughout the codebase. And, each extractor knows what to do with inner to pull out features. Previously, we required that location-like things supported int(...) with varying levels of success. This design is both cleaner and more explicit.

Making Addresses complex also involves breaking changes to the freeze and result document formats, because they can no longer be used as the key in a json dictionary. In most places, a Dict[Address, Foo] becomes a List[Tuple[Address, Foo]] within the serialized format.

As I made changes to the freeze and result document formats, I became quite worried that I was leaving things broken, as we didn't have any type checking around the structures (also, these formats weren't documented outside of the code :-( ). So, I made some (fairly big) changes to use pydantic to declare the serialization of these types. These changes weren't really in scope for the original issue, but now seemed like a prudent time for them. Sorry :-/

render examples:

image

virtual addresses: like before, such as 0x401000
file offsets: file+0x1000

TODO

  • pefile handles
  • elf handles
  • SMDA handles
  • IDA handles
  • .NET handles
  • render Addresses
  • encode NO_ADDRESS to JSON
  • encode DN token to JSON
  • freeze Addresses
  • fix any remaining tests
  • IDA plugin
  • tests for .NET
    • freeze
    • render
    • json

Checklist

  • No CHANGELOG update needed
  • No new tests needed
  • No documentation update needed

mr-tz and others added 8 commits April 6, 2022 11:33
* feat: start dotnet detection

* Apply suggestions from code review

Co-authored-by: Willi Ballenthin <willi.ballenthin@gmail.com>

* refactor: dn instead of dotnet

* refactor: format branches, extractor reorg

* refactor: format selection and dotnet detect

* feat: get format, arch, os

* refactor: log errors and exceptions

* ci: also test and build for dotnet-main dev

* fix: import path

* fix: circular dep

* fix: remove buf argument
feat: get runtime meta data

* fix: log unsupported runtime error

* fix: type ignore

Co-authored-by: Willi Ballenthin <willi.ballenthin@gmail.com>
* Sync capa rules submodule

* Sync capa-testfiles submodule

* Sync capa rules submodule

* changelog

* *: remove /x32 and /x64 flavors from number and offset features

* *: remove more references to /x32 and /x64

* linter: accept instruction scope

* rules: fix max operand index (4)

* API: better support A/W functions

* vverbose: show lib rule matches

* main: accept multiple paths to rules

* main: fix removal of default rules path

* lint: fix rules path

* changelog

* capa_as_library: fix rules path is list now

* main: better handle multiple rules paths

* main: bail if python 3.6 or below

closes #964

* ida: readme: remove python 3.6 support

* capa2yara: fix rules paths

* render: meta: display rule paths on separate lines

closes #971

* render: verbose: add doc

* verbose: make rule path multiline more concise

* vverbose: don't show examples in output

closes #970

* vverbose: render subscope name, like "basic block:"

closes #963

* build(deps-dev): bump pytest from 7.0.1 to 7.1.1

Bumps [pytest](https://github.com/pytest-dev/pytest) from 7.0.1 to 7.1.1.
- [Release notes](https://github.com/pytest-dev/pytest/releases)
- [Changelog](https://github.com/pytest-dev/pytest/blob/main/CHANGELOG.rst)
- [Commits](pytest-dev/pytest@7.0.1...7.1.1)

---
updated-dependencies:
- dependency-name: pytest
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* ci: build: update pip and setuptools

* ci: build: bump pyinstall to v4.10

* Sync capa rules submodule

* Dotnet mixed mode detect (#969)

* feat: start dotnet detection (#955)

* feat: start dotnet detection

* Apply suggestions from code review

Co-authored-by: Willi Ballenthin <willi.ballenthin@gmail.com>

* refactor: dn instead of dotnet

* refactor: format branches, extractor reorg

* refactor: format selection and dotnet detect

* feat: get format, arch, os

* refactor: log errors and exceptions

* ci: also test and build for dotnet-main dev

* fix: import path

* fix: circular dep

* fix: remove buf argument
feat: get runtime meta data

* fix: log unsupported runtime error

* fix: type ignore

Co-authored-by: Willi Ballenthin <willi.ballenthin@gmail.com>

* fix: imports and add tests

* feat: detect mixed mode and tests

* feat: start dotnet detection (#955)

* feat: start dotnet detection

* Apply suggestions from code review

Co-authored-by: Willi Ballenthin <willi.ballenthin@gmail.com>

* refactor: dn instead of dotnet

* refactor: format branches, extractor reorg

* refactor: format selection and dotnet detect

* feat: get format, arch, os

* refactor: log errors and exceptions

* ci: also test and build for dotnet-main dev

* fix: import path

* fix: circular dep

* fix: remove buf argument
feat: get runtime meta data

* fix: log unsupported runtime error

* fix: type ignore

Co-authored-by: Willi Ballenthin <willi.ballenthin@gmail.com>

* fix: imports and add tests

Co-authored-by: Willi Ballenthin <willi.ballenthin@gmail.com>

* test: checkout submodules recursively

Co-authored-by: Capa Bot <capa-dev@mandiant.com>
Co-authored-by: Willi Ballenthin <willi.ballenthin@gmail.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
@williballenthin williballenthin added enhancement New feature or request dotnet labels Apr 8, 2022
@williballenthin williballenthin added this to the 4.0.0 milestone Apr 8, 2022
Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add bug fixes, new features, breaking changes and anything else you think is worthwhile mentioning to the master (unreleased) section of CHANGELOG.md. If no CHANGELOG update is needed add the following to the PR description: [x] No CHANGELOG update needed

@williballenthin

This comment was marked as outdated.

@github-actions github-actions bot dismissed their stale review April 8, 2022 18:20

CHANGELOG updated or no update needed, thanks! 😄

@williballenthin

This comment was marked as outdated.

Base automatically changed from dotnet-main to master April 8, 2022 20:55
@williballenthin williballenthin marked this pull request as ready for review June 14, 2022 23:03
@williballenthin
Copy link
Collaborator Author

I've updated the IDA plugin so this PR is now ready for final review and merge. I'm not aware of any issues (aside from how big it is!).

Comment on lines +223 to +224
assert isinstance(location, AbsoluteVirtualAddress)
ea = int(location)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note that we assume that IDA only works with absolute addresses. that is, it doesn't work with .NET tokens, etc. as of today, this is true. if we change the behavior of the plugin in the future, we may have to update these assumptions. thats why i used an assert here.

Comment on lines -175 to +176
class CapaExplorerRulgenPreview(QtWidgets.QTextEdit):
class CapaExplorerRulegenPreview(QtWidgets.QTextEdit):
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed typo

@williballenthin
Copy link
Collaborator Author

williballenthin commented Jun 14, 2022

  • FAILED tests/test_main.py::test_main_dotnet - AssertionError: assert -14 == 0
  • FAILED tests/test_main.py::test_main_dotnet2 - AssertionError: assert -14 == 0
  • FAILED tests/test_render.py::test_render_meta_attack - AttributeError: module...
  • FAILED tests/test_render.py::test_render_meta_mbc - AttributeError: module 'c...
  • FAILED tests/test_scripts.py::test_scripts[show-capabilities-by-function.py-args3]
  • FAILED tests/test_scripts.py::test_bulk_process - AssertionError: assert 1 == 0

Copy link
Collaborator

@mr-tz mr-tz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks a lot!

@williballenthin
Copy link
Collaborator Author

dotnet tests failing due to .NET limitation rule presence fixed by mandiant/capa-rules#563

@williballenthin williballenthin merged commit fb99ef5 into master Jun 21, 2022
@williballenthin williballenthin deleted the feature-981 branch June 21, 2022 15:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dotnet enhancement New feature or request
Projects
None yet
3 participants