Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parallel processing with BlockSci #84

Closed
mosessoh opened this issue Apr 9, 2018 · 1 comment · Fixed by #103
Closed

Parallel processing with BlockSci #84

mosessoh opened this issue Apr 9, 2018 · 1 comment · Fixed by #103
Labels

Comments

@mosessoh
Copy link

mosessoh commented Apr 9, 2018

I have noticed that the Address object in BlockSci cannot be pickled.

This results in users not being able to use the multiprocessing libraries which rely on pickling objects.

BlockSci exposes the map_blocks, mapreduce_txes and mapreduce_blocks functions that help us utilise multiple cores to processing the blockchain.

I'm wondering if I am missing something similar for Address objects, as I currently work with lists of Address objects (e.g. to chart the historical balances of an entity's addresses), but not being able to handle this task in a parallel manner has led to quite a slowdown in our research velocity.

This is somewhat related to #83. Thank you very much for your time and work on BlockSci.

@hkalodner
Copy link
Collaborator

This is an issue in the current release. Native pickle functionality doesn't work on BlockSci objects since they are really pointers into a backing database. This will be fixed in the next release by using a custom Pickler object which takes a reference to the Blockchain that the pickled objects came from. This is already partially implemented in the development branch.
https://github.com/citp/BlockSci/blob/v0.5/Notebooks/blocksci/pickler.py

@hkalodner hkalodner added the bug label May 3, 2018
hkalodner added a commit that referenced this issue May 3, 2018
Update BlockSci to v0.5.0

Version 0.5.0 focuses mainly on improvements and cleanups in the python interface. The largest new feature is the introduction of vectorized operations which return NumPy arrays, enabling much more rapid usage of BlockSci's python interface. You can read more details about the release in the [release notes](https://citp.github.io/BlockSci/changelog.html#version-0-5-0).

Fixes #72, fixes #76, fixes #84, and fixes #98
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants