Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data corruption when multiple parsers are run simultaneously on the same chain #211

Closed
hkalodner opened this issue Nov 21, 2018 · 4 comments
Labels
bug fixed-in-v0.6 This issue has been resolved in the development version (available on the v0.6 branch) parser Issue related to the parser
Milestone

Comments

@hkalodner
Copy link
Collaborator

Please provide a clear and concise description of what the problem is.

Expected Result

blocksci_parser should detect if another process is already operating on the same directory

Actual Result

There's so safeguard in place so 2 processes can corrupt the data

Reproduction Steps

Run the parser twice simultaneously

System Information

N/A

@hkalodner hkalodner added bug parser Issue related to the parser labels Nov 21, 2018
@hkalodner hkalodner added this to the v0.6 milestone Nov 21, 2018
maltemoeser added a commit to maltemoeser/BlockSci that referenced this issue Apr 1, 2019
Create a PID file in the data directory to prevent multiple parser instances from running simultaneously

Resolves citp#211
@maltemoeser
Copy link
Member

@hkalodner this seems like it could be an issue for the v0.5 AMI. If the node updates much faster than the parser it could happen that another cronjob is started while the previous parser process is still running, potentially leading to data corruption

@maltemoeser
Copy link
Member

Looks like currently on the AMI a second parser process will stop due to a lock put in place by the database.

terminate called after throwing an instance of 'std::runtime_error'
  what():  Could not open hash index with error: While lock file: /home/ubuntu/bitcoin/hashIndex/LOCK: Resource temporarily unavailable

@maltemoeser maltemoeser added the fixed-in-v0.6 This issue has been resolved in the development version (available on the v0.6 branch) label May 10, 2019
@jiagengliu
Copy link

It might sound silly, but I accidentally stopped the parser process and now the parser complains that a PID file already exists and aborts. Is there a way to clean up and rerun? Thank you.

@maltemoeser
Copy link
Member

@jiagengliu If the parser was interrupted, I'd strongly recommend to reparse the whole chain since there may be data corruption (#33). You could simply delete the PID file in the data directory to continue parsing, but I don't recommend it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug fixed-in-v0.6 This issue has been resolved in the development version (available on the v0.6 branch) parser Issue related to the parser
Projects
None yet
Development

No branches or pull requests

3 participants