Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change default value for dfs.ha.nn.not-become-active-in-safemode to false #458

Merged
merged 3 commits into from
Jan 19, 2024

Conversation

soenkeliebau
Copy link
Member

@soenkeliebau soenkeliebau commented Jan 19, 2024

Description

Change default value for dfs.ha.nn.not-become-active-in-safemode from true to false to be in line with the Hadoop default value.

This caused issues during a cold restart of an HDFS cluster for a user, due to both namenodes refusing to become active, hence creating a deadlock.

The relevant code in HDFS is shown below, and since this will never be skipped if notBecomeActiveInSafemode is true and the namenode is in safemode it can potentially delay startup a long time or even indefinitely.

    if (notBecomeActiveInSafemode && isInSafeMode()) {
      throw new HealthCheckFailedException("The NameNode is configured to " +
          "report UNHEALTHY to ZKFC in Safemode.");
    }

fixes #264

Definition of Done Checklist

  • Not all of these items are applicable to all PRs, the author should update this template to only leave the boxes in that are relevant
  • Please make sure all these things are done and tick the boxes

Author

Preview Give feedback

Reviewer

Preview Give feedback

Acceptance

Preview Give feedback

@soenkeliebau soenkeliebau added this pull request to the merge queue Jan 19, 2024
Merged via the queue into main with commit f05d68c Jan 19, 2024
30 checks passed
@soenkeliebau soenkeliebau deleted the fix/#264 branch January 19, 2024 11:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

Namenodes deadlock in safemode when all namenodes have been down at the same time
2 participants