-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Which GitHub accont we should/can use for migration? #4
Comments
Seems ASF organization has a few organization accounts. I'm thinking of this account may perfectly fit for the job - I'll ask infra if we can use it as the author account for migrated issues. cc @uschindler |
This is two-pass migration.
I think we can ask infra to use an official ASF bot account (e.g. https://github.com/asfgit) for the first pass. This means the author of all issues/comments will be the bot.
For the second pass, I would like to do this part ourselves since it is the most time-consuming [1] and risky [2] part of the migration process. This means, my account will be noted in each migrated issue/comment header like "edited by mocobeta". I don't want to use my personal account for our migration purpose, but it's more important to avoid any accidents and if there are accidents, quickly react to them as far as possible; it's a trade-off for me. [1] It will take > 24h. |
If we must use an ASF bot account throughout all processes, I'd break up the second pass into three sub-steps.
It'll make the total time of the migration longer (maybe 2-3 days to 4-5 days or more) since it involves additional scripts and communication with infra; but it'll make the updating step a bit safer than the current draft plan in #7, and allows us to ask infra to run the second pass with an infra account. |
If there are no comments/objections, I'll decide how we proceed with that. |
Regarding rate limits, would it be possible to reach out to GitHub directly or through ASF infra to get those temporarily raised? |
I'm not familiar with the relationship or alliance between ASF and GitHub, but the ASF organization accounts could possibly be already counted as Enterprise Accounts (with a higher limit of 15,000 requests per hour). I think the difficulty here is that we developers cannot test our migration script with a real ASF account and infra would expect us to provide "tested" scripts. If we mistakenly estimate the throttling interval the script will fall into an unstable status in the middle of processing (maybe one or two hours later after starting), and there is no way to roll back. Anyway, we'll need infra's help on that. I will open an INFRA issue to ask for advice/information after summing up our draft plan(s). |
There have been no additional comments/requests. |
Hmm why is the script NOT idempotent? It remaps a |
@mikemccand it was not idempotent at that time (multiple same cross-issue links were created if we applied the update script multiple times), but I made it idempotent by this change #16 in exchange for additional steps/time. |
It's difficult (or at least cumbersome) to correctly determine if there is already a remapped cross-issue link so it "should not do anything", since a cross-issue link is just a string ( |
@mocobeta , can you please clarify why you have decided to do two-pass migration? Why don't you just import the issues with the proper cross-references in the first place? In other words:
Here's an example of an issue imported in a single go: https://github.com/vlsi/tmp-jmeter-issues/issues/1188#issue-1327437303 |
We could predict the issue numbers as you pointed out, but we must prioritize safety. Importing takes 24 hours and if there are short-time GitHub outages (it's not very uncommon), the numbers will be inconsistent with the predicted issue numbers - it'd be a disaster for us. (We can't stop importing for a few errors since the importing is done by another team.) In addition, we decided to make GitHub issues available before the migration is finished not to interrupt our issue system while preventing people from opening a new Jira issue; this makes it almost impossible to determine the imported issue numbers. |
There will also be pull requests (we already heavily use it) - @vlsi I'm just curious if there is a way to correctly determine the issue numbers while new issues/PRs are arrived during importing. We won't be able to stop new issues/PRs. |
Just in case: I'm doing the migration for Apache JMeter (see https://github.com/vlsi/bugzilla2github).
I do not think so. As far as I know, the issues are allocated sequentially, and the latest assigned number can be fetched via https://api.github.com/repos/apache/lucene/issues?per_page=1 API (see With JMeter, the flow of issues and PRs is not really high, so we would just make everything read-only during the migration.
I assume you use bulk issue import API (https://gist.github.com/jonmagic/5282384165e0f86ef105), and if you wait for the import to complete, then it does return the assigned issue_number. However, the key need for pre-assigning the numbers is that you could generate comments that reference "issues that will be created later". If GitHub breaks, you could just continue from what you stopped. |
Do you know GitHub has "Temporary interaction limits"? I think it should be able to prevent creating issues and PRs during the import: By the way. I wonder if you tried contacting GitHub support somehow. |
Thanks for your suggestions.
Personally, I wish we could temporarily stop all new issues/PRs during migration, but our community is unlikely to accept it. |
Actually I think that would be fine. We could VOTE on it, but such down time is warranted if it de-risks the migration. |
Current two-pass migration is carefully considered and safe, and there is no risk; though I admit it's a bit complicated. I don't think we should change the two-pass migration plan just to shorten the total time a bit? I don't mean we shouldn't introduce downtime and make it just one pass. But the scripts are already ready and well tested - I think it'd be riskier for us to change it from now. |
Yeah I'm not proposing we change this approach now. I am proposing we mark both GitHub and Jira read-only during the migration. I think the community would agree, since/if can de-risk migration. A two day downtime every 10 years or so seems fine ;) |
Maybe I should have strongly argued we should allow some downtime (no new issues, PRs and comments for one or two days) so that we make the migration plan more simple. But if we follow the two-pass migration plan written in #7, we do not need any downtime. |
Does the second pass generate GitHub notifications for the users mentioned in the issues? |
We confirmed the second pass does not cause any notifications. It seems GitHub does not trigger any notifications by issue/comments updating API. |
Have we confirmed with INFRA that we can quickly make Jira read-only for just our project (Lucene)? |
OK, as long as we feel the risks are all contained (because you had planned on keeping both issue trackers accessible during the migration), then let's stick with that plan. I just want to point out that asking the community to freeze one or both issue trackers is fine IMO. |
We need to update all Jira issues at the last step. If we want Jira read-only during migration, we have to ask infra extra work: make Jira read-only, make Jira writable after the migration, and lastly make Jira read-only again after adding comments to each issue - personally, I don't think we should pursue this way. Making a project read-only is not an easy configuration; workflow (or a database record?) needs to be changed I think. |
Just do this live with the Infra team on Slack. They are very cooperative and fix stuff in realtime. Just make some appointment with them and you can work together with them. We did this several times with migration on Jenkins. They are there to help! |
Yes, but - making a Jira project read-only, writable, and read-only again would not be a quick fix I think? I'm missing something maybe? |
I'll reach INFRA, but If possible, I would like to proceed with the whole process in async-style without real-time conversation using Slack - there is a time difference between me and US/Europe people. It's especially critical if we do the migration on business days... (I also have a daytime job.) |
Don't get me wrong - I appreciate suggestions from all of you. However, there are many things you can easily control but I can't due to several differences (timezone, language fluency, etc.) I have to proceed with this project with my very limited resources and ability if there is no person who is willing to take over this work. Please feel free to pick any tasks if you see I'm doing them badly. |
Hi mocobeta, Regarding read only: Don't misunderstand me. I just wanted to say: Let's make JIRA and Github Readonly for outsiders. Switching this is easy by chaning the permission scheme on the Project, that's two mouseclicks. We can'T do it ourselves, but I'd really want to enforce this. I don't think it is a problem to prevent people from opening issues! We just put a message there like "we have some maintenance, you can't open issues at the moment. Please send a message to dev@licene.apache.org, we will take care to create a new issue once all systems are backup and running." |
And doing something like changing permission scheme can be done with communication on Slack. That's all. I did this several times. |
Sorry I don't know - I'm not familiar with our ASF infrastructure, but I think the infra team selects a proper machine for the job. Anyway we can't run the import script ourselves.
Thanks, it'd be great if we make them completely non-writable. I think it'd be fine for Jira side, maybe people wouldn't care much about it. We could make Jira/GitHub not writable only for external contributors with fine-grained access control, but it'd be confusing- I think we should ensure to prevent everyone including committers from opening issues/PRs/adding comments for both Jira and GitHub, if we are going to introduce a downtime.
Of course, I can use ASF Slack for asynchronous communication. I'll reach the infra team and consult on how to arrange the whole process considering the time difference. |
I recognize strong suggestions to make both Jira and GitHub non-writable during migration. Generally, I agree with it - stopping all activities for two or three days would not be a big deal (hope there are no objections since I've seen at least one request not to stop our issue system). Allowing downtime, the migration plan will be:
For details, I'll reach infra by Jira or Slack. |
I sent an email on the dev@ list to share the change in the migration (~72 hours downtime for issues/PRs). I'd be glad if you give further suggestions there. |
Are you going to include the link to GitHub issue? |
Yes, this is the main purpose of leaving a comment to each Jira. The message will be something like
We plan to completely silence JIRA notifications. If this can't be done ourselves, we'd need to ask infra. |
Could we make it look something like this: This issue was moved to GitHub issue #517. I.e. embed the link in the Jira comment? |
It's just a hyperlink, we can use any anchor string. In Jira syntax it would be:
|
Refined the comment message for Jira. See: commit: 3ad68da |
Hi, Can you please watch the issue, and give comments if needed? I tried to explain our intricate requests but am not so confident that I'm doing well it. |
Thanks @mocobeta! |
Infra created this account for issue migration purposes. https://github.com/asfimport. |
I tested it works. I'm closing this. Thank you everyone who gave comments on this. |
We cannot preserve the original Jira issue/comment authors since GitHub API assumes the caller's account is the author and does not allow callers to change it by any means.
To import/create issues with GItHub API, you need admin access to the repo and we developers are not allowed to have it.
Actual migration will be done by infra; it seems a personal account was used for the import job when Lucene.NET project migrated their issues to GitHub. See .NETify the public API where appropriate lucenenet#280.
For example, Spring uses an organization account that is not tied to a person (spring-projects/spring-framework#22178). Can we do the same? What organization account is available to us?
The text was updated successfully, but these errors were encountered: