Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

migrate VMs from ORC #13

Open
5 tasks done
mbjones opened this issue Oct 8, 2021 · 16 comments
Open
5 tasks done

migrate VMs from ORC #13

mbjones opened this issue Oct 8, 2021 · 16 comments
Assignees
Labels
maintenance Security and other maintenance patching

Comments

@mbjones
Copy link
Member

mbjones commented Oct 8, 2021

VMs that need to be migrated from ORC to UCSB:

  • cn-orc-1 (production, 3.6T disk/1.3TB in use, 70G memory, 10 CPUs): Anacapa
  • search-orc-1 (production, ~1TB disk/6GB in use, 4GB memory, 2 CPUs): Anacapa
  • mn-orc-1 (production, 2.2T disk/1.6TB in use, 8G memory, 3 CPUs): Anacapa

Good to migrate if possible, but not as critical:

  • cn-sandbox-orc-1(test, 1.2T disk/1.1T in use, 39G memory, 5 CPUs): NHDC
  • cn-stage-orc-1 (test, 3.7T disk/1.2T in use, 39G memory, 5 CPUs): NHDC

@nickatnceas let's discuss placement of these. If possible, I'd like to move some of these to Anacapa, and others to NHDC (as indicated above). @taojing2002 and @datadavev can coordinate the moves on the DataONE side.

@mbjones mbjones added the maintenance Security and other maintenance patching label Oct 8, 2021
@nickatnceas
Copy link

nickatnceas commented Oct 8, 2021

@mbjones as of right now we don't have any VM hosts at Anacapa with the required specs, but we can move an older host or two from the NHDC.

The two VM hosts we are running at Anacapa (Pluto and Io) are 4 core R330/R340 1U servers with limited upgrade potential.

One note I have is that since Anacapa will be routed through campus, most of the planned campus network outages (like the few we had recently related to campus WiFi) will take down both the NHDC and Anacapa at the same time.

@mbjones mbjones changed the title migrate services from ORC migrate VMs from ORC Oct 8, 2021
@taojing2002
Copy link

We have some configurations (particularly on CNs) using the domain names. It will be great we can keep those domain names.

@mbjones
Copy link
Member Author

mbjones commented Oct 9, 2021

@taojing2002 Yes, the domain names can stay the same.

We plan to migrate the 3 production VMs to Poseidon at NHDC, and then move that host to Anacapa once the transfer is complete. @nickatnceas will coordinate with @taojing2002 and @datadavev on shutting down services at needed times.

The other two non-production VMs will go to NHDC.

@nickatnceas
Copy link

I will need to reconfigure the hardware raid config on Poseidon, which will involve reinstalling the OS. This might take a day or two before we can start transferring data.

@nickatnceas
Copy link

I have server progress tracking at https://github.nceas.ucsb.edu/NCEAS/Computing/issues/106

I should be able to start initial (online) rsyncs tomorrow, and then offline migrations Thursday or Friday.

@nickatnceas
Copy link

Initial rsyncs are running for cn-orc-1 and mn-orc-1. They're running pretty slow, at 20-30 MB/sec (each), but should finish tomorrow before noon PT if they keep that speed.

@nickatnceas
Copy link

cn-orc-1 and search-orc-1 are ready for final migrations. mn-orc-1 is still running its initial rsync.

@nickatnceas
Copy link

@taojing2002 @datadavev the initial rsyncs are done for cn-orc-1, mn-orc-1, and search-orc-1. Do you want to plan on doing the final migration tomorrow (Friday 10/15)?

I need to do the migrations one VM at a time. It will look something like:

  1. Jing/Dave: Stop as many services as possible on the ORC VM (ie PG, Apache, Tomcat, etc)
  2. Nick: Run a final rsync to UCSB
  3. Nick: Update networking/grub/fstab/etc on the UCSB VM
  4. Nick: Boot the UCSB VM
  5. Nick: Change DNS
  6. Jing/Dave: Check and fix DataONE services

I have started on cn-stage-orc-1 and cn-sandbox-orc-1 but they are not ready yet.

@taojing2002
Copy link

I just talked with Nick. We will first complete the sync for cn-sandbox/stage-orc-1 for testing, then do the final sync for production servers.

@nickatnceas
Copy link

The initial rsyncs for cn-stage-orc-1 and cn-sandbox-orc-1 are now running. They are transferring at speeds between 15-30 MB/sec, and have 1.1/1.2 TB to transfer, which will take about 22 hours at the slower 15 MB/sec speed.

We can cut the transfer time in half by deleting /var/postgres-bak on both VMs, but since it's running over the weekend, I don't think it's going to matter. Once the initial transfer is done subsequent rsyncs won't be affected much by those backup files.

I'm planning to be out on Monday Oct 18, but will be back Tuesday the 19th to do the final migrations.

@taojing2002
Copy link

taojing2002 commented Oct 15, 2021 via email

@nickatnceas
Copy link

cn-sandbox-orc-1 is done. DNS has been changed from 160.36.13.152 to 128.111.85.161.

@nickatnceas
Copy link

cn-stage-orc-1 is done. DNS has been changed from 160.36.13.151 to 128.111.85.167.

@nickatnceas
Copy link

Two more done:

search-orc-1 DNS changed from 160.36.13.162 to 128.111.85.187
mn-orc-1 DNS changed from 160.36.13.148 to 128.111.85.183

@nickatnceas
Copy link

cn-orc-1 was migrated last night. All VMs from ORC are now running in the UCSB NHDC.

Poseidon, hosting the three production VMs, is planned to move to Anacapa. If it moves before the UCSB VPN is installed, ports will need to be opened in the NAT. We have a single public IP address in our NAT config, so we may need to use non-standard ports (ie port 22 for SSH is already in use, the ORC MVs may need to use port 2222, 2223, 2224, etc).

If Poseidon is moved after the VPN is installed, the VMs will get public IPs in the 128.111.196.0/23 subnet, and all traffic will be routed through the UCSB campus.

@nickatnceas
Copy link

I moved the three production ORC VMs from Poseidon to Aurora-HW on November 17th.

Aurora-HW is still housed in the NHDC, and is configured to run these VMs off local storage with file based VM disk images. It has more memory and disk resources and faster CPUs than Poseidon, and should not have any trouble hosting the VMs.

I moved Poseidon to Anacapa and it has been reconfigured and is now capable of running the VMs whenever we want.

Anacapa is still NAT'd, and any public facing ports will need to be port-forwarded through a single public IP address. We expect the NAT to be removed in the spring of 2022, at which point all traffic from Anacapa will be routed through the UCSB campus.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
maintenance Security and other maintenance patching
Projects
None yet
Development

No branches or pull requests

3 participants