Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No connections reported after upgrading to version v0.23.0 from version v0.22.3 #2182

Open
2 of 4 tasks
simonlock opened this issue Oct 9, 2024 · 14 comments
Open
2 of 4 tasks
Labels
bug Something isn't working

Comments

@simonlock
Copy link

Is this a support request?

  • This is not a support request

Is there an existing issue for this?

  • I have searched the existing issues

Current Behavior

OS: Debian GNU/Linux 12 (bookworm) x86_64

Once updated all tailscale nodes show offline (Connected: offline) when running

sudo headscale nodes list

and

https://headscale.domainname.com/windows

is also inaccessible (I am using tls_letsencrypt_challenge_type: TLS-ALPN-01)

The service runs on a restart

sudo systemctl restart headscale.service

I have migrated my config file to align with your new example config.
I have migrated my acl.yml policy file to the new huJSON format (acl.hujson).
I have tried disabling the use of ACLs by setting path: "" under the policy.
I have tried preventing Headscale from managing DNS by setting all fields under dns to empty values.
I’ve also tried disabling UFW.
I am using the latest version of Tailscale on all of my nodes.

However, all my attempts have failed. When I roll back to v0.22.3, everything works.

Are there any known issues with using v0.23.0 on Debian 12?
Please could you suggest where I might be going wrong?

Thanks in advance.

Expected Behavior

To continue working once upgraded to v0.23.0

Steps To Reproduce

Install version v0.23.0 on Debian 12

Environment

- OS: Debian GNU/Linux 12 (bookworm) x86_64
- Headscale version: v0.23.0
- Tailscale version: 1.74.1

Runtime environment

  • Headscale is behind a (reverse) proxy
  • Headscale runs in a container

Anything else?

No response

@simonlock simonlock added the bug Something isn't working label Oct 9, 2024
@nblock
Copy link
Collaborator

nblock commented Oct 9, 2024

The service runs on a restart
sudo systemctl restart headscale.service

Can you paste the output of sudo systemctl status headscale.service and sudo journalctl -u headscale.service -f
please?

Probably, headscale is not running/not listening, can you verify with sudo ss -tlen please?

Are there any known issues with using v0.23.0 on Debian 12?

No, at least I'm not aware of it.

@matsstralbergiis
Copy link

I did just upgrade from 0.22.3 to 0.23.0 and have the same problem.

The service is dead and in the log it looks likes this:
Oct 09 07:53:50 maja headscale[624]: 2024-10-09T07:53:50Z ERR Failed to fetch machine from the database with node key: nodekey:abc... handler=NoisePollNetMap Oct 09 07:53:50 maja headscale[624]: 2024-10-09T07:53:50Z ERR error getting routes error="sql: database is closed" Oct 09 07:53:50 maja headscale[624]: 2024-10-09T07:53:50Z ERR Error listing users error="sql: database is closed" Oct 09 07:53:50 maja headscale[624]: 2024-10-09T07:53:50Z ERR Error listing users error="sql: database is closed"

and

Oct 09 07:55:18 maja systemd[1]: headscale.service: State 'stop-sigterm' timed out. Killing. Oct 09 07:55:18 maja systemd[1]: headscale.service: Killing process 624 (headscale) with signal SIGKILL. Oct 09 07:55:18 maja systemd[1]: headscale.service: Failed to kill control group /system.slice/headscale.service, ignoring: Invalid argument Oct 09 07:55:18 maja systemd[1]: headscale.service: Main process exited, code=killed, status=9/KILL Oct 09 07:55:18 maja systemd[1]: headscale.service: Failed with result 'timeout'. Oct 09 07:55:18 maja systemd[1]: Stopped headscale.service - headscale coordination server for Tailscale. Oct 09 07:55:18 maja systemd[1]: headscale.service: Consumed 2h 39min 9.793s CPU time, 57.5M memory peak, 0B memory swap peak.

I did not replace my current config file as this was the default option.

@nblock
Copy link
Collaborator

nblock commented Oct 9, 2024

I did just upgrade from 0.22.3 to 0.23.0 and have the same problem.

What happens if you restart headscale?

@matsstralbergiis
Copy link

matsstralbergiis commented Oct 9, 2024

root@maja:/home/sysman# systemctl restart headscale
root@maja:/home/sysman# systemctl status headscale
headscale.service - headscale coordination server for Tailscale
Loaded: loaded (/usr/lib/systemd/system/headscale.service; disabled; preset: enabled)
Active: activating (auto-restart) (Result: exit-code) since Wed 2024-10-09 09:39:11 UTC; 2s ago
Process: 891 ExecStart=/usr/bin/headscale serve (code=exited, status=1/FAILURE)
Main PID: 891 (code=exited, status=1/FAILURE)
CPU: 31ms

Oct 09 09:39:11 maja systemd[1]: headscale.service: Main process exited, code=exited, status=1/FAILURE
Oct 09 09:39:11 maja systemd[1]: headscale.service: Failed with result 'exit-code'.

@nblock
Copy link
Collaborator

nblock commented Oct 9, 2024

headscale.service: Failed with result 'exit-code'.

and the corresponding logs from the journal?

@matsstralbergiis
Copy link

I guess this is what you are asking for:

Oct 09 09:53:59 maja systemd[1]: headscale.service: Main process exited, code=exited, status=1/FAILURE
Oct 09 09:53:59 maja systemd[1]: headscale.service: Failed with result 'exit-code'.
Oct 09 09:54:04 maja systemd[1]: headscale.service: Scheduled restart job, restart counter is at 177.
Oct 09 09:54:04 maja systemd[1]: Started headscale.service - headscale coordination server for Tailscale.
Oct 09 09:54:04 maja headscale[2221]: 2024-10-09T09:54:04Z FTL
Oct 09 09:54:04 maja headscale[2221]: WARN: The "dns_config.override_local_dns" configuration key is deprecated and has been removed. Please see the changelog for more details.
Oct 09 09:54:04 maja headscale[2221]: WARN: The "dns_config.magic_dns" configuration key is deprecated. Please use "dns.magic_dns" instead. "dns_config.magic_dns" has been removed.
Oct 09 09:54:04 maja headscale[2221]: WARN: The "dns_config.base_domain" configuration key is deprecated. Please use "dns.base_domain" instead. "dns_config.base_domain" has been removed.
Oct 09 09:54:04 maja headscale[2221]: WARN: The "dns_config.nameservers" configuration key is deprecated. Please use "dns.nameservers.global" instead. "dns_config.nameservers" has been removed.
Oct 09 09:54:04 maja headscale[2221]: WARN: The "dns_config.domains" configuration key is deprecated. Please use "dns.search_domains" instead. "dns_config.domains" has been removed.
Oct 09 09:54:04 maja headscale[2221]: FATAL: The "acl_policy_path" configuration key is deprecated. Please use "policy.path" instead. "acl_policy_path" has been removed.
Oct 09 09:54:04 maja systemd[1]: headscale.service: Main process exited, code=exited, status=1/FAILURE
Oct 09 09:54:04 maja systemd[1]: headscale.service: Failed with result 'exit-code'.

@matsstralbergiis
Copy link

I just read the changelog. I will probably solve this by myself. I will comment here how it goes.

@nblock
Copy link
Collaborator

nblock commented Oct 9, 2024

Yes. It seems the configuration needs to be adjusted for 0.23: Oct 09 09:54:04 maja headscale[2221]: FATAL: The "acl_policy_path" configuration key is deprecated. Please use "policy.path" instead. "acl_policy_path" has been removed.

@matsstralbergiis
Copy link

I took the sample-config and changed it according to changes in the old one.

Now it works perfect.

Sorry to bother you.

Thanks for an excelent product!

@simonlock
Copy link
Author

HI @nblock

This is the output of sudo journalctl -u headscale.service -f

Oct 09 20:40:21 headscale.mydomain.com headscale[32694]: WARN: The "dns.use_username_in_magic_dns" configuration key is deprecated and has been removed. Please see the changelog for more details.
Oct 09 20:40:21 headscale.mydomain.com headscale[32694]: 2024-10-09T20:40:21+01:00 WRN Warning: when using tls_letsencrypt_hostname with TLS-ALPN-01 as challenge type, headscale must be reachable on port 443, i.e. listen_addr should probably end in :443
Oct 09 20:40:21 headscale.mydomain.com headscale[32694]: 2024-10-09T20:40:21+01:00 INF Opening database database=sqlite3 path=/var/lib/headscale/db.sqlite
Oct 09 20:40:22 headscale.mydomain.com headscale[32694]: 2024-10-09T20:40:22+01:00 WRN No IPs found with the alias user
Oct 09 20:40:22 headscale.mydomain.com headscale[32694]: 2024-10-09T20:40:22+01:00 WRN No IPs found with the alias user
Oct 09 20:40:22 headscale.mydomain.com headscale[32694]: 2024-10-09T20:40:22+01:00 INF Setting up a DERPMap update worker frequency=86400000
Oct 09 20:40:22 headscale.mydomain.com headscale[32694]: 2024-10-09T20:40:22+01:00 INF Enabling remote gRPC at 0.0.0.0:50443
Oct 09 20:40:22 headscale.mydomain.com headscale[32694]: 2024-10-09T20:40:22+01:00 INF listening and serving gRPC on: 0.0.0.0:50443
Oct 09 20:40:22 headscale.mydomain.com headscale[32694]: 2024-10-09T20:40:22+01:00 INF listening and serving HTTP on: 0.0.0.0:8080
Oct 09 20:40:22 headscale.mydomain.com headscale[32694]: 2024-10-09T20:40:22+01:00 INF listening and serving debug and metrics on: 0.0.0.0:9090

This line appears to have been the issue:
Oct 09 20:40:21 headscale.mydomain.com headscale[32694]: 2024-10-09T20:40:21+01:00 WRN Warning: when using tls_letsencrypt_hostname with TLS-ALPN-01 as challenge type, headscale must be reachable on port 443, i.e. listen_addr should probably end in :443

So in the configuration file /etc/headscale/config.yml

changing

# For production:
# listen_addr: 0.0.0.0:8080
listen_addr: 0.0.0.0:8080

to

# For production:
# listen_addr: 0.0.0.0:8080
listen_addr: 0.0.0.0:443

Solved the connection error and now all nodes are connected.

In version 0.22.3 listen_addr: 0.0.0.0:8080 worked without issue and I also received valid tls certs. Do you know if this is new expected behavior?

@nblock
Copy link
Collaborator

nblock commented Oct 10, 2024

Oct 09 20:40:22 headscale.mydomain.com headscale[32694]: 2024-10-09T20:40:22+01:00 WRN No IPs found with the alias user
Oct 09 20:40:22 headscale.mydomain.com headscale[32694]: 2024-10-09T20:40:22+01:00 WRN No IPs found with the alias user

There seems to be an issue with your ACL, too.

In version 0.22.3 listen_addr: 0.0.0.0:8080 worked without issue and I also received valid tls certs. Do you know if this is new expected behavior?

I don't know, @kradalby what do you think? As per #2164 (comment) it is strongly recommended to use HTTPS on 443.

@simonlock
Copy link
Author

Thanks @nblock for pointing out the acl. After scanning the internet I cannot find any other reference to users setting headscale to listen on 0.0.0.0:443. Could this be related to the use of the tls_letsencrypt_challenge_type: TLS-ALPN-01.

@devz3r0
Copy link

devz3r0 commented Nov 3, 2024

I run in the same issue:

WARN: The "dns_config.override_local_dns" configuration key is deprecated and has been removed.

that is of course fine as stated it in the change log and fixed it.
However if the service will refuse to start, please state it as an Error Or Critical instead of warning, as it might quicken the troubleshooting.

So it would be nice if it state:

ERR: The "dns_config.override_local_dns" configuration key is deprecated and has been removed.

or

CRIT: The "dns_config.override_local_dns" configuration key is deprecated and has been removed.

A warning suggest that should should take a look, but not, that you must take a look at it.

@dgrr
Copy link

dgrr commented Nov 25, 2024

I also get the error No IPs found with the alias. ACL is the same.
It seems to happen with a rule like { "action": "accept", "src": ["user"], "dst": ["user:*"] }. Allowing users to access their own devices

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants