Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Services are accessible even if their namespace doesn’t exist locally #605

Closed
skitt opened this issue Aug 20, 2021 · 12 comments · Fixed by #1289
Closed

Services are accessible even if their namespace doesn’t exist locally #605

skitt opened this issue Aug 20, 2021 · 12 comments · Fixed by #1289
Assignees
Labels
bug Something isn't working priority:medium

Comments

@skitt
Copy link
Member

skitt commented Aug 20, 2021

What happened:

In a two-cluster setup running Lighthouse (devel), I set up a namespace and service in cluster2, and exported it. The namespace didn’t exist in cluster1, but even so, the exported service was accessible.

What you expected to happen:

Given that the namespace containing the service doesn’t exist in cluster1, the service shouldn’t be imported there.

How to reproduce it (as minimally and precisely as possible):

  1. make deploy from Lighthouse or the operator
  2. create a namespace in one cluster only
  3. create a service there
  4. export it
  5. check that it can be resolved and accessed from the other cluster
@skitt skitt added the bug Something isn't working label Aug 20, 2021
danibachar added a commit to danibachar/lighthouse that referenced this issue Oct 6, 2021
danibachar added a commit to danibachar/lighthouse that referenced this issue Oct 6, 2021
Signed-off-by: I <danibachar89@gmail.com>
danibachar added a commit to danibachar/lighthouse that referenced this issue Oct 7, 2021
Signed-off-by: I <danibachar89@gmail.com>
danibachar added a commit to danibachar/lighthouse that referenced this issue Oct 8, 2021
Signed-off-by: I <danibachar89@gmail.com>
danibachar added a commit to danibachar/lighthouse that referenced this issue Oct 8, 2021
Signed-off-by: I <danibachar89@gmail.com>
danibachar added a commit to danibachar/lighthouse that referenced this issue Oct 8, 2021
Signed-off-by: I <danibachar89@gmail.com>
@danibachar
Copy link
Contributor

danibachar commented Oct 8, 2021

@skitt - what would be the expected behavior in that case but that the DNS query specify the destination cluster (which is hosting the relevant namespace)?
For example
Service s1 is in cluster cluster1 on namespace ns1
Service s2 is present on cluster cluster2 which does not have namespace ns1 deployed on.
It releases the following DNS query:
cluster1.s1.ns1.svc.clusterset.local

I'm not quite sure why would we even block access? Do you think it is a security risk?

danibachar added a commit to danibachar/lighthouse that referenced this issue Oct 8, 2021
Signed-off-by: I <danibachar89@gmail.com>
danibachar added a commit to danibachar/lighthouse that referenced this issue Oct 8, 2021
Signed-off-by: I <danibachar89@gmail.com>
danibachar added a commit to danibachar/lighthouse that referenced this issue Oct 8, 2021
Signed-off-by: I <danibachar89@gmail.com>
@skitt
Copy link
Member Author

skitt commented Oct 11, 2021

Service s1 is in cluster cluster1 on namespace ns1
Service s2 is present on cluster cluster2 which does not have namespace ns1 deployed on.
It releases the following DNS query: cluster1.s1.ns1.svc.clusterset.local

See the MCS API:

A multi-cluster service will be imported only by clusters in which the service's namespace exists.

So cluster2 shouldn’t import s1, and its DNS server shouldn’t be aware of it, so looking up cluster1.s1.ns1.svc.clusterset.local should fail.

I'm not quite sure why would we even block access? Do you think it is a security risk?

I’m not sure it’s a security risk, but it could be a surprise for the administrator — in cluster2, if the administrator hasn’t created ns1, one wouldn’t expect services in ns1 to be accessible.

@vthapar
Copy link
Contributor

vthapar commented Nov 24, 2021

Ideal fix for this issue would be with #214 Namely, we would rely on aggregated import for DNS resolution but we won't aggregate the import if namespace is not present locally.

@donaldh
Copy link

donaldh commented Feb 10, 2022

@vthapar I think #214 is independent of this issue. #214 is concerned with merging multiple ServiceImports that are allowed to happen per the spec. This issue is concerned with not allowing a ServiceImport when the relevant namespace is not present in the candidate importing cluster.

A related issue: the kep strongly implies that the ServiceImport object should be created in the namespace that the Service was exported from and Submariner currently diverges from this; the ServiceImports are in the submariner-operator namespace. If we were to create ServiceImports in the correct namespace, then this issue could be resolved as part of that work. I am not familiar enough with the code to know whether syncing to multiple namespaces would end up creating sync or permissions problems though.

Another detail from the kep:

An implementation may or may not decide to create missing namespaces automatically, that behavior is out of scope of this spec.

My guess is that it could be a policy / broker configuration option to auto-create namespaces. Enabling it would mean that namespace existence could not be used as a way of controlling which clusters services get imported to.

@skitt
Copy link
Member Author

skitt commented Feb 10, 2022

A related issue: the kep strongly implies that the ServiceImport object should be created in the namespace that the Service was exported from and Submariner currently diverges from this; the ServiceImports are in the submariner-operator namespace. If we were to create ServiceImports in the correct namespace, then this issue could be resolved as part of that work. I am not familiar enough with the code to know whether syncing to multiple namespaces would end up creating sync or permissions problems though.

Yes, creating the ServiceImport objects in the right namespace would fix this, and it would also simplify non-MCS scenarios where services are imported into other namespaces than the namespace of the service itself.

@donaldh
Copy link

donaldh commented Feb 10, 2022

I have a potential patch to move ServiceImports into the appropriate namespace. Will test to see if any more work is needed.

@vthapar
Copy link
Contributor

vthapar commented Feb 14, 2022

@donaldh You're correct that it is a different issue from #214 but our intention was to implement namespace checks when we aggreate service imports. Namespace checks will be implicitly done as part of creating aggregated serviceimport in the service namespace. Whether we decide to auto-create namespace or set ServiceImport status to error, either can be done when aggregating serviceimports. ServiceImports in broker namespace are more a result of way we distribute ServiceImports through broker. But yes, we can have ServiceImports created in apporpriate namespace without aggregation and if you can bring in a patch it would be great.

Note that one of the reasons we deviate from spec is that we don't implement a ClusterSetIP like spec requires.

@donaldh
Copy link

donaldh commented Feb 14, 2022

@vthapar Is your intention that the ServiceImports would be synchronised to the submariner-operator namespace and then merged into a single ServiceImport in the target namespace?

@vthapar
Copy link
Contributor

vthapar commented Feb 15, 2022

@donaldh Yes. Other way would be to aggregate at source in correct namespace,keep only one copy on broker and sync to origin namespace on remotes. Both have pros and cons.

  1. Aggregate at source: Easy to catch conflicts and update status on ServiceExport, lesser ServiceImports being synced across clusters. But that might lead to too many conflicts when writing from multiple clusters at same time. Not an issue during normal operation, but will be an issue during restart/upgrades or when deploying on multiple clusters using automation. This mostly means it will take longer to sync everything on startup/upgrade.
  2. Aggregate at destination: All aggregation is done locally so no conflicts from multiple clusters writing to same object. But large number of service imports will be distributed across clusters, updating relevant ServiceExport for conflict will require going through list of ServiceExports.

I do think 1 will scale better in longer run with lesser data being synced across clusters.

@stale
Copy link

stale bot commented Aug 1, 2022

This issue has been automatically marked as stale because it has not had activity for 60 days. It will be closed if no further activity occurs. Please make a comment if this issue/pr is still valid. Thank you for your contributions.

@stale stale bot added the wontfix This will not be worked on label Aug 1, 2022
@skitt skitt removed the wontfix This will not be worked on label Aug 1, 2022
@stale
Copy link

stale bot commented Dec 21, 2022

This issue has been automatically marked as stale because it has not had activity for 60 days. It will be closed if no further activity occurs. Please make a comment if this issue/pr is still valid. Thank you for your contributions.

@stale stale bot added the wontfix This will not be worked on label Dec 21, 2022
@stale stale bot closed this as completed Dec 31, 2022
@skitt skitt reopened this Jan 2, 2023
@stale stale bot removed the wontfix This will not be worked on label Jan 2, 2023
@tpantelis
Copy link
Contributor

The aggregated ServiceImports are now created in the service namespace which inherently fixes this issue. However we should verify the exact behavior in this case to ensure there's no detrimental effect. An E2E test would be ideal.

@tpantelis tpantelis self-assigned this Apr 28, 2023
@dfarrell07 dfarrell07 moved this to Todo in Submariner 0.16 May 9, 2023
tpantelis added a commit to tpantelis/lighthouse that referenced this issue Jun 20, 2023
...to verify an imported service is not accessible if the
service namespace doesn't exist locally.

Since the service is created in the framework's generated namespace,
the test deletes this namespace from the source cluster and creates
another namespace to contain the netshoot deployment.

The test was placed in a new folder named "internal" so it isn't run
with 'subctl verify'. We should only need to verify this functionality
in CI plus it has the overhead of an additional namespace.

Fixes submariner-io#605

Signed-off-by: Tom Pantelis <tompantelis@gmail.com>
@tpantelis tpantelis moved this from Todo to In Review in Submariner 0.16 Jun 20, 2023
tpantelis added a commit to tpantelis/lighthouse that referenced this issue Jun 20, 2023
...to verify an imported service is not accessible if the
service namespace doesn't exist locally.

Since the service is created in the framework's generated namespace,
the test deletes this namespace from the source cluster and creates
another namespace to contain the netshoot deployment.

The test was placed in a new folder named "internal" so it isn't run
with 'subctl verify'. We should only need to verify this functionality
in CI plus it has the overhead of an additional namespace.

Fixes submariner-io#605

Signed-off-by: Tom Pantelis <tompantelis@gmail.com>
tpantelis added a commit to tpantelis/lighthouse that referenced this issue Jun 22, 2023
...to verify an imported service is not accessible if the
service namespace doesn't exist locally.

Since the service is created in the framework's generated namespace,
the test deletes this namespace from the source cluster and creates
another namespace to contain the netshoot deployment.

The test was placed in a new folder named "internal" so it isn't run
with 'subctl verify'. We should only need to verify this functionality
in CI plus it has the overhead of an additional namespace.

Fixes submariner-io#605

Signed-off-by: Tom Pantelis <tompantelis@gmail.com>
dfarrell07 pushed a commit that referenced this issue Jul 5, 2023
...to verify an imported service is not accessible if the
service namespace doesn't exist locally.

Since the service is created in the framework's generated namespace,
the test deletes this namespace from the source cluster and creates
another namespace to contain the netshoot deployment.

The test was placed in a new folder named "internal" so it isn't run
with 'subctl verify'. We should only need to verify this functionality
in CI plus it has the overhead of an additional namespace.

Fixes #605

Signed-off-by: Tom Pantelis <tompantelis@gmail.com>
@github-project-automation github-project-automation bot moved this from In Review to Done in Submariner 0.16 Jul 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working priority:medium
Projects
No open projects
Status: Done
Development

Successfully merging a pull request may close this issue.

6 participants