Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ECF connection lost between two bundle. #170

Open
bicikbico opened this issue Feb 6, 2025 · 4 comments
Open

ECF connection lost between two bundle. #170

bicikbico opened this issue Feb 6, 2025 · 4 comments

Comments

@bicikbico
Copy link

I'm using the timeService.consumer and timeService.host examples from your repository, but no matter how I run it on any computer, exactly after one hour the connection between the two services gets disconnected, and the host stops providing service. What could be the reason for this?

@scottslewis
Copy link
Contributor

Hi @bicikbico

What could be the reason for this?

I don't immediately know, but I can tell you that there is nothing intentionally in the remote services impl that would account for this behavior. My immediate guess is either something about the policies of your network, or something having to do with anti-virus software. I am going to ask you some questions about your environment and what you are seeing so that I can attempt to reproduce myself and get to the bottom of it (and fix it if it's a bug with ECF).

Questions about your environment

  1. What are your host and consumer processes running on? (version of java, version of Eclipse if running in Eclipse, operating system)

  2. Which example app are you running and how are you starting it? What system or service props are you setting? I'm assuming you are using the default ecf generic provider, but please verify that. As I member, this example app only makes a single remote call...and doesn't repeat. Are you starting/stoping the consumer multiple times, or using the service in some other app of your own?

  3. What happens after 1 hour? By that I mean...do remote calls fail?, or does a new consumer just fail to connect? Do you have the logs/output of both the host and consumer? That output would be very helpful.

Let's go from there. Thanks.

@bicikbico
Copy link
Author

First of all, thank you for your response. My answers to your questions are as follows;

  1. I have started using your timeservice.async, timeservice.consumer.ds, and timeservice.host examples with very minimal modifications. As you mentioned, I am making calls not just once but at a rate of one per second. I am using OSGi Declarative Services on both the host and consumer sides. My host service runs as a standalone desktop application, while the consumer side runs on Eclipse. My Eclipse version is 2021-09 (4.21.0), my Java version is 1.8.0_192, and both the host and consumer are running on the same computer.

  2. Here are my properties for both sides:
    For the host side:
    @component(immediate = true,
    property = {
    "service.exported.interfaces=",
    "service.exported.configs=ecf.generic.server",
    "ecf.generic.server.port=55012",
    "ecf.generic.server.hostname=localhost" })
    For the client side:
    @component(immediate = true,
    property = {
    "service.exported.interfaces=
    ",
    "service.exported.configs=ecf.generic.server",
    "ecf.generic.server.port=55011",
    "ecf.generic.server.hostname=localhost" })

  3. The issue that occurs after one hour, based on the logs from the consumer side, is that the service provided by the host enters the unbind method on the consumer side. However, the interesting part is that if I don't set the existing service to null in the unbind method, communication continues without any issues.
    The problem is that if I stop the consumer service and restart it, it can never reconnect to the host again. This exact issue happens precisely one hour after starting . The same behavior occurs on both my computer and my colleague's computer.

We also implemented RemoteServiceEventListener to analyze what happens exactly at the moment of disconnection. At the exact time when the service gets unbind, we receive the messages "import_unregistration" and "export_unregistration".

any help would be appreciated, thank you.

Hi @bicikbico

What could be the reason for this?

I don't immediately know, but I can tell you that there is nothing intentionally in the remote services impl that would account for this behavior. My immediate guess is either something about the policies of your network, or something having to do with anti-virus software. I am going to ask you some questions about your environment and what you are seeing so that I can attempt to reproduce myself and get to the bottom of it (and fix it if it's a bug with ECF).

Questions about your environment

  1. What are your host and consumer processes running on? (version of java, version of Eclipse if running in Eclipse, operating system)
  2. Which example app are you running and how are you starting it? What system or service props are you setting? I'm assuming you are using the default ecf generic provider, but please verify that. As I member, this example app only makes a single remote call...and doesn't repeat. Are you starting/stoping the consumer multiple times, or using the service in some other app of your own?
  3. What happens after 1 hour? By that I mean...do remote calls fail?, or does a new consumer just fail to connect? Do you have the logs/output of both the host and consumer? That output would be very helpful.

Let's go from there. Thanks.

@bicikbico
Copy link
Author

And class diagram is like below

Image

@scottslewis
Copy link
Contributor

First of all, thank you for your response. My answers to your questions are as follows;

1. I have started using your timeservice.async, timeservice.consumer.ds, and timeservice.host examples with very minimal modifications. As you mentioned, I am making calls not just once but at a rate of one per second. I am using OSGi Declarative Services on both the host and consumer sides. My host service runs as a standalone desktop application, while the consumer side runs on Eclipse. My Eclipse version is 2021-09 (4.21.0), my Java version is 1.8.0_192, and both the host and consumer are running on the same computer.

Is the operating system windows 64-bit or something else?

2. Here are my properties for both sides:
   **For the host side:**
   [@component](https://github.com/component)(immediate = true,
   property = {
   "service.exported.interfaces=_",
   "service.exported.configs=ecf.generic.server",
   "ecf.generic.server.port=55012",
   "ecf.generic.server.hostname=localhost" })
   **For the client side:**
   [@component](https://github.com/component)(immediate = true,
   property = {
   "service.exported.interfaces=_",
   "service.exported.configs=ecf.generic.server",
   "ecf.generic.server.port=55011",
   "ecf.generic.server.hostname=localhost" })

One thing about the above: you should not need any service properties for the client side...that is, you only need these service properties

"service.exported.interfaces=_",
"service.exported.configs=ecf.generic.server",
"ecf.generic.server.port=55012",
"ecf.generic.server.hostname=localhost" })

on the host (the remote service exporter). The client typically discovers the remote service (via a data structure called the EndpointDescription) via some networked or xml-file based discovery mechanism. I believe that for those examples the discovery mechanism were set to be the jmdns discovery (which implements zeroconf protocol on a LAN). Could you try removing those service properties and the client and give it a try?

Would it be possible for me to run/test your code? If you have a public repo available I can get that...or if you like you can just export the relevant projects to a zip file and send it to me at scottslewis at gmail.com. When we get things working I may ask you if I can use it to enhance the timeservice example, but if you don't agree I won't share it at all.

3. The issue that occurs after one hour, based on the logs from the consumer side, is that the service provided by the host enters the unbind method on the consumer side. However, the interesting part is that if I don't set the existing service to null in the unbind method, communication continues without any issues.

This is interesting, and makes me think it might be something about the client-side service properties. Reason is that your client-side service properties would...under normal conditions, export a second instance of the TimeService, so it might be that you are running two exported instances of the TimeService and that somehow the are getting confused.

   The problem is that if I stop the consumer service and restart it, it can never reconnect to the host again. This exact issue happens precisely one hour after starting .   The same behavior occurs on both my computer and my colleague's computer.

When you do this, is there any error message to the log on the failed connect?

Actually, it would be helpful to have both of the console logs (client and host). Or if you are able to provide me with the code I'll generate it myself.

We also implemented RemoteServiceEventListener to analyze what happens exactly at the moment of disconnection. At the exact time when the service gets unbind, we receive the messages "import_unregistration" and "export_unregistration".

Are these import and export unregistration messages both on client side? or import on client side, export on host side? Or something else?

any help would be appreciated, thank you.

Sure. If you are able to share your code (on public repo or via zip) I think that would be easiest, but if you aren't able to do that I understand. Just let me know. FWIW, I give you my word that I will keep everything confidential.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants