Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

jep with pytorch happens errors #399

Closed
pyNpy opened this issue May 11, 2022 · 2 comments
Closed

jep with pytorch happens errors #399

pyNpy opened this issue May 11, 2022 · 2 comments

Comments

@pyNpy
Copy link

pyNpy commented May 11, 2022

Describe the problem

hello sir,
in my project , i use jep to call deep-mechine-learning python3 code ,to classify the input text of content , i run the following code with 1000 times .

in normal case , the console will print string strings ,like
"Evaluating: 100%|████████████████████████████| 1/1 [00:00<00:00, 1.41it/s]"

But after loop serval times (we can not make sure the exact number , may 4 or 5 and others number ), the program meet some problems , the console print strings like this : "Evaluating: 0%| | 0/1 [00:00<?, ?it/s] "

I follow the python script and step into torch.nn.module.eval , and i make sure that the python code step in to torch.nn.module.eval, as i print the strings as flag strings >>>>>>>>>torch.nn.module.eval , as the following picture
image

About the code

the loop 1000 times : java call python

image

other java code :

image

**Questions **

  1. Having you ever seen such problems like it ?

  2. I think maybe some problems happens in pytorch , but hava no more idea to following the code ,
    because the python code torch.nn.module.eval is the final called in python . Maybe i need seen the code near progress bar

  3. Is there any possibility of problems happen in jep ? for somewhere , i see the description as python gloable static variable data can be influenced by multi threads ? but the java code which be calling python code run one single process and one thread , i have no idea about it ?

Last words
i can think about the questions as above , but i have not seen the errors as before , can you give me some idea ?thanks a lot .

Environment (please complete the following information):

  • OS Platform, Distribution, and Version: centos 7 Linux localhost.localdomain 3.10.0-1062.el7.x86_64 #1 SMP Wed Aug 7 18:08:02 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
  • Python Distribution and Version: python3 3.7.9
  • Java Distribution and Version: 1.8
  • Jep Version: 4.0.0
  • Python packages used (e.g. numpy, pandas, tensorflow):
    torch , tensorflow , numpy ,pandas
@bsteffensmeier
Copy link
Member

Unfortunately your problem does not point to any specific issue we are aware of. I have no solution but I ahe a few suggestions for things to try to narrow down the problem.

  1. Try running the same scenerio in python without jep, including looping 1000 times and doing the same operation. If the problem is specific to pyorch and completely unrelated to jep this would fail and clearly indicate jep is not the source of problem.
  2. Try creating only one shared interpreter and looping 1000 times within a single interpreter and doing the same task. If this is successful it would indicate there may be a problem related to the way jep cleans up the state when an interpreter closes or creates new state when a new interpreter is open.

@pyNpy
Copy link
Author

pyNpy commented May 12, 2022

yes,i will try the idea which you hava told ,and try more test , thanks

@pyNpy pyNpy closed this as completed May 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants