Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Just find out the time cost of sandbox.eval() method varies pretty hard. #108

Open
liguojing929 opened this issue Jan 5, 2021 · 21 comments

Comments

@liguojing929
Copy link

Sometimes the eval method took only about 1-2 seconds, sometimes took 5-10 seconds, the maximum time I have encountered was 30 seconds.
ps: I used the same script.

@liguojing929 liguojing929 changed the title Just find out the sandbox.eval() method time complexity varies pretty hard. Just find out the time cost of sandbox.eval() method varies pretty hard. Jan 5, 2021
@liguojing929
Copy link
Author

liguojing929 commented Jan 5, 2021

The javascript file is only about 300-400 lines, and the business logic inside is pretty simple, just do the model mapping and transformation. The code is shown below:

public ScriptAdapter create(String script) throws ScriptException {
log.info("initialize sandbox at {}", new Date().toString());
NashornSandbox sandbox = NashornSandboxes.create(script);
sandbox.setMaxCPUTime(MAX_CPU_TIME); // MAX_CPU_TIME = 200
sandbox.setMaxMemory(MAX_MEMORY); // MAX_MEMORY = 1 * 1024 * 1024
sandbox.allowNoBraces(false);
sandbox.setExecutor(Executors.newFixedThreadPool(MAX_THREAD_COUNT)); // MAX_THREAD_COUNT = 10
sandbox.eval(script); // here took quite a long time.
JSAdapter adapter = new JSAdapter(sandbox);
log.info("finish creating sandbox at {}", new Date().toString());
return adapter;
}

Is it possible to reduce the eval time?

@mxro
Copy link
Collaborator

mxro commented Jan 6, 2021

Thank you for reporting this issue!

I assume the same variation doesn't happen when the script is executed without the sandbox?

Is this code executed concurrently?

@liguojing929
Copy link
Author

Thank you for reporting this issue!

I assume the same variation doesn't happen when the script is executed without the sandbox?

Is this code executed concurrently?

Without sandbox and run directly is ok though. The script is quite simple and straightforward. And the code is concurrently btw.

@liguojing929
Copy link
Author

By the way, I just tried one more time today. To run about 45 different cases (not concurrently) and found out there could be accidentally more than 15 minutes just for the eval() method.

@mxro
Copy link
Collaborator

mxro commented Jan 7, 2021

Oh 15 minutes is pretty high variation then! Hmm, very strange.

Maybe one thing to try may be to shutdown the created thread pools? Also since the ThreadPool is only used once, may it be sufficient to keep the max thread count at 1?

Interesting is too that the Max CPU time is not violated? Maybe you could try to remove this or set to a different level?

@liguojing929
Copy link
Author

liguojing929 commented Jan 7, 2021

The code shown above is a prototype. So currently my goal is to handle about 2000 tps concurrency. I then set the values as below:

Max_CPU_Time = 1000 (1 s)
Max_Memory = 1024 * 1024 * 1024 * 5L (5 GB)
Max_Prepared_Statement = 30

And I have a customized Executor:
Core_pool_size = 10 threads
Max_pool_size = 20 threads
Queue_size = 1000

I am not sure if the version of sandbox matters, but when I changed from 0.1.26 back to 0.1.21, it kinda feels slightly better.

Btw I also change the tomcat thread size:
Min-spare as 15.

But it still sometimes suffered the time-spent bump.

@liguojing929
Copy link
Author

liguojing929 commented Jan 7, 2021

Does the version even matter? Cuz I found out that applying 0.1.26 causes more of the problems, while 0.1.21 causes less and 0.1.19 haven't shown this issue by far.

@mxro
Copy link
Collaborator

mxro commented Jan 9, 2021

Hmm very interesting! I had a look for the commits going in for these versions and couldn't really find anything that should impact performance that much. Is the issue only there when running in Tomcat or also when running in unit tests?

Could you see after how many tps it becomes problematic? 2000 tps sounds like it would need a very beefy machine, due to Nashorn being quite heavy.

@liguojing929
Copy link
Author

For version 0.1.21 and above, the very long-time responses did not happen frequently, and not everytime either. But there just exists potential threats. The problems only happen when running under Tomcat, unit tests were fine. I can't see for certain about how many tps the problem would be triggered, this issue shown up randomly. Can't find pattern of it actually.

@asilism
Copy link

asilism commented Jan 11, 2021

I'm in similar situation. (Version is 0.1.26)
And I did change the version of sandbox to 0.1.21, then it didn't happen.

I try to analyze eval method by using java application monitoring tool named Scouter.
I found heavy method point. -> JsSanitizer#injectInterruptionCalls() [5,479ms]

Looking at the Thread-dump, I can see that the repetitive regular expression method is being called.

Because there were not many changes from version 0.1.21 to 0.1.26,
I could traced back the changes related to this method.
So I guess the change "PoisonPil" class had this side effect.

Slowing down occurs especially a lot when there are many comments, ( comments over 50 lines)
Also, the longer the script is, the more it is affected.

It was not reproduced on my personal laptop, but on several servers (linux).

Finally, the most critical is that the CPU or memory is not monitored because it is still before start EvaluateOperation().

@liguojing929
Copy link
Author

I'm in similar situation. (Version is 0.1.26)
And I did change the version of sandbox to 0.1.21, then it didn't happen.

I try to analyze eval method by using java application monitoring tool named Scouter.
I found heavy method point. -> JsSanitizer#injectInterruptionCalls() [5,479ms]

Looking at the Thread-dump, I can see that the repetitive regular expression method is being called.

Because there were not many changes from version 0.1.21 to 0.1.26,
I could traced back the changes related to this method.
So I guess the change "PoisonPil" class had this side effect.

Slowing down occurs especially a lot when there are many comments, ( comments over 50 lines)
Also, the longer the script is, the more it is affected.

It was not reproduced on my personal laptop, but on several servers (linux).

Finally, the most critical is that the CPU or memory is not monitored because it is still before start EvaluateOperation().

Did you run concurrently?

@liguojing929
Copy link
Author

I'm in similar situation. (Version is 0.1.26)
And I did change the version of sandbox to 0.1.21, then it didn't happen.

I try to analyze eval method by using java application monitoring tool named Scouter.
I found heavy method point. -> JsSanitizer#injectInterruptionCalls() [5,479ms]

Looking at the Thread-dump, I can see that the repetitive regular expression method is being called.

Because there were not many changes from version 0.1.21 to 0.1.26,
I could traced back the changes related to this method.
So I guess the change "PoisonPil" class had this side effect.

Slowing down occurs especially a lot when there are many comments, ( comments over 50 lines)
Also, the longer the script is, the more it is affected.

It was not reproduced on my personal laptop, but on several servers (linux).

Finally, the most critical is that the CPU or memory is not monitored because it is still before start EvaluateOperation().

Are you saying that the PoisonPil class ran in different efficiencies based on two versions? But for what I have observed, the codes of PoisonPil class for versions 19 21 26 are the same.

@asilism
Copy link

asilism commented Jan 11, 2021

Oh, sorry for the confusion.
I mean, POISON_PILLS static variable was changed.

I didn't have time to deeply analyze this logic.
but I just check that it took a lot of time in that method.

@asilism
Copy link

asilism commented Jan 11, 2021

Anyway, as liguojing929 said,
it didn't happen when I lowered the version and I decided to use the version down to 0.1.21 on my production servers.

@liguojing929
Copy link
Author

Anyway, as liguojing929 said,
it didn't happen when I lowered the version and I decided to use the version down to 0.1.21 on my production servers.

oh ok, thanks for the clarification. As for my situation, when running concurrently for which the maximum required tps is 2000, I have to choose 0.1.19 to avoid the sandbox eval time issue. And I compared the implementations among version 0.1.19 and the above (21, 26, etc), still couldn't find the evident code block that causes the issue.

@liguojing929
Copy link
Author

@mxro @asilism So the number of lines of comments in script does affect the time of sandbox initialization? If so, how come sometimes it ran fast and sometimes very slowly ? Is it because the regex running time could vary to a large scale in Java?

@mxro
Copy link
Collaborator

mxro commented Jan 30, 2021

I think many regex can run quite long depending on the input, since they can get themselves into extensive backtracking!

@liguojing929
Copy link
Author

I think many regex can run quite long depending on the input, since they can get themselves into extensive backtracking!

Btw, does the sandbox support class in javascript?

@mxro
Copy link
Collaborator

mxro commented Mar 14, 2021

@liguojing929 Not sure but it may not. According to the Oracle docs

The Nashorn engine is an implementation of the ECMAScript Edition 5.1 Language Specification. It also implements many new features introduced in ECMAScript 6 including template strings; let, const, and block scope; iterators and for..of loops; Map, Set, WeakMap, and WeakSet data types; symbols; and binary and octal literals.

https://docs.oracle.com/javase/10/nashorn/introduction.htm#JSNUG140

And I think 5.1 didn't have classes in it.

@liguojing929
Copy link
Author

Is it possible to add log in js so that samdbox can show up some info I want to see? I've tried console.log but it won't recognize this code.

@mxro
Copy link
Collaborator

mxro commented Jun 3, 2021

@liguojing929 Have you tried print('Hello World');?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants