Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[js][java] Performance lost #8040

Closed
flashultra opened this issue Mar 24, 2019 · 23 comments · Fixed by #10687
Closed

[js][java] Performance lost #8040

flashultra opened this issue Mar 24, 2019 · 23 comments · Fixed by #10687
Assignees
Labels
platform-java Everything related to Java platform-javascript Everything related to JS / JavaScript
Milestone

Comments

@flashultra
Copy link
Contributor

I created two identical files - Haxe and java source - and after that compile Haxe to Java target and run via JVM . There is significant difference between Haxe generated java classes and those write on Java . At example :

  1. Haxe - Java takes 1.01 seconds
  2. Pure Java takes 0.40 seconds

Here is all sources and test vectors:

  1. Haxe source : https://gist.github.com/flashultra/8b33d48aca7c48a9a4d4e7d36b0a3d66
  2. Java source : https://gist.github.com/flashultra/498963ec508e9fb099e3e7110fee7f69
@nadako
Copy link
Member

nadako commented Mar 24, 2019

I'm 80% sure it's all the redundant casting going on, but of course we need to look into the generated code and the profiler output. Thanks for the benchmark, we'll look into it as Java target is getting some love for Haxe 4.1

@flashultra
Copy link
Contributor Author

flashultra commented Mar 24, 2019

Here is more results for Haxe targets :

  • Hashlink ( native ) - 0.40 seconds
  • HXCPP ( native ) - 0.54 seconds
  • Javascript - 0.68 seconds
  • Java - 1.02 seconds
  • Hashlink ( hl ) - 1.96 seconds
  • Eval - 18.35 seconds
  • PHP - 29.67 seconds

What is strange for me is Javascript. According this library https://www.npmjs.com/package/bcrypt , JS should produce 2-3 hashes/sec ( with rounds=12 ) , but it fails with only 1.5 hashes / sec. I think JS should be same as Java i.e about 0.40 seconds ( for pure java code)

So what can be wrong ?

  • Not optimized bcrypt algorithm
  • Haxe not optimized JS code
  • Different environment ( PC, NodeJS , V8 etc )

One solution will be to test the current Haxe js implementation VS npm bcrypt . The other will be to convert haxe code to JS manually without using Haxe compiler

@flashultra
Copy link
Contributor Author

According these tests ( https://github.com/dcodeIO/bcrypt.js/wiki/Benchmark ) bcryptjs should run for 0.40 seconds ( similiar to hashlink , java ) . Here is test for 12 rounds ( they compare native C++ with JS )

  • bcrypt sync: 319.607ms
  • bcrypt.js sync: 404.098ms

@Simn
Copy link
Member

Simn commented Mar 25, 2019

If you want to achieve anything with this issue you'll have to profile the generated code to figure out why it's slow. It's Java so I assume there's good tooling for that.

@flashultra
Copy link
Contributor Author

You are right . Here are profile images for Haxe and Java . Hope that helps.

Java profiler images
java1
java2
java3

Haxe profiler images:
haxe1
haxe2
haxe3

@nadako
Copy link
Member

nadako commented Mar 25, 2019

Yeah, looks like it's the int boxing being a bottleneck...

@Simn
Copy link
Member

Simn commented Mar 25, 2019

Looks like nadako was right.

@Simn
Copy link
Member

Simn commented Mar 25, 2019

By the way, I think we've been here before, but you should really use Vector instead of Array for this kind of stuff. That should actually make a huge difference on Java because it can use the non-boxing native array types.

@Simn
Copy link
Member

Simn commented Mar 25, 2019

Which would mean you'd not only lose the value boxing but also the entire Array layer.

@flashultra
Copy link
Contributor Author

Ok, so I change all Array to Vector and now Haxe java source is at the same speed as pure java code . Still some strange things ( see screenshoot below ) . On thing, if Vector is much better than Array , why array not have way to set fixed length ( if this is the only difference between array and vector) to gain same performance ?
Here are screenshoots . Runtime.toInt take 27%
haxe1-1
haxe1-2
haxe-1-3

@flashultra
Copy link
Contributor Author

flashultra commented Mar 26, 2019

I will close this issue. My conclusion at the moment for replacing Array with Vector:

  • Vector has better performance on Java target
  • Vector has worst performance on Javascript target ( 1.02 seconds VS 0.43 seconds )

So each target should be tested separately using Array or Vector and in the code should be use many compiler flags ( #if java , #if javascript , etc ) to reach good performance.

@Simn
Copy link
Member

Simn commented Mar 26, 2019

Vector has worst performance on Javascript target

Uhm, that's something we should check. How does the generated code differ? Given that Vector<T> is an all-inline abstract over VectorData<T> and that is a typedef to Array<T> on JS, performance should be equal.

@Simn Simn reopened this Mar 26, 2019
@flashultra
Copy link
Contributor Author

I replace all Array with Vector and use ```Vector.fromArrayCopy`` for init the Vector .
I also change pBox and sBox to be Vector
About generated code , Vector is represent as Array , but obviously that doesn't help.
Here is :

@Simn
Copy link
Member

Simn commented Mar 26, 2019

About generated code , Vector is represent as Array , but obviously that doesn't help.

That's what I don't understand. If it's represented as Array then how can it be slower than the Array version?

Could you diff the generated code?

@flashultra
Copy link
Contributor Author

Vector generation created Vectro function . Here is both files ( left is with vector , right is with array)
https://www.diffchecker.com/qYtHwfX2

@flashultra
Copy link
Contributor Author

Maybe the problem is with blit(). Vector use self created function haxe_ds__$Vector_Vector_$Impl_$.blit , but Array use slice().
Suggestion: Replace blit() with slice() for JS target

@Simn
Copy link
Member

Simn commented Mar 26, 2019

Yes I think js can join the #if eval branch there which already does that.

@markknol markknol added the platform-java Everything related to Java label Apr 3, 2019
@Simn Simn added this to the Release 4.0 milestone May 22, 2019
@Simn Simn self-assigned this May 22, 2019
@RealyUniqueName
Copy link
Member

RealyUniqueName commented May 26, 2019

Current implementation of haxe.ds.Vector on js is not efficient for Int items, because it creates native array like this: new Array(length) and JS VM doesn't know items type and creates unoptimized array.
Here is a simple test to illustrate:

import haxe.Timer;
import haxe.ds.Vector;

class Main {
	static var tmp:Int;
	static public function main() {
		var it = new Vector(0xFFFFFF);
		// var it = [for(i in 0...0xFFFFFF) i];
		Timer.measure(() -> {
			for (item in it) {
				tmp = item;
			}
		});
	}
}

With array created via new Vector(length):

$ haxe js.hxml && node test.js
src/Main.hx:8: 0.19099998474121094s

With array explicitly populated with integers:

$ haxe js.hxml && node test.js
src/Main.hx:9: 0.019999980926513672s

10x speedup on array access for identical generated js (except array init):

//Vector init
var it = new Array(16777215);
haxe_Timer.measure(function() {
	var _g = 0;
	while(_g < it.length) Main.tmp = it[_g++];
	return;
}
//Array init
var _g = [];
var _g1 = 0;
while(_g1 < 16777215) _g.push(_g1++);
var it = _g;
haxe_Timer.measure(function() {
	var _g2 = 0;
	while(_g2 < it.length) Main.tmp = it[_g2++];
	return;
}

@RealyUniqueName
Copy link
Member

Changing vector constructor to this = [for(_ in 0...length) null] improves it, but still ~1,5 times slower, than with integers:

src/Main.hx:12: 0.029999971389770508s

@RealyUniqueName RealyUniqueName added the platform-javascript Everything related to JS / JavaScript label May 26, 2019
@RealyUniqueName RealyUniqueName changed the title [java] Performance lost [js][java] Performance lost May 26, 2019
@Simn Simn modified the milestones: Release 4.0, Release 4.1 May 28, 2019
@Simn
Copy link
Member

Simn commented May 28, 2019

Moved this to 4.1 as it doesn't seem urgent.

@RblSb
Copy link
Member

RblSb commented Jun 29, 2019

1,5x slower is still better that 10x. I think the only way to archive same array access performance is changing api to Vector.new(length:Int, ?defautValue:T) and do this = [for(_ in 0...length) defautValue]; in constructor. That slowdown because of array nullability.

@RealyUniqueName
Copy link
Member

the only way to archive same array access performance is changing api to Vector.new(length:Int, ?defautValue:T)

I like the idea. And I think we can do it for 4.0

@Simn
Copy link
Member

Simn commented Jun 29, 2019

That's crazy talk... and let's keep this at 4.1.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
platform-java Everything related to Java platform-javascript Everything related to JS / JavaScript
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants