Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed-up Klopfenstein taper and ABCD matrix calculations #3

Merged
merged 1 commit into from
Sep 29, 2015

Conversation

in3otd
Copy link

@in3otd in3otd commented Sep 28, 2015

Here is an update to greatly speed-up the tapers calculations, especially the Klopfenstein.
The two main points are:

  • the overall ABCD matrix calculation is now done with a fixed number of sections (could be made dependent on the impedance step, did not try to check if it is worth)
  • the Klopfenstein taper impedance is computed using a nice recursive formula

Enjoy! and thanks again for all the work!

Implemented the simple recursive formula from Grossberg, Proc. IEEE
1968, pp.1629-1630 to compute the phi(x, A) function needed for
the Klopfenstein taper calculations.
Changed the ABCD matrix to use a constant number of sections, instead
of using constant fraction of wavelength sections, since the number
of sections needed depends mainly on the impedance change across
the section, not on its length.
Corrected the ABCD matrix to Y-parameters conversion (had a missing
negative sign). Now using the a Qucs function to set the S-matrix.
andresmmera added a commit that referenced this pull request Sep 29, 2015
Speed-up Klopfenstein taper and ABCD matrix calculations
@andresmmera andresmmera merged commit c4bd268 into andresmmera:TaperedLines Sep 29, 2015
@andresmmera
Copy link
Owner

Wow!! Your Klopfenstein taper implementation is miracle-working :-) I've just compared your code with the paper you've cited and it is correct. Thank you!!
The idea of using a fixed length step to analyse the taper impedance increases the computation a lot. That's wonderful. (I'll try to make sure that it's 100% correct) :-)

On the other hand, I've removed the line 75 (previously commented) because it is no longer useful. I've also remembered that float divisions take more CPU cycles than float multiplications so I have also converted x/2 expressions into .5*x. That's not a significant improvement, but every little helps...
http://stackoverflow.com/questions/4125033/floating-point-division-vs-floating-point-multiplication

Besides this, I've already added the GPL license + my name. I would like to add yours there. May I do it?

@in3otd
Copy link
Author

in3otd commented Sep 29, 2015

yep, the recursive phi() formula works a treat, I had thought to attempt integrating the series expansion but I doubt I will have ever been able to come up with the recursion. Nice to see that others did well before us, and published the results.

As said in a previous comments, the ABCD matrix will be exact with 1 section only if the actual line was uniform, i.e. with a constant impedance, the need to divide it in section comes from the impedance variations, not from the line delay, which is modeled already by the section ABCD matrix.
I saw in the literature there are alternative methods to model a lines cascade, no need to change now, but might be interesting to experiment further, one day.
Another feature that could be added in the future is the ability to model arbitrary tapers; make the component read the line impedances from a small text file with (x, Z) data and use these for computing the overall ABCD matrix.

ok, for line 75, it was a leftover from previous mods.

I usually do not pay much attention to the simple math operations, I expect the compiler to understand that x/2 is the same as 0.5*x and choose automatically the best implementation. If you have time, try to see if you can practically see a difference, either in timing or in the generated assembly code (if you are familiar with that, I'm not), I guess they will be identical.

Thanks for asking to add my name to the code authors, I won't say no 😁, it was an interesting work but took quite some time here. Name and contact details you can fine e.g. here

Thanks again for all the work!

@andresmmera
Copy link
Owner

Ok, thank you for your explanation. Since the sampling rate of the impedance profile does not depend on the frequency, now it is possible to precalculate it in initSP() :-)

Concerning the product/division issue, I did a quick test and I can confirm that g++ does not optimize that kind of operations. Indeed, if you take a look to the assembly code you can see that the division is performed by using a 'divsd' instruction whereas the product uses 'mulsd'. Unfortunately, I couldn't find further information in the instruction set of Intel. However, the test above evidences that 'divsd' requires more time than 'mulsd' (roughly about a 14% more). Anyway, this is absolutely insignificant because we were using a just few x/2...

@in3otd
Copy link
Author

in3otd commented Sep 29, 2015

thanks for doing some actual testing on the math operations; honestly I'm surprised the compiler didn't choose the same (faster) method for both cases. Did you complie with optimizaton enabled (-O2) ? Without optimization I understood the compiler won't alter the operations, to ease debugging.

@andresmmera
Copy link
Owner

Yes, you win.
x/2 and x*.5 produce exactly the same assembly code when using -O2 or -Ofast flags

@in3otd
Copy link
Author

in3otd commented Sep 29, 2015

ok, this agrees with some experiments I did lot of time ago, my conclusions were that for simple operation is better to let the compiler decide what to do and prefer to optimize code readability.
Note that for Qucs (qucsator), if you use the --maintainer-mode flag (which every developer need to use, at present), the optimization is fixed at -O0 (no optimization), mainly for the reasons above (debugging). If you optimize, sometimes following what's going on is a bit messy.
If you are used to use Qucs configured with the --maintainer-mode flag, try without, you wil llikely see a difference in speed, for non-trivial circuits.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants