Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

m68k.md from January 9th produces crashing WCS #181

Closed
githubaf opened this issue May 28, 2022 · 20 comments
Closed

m68k.md from January 9th produces crashing WCS #181

githubaf opened this issue May 28, 2022 · 20 comments

Comments

@githubaf
Copy link

WCS crashes as soon as image calulation is started with "HALT3" (WinUAE, 68040)
Guilty is
projects/gcc/gcc/config/m68k/m68k.md commit 8c02308 (9.Jan22) "implement direct MathIeeeSingTransBase/MathIeeeDoubTransBase calls"

I went back in time step by step from recent toolchain until WCS executable works again. When I do
git checkout fd0a62e m68k.md
i.e. back to commit from January the 7th, the crash is gone.

I compile with
m68k-amigaos-gcc -DFORCE_MUIMASTER_VMIN=19 -I"/home/developer/Desktop/SelcoGit/3DNature/Amiga" -O2 -Wall -c -fmessage-length=0 -funsigned-char -MMD -MP -MF"file.d" -MT"file.o" -o "file.o" "../file.c" -g -noixemul -m68040 -fomit-frame-pointer -fbaserel -DSTATIC_FCN=static -DSTATIC_VAR=static -ffast-math -mregparm -Winline -DSWMEM_FAST_INLINE -flto

and link with
m68k-amigaos-gcc -o "WCS" ./file1.o ./file2.o ./file3.o -lmui -lm -Wl,-Map=wcs.map,--trace -ldebug -g -noixemul -m68040 -fomit-frame-pointer -fbaserel -DSTATIC_FCN=static -DSTATIC_VAR=static -ffast-math -mregparm -Winline -DSWMEM_FAST_INLINE -flto

@bebbo bebbo transferred this issue from bebbo/amiga-gcc Jun 9, 2022
@bebbo
Copy link
Owner

bebbo commented Jun 9, 2022

I suspect the statement to load the base register:

bas = gen_rtx_MEM (SImode, gen_rtx_PLUS(SImode, gen_rtx_REG (Pmode, PIC_REG), doubbas));

@bebbo
Copy link
Owner

bebbo commented Jun 9, 2022

must rather look like:

	// create the pic_ref expression
	rtx s = gen_rtx_UNSPEC (Pmode, gen_rtvec (2, symbol, GEN_INT (0)), UNSPEC_RELOC16);
	s = gen_rtx_CONST (Pmode, s);
	s = gen_rtx_PLUS (Pmode, picreg, s);
	s = gen_rtx_CONST (Pmode, s);

notice the UNSPEC construct, which emit asm code that relocates base relative...

@bebbo
Copy link
Owner

bebbo commented Jun 9, 2022

giving this a try:

        bas = gen_rtx_MEM (SImode, gen_rtx_CONST (Pmode, 
        			gen_rtx_PLUS (Pmode, gen_rtx_REG (Pmode, PIC_REG), gen_rtx_CONST (Pmode,
        				gen_rtx_UNSPEC (Pmode, gen_rtvec (2, doubbas, GEN_INT (0)), UNSPEC_RELOC16)	  
        	))));        

@bebbo
Copy link
Owner

bebbo commented Jun 9, 2022

float foo(float f) {
  return acos(f);
}

m68k-amigaos-gcc -m68040 -ffast-math -S -O2 test.c -fbaserel
before:

       move.l (_MathIeeeSingTransBase,a4),a6

now:

       move.l (_MathIeeeSingTransBase:W,a4),a6

@githubaf
Copy link
Author

Thanks for the update!
WCS does no longer crash, however the result is still wrong. First there is no landscape, only sky. Then the cloud rendering produces a complete plain white image.

@bebbo
Copy link
Owner

bebbo commented Jun 10, 2022

can you compare it to a result using -fno-fast-math?

@githubaf
Copy link
Author

githubaf commented Jun 10, 2022

I will try with -fno-fast-math

Right now I tried compiling/linking without -fbaserel. We ge the landscape then. The clouds are calculated but look very different. The mirroring of the sky in the water is totally wrong.

@bebbo
Copy link
Owner

bebbo commented Jun 10, 2022

Could you bisect it down to one file (or even one function) where it breaks?

@githubaf
Copy link
Author

can you compare it to a result using -fno-fast-math?
Without -fno-fast-math (but with -fbaserel and -mregparm) the calculation is correct!

Could you bisect it down to one file (or even one function) where it breaks?
What do you mean? Can I compile some files with ffast-math and others without ffast-math? I thought I have to give ffast-math to compiler and linker? Is it enough to give it only to the compiler?

If so I will of course try until I find the problematic file or even the problematic function(s).

@bebbo
Copy link
Owner

bebbo commented Jun 10, 2022

That's an option per file. And inside each file you can do things like:

#pragma GCC push_options
#pragma GCC optimize("no-fast-math")
float foo(float f) {
  return acos(f);
}
#pragma GCC pop_options

@githubaf
Copy link
Author

githubaf commented Jun 10, 2022

I found a problematic function:
Fractal.c Function WaveAmp_Compute()

I compiled the project without -ffast-math.
Then I added -ffast-math to that single function.

The calculated pictured lost all water-waves. The river looks like a perfect mirror now.

I put that into compiler explorer.

http://franke.ms/cex/z/fGc34r

@bebbo
Copy link
Owner

bebbo commented Jun 11, 2022

I found a problematic function: Fractal.c Function WaveAmp_Compute()

I compiled the project without -ffast-math. Then I added -ffast-math to that single function.

The calculated pictured lost all water-waves. The river looks like a perfect mirror now.

I put that into compiler explorer.

http://franke.ms/cex/z/fGc34r

hm, the generated asm codes look identical to me... /shrug

@githubaf
Copy link
Author

githubaf commented Jun 11, 2022

Really identical? It looks here very similar but not identical. Different registers used, different size of colored blocks. I just checked the link on a different pc.

For instance the stack seems to be 12 bytes bigger if compiled with -ffast-math. Do function arguments have different size or alignment when compiled with -ffast-math?

Edit:
Ah, with -ffast-math there is one more floating register pushed on the stack. So I guess 80 Bit = 10 Bytes, +2 Bytes Alignement = 12 Bytes difference?

Line 64:
(*MakeWater) ++;
without -ffast-math
move.l (100,sp),a0
addq.w #1,(a0)

with -ffast-math
move.l (112,sp),a0
addq.w #1,(a0)

@bebbo
Copy link
Owner

bebbo commented Jun 12, 2022

yes, there are some differences, but it's rather the same. The differences result in taking diffferent paths during the various optimizations.
Are both files linked the same way? Some CFLAGS affect the selection of the libraries:

  • cpu / fpu
  • baserel / resident

both variants do not use the mentioned code from m68k.md since -mregparm is used. But the libraries might differ...

@githubaf
Copy link
Author

I am preparing an example. Should be ready in the evening.

@githubaf
Copy link
Author

Here is the example. I fed the function with the original values from the first call and print several values. I run the executables on WinUAE with an m68040. The diff shows severeal differences.

m68k-amigaos-gcc -O2 -Wall -c -fmessage-length=0 -o "main.o"  "main.c"  -g -noixemul -m68040 -fomit-frame-pointer -fbaserel -mregparm -Winline -flto
m68k-amigaos-gcc -O2 -Wall -c -fmessage-length=0 -o "test2.o" "test2.c" -g -noixemul -m68040 -fomit-frame-pointer -fbaserel -mregparm -Winline -flto
m68k-amigaos-gcc  -o "WaveAmp_Compute_ok"  ./main.o ./test2.o   -lm -Wl,-Map=wcs.map,--trace -ldebug -g -noixemul -m68040 -fomit-frame-pointer -fbaserel  -mregparm -flto

rm *.o

m68k-amigaos-gcc -O2 -Wall -c -fmessage-length=0 -o "main.o"  "main.c"  -g -noixemul -m68040 -fomit-frame-pointer -fbaserel -mregparm -Winline -flto
m68k-amigaos-gcc -O2 -Wall -c -fmessage-length=0 -o "test2.o" "test2.c" -g -noixemul -m68040 -fomit-frame-pointer -fbaserel -mregparm -Winline -flto -DFASTMATH
m68k-amigaos-gcc  -o "WaveAmp_Compute_bad"  ./main.o ./test2.o   -lm -Wl,-Map=wcs.map,--trace -ldebug -g -noixemul -m68040 -fomit-frame-pointer -fbaserel  -mregparm -flto

colordiff --side-by-side ~/Desktop/fast_math_ok  ~/Desktop/fast_math_bad
WaveAmp_Compute NO-FAST-MATH Elev=1.587902 Alt=5.018000 PolyE | WaveAmp_Compute FAST-MATH Elev=1.587902 Alt=5.018000 PolyEl=1
WV->Pt.Lat=36.089000                                            WV->Pt.Lat=36.089000
WV->Pt.Lon=112.011000                                           WV->Pt.Lon=112.011000
WD->LatOff=0.000000                                             WD->LatOff=0.000000
WD->LonOff=0.000000                                             WD->LonOff=0.000000
WV->Amp=0.250000                                                WV->Amp=0.250000
WV->Length=0.100000                                             WV->Length=0.100000
WV->Velocity=200.000000                                         WV->Velocity=200.000000
avglat=36.094657                                                avglat=36.094657
lonscale=89.733193                                            | lonscale=0.000000
LATSCALE=111.049771                                             LATSCALE=111.049771
cos(avglat)=-0.033653                                         | cos(avglat)=0.000000
cos(avglat) *  PiOver180 = -0.000587                          | cos(avglat) *  PiOver180 = 0.000000
dist=8.796182 WaveAmp=0.000000                                | dist=1.256306 WaveAmp=0.000000
Elev=1200.000000                                                Elev=1200.000000
------------                                                    ------------

af_fastmath.zip

@bebbo
Copy link
Owner

bebbo commented Jun 13, 2022

heh, you killed the compiler again ^^
thanks for the example!

@bebbo
Copy link
Owner

bebbo commented Jun 13, 2022

the emitted code was malformed and the loop compiler considered the results in d0/d1 as const and moved it to the top of the loop.

@githubaf
Copy link
Author

Now it looks much better! The river is no longer a mirror ;-) WCS is happy again in this respect.
Thank you very much for fixing it!

@bebbo
Copy link
Owner

bebbo commented Jun 14, 2022

thank you for providing such a good test case

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants