Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AvCodec Error -22 #2142

Closed
totaam opened this issue Feb 10, 2019 · 54 comments
Closed

AvCodec Error -22 #2142

totaam opened this issue Feb 10, 2019 · 54 comments
Labels

Comments

@totaam
Copy link
Collaborator

totaam commented Feb 10, 2019

Issue migrated from trac ticket # 2142

component: server | priority: major | resolution: invalid | keywords: avcodec,h264

2019-02-10 04:20:59: DevynCJohnson created the issue


I have built Xpra from the Subversion repository source (tag v2.4.x) since the package in default Ubuntu repository and the WinSwitch repository ( https://winswitch.org/downloads/debian-repository.html?dist_select=cosmic ) both appear to have a "tjcompress2 error -2" during the initial start-up of the server.

In the attached file, I include the error message, the build commands I used, and the output of xpra showconfig.

When running an Xpra server using the below command, an avcodec error -22 message is given when I try to launch the Xpra GUI-client. Prior to launching the client, the HTML5 interface would never load (I just get "Unable to connect" in Firefox). I am needing to use the HTML5 interface as seen here - [https://www.xpra.org/trac/wiki/Clients/HTML5]). I have installed all of the dependencies available in Ubuntu for both building and running Xpra.

xpra start :37 --start=gnome-mines --html=on --systemd-run=no --start-via-proxy=no --tcp-auth=file:filename=/home/collier/xpra_pswd.txt --bind-tcp=0.0.0.0:13700 --encoding=x264 --csc-modules=swscale --compress=1 --uid=1000 −−mmap−group=auto

I have also used the below environment variables as suggested in other bug reports relating to the h264 codec.

export XPRA_B_FRAMES=0
export XPRA_X264_THREADS=4
export XPRA_X264_SLICED_THREADS=0
@totaam
Copy link
Collaborator Author

totaam commented Feb 10, 2019

2019-02-10 04:21:38: DevynCJohnson uploaded file Xpra_Error_Log.txt (19.1 KiB)

Xpra Error Log and Info

@totaam
Copy link
Collaborator Author

totaam commented Feb 10, 2019

FYI:

  • --html=on the html5 client should be enabled by default if everything is installed correctly
  • systemd-run=no should default to no on versions of Ubuntu that are known to be broken
  • start-via-proxy=no already defaults to no
  • --encoding=x264 - don't do that
  • csc-modules=swscale - since you're not building other csc modules, this doesn't do anything
  • compress=1 - should already be the default, strangely enough your xpra showconfig shows a different value..
  • uid=1000 unless you are running as root, this doesn't do anything
  • mmap−group=auto - should already be the default

So AFAICT, your command line should just be:

xpra start :37 --start=gnome-mines \
   --tcp-auth=file:filename=/home/collier/xpra_pswd.txt --bind-tcp=0.0.0.0:13700

There are separate issues in this ticket.
(from the log, you seem to be running Ubuntu 18.10)

tjcompress2 error -2 during the initial start-up of the server.
Please post the full error message.
This is from the turbo jpeg encoder. and this is not normal, and it does not occur on a standard installation of Ubuntu 18.10.
Maybe start again with a clean installation?

avcodec error -22 message is given when I try to launch the Xpra GUI client
What command? Just xpra?
Please post the output of ./xpra/codecs/loader.py -v.
Please post: dpkg --list | egrep -i "ffmepg|xpra|dummy".

Prior to launching the client, the HTML5 interface would never load
I'm not sure I understand what that means: does launching the client somehow fix the HTML5 client?
Make sure your firewall isn't blocking that port. Make sure websockify is installed.
If that doesn't help, run your server with -d websockify,http and post the log file.
Or you may want to try the beta channel, which has big improvements to the websocket layer, which also makes it easier to deploy (#2121).

I have also used the below environment variables as suggested in other bug reports relating to the h264 codec.
(..)
You should not be fiddling with those settings, if the html5 client is not connecting, they won't make any difference at all.

@totaam
Copy link
Collaborator Author

totaam commented Feb 11, 2019

2019-02-11 01:11:50: DevynCJohnson commented


I attached two log files. The one contains the output of the requested commands appended to the bottom the the initial file that I uploaded.

You are correct, I am running Ubuntu 18.10 (Cosmic).

The purpose of running the command "xpra" is to try to connect to the running session via the GUI since the web-interface did not appear to be working. This was a way of checking if it is just the web-interface that was down or all of the Xpra server. As for the results, the GUI client would report "No sessions found".

The error that appears when installing from the repositories is seen below.

Error: failed to compress jpeg image, code -2:
tjCompress2(): Invalid argument
 width=32, stride=128, height=32
 format=BGRA, quality=0
Error: failed to compress jpeg image, code -2:
tjCompress2(): Invalid argument
 width=32, stride=128, height=32
 format=BGRA, quality=50
Error: failed to compress jpeg image, code -2:
tjCompress2(): Invalid argument
 width=32, stride=128, height=32
 format=BGRA, quality=100

I also got the version information concerning libTurbo.

collier@Nacho-Computer:~$ dpkg --list | egrep -i 'turbo'
ii  libjpeg-turbo-progs                        2.0.0-0ubuntu2                                amd64        Programs for manipulating JPEG files
ii  libjpeg-turbo8:amd64                       2.0.0-0ubuntu2                                amd64        IJG JPEG compliant runtime library.
ii  libjpeg-turbo8-dev:amd64                   2.0.0-0ubuntu2                                amd64        Development files for the IJG JPEG library
ii  libturbojpeg:amd64                         2.0.0-0ubuntu2                                amd64        IJG JPEG compliant runtime library.
ii  libturbojpeg0-dev:amd64                    2.0.0-0ubuntu2                                amd64        Development files for the TurboJPEG library

As for your comment regarding the environment variables. True, you have a good point. The only reason I tried that was due to trying as many possibilities since with many bug reports I have filed in the past (with other projects), the developers would often times have me try solutions to slightly off-topic issues. Since I have never filed a report with Xpra, I wanted to be sure and get any off-topic solutions out of the way (yes, it sounds illogical, but I have often times dealt with illogical people).

After looking at the output of running Xpra with -d websockify,http, I noticed the line 2019-02-10 18:39:02,042 init_html_proxy(..) options: tcp_proxy=, html='yes' with the empty tcp_proxy parameter. Does the Xpra HTML5 interface require a proxy?

As for trying the beta channel, how stable is the beta version of Xpra? I am needing something stable and reliable (although, an unstable version of Xpra would be better than a non-working version of Xpra, right? 😉)

@totaam
Copy link
Collaborator Author

totaam commented Feb 11, 2019

2019-02-11 01:12:31: DevynCJohnson uploaded file Xpra_Error_Log.2.txt (32.4 KiB)

Xpra Error Log 2

@totaam
Copy link
Collaborator Author

totaam commented Feb 11, 2019

2019-02-11 01:12:57: DevynCJohnson uploaded file xpra.log (354.4 KiB)

Xpra X11 Log

@totaam
Copy link
Collaborator Author

totaam commented Feb 11, 2019

2019-02-11 02:31:54: antoine commented


TILs:

xpra start :37 -d websockify,http --start=gnome-mines --tcp-auth=file:filename=/home/collier/xpra_pswd.txt --bind-tcp=0.0.0.0:13700
(..)	serving html content from: /usr/local/share/xpra/www

All you need to do at this point is to run: xdg-open http://localhost:13700/connect.html.
Keep an eye on the server log, just in case.
If the browser does not connect, you have a firewall problem.

As for the codec issues, this does not happen with a clean installation of xpra on a standard Ubuntu system, so either re-install both or at least remove the xpra package before installing from source.

As for trying the beta channel, how stable is the beta version of Xpra?

It should be stable, but being a beta channel, things do break occasionally.

@totaam
Copy link
Collaborator Author

totaam commented Feb 15, 2019

2019-02-15 20:06:55: DevynCJohnson commented


I tried running Xpra on a fresh installation of the codecs and Xpra itself (all from the default Ubuntu repos). I also explicitly allowed the port 13700 to both TCP and UDP. The HTML5 interface still fails to work. If I run xpra list to see the list of running sessions, the session is listed as "UNKNOWN" and is cleaned-up.

I attached the logs.

@totaam
Copy link
Collaborator Author

totaam commented Feb 15, 2019

2019-02-15 20:07:22: DevynCJohnson uploaded file Xpra_Log_Fresh_Install.txt (13.4 KiB)

Fresh Installation Log

@totaam
Copy link
Collaborator Author

totaam commented Feb 15, 2019

Your session must be taking forever to launch - could be caused by #2091. Is this an underpowered CPU?
You ran "xpra list" before the server had finished starting up, so "xpra list" ended up cleaning up the sockets.
Try to run the server with "--no-daemon" and wait until the server output prints "xpra is ready".

@totaam
Copy link
Collaborator Author

totaam commented Feb 15, 2019

Also note that there are no codec errors in the logs.
You can check that manually by running ./xpra/codecs/loader.py -v.

@totaam
Copy link
Collaborator Author

totaam commented Feb 16, 2019

2019-02-16 15:03:18: DevynCJohnson commented


The CPU is not under-powered. My system has an 8700K Intel processor (Coffeelake) with six physical cores + six virtual cores and an Nvidia 1080 GPU. My system has 32GB of RAM.

By using the --no-daemon parameter, I was able to see that Xpra is having a segmentation fault on a clean installation from the default Ubuntu repositories.

collier@Nacho-Computer:~$ xpra start :35 --no-daemon --start=gnome-mines --tcp-auth=file:filename=/home/collier/xpra_pswd.txt --bind-tcp=0.0.0.0:13700 --html=on
2019-02-16 09:00:55,915 cannot use uinput for virtual devices:
2019-02-16 09:00:55,915  [Errno 13] Failed to open the uinput device: Permission denied
[mi] Extension "Composite" is not recognized
[mi] Only the following extensions can be run-time enabled:
[mi]    Generic Event Extension
[mi]    MIT-SHM
[mi]    XTEST
[mi]    SECURITY
[mi]    XINERAMA
[mi]    XFIXES
[mi]    RENDER
[mi]    RANDR
[mi]    COMPOSITE
[mi]    DAMAGE
[mi]    MIT-SCREEN-SAVER
[mi]    DOUBLE-BUFFER
[mi]    RECORD
[mi]    DPMS
[mi]    X-Resource
[mi]    XVideo
[mi]    XVideo-MotionCompensation
[mi]    SELinux
[mi]    GLX
2019-02-16 09:00:56,022 created unix domain socket: /run/user/1000/xpra/Nacho-Computer-35
2019-02-16 09:00:56,022 created unix domain socket: /run/xpra/Nacho-Computer-35
2019-02-16 09:00:56,120 pointer device emulation using XTest
2019-02-16 09:00:56,993  OpenGL is supported on this display
WARNING: no 'numpy' module, HyBi protocol will be slower
2019-02-16 09:00:57,030 serving html content from: /usr/local/share/xpra/www
2019-02-16 09:00:57,111 D-Bus notification forwarding is available
2019-02-16 09:00:57,246 found 1 virtual video device for webcam forwarding
2019-02-16 09:00:57,256 pulseaudio server started with pid 10644
2019-02-16 09:00:57,256  private server socket path:
2019-02-16 09:00:57,256  '/run/user/1000/xpra/pulse-35/pulse/native'
2019-02-16 09:00:58,147 GStreamer version 1.14.4 for Python 2.7.15 64-bit
Segmentation fault

@totaam
Copy link
Collaborator Author

totaam commented Feb 16, 2019

By using the --no-daemon parameter, I was able to see that Xpra is having a segmentation fault on a clean installation from the default Ubuntu repositories.

I find that a little bit hard to believe seeing that I did a clean install test as part of testing for comment:3.
To diagnose those types of crashes:

  • run the server with -d all
  • run xpra in gdb to get a backtrace

@totaam
Copy link
Collaborator Author

totaam commented Feb 22, 2019

2019-02-22 02:15:18: DevynCJohnson commented


I ran the suggested commands. I attached a screenshot and Xpra's output.

How do you recommend that I run Xpra with GDB? I ran it, but the xpra command is a Python script.

@totaam
Copy link
Collaborator Author

totaam commented Feb 22, 2019

2019-02-22 02:15:55: DevynCJohnson uploaded file Xpra.png (274.4 KiB)

Segmentation Error
Xpra.png

@totaam
Copy link
Collaborator Author

totaam commented Feb 22, 2019

2019-02-22 02:16:22: DevynCJohnson uploaded file xpra_full.log (153.6 KiB)

Full Log -d all

@totaam
Copy link
Collaborator Author

totaam commented Feb 22, 2019

Looks like the crash is in the vpx decoder, you can try running xpra with:

xpra start --video-encoders=x264 ...

This should avoid the crash.
What is your libvpx version?

$ gdb --args xpra start :34 --no-daemon ...

You were pretty close, try:
gdb --args /usr/bin/python2 /usr/bin/xpra start ...

(as per Debugging)

@totaam
Copy link
Collaborator Author

totaam commented Feb 22, 2019

2019-02-22 14:32:02: DevynCJohnson commented


I have libvpx version 1.7.0

I ran GDB as suggested while using the --video-encoders=x264 with xpra. However, it appears that Xpra is still calling VPX.

I attached a screenshot and the output of GDB.

@totaam
Copy link
Collaborator Author

totaam commented Feb 22, 2019

2019-02-22 14:32:30: DevynCJohnson uploaded file libvpx.png (100.7 KiB)

libvpx version
libvpx.png

@totaam
Copy link
Collaborator Author

totaam commented Feb 22, 2019

2019-02-22 14:33:00: DevynCJohnson uploaded file Xpra_GDB.txt (4.2 KiB)

GDB Output

@totaam
Copy link
Collaborator Author

totaam commented Feb 22, 2019

Please grab a backtrace from gdb by typing bt at the gdb crash prompt.

@totaam
Copy link
Collaborator Author

totaam commented Feb 22, 2019

2019-02-22 14:46:10: DevynCJohnson uploaded file Xpra_GDB_BT.txt (24.6 KiB)

GDB with bt

@totaam
Copy link
Collaborator Author

totaam commented Feb 22, 2019

2019-02-22 14:46:37: DevynCJohnson commented


Okay, I attached the backtrace

@totaam
Copy link
Collaborator Author

totaam commented Feb 22, 2019

Ah, ubuntu's gdb doesn't give you very useful backtraces, does it have a py-bt command?

That said, some things immediately stand out from this stacktrace:

$ gdb --args /usr/bin/python2 /usr/local/bin/xpra start :29 ...

/usr/local/bin/xpra is not a standard location for one of our packages.
You must have installed this yourself by building from source? Why is that? Did you remove the package before mixing with your source installation?

The fact that you are seeing crashes and errors with multiple codecs, and that those errors look like library version issues, and that I am not seeing that on a fresh install, this all makes me think that you are either using the wrong package (wrong repository configured?) or building it wrong yourself or you have the wrong libraries installed.

However, it appears that Xpra is still calling VPX.
Yes, the codec still gets loaded, it isn't fully initialized but that's still enough to trigger the crash.

@totaam
Copy link
Collaborator Author

totaam commented Feb 22, 2019

2019-02-22 16:23:37: DevynCJohnson commented


I had installed Xpra from the default repos. Initially (the first post), I had installed from source due to wanting to compile the code specifically to my needs (e.g. compile without webcam support, etc.) and specifically for the processor (e.g. CFLAGS= -march=skylake -mavx -O3 etc.) to achieve greater performance.

For this next run and backtrace, I installed Xpra from the Winswitch repo/ppa and I used PIP to upgrade most of my Python2 packages. I also saw that you provided an update for Xdummy (I got that update). Obviously, I did all this after I uninstalled the existing XPra and after searching the whole file hierarchy for any remnants of Xpra. I also rebooted after applying these changes. However, I am still getting a segfault. I am also getting the "tjCompress2" compress errors.

@totaam
Copy link
Collaborator Author

totaam commented Feb 22, 2019

2019-02-22 16:24:19: DevynCJohnson uploaded file Xpra_Bug_Notes.txt (26.0 KiB)

Xpra BT Winswitch PPA

@totaam
Copy link
Collaborator Author

totaam commented Feb 22, 2019

2019-02-22 16:27:39: DevynCJohnson commented


No, Ubuntu does not appear to have py-bt.

@totaam
Copy link
Collaborator Author

totaam commented Feb 22, 2019

I had installed Xpra from the default repos.
Initially (the first post), I had installed from source due to wanting to compile the code specifically to my needs (e.g. compile without webcam support, etc.)

That's mostly superfluous: features that are disabled aren't loaded into memory (see #1861 and #1838 for details), just disable them instead.

and specifically for the processor (e.g. CFLAGS= -march=skylake -mavx -O3 etc.) to achieve greater performance.

AFAIK, the benefits of this are very limited.
The only part of the process that can really benefit from CPU optimizations is the picture encoding stage. For that, you should rebuild x264 and / or libvpx, not xpra itself. And even then, those libraries include various hand crafted CPU optimizations already (turbojpeg does too), so gcc probably won't be able to improve on that.

For this next run and backtrace, I installed Xpra from the Winswitch repo/ppa and I used PIP to upgrade most of my Python2 packages.

That's usually a bad idea. Don't mix distribution packages with pip installed packages.

However, I am still getting a segfault. I am also getting the "tjCompress2" compress errors.

You could always nuke the problematic codecs from the filesystem, ie: rm -fr /the/path/to/xpra/codecs/vpx
But this won't resolve the broken state of your system, which is likely to cause you more problems down the line.

@totaam
Copy link
Collaborator Author

totaam commented Feb 28, 2019

2019-02-28 23:54:55: DevynCJohnson commented


Even after running rm -fr /the/path/to/xpra/codecs/vpx on a fresh installation of Ubuntu, it still does not work. Could it have anything to do with using the proprietary Nvidia driver which would be (to the best of my knowledge) the only difference between my fresh installation and yours?

Is there a way to prevent all codecs from loading except for the one that I need to use? Also, from a fresh installation of everything, why would I be getting the "tjCompress2" compress errors?

@totaam
Copy link
Collaborator Author

totaam commented Mar 1, 2019

Even after running rm -fr /the/path/to/xpra/codecs/vpx on a fresh installation of Ubuntu, it still does not work.

How so?

Could it have anything to do with using the proprietary Nvidia driver which would be (to the best of my knowledge) the only difference between my fresh installation and yours?

The nvidia driver would allow for the nvenc codec to be used. (awesome performance - highly recommended)
You can try nuking that one too: rm -fr /the/path/to/xpra/codecs/nvenc.

Is there a way to prevent all codecs from loading except for the one that I need to use?

Yes, use the 2.5 beta builds.

Also, from a fresh installation of everything, why would I be getting the "tjCompress2" compress errors?

My guess is that there is something wrong with your installation.
The fact that you're getting so many different codec errors tells me that none of the codec shared libraries match what xpra is expecting, you're either using the wrong repository / package or your mixing with a source installation.

@totaam
Copy link
Collaborator Author

totaam commented Mar 3, 2019

2019-03-03 00:18:34: DevynCJohnson commented


Today, I started from a fresh install and I used the Beta (v2.5) WinSwitch repository for Cosmic. I installed the package for Python3. This time, I actually managed to get to the screen that would show the progress of the loading web-sockets. However, it stopped very close to finish loading. I attached the GDB log.

@totaam
Copy link
Collaborator Author

totaam commented Mar 3, 2019

2019-03-03 00:19:02: DevynCJohnson uploaded file Xpra_Beta_Log.txt (10.9 KiB)

Xpra v2.5 Beta Logs

@totaam
Copy link
Collaborator Author

totaam commented Mar 3, 2019

2019-03-03 00:20:54: DevynCJohnson commented


In the beta version, it crashed due to the XOR codec. I tried deleting that directory (/usr/lib/python3/dist-packages/xpra/codecs/xor/), but then Xpra would just say that the codec was not found and then stall.

@totaam
Copy link
Collaborator Author

totaam commented Mar 3, 2019

Xpra v2.5 Beta Logs

You're not including the backtrace from gdb, you need to run py-bt or bt from the gdb prompt.

I installed the package for Python3.

Just in case, also try the python2 version.

Can you connect with the regular client instead of the html5 client?
Is the x264 codec enabled then?

I tried deleting that directory (/usr/lib/python3/dist-packages/xpra/codecs/xor/), but then Xpra would just say that the codec was not found and then stall.

This module is generally required. It is also used by the new websockets code: #2121.

The only crash we've ever had in xor was related to unaligned 64-bit access on older CPUs: #1749. Which is why we now use 32-bit access everywhere and added extra code to align addresses in the target buffer.

@totaam
Copy link
Collaborator Author

totaam commented Mar 3, 2019

2019-03-03 03:45:57: DevynCJohnson commented


Whoops, my bad. Here are the back-traces.

I will try using the Python2 version sometime next week as well as using the regular client.

@totaam
Copy link
Collaborator Author

totaam commented Mar 3, 2019

2019-03-03 03:46:24: DevynCJohnson uploaded file Xpra_Beta_Backtrace.txt (28.9 KiB)

Xpra v2.5 Backtrace

@totaam
Copy link
Collaborator Author

totaam commented Mar 3, 2019

TILs:

Thread 91 "python3.6" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffb3fff700 (LWP 16413)]
0x00007ffff1323fbf in ?? ()
   from /usr/lib/python3/dist-packages/xpra/codecs/xor/cyxor.cpython-36m-x86_64-linux-gnu.so
(gdb) bt
#0  0x00007ffff1323fbf in  ()
    at /usr/lib/python3/dist-packages/xpra/codecs/xor/cyxor.cpython-36m-x86_64-linux-gnu.so
#1  0x00007ffff1324e92 in  ()
    at /usr/lib/python3/dist-packages/xpra/codecs/xor/cyxor.cpython-36m-x86_64-linux-gnu.so
#2  0x000000000050c4f5 in _PyCFunction_FastCallDict
    (kwargs=<optimized out>, nargs=<optimized out>, args=<optimized out>, func_obj=<built-in function hybi_unmask>) at ../Objects/methodobject.c:231
#3  0x000000000050c4f5 in _PyCFunction_FastCallKeywords
    (kwnames=<optimized out>, nargs=<optimized out>, stack=<optimized out>, func=<optimized out>)
    at ../Objects/methodobject.c:294
#4  0x000000000050c4f5 in call_function
    (pp_stack=0x7fffb3ffdee0, oparg=<optimized out>, kwnames=<optimized out>) at ../Python/ceval.c:4837
#5  0x000000000050dd99 in _PyEval_EvalFrameDefault (f=<optimized out>, throwflag=<optimized out>)
    at ../Python/ceval.c:3335
#6  0x000000000050b638 in PyEval_EvalFrameEx
    (throwflag=0, f=Frame 0x7fffbc002ec8, for file /usr/lib/python3/dist-packages/xpra/net/websockets/header.py, line 65, in decode_hybi (buf=b'\x82\xfe\x17...

So it is the new hybi_unmask function called from decode_hybi.
The changelog for this code is here: [/log/xpra/trunk/src/xpra/codecs/xor/cyxor.pyx].
In particular: #1926 (superseded by #2121) and r21393 for 32-bit accesses.

Please post the /proc/cpuinfo of the machine that is having this problem.

@totaam
Copy link
Collaborator Author

totaam commented Mar 3, 2019

2019-03-03 12:48:44: DevynCJohnson commented


I attached the requested information and I provided additional information in the file as well.

Also, I have Xpra v2.5-r21899.

@totaam
Copy link
Collaborator Author

totaam commented Mar 3, 2019

2019-03-03 12:49:18: DevynCJohnson uploaded file System_Info.txt (17.8 KiB)

System Info

@totaam
Copy link
Collaborator Author

totaam commented Mar 3, 2019

Right, at this point I am pretty sure that the problem is with your system.
The new cyxor code has torture tests that I've run on similar but older CPUs, and on Ubuntu virtual machines. This is just another symptom of the more general problem you have.

Unless you can provide me with steps to reproduce the problem reliably (ie: dockerfile, virtual machine image, etc) then I will have to close this ticket as invalid.

@totaam
Copy link
Collaborator Author

totaam commented Mar 4, 2019

2019-03-04 05:07:10: DevynCJohnson commented


What do you mean by "the problem is with your system" and "another symptom of the more general problem you have"? I have tried Xpra on a fresh install of both Ubuntu and Xubuntu. The hardware works perfectly fine for all other uses. I have not had any issues with any other software for the whole year that I have had this hardware. What is this general problem?

I installed the Python2 version of Xpra v2.5 (Beta) from the Winswitch PPA. Again, a fresh Linux install. The Python2 version using the non-HTML5 client works, but the HTML5 interface via the web-browser (I have tried Firefox and Chrome), but neither work. With the Python2 version of Xpra v2.5 (Beta), it repeatedly loops invalid packet format, character 0xa0, not an xpra client? and server does not support h264 encoding and has switched to auto in the command-line where I ran gdb --args /usr/bin/python2.7 /usr/bin/xpra start :34 --no-daemon --video-encoders=x264 --start=xmahjongg --auth=allow --bind-tcp=0.0.0.0:17300.

To reproduce the issue, install Ubuntu or Xubuntu on a system with an i7-8700K Intel processor, a 1080 Nvidia graphics card, and a 16-inch 4K screen. Add the WinSwitch Beta repo (deb http://winswitch.org/beta/ cosmic main) and the Graphics repo (deb http://ppa.launchpad.net/graphics-drivers/ppa/ubuntu cosmic main) for the Nvidia driver. Install Paramiko, Websockify, and all other listed dependencies for Xpra.

I disassembled the cyxor.cpython-36m-x86_64-linux-gnu.so file and I noticed out of the whole file, there were only two SSE assembly command (both of which are unaligned).

AT&T Syntax:

    437a:	movdqu -0x10(%[r9](../commit/8304a98c1fa2f625cfd0c0575656692d3d8b2388),%rax,1),%xmm0
    4381:	movups %xmm0,(%[r9](../commit/8304a98c1fa2f625cfd0c0575656692d3d8b2388),%rax,1)

movdqu is an unaligned double quadword move
movups is an unaligned packed single-precision floating-point move

If the pointer is not 16-byte aligned, this will cause a segmentation fault. Also, if the data happens to be on the stack instead of the memory, than this could cause alignment issues that are seen in some systems and not others.

This reminds me of #1749#comment:10

This issue with Cython and Python ( https://stackoverflow.com/questions/51187592/using-c-union-with-sse-intrinsics-in-cython-results-in-sigsegv ) is similar to our issue at the assembly level.

Also, I noticed other developers that get segmentation faults originating from Cython code tend to add nogil or some other methods of manipulating the Python garbage collector to help with similar issues.

Here are some helpful links on SSE and alignment

https://stackoverflow.com/questions/47510783/why-does-unaligned-access-to-mmaped-memory-sometimes-segfault-on-amd64

https://stackoverflow.com/questions/841433/are-stack-variables-aligned-by-the-gcc-attribute-alignedx

@totaam
Copy link
Collaborator Author

totaam commented Mar 4, 2019

2019-03-04 06:19:44: antoine uploaded file hybi-log-addresses.patch (1.0 KiB)

print addresses used with 32-bit accesses

@totaam
Copy link
Collaborator Author

totaam commented Mar 4, 2019

What do you mean by "the problem is with your system" and "another symptom of the more general problem you have"?

I mean that no-one else has reported any problems with the codecs and that every time I have seen errors like these reported it ended up being a mistake during system setup. (wrong arch, wrong distro version, mixed source installation, etc)

To reproduce the issue, install Ubuntu or Xubuntu on a system ...

As per comment:8, I had already done a test install in a VM 3 weeks ago.

Add the WinSwitch? Beta repo (deb http://winswitch.org/beta/ cosmic main)

As per the installation instructions, the beta repo should not be installed without also installing the stable repo.
(it may or may not work correctly without)

Install Paramiko, Websockify, and all other listed dependencies for Xpra.

The dependencies should be installed automatically when you install xpra, and websockify is no longer used in 2.5 (as per #2121)

I disassembled .. floating-point move

Are you sure that those instructions are actually used?
We don't do any floating point operations in that whole module.

If the pointer is not 16-byte aligned, this will cause a segmentation fault.

Unless gcc does something really weird with vectorization, all accesses are 32-bit only and 4-byte aligned.
Run the python ./unittests/unit/net/cyxor_hybi_test.py with the patch above applied to verify the value of the pointers.

Also, I noticed other developers that get segmentation faults originating from Cython code tend to add nogil or some other methods of manipulating the Python garbage collector to help with similar issues.

We don't release the GIL in this particular module. Adding nogil would definitely not help, if anything it can cause more problems.

So, like I said: if I can't reproduce the problem, I can't fix it.

@totaam
Copy link
Collaborator Author

totaam commented Mar 9, 2019

2019-03-09 18:41:16: DevynCJohnson commented


Just to keep you updated, I am currently in the process of obtaining an online cloud-computing account which I will use to try Xpra v2.5-beta (both WinSwitch repos) since it appears to have issues running directly on a fresh Ubuntu system with the i7-8700K Intel processor. True, I could try a virtual machine, but I want to try Xpra on another freshly installed system that will not be using nor involved with the i7-8700K Intel processor. Also, considering that you have never had such issues running Xpra within Docker, that may be the best option.

Once I complete this testing, I will report back.

Either way, I will soon be working on optimizing and enhancing the code for Xpra (like we mentioned on our video-call).

@totaam
Copy link
Collaborator Author

totaam commented Mar 10, 2019

I want to try Xpra on another freshly installed system

Please keep a log of every terminal command you run to get it installed, so this can be reproduced somewhere else if need be.

Either way, I will soon be working on optimizing and enhancing the code for Xpra

Please create a separate ticket for that, for more information see Performance (out of date) and #620: you need to use the profiling tools to identify the locations that may need optimizing then the automated tests (#2112) to validate changes.

@totaam
Copy link
Collaborator Author

totaam commented Mar 16, 2019

2019-03-16 01:47:43: DevynCJohnson commented


I tried Xpra in Docker on AWS and it still has a segmentation fault (both Python2 and Python3). I attached the needed files.

@totaam
Copy link
Collaborator Author

totaam commented Mar 16, 2019

2019-03-16 01:48:14: DevynCJohnson uploaded file Dockerfile (3.5 KiB)

Dockerfile

@totaam
Copy link
Collaborator Author

totaam commented Mar 16, 2019

2019-03-16 01:48:40: DevynCJohnson uploaded file Docker_Log.txt (4.3 KiB)

Log from AWS

@totaam
Copy link
Collaborator Author

totaam commented Mar 16, 2019

2019-03-16 01:48:59: DevynCJohnson uploaded file init_xpra.sh (2.0 KiB)

Entrypoint

@totaam
Copy link
Collaborator Author

totaam commented Mar 16, 2019

2019-03-16 01:49:16: DevynCJohnson uploaded file gpg.asc (9.1 KiB)

gpg.asc

@totaam
Copy link
Collaborator Author

totaam commented Mar 16, 2019

As I suspected all along, there is something fundamental and non-standard that you're modifying on your system.
Setting PYTHONOPTIMIZE=2 breaks all sort of things: Cython extensions, Pillow (issue 3232: Pillow cannot be loaded in python optimize (2) mode) and is not going to optimise anything useful.

More information here: What does Python optimization (-O or PYTHONOPTIMIZE) do?

As of r22087 we will now print a big warning when the flag is set.
If you really want to optimise things, use the profiling tools (ie: #620) and work from there.

@totaam
Copy link
Collaborator Author

totaam commented Mar 17, 2019

2019-03-17 09:43:51: DevynCJohnson commented


I am completely confused. On a fresh install of Ubuntu with the Winswitch repo added and all other steps mentioned in comment 27 ( https://www.xpra.org/trac/ticket/2142#comment:27 ), how does the PYTHONOPTIMIZE variable get set? In all these fresh installs and attempts to get Xpra working that was the first time I set the variable. Also, when removing that line from the Docker image and script, Xpra still fails to work.

@totaam
Copy link
Collaborator Author

totaam commented Mar 17, 2019

Xpra works here, as soon as I remove it from the dockerfile.
If it still doesn't work for you, maybe there's something else that is changed.

@totaam
Copy link
Collaborator Author

totaam commented Mar 17, 2019

2019-03-17 10:41:31: DevynCJohnson commented


I had removed the PYTHONOPTIMIZE line as well, but Xpra still did not work. Did you manage to get Xmahjongg to work in the web-browser? If so, what does the Dockerfile look like the you are using? A better question may be, how are you running Xpra to get it to work?

@totaam
Copy link
Collaborator Author

totaam commented Mar 17, 2019

I had removed the PYTHONOPTIMIZE line as well, but Xpra still did not work.

There are 2 of them, not just one. Remove all of them.

Did you manage to get Xmahjongg to work in the web-browser?

Not with your dockerfile and script, because xmahjong is not on the $PATH.

@totaam totaam closed this as completed Mar 17, 2019
@totaam totaam added the v2.4.x label Jan 22, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant