Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stylizeVideo_deepflow.sh produces video which is not stylized #7

Open
agilebean opened this issue Apr 3, 2018 · 24 comments
Open

stylizeVideo_deepflow.sh produces video which is not stylized #7

agilebean opened this issue Apr 3, 2018 · 24 comments
Labels
framework-issue help wanted Extra attention is needed

Comments

@agilebean
Copy link

stylizeVideo_deepflow.sh runs smoothly without error message, i.e. produces out*.png files and the .mp4 file.
However, the video displays is not stylized and it mainly a single color (here red) but the original video displayed people. What is wrong? Is it due to deepflow?

Here are the first three frames for an impression:
out-00001
out-00002
out-00003

Here's the stack:

bash stylizeVideo_deepflow.sh input/dance1.mov ./models/checkpoint-mosaic-video.t7

In case of multiple GPUs, enter the zero-indexed ID of the GPU to use here, or enter -1 for CPU mode (slow!). [0]
 >

Which backend do you want to use?   For Nvidia GPUs it is recommended to use cudnn if installed. If not, use nn.   For non-Nvidia GPU, use opencl (not tested). Note: You have to have the given backend installed in order to use it. [cudnn]
 >

Please enter a resolution at which the video should be processed, in the format w:h, or leave blank to use the original resolution. If you run out of memory, reduce the resolution.
 > 77:128

Please enter a downsampling factor (on a log scale, integer) for the matching algorithm used by DeepFlow. If you run out of main memory or optical flow estimation is too slow, slightly increase this value, otherwise the default value will be fine. [2]
 >
ffmpeg version 3.4.2-1+b1 Copyright (c) 2000-2018 the FFmpeg developers
  built with gcc 7 (Debian 7.3.0-4)
  configuration: --prefix=/usr --extra-version=1+b1 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --enable-gpl --disable-stripping --enable-avresample --enable-avisynth --enable-gnutls --enable-ladspa --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librubberband --enable-librsvg --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzmq --enable-libzvbi --enable-omx --enable-openal --enable-opengl --enable-sdl2 --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-chromaprint --enable-frei0r --enable-libopencv --enable-libx264 --enable-shared
  libavutil      55. 78.100 / 55. 78.100
  libavcodec     57.107.100 / 57.107.100
  libavformat    57. 83.100 / 57. 83.100
  libavdevice    57. 10.100 / 57. 10.100
  libavfilter     6.107.100 /  6.107.100
  libavresample   3.  7.  0 /  3.  7.  0
  libswscale      4.  8.100 /  4.  8.100
  libswresample   2.  9.100 /  2.  9.100
  libpostproc    54.  7.100 / 54.  7.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'input/dance1.mov':
  Metadata:
    major_brand     : qt
    minor_version   : 0
    compatible_brands: qt
    creation_time   : 2016-02-14T23:45:54.000000Z
    com.apple.quicktime.location.ISO6709: +38.7090-009.1443+005.006/
    com.apple.quicktime.make: Apple
    com.apple.quicktime.model: iPhone 6 Plus
    com.apple.quicktime.software: 9.3
    com.apple.quicktime.creationdate: 2016-02-14T23:45:53+0000
  Duration: 00:00:11.59, start: 0.000000, bitrate: 10879 kb/s
    Stream #0:0(und): Video: h264 (Baseline) (avc1 / 0x31637661), yuv420p(tv, bt709), 1280x720, 10752 kb/s, 30.03 fps, 30 tbr, 600 tbn, 1200 tbc (default)
    Metadata:
      rotate          : 90
      creation_time   : 2016-02-14T23:45:54.000000Z
      handler_name    : Core Media Data Handler
      encoder         : H.264
    Side data:
      displaymatrix: rotation of -90.00 degrees
    Stream #0:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, mono, fltp, 85 kb/s (default)
    Metadata:
      creation_time   : 2016-02-14T23:45:54.000000Z
      handler_name    : Core Media Data Handler
    Stream #0:2(und): Data: none (mebx / 0x7862656D), 32 kb/s (default)
    Metadata:
      creation_time   : 2016-02-14T23:45:54.000000Z
      handler_name    : Core Media Data Handler
    Stream #0:3(und): Data: none (mebx / 0x7862656D), 0 kb/s (default)
    Metadata:
      creation_time   : 2016-02-14T23:45:54.000000Z
      handler_name    : Core Media Data Handler
Stream mapping:
  Stream #0:0 -> #0:0 (h264 (native) -> ppm (native))
Press [q] to stop, [?] for help
Output #0, image2, to 'dance1/frame_%05d.ppm':
  Metadata:
    major_brand     : qt
    minor_version   : 0
    compatible_brands: qt
    com.apple.quicktime.creationdate: 2016-02-14T23:45:53+0000
    com.apple.quicktime.location.ISO6709: +38.7090-009.1443+005.006/
    com.apple.quicktime.make: Apple
    com.apple.quicktime.model: iPhone 6 Plus
    com.apple.quicktime.software: 9.3
    encoder         : Lavf57.83.100
    Stream #0:0(und): Video: ppm, rgb24, 77x128, q=2-31, 200 kb/s, 30 fps, 30 tbn, 30 tbc (default)
    Metadata:
      encoder         : Lavc57.107.100 ppm
      creation_time   : 2016-02-14T23:45:54.000000Z
      handler_name    : Core Media Data Handler
    Side data:
      displaymatrix: rotation of -0.00 degrees
frame=  348 fps= 81 q=-0.0 Lsize=N/A time=00:00:11.60 bitrate=N/A speed= 2.7x
video:10053kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: unknown

Starting optical flow computation as a background task...
Starting video stylization...
Model loaded.
Elapsed time for stylizing frame independently:0.333889
Writing output image to dance1/out-00001.png
Waiting for file "dance1/flow_77:128/reliable_2_1.pgm"
Elapsed time for stylizing frame:0.023393
Writing output image to dance1/out-00002.png
Waiting for file "dance1/flow_77:128/reliable_3_2.pgm"
Elapsed time for stylizing frame:0.019732
Writing output image to dance1/out-00003.png
Waiting for file "dance1/flow_77:128/reliable_4_3.pgm"
Elapsed time for stylizing frame:0.023397
Writing output image to dance1/out-00004.png
Waiting for file "dance1/flow_77:128/reliable_5_4.pgm"
Elapsed time for stylizing frame:0.023525999999999
Writing output image to dance1/out-00005.png
Waiting for file "dance1/flow_77:128/reliable_6_5.pgm"
Elapsed time for stylizing frame:0.015667
Writing output image to dance1/out-00006.png
Waiting for file "dance1/flow_77:128/reliable_7_6.pgm"
Elapsed time for stylizing frame:0.026106
Writing output image to dance1/out-00007.png
Waiting for file "dance1/flow_77:128/reliable_8_7.pgm"
Elapsed time for stylizing frame:0.013540000000001
Writing output image to dance1/out-00008.png
Waiting for file "dance1/flow_77:128/reliable_9_8.pgm"
Elapsed time for stylizing frame:0.023158
Writing output image to dance1/out-00009.png
Waiting for file "dance1/flow_77:128/reliable_10_9.pgm"
Elapsed time for stylizing frame:0.024933
Writing output image to dance1/out-00010.png
Waiting for file "dance1/flow_77:128/reliable_11_10.pgm"
Elapsed time for stylizing frame:0.0214
Writing output image to dance1/out-00011.png
Waiting for file "dance1/flow_77:128/reliable_12_11.pgm"
Elapsed time for stylizing frame:0.023577
Writing output image to dance1/out-00012.png
Waiting for file "dance1/flow_77:128/reliable_13_12.pgm"
Elapsed time for stylizing frame:0.025696
Writing output image to dance1/out-00013.png
...
@manuelruder
Copy link
Owner

In the past I often had issues with video that have a special color format (10 bit etc.). This software and it's libraries only work with normal, consumer-ready videos. You could check different videos from different sources.
It has cleanly nothing to do with DeepFlow

@agilebean
Copy link
Author

agilebean commented Apr 4, 2018

Unfortunately, I tried with a different video format (from an animation movie) but still get the same single-colored output files:

out-00003

About DeepFlow, are really sure? Because I did get one error message from the deepflow static file:

deepmatching-static: conv.cpp:710: void fastconv(float_image*, float_layers*, int, int, int, float, int, res_scale*): Assertion res->res_map.pixels || !"error: ran out of memory before sgemm"'`
and at the end:

run-deepflow.sh: line 13: 12432 Aborted                 (core dumped) ./deepmatching-static $1 $2 -nt 0 -downscale $4
     12433 Killed                  | ./deepflow2-static $1 $2 $3 -match

@manuelruder
Copy link
Owner

If this issue was caused by DeepFlow, at least the first frame would have been stylized correctly. The first frame is generated without dependency on a previous frame or optical flow. In fact, for the first frame the algorithm is equal to fast-neural-style.

@michaeltandecki
Copy link

For what it's worth, we've had the same issue as @agilebean .

We tried many different scenarios;

  • ffmpeg vs. avconv
  • deepflow vs. flownet
  • native resolution vs. reduced
  • different input videos (generated by quicktime on mac, and camera from Android)
  • gpu vs. cpu

@manuelruder, it would be great if you could include a simple video in this repository that should work, as a sanity check.

@SeirVed
Copy link

SeirVed commented May 16, 2018

Having the same problem,
Tried multiple different containers, even sitting through the 800+ second per frame time of CPU rendering.
2nding a tried&true sample to test on.

@AndrewGibb
Copy link

I too am seeing these kind of results. I'm using ffmpeg, deepflow, half resolution. I would also like to have a known good input video, and parameters to run, to test things were all behaving.

@manuelruder
Copy link
Owner

To further analyse, you could run fast-neural-style on the extracted video frames. If there is a case where fast-neural-style produces a correct stylization but mine fails let me know.

As an example you could take the five video frames from here. Then run stylizeVideo_*.sh example/marple8_%02d.ppm <path_to_video_model> [<path_to_image_model>. (This works because path_to_video can also be an already extracted sequence, or anything else that can be used as input to ffmpeg)

@AndrewGibb
Copy link

@manuelruder Thanks for the example frames. These do not work on my installation. I see results which look very similar to the frames in the first post. Can you suggest any steps to work out what's wrong?

@manuelruder
Copy link
Owner

Yes, see my post above...

@manuelruder
Copy link
Owner

manuelruder commented May 24, 2018

P.S. I've seen that a lot of people are reporting similar issues for fast-neural-style, see for example here. There it was suggested that a recent torch update (or a package) caused this issue. Unfortunately, there are no official releases or even a simple changelog. Instead if you install torch you'll get whatever the current master is at that time. Therefore I have no idea what I would need to change in order to fix this. (I'm not actively using torch anymore, like probably most other people I switched to a more recent framework)

@manuelruder manuelruder added framework-issue help wanted Extra attention is needed labels May 24, 2018
@kboruff
Copy link

kboruff commented Jun 11, 2018

What framework are you using instead? What would it take to port the torch elements to the new framework?

@manuelruder
Copy link
Owner

I'm currently using pytorch, it's more actively developed, although there are also breaking changed from time to time. But at least it has proper versioning. There exists code for fast-neural-style in pytorch, one could use this as a base.

@AndrewGibb
Copy link

I seem to have got this working. I used a clean install of Ubuntu 14.04 and CUDA 7.5. Aside from following the steps in README.md, I did the following:

  • Restart after installation
  • Ensure cudnn is version 5.0 (Although error messages make this clear)
  • After all torch installation is complete, run the update.sh script which appears in the torch directory.

I'm using the static binaries of DeepFlow and deepmatching .

For reference, my previous attempt on a more powerful machine used Ubuntu 16.04, CUDA 9.2, cudnn 5.0. I had run torch/update.sh on this machine, and I still got poor results, similar to those in the first post on this issue.

I tested this code on the few frames suggested by Manuel just above my last post. I get properly stylized results.

@anon19831
Copy link

Has anyone tried to set up this up on AWS? I have been trying for a few days I can't get a instance up and working. I get the same problems and results as @agilebean. I have gone through all the troubleshooting other people has done here and on https://github.com/jcjohnson/fast-neural-style/issues. I am using Ubuntu 14.04 CUDA 7.5 cudnn 5.0 and ran bash update.sh in the torch directory like @AndrewGibb said and that still gave me erroneous results. If you could share a working AMI that would also be appreciated.

@positlabs
Copy link

I am also seeing this issue. Attempted building on ubuntu 16 with various versions of cuda. Downgraded to cuda 7.5, which forced me into ubuntu 14. In the end, this may have been the wrong path because I initially got the exact same results regardless of lib versions.

In running through the debugging steps mentioned above, I found that I could get it to work with some elbow grease. The issue appears to be related to how ffmpeg is handling the video > ppm conversion. I manually split the frames into pngs, then tested using an input like %05d.png and it produces stylized frames (although the output is a single png). After sending the frames back through ffmpeg (png > mp4), I get something that works:

stylized

This is a little odd because the png > ppm conversion works, but not the mp4 > ppm. I wonder if there's some missing build flag in my ffmpeg version.

For reference, I'm using the following lib versions:

  • flownet2 docker modded for ubuntu 14 (FROM nvidia/cuda:7.5-cudnn5-devel-ubuntu14.04)
  • ubuntu 14.04
  • cudnn 5
  • cuda 7.5
  • latest torch
  • ffmpeg from ppa:mc3man/trusty-media

Here's the ffmpeg build info, in case it helps track down the issue

ffmpeg version 3.4 Copyright (c) 2000-2017 the FFmpeg developers
  built with gcc 4.8 (Ubuntu 4.8.4-2ubuntu1~14.04.4)
  configuration: --extra-libs=-ldl --prefix=/opt/ffmpeg --mandir=/usr/share/man --enable-avresample --disable-debug --enable-nonfree --enable-gpl --enable-version3 --enable-libopencore-amrnb --enable-libopencore-amrwb --disable-decoder=amrnb --disable-decoder=amrwb --enable-libpulse --enable-libfreetype --enable-gnutls --disable-ffserver --enable-libx264 --enable-libx265 --enable-libfdk-aac --enable-libvorbis --enable-libtheora --enable-libmp3lame --enable-libopus --enable-libvpx --enable-libspeex --enable-libass --enable-avisynth --enable-libsoxr --enable-libxvid --enable-libvidstab --enable-libwavpack --enable-nvenc --enable-libzimg
  libavutil      55. 78.100 / 55. 78.100
  libavcodec     57.107.100 / 57.107.100
  libavformat    57. 83.100 / 57. 83.100
  libavdevice    57. 10.100 / 57. 10.100
  libavfilter     6.107.100 /  6.107.100
  libavresample   3.  7.  0 /  3.  7.  0
  libswscale      4.  8.100 /  4.  8.100
  libswresample   2.  9.100 /  2.  9.100
  libpostproc    54.  7.100 / 54.  7.100

I have a dockerfile that I can publish once I have time to clean it up a bit.

@manuelruder
Copy link
Owner

manuelruder commented Nov 29, 2018

This is what I observed with some videos having a higher bit depth. (see my first post)

Converting to png and then to ppm could reduce the bit depth to 8 bit and this could be the reason why it worked for you.

However, people also reported this issue with fast-neutral-style where they found that instance norm was not compatible with a specific cuda, cudnn or torch version, and they didn't use ffmpeg. Also note that AndrewGibb reported that the example images I provide didn't work for him.

I think we have multiple distinct issues here.

@bafonso
Copy link

bafonso commented Dec 4, 2018

I have a dockerfile that I can publish once I have time to clean it up a bit.

I'd love to see that dockerfile... my initial attempt I could not build flownet using FROM nvidia/cuda:7.5-cudnn5-devel-ubuntu14.04

@positlabs
Copy link

It's a work in progress, but here it is: https://github.com/positlabs/fast-artistic-videos-docker

I was able to get stylized videos from it last friday, but tried again today and it failed. I'll keep working on it.

@bafonso
Copy link

bafonso commented Dec 5, 2018

Even though I'd like to use the docker approach I can report that I was previously getting garbled images like the OP but can now get useful output (at least the first 100 frames, somehow it stopped getting output images after...) using CUDA 9.2 and cudnn 7. What I did was to set CUDNN_PATH="/usr/local/cuda/lib64/libcudnn.so.7" and TORCH_NVCC_FLAGS="-D__CUDA_NO_HALF_OPERATORS__" and update.sh torch. Then I used cudnn.torch bindings (from github) with branch R7:

git clone https://github.com/soumith/cudnn.torch -b R7
cd cudnn.torch
luarocks make cudnn-scm-1.rockspec

@positlabs
Copy link

My docker build is working now. The trick was to run torch's update.sh script AFTER all of the other dependencies were installed

@arunasank
Copy link

I see this issue with flownet, but not with deepflow.

@teidenzero
Copy link

I have found that by re-exporting my videos in .mov format with PNG codec and 8bit depth I completely got rid of the issue

@teidenzero
Copy link

#produce mov file with 8bit depth
$FFMPEG -i $1 -vf scale=$resolution -crf 0 -c:v libx264 -preset veryslow ${filename}/${filename}.mov

#produces the ppm frames from the video
$FFMPEG -i ${filename}/${filename}.mov -vf scale=$resolution ${filename}/frame_%04d.ppm

@ryletko
Copy link

ryletko commented May 12, 2020

Even though I'd like to use the docker approach I can report that I was previously getting garbled images like the OP but can now get useful output (at least the first 100 frames, somehow it stopped getting output images after...) using CUDA 9.2 and cudnn 7. What I did was to set CUDNN_PATH="/usr/local/cuda/lib64/libcudnn.so.7" and TORCH_NVCC_FLAGS="-D__CUDA_NO_HALF_OPERATORS__" and update.sh torch. Then I used cudnn.torch bindings (from github) with branch R7:

git clone https://github.com/soumith/cudnn.torch -b R7
cd cudnn.torch
luarocks make cudnn-scm-1.rockspec

I can confirm that this solution works! I had the same issue with CUDA 8, cuDNN 7.1 and Ubuntu 16. Then I set up everything from scratch with CUDA 9.2 and cuDNN 7.6 and still had the same poor results until I updated torch as bafonso advised. Also I had to install 'cuDNN Library for Linux' additionally to 'cuDNN Runtime Library for Ubuntu16.04 (Deb)' and 'cuDNN Developer Library for Ubuntu16.04 (Deb)', because I couldn't have found /usr/local/cuda/lib64/libcudnn.so.7. Now I get the proper results.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
framework-issue help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests