Add CUDA & OpenCL support #227

crazyks · 2017-07-18T02:49:45Z

Guetzli is an awesome jpeg encoder, however, it works a liitle bit slow. In order to speed it up, we have added CUDA & OpenCL support for Guetzli and we also optimized some procedure and added full jpeg format support.

This work is made by strongtu, ianhuang, tongzhan and me.

We had it tested on our GPU server, and here is the test statistical data for one of the sample pictures.

We hope that our codes can be merged into guetzli/master branch:)

cuDiffmapOpsinDynamicsImage cuComputeBlockZeroingOrder cuMask

继续简化代码

目前速度比opencl略差，待分析优化

cuEdgeDetectorMapEx cuEdgeDetectorLowFreqEx cuRemoveBorderEx cuAddBorderEx

肯定是opencl生成代码时有bug!

Conflicts: guetzli/butteraugli_comparator.cc third_party/butteraugli/butteraugli/butteraugli.h

jdluzen · 2017-07-23T22:53:08Z

clguetzli/clguetzli.cl.h

+#ifdef __cplusplus
+#ifndef __CUDACC__
+#include "CL/cl.h"
+#include "cuda.h"


I'm having trouble building OpenCL only with the Intel SDK, this seems to be part of it.

jdluzen · 2017-07-24T00:32:31Z

clguetzli/clbutter_comparator.h

+        size_t len, size_t offset,
+        const float* __restrict__ multipliers,
+        const float* __restrict__ inp,
+        double border_ratio,


float here, otherwise linking error.

Try OpenCL 2.0 or larger version in OpenCL Code Builder setting.

Bukashk0zzz · 2017-07-25T06:29:50Z

Any news on this?

robryk · 2017-07-28T15:21:07Z

Thanks for this PR; it's quite large, not very trivial, and I don't speak Chinese, so it'll take me some more time to look through it.

@crazyks @pornel Please pay no attention to clabot complaints.

swrobel · 2017-08-31T21:09:40Z

README.md

+```
+You can pass a `--c` parameter to enable the procedure optimization or `--cuda` parameter to use the CUDA acceleration or `--opencl` to use the OpenCL acceleration.
+
+If you have any question about CUDA/OpenCL support, please contact strongtu@tencent.com, ianhuang@tencent.com or chriskzhou@tencent.com.


Maybe create a dropbox email like guetzli-support@tencent.com that would go to all 3 of you (and could be adjusted on your end to add/remove people as necessary without having to update these docs)

FireEmerald · 2017-09-01T21:15:21Z

Could someone of you provide a working binary file for windows with cuda support? This would be awesome.

mikhailnov · 2017-09-25T11:08:20Z

Will this require non-free CUDA (cuda.h) libraries for compilation? If yes, probably a compilation flag is needed to disable them...

leafjungle · 2017-09-30T03:47:35Z

where is cu_mem defined? cuda or opencl can not find that.

ianhuang-777 · 2017-10-09T03:21:45Z

@leafjungle It's defined in clguetzli.cl.h, not an original CUDA definition.

joyjoker2017 · 2017-10-17T07:17:41Z

which GPU was used in your environment?

mikhailnov · 2017-10-31T20:20:52Z

Waiting impatiently for this to be merged ))

rogierlommers · 2017-11-02T06:25:15Z

Yes, we all do!

ianhuang-777 · 2017-11-02T08:52:56Z

@joyjoker2017 Tesla M40

fvm · 2017-11-04T00:07:53Z

🙏 👍 Fingers crossed 🤞 😄

fvm · 2017-12-08T07:43:12Z

Bump...

DanielBiegler · 2017-12-12T22:52:32Z

😴

tina-junold · 2018-04-12T06:36:14Z

Any news on this?

alexblhr · 2018-06-19T13:36:27Z

Is this ever going to be merged ? :)

jonathas · 2018-12-12T17:24:00Z

Any update on this? Please :)

magicdoublem · 2019-03-22T22:01:10Z

Did anyone ever succeed in building those binaries and want to share them? ;-)

EwoutH · 2019-12-12T22:06:01Z

This looks incredible! @crazyks could you resolve the conflicts, update the dependencies and rebase? I will check with someone from Google if we can merge into master. If not, we can create a fork.

doterax · 2021-07-31T22:06:21Z

Hi there. I have built Guetzli with CUDA for Windows.

You can download binaries from here.

twitnic · 2023-04-25T14:32:30Z

@crazyks could you resolve the conflict?

doterax · 2023-04-26T08:43:29Z

@crazyks could you resolve the conflict?

I continue supporting guetzli with CUDA and OpenCl here, where you can also download windows binaries.

strongtu added 30 commits June 2, 2017 17:29

完成代码流程，但计算结果还需要校正

e5efe98

cuDiffmapOpsinDynamicsImage cuComputeBlockZeroingOrder cuMask

修正 cuMask 计算结果

9a6a17c

调整cu代码结构

3345026

调整代码

5d49f24

简化代码

b4d0ffe

调整cu编译

18f9672

CUDA编译支持宏开关

601e367

优化clSetKernelArg代码

c0bab47

精简代码

39bcbd1

cu编译改回nvcc提前编译

1cb6e52

继续简化代码

更换mode方式

cd2e614

异步拷贝内存

598603b

完成CUDA并行优化，计算结果正常

8c29f1f

目前速度比opencl略差，待分析优化

修正命令行提示，Max Thread Per MP和SP是不一样的概念

d13a9ba

调整参数试试性能情况

f9ba50e

修正64、32位判断的宏

cce5bc3

优化

3237a50

cuEdgeDetectorMapEx cuEdgeDetectorLowFreqEx cuRemoveBorderEx cuAddBorderEx

恢复factor=2的支持，性能差别不大，但是编译时间变长了

9f8597d

优化编译和Test脚本

61fde3c

减少kernel中一些冗余的数据copy

8fe8454

Merge branch 'master' of https://github.com/ianhuang-777/guetzli

c90b88a

优化clDiffmapOpsinDynamicsImageEx

1e4b4f4

增加一些调试信息

3995006

kernel运算用float替代double，节省运算时间

1aa86d5

修正数组长度

8ed0ce3

我也不知道为什么，删除掉这个空行计算结果就正确了

7aff164

肯定是opencl生成代码时有bug!

修正编译配置

f795ad1

修正warning

13abc16

换一组编译参数

0c85b8f

Merge branch 'googleMaster'

45300b4

Conflicts: guetzli/butteraugli_comparator.cc third_party/butteraugli/butteraugli/butteraugli.h

jdluzen reviewed Jul 23, 2017

View reviewed changes

jdluzen reviewed Jul 24, 2017

View reviewed changes

crazyks mentioned this pull request Jul 24, 2017

Add CUDA & OpenCL support #228

Closed

crazyks closed this Jul 25, 2017

crazyks reopened this Jul 25, 2017

kornelski mentioned this pull request Aug 23, 2017

CUDA and/or OpenCL usage for DCT and other functions? #44

Open

swrobel reviewed Aug 31, 2017

View reviewed changes

swrobel mentioned this pull request Aug 31, 2017

Automatic translation isaacs/github#1009

Open

DanielBiegler mentioned this pull request Nov 21, 2018

Extremely slow performance #50

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add CUDA & OpenCL support #227

Add CUDA & OpenCL support #227

crazyks commented Jul 18, 2017

jdluzen Jul 23, 2017

jdluzen Jul 24, 2017

ianhuang-777 Jul 25, 2017

Bukashk0zzz commented Jul 25, 2017

robryk commented Jul 28, 2017

swrobel Aug 31, 2017

FireEmerald commented Sep 1, 2017

mikhailnov commented Sep 25, 2017

leafjungle commented Sep 30, 2017

ianhuang-777 commented Oct 9, 2017

joyjoker2017 commented Oct 17, 2017

mikhailnov commented Oct 31, 2017

rogierlommers commented Nov 2, 2017

ianhuang-777 commented Nov 2, 2017

fvm commented Nov 4, 2017 •

edited

Loading

fvm commented Dec 8, 2017

DanielBiegler commented Dec 12, 2017

tina-junold commented Apr 12, 2018

alexblhr commented Jun 19, 2018

jonathas commented Dec 12, 2018

magicdoublem commented Mar 22, 2019

EwoutH commented Dec 12, 2019

doterax commented Jul 31, 2021 •

edited

Loading

twitnic commented Apr 25, 2023

doterax commented Apr 26, 2023

Add CUDA & OpenCL support #227

Are you sure you want to change the base?

Add CUDA & OpenCL support #227

Conversation

crazyks commented Jul 18, 2017

jdluzen Jul 23, 2017

Choose a reason for hiding this comment

jdluzen Jul 24, 2017

Choose a reason for hiding this comment

ianhuang-777 Jul 25, 2017

Choose a reason for hiding this comment

Bukashk0zzz commented Jul 25, 2017

robryk commented Jul 28, 2017

swrobel Aug 31, 2017

Choose a reason for hiding this comment

FireEmerald commented Sep 1, 2017

mikhailnov commented Sep 25, 2017

leafjungle commented Sep 30, 2017

ianhuang-777 commented Oct 9, 2017

joyjoker2017 commented Oct 17, 2017

mikhailnov commented Oct 31, 2017

rogierlommers commented Nov 2, 2017

ianhuang-777 commented Nov 2, 2017

fvm commented Nov 4, 2017 • edited Loading

fvm commented Dec 8, 2017

DanielBiegler commented Dec 12, 2017

tina-junold commented Apr 12, 2018

alexblhr commented Jun 19, 2018

jonathas commented Dec 12, 2018

magicdoublem commented Mar 22, 2019

EwoutH commented Dec 12, 2019

doterax commented Jul 31, 2021 • edited Loading

twitnic commented Apr 25, 2023

doterax commented Apr 26, 2023

fvm commented Nov 4, 2017 •

edited

Loading

doterax commented Jul 31, 2021 •

edited

Loading