Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add initial support for Intel FPGA SDK for OpenCL (AOCL) #1474

Merged
merged 27 commits into from
Jul 31, 2018
Merged

Add initial support for Intel FPGA SDK for OpenCL (AOCL) #1474

merged 27 commits into from
Jul 31, 2018

Conversation

oss-dev-somewhere
Copy link
Contributor

This PR adds initial support for Intel FPGA SDK for OpenCL (AOCL).
Currently it only works in CPU emulation mode.
To try this patch, you can use AOCL 17.1 without paid license.

@liangfu
Copy link
Member

liangfu commented Jul 23, 2018

Nice effort.
Can we support targeting opencl for both gpu and accelerator at the same time?

@oss-dev-somewhere
Copy link
Contributor Author

Hi @liangfu , as far as this PR goes, no, we can't.
If we support targeting OpenCL for both GPU and FPGA at the same time,
what is the best way to choose device types (GPU and FPGA) from python script?

@liangfu
Copy link
Member

liangfu commented Jul 23, 2018

i'm not sure. perhaps by defining a different context? usually, ctx=tvm.context('opencl',0)

#
# Possible values:
# - OFF: disbale AOCL
# - board_name: use specific board name for offline compilation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the board name should be passed with target options (e.g. tvm.context("opencl -device=[board name]")).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to use target name option like tgt="aocl -device=de5net_a7".

Init("opencl", "gpu");
#else
Init("opencl", "accelerator");
#endif
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implement AOCLWorkspace as a subclass of OpenCLWorkspace so that it can coexist with other OpenCL platforms.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Introduced AOCLWorkspace.

if (device_type == "accelerator") dtype = CL_DEVICE_TYPE_ACCELERATOR;
#else
if (device_type == "accelerator") dtype = CL_DEVICE_TYPE_DEFAULT;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you want to use CL_DEVICE_TYPE_DEFAULT here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deleted this code and now works with device_type == "accelerator".

const char* s = data_.c_str();
size_t len = data_.length();
cl_int err;
program_ = clCreateProgramWithSource(w->context, 1, &s, &len, &err);
OPENCL_CHECK_ERROR(err);
#else
OfflineCompile(w, t);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be moved to codegen since compiling OpenCL codes for FPGAs takes very long time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Offline compilation was moved to codegen. See also BuildAOCL().

@oss-dev-somewhere
Copy link
Contributor Author

Hi @kazum thank you for reviewing. I'll fix them.

@oss-dev-somewhere
Copy link
Contributor Author

@kazum I implemented AOCLWorkspace and moved offline compile process to codegen. Please review.

- Install AOCL 17.1 on Ubuntu 16.04.4 LTS.
- Install FPGA device driver.
- Make ICD file. (/etc/OpenCL/vendors/Altera.icd)
- Make FCD file. (/opt/Intel/OpenCL/Boards/de5net.fcd)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add more explanation about what kinds of files we should make.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed document.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I can't install FPGA PCIe driver on Ubuntu 16.04 LTS,

@@ -91,6 +91,9 @@ Target CreateTarget(const std::string& target_name,
} else if (target_name == "sdaccel") {
t->device_type = kDLOpenCL;
t->keys_array.push_back(ir::StringImm::make("sdaccel"));
} else if (target_name == "aocl") {
t->device_type = kDLOpenCL;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this should be kDLAOCL.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

// Compile the .cl file.
Target target = Target::create(target_str);
std::string cmd = "aoc aocl.cl -march=emulator -board=";
cmd += target->device_name;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this doesn't work if we don't specify the '-device' option.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added logic to check device name.

}

void AOCLWorkspace::Init() {
OpenCLWorkspace::Init("aocl", "accelerator", "Intel");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think "Intel" would match the Intel OpenCL platform for CPU/GPU. Should be "Intel(R) FPGA"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed "Intel" to the exact platform name.

@oss-dev-somewhere
Copy link
Contributor Author

@kazum I made some changes to meet your comments. Will you review again?

import tvm

tgt_host="llvm"
tgt="aocl -device=de5net_a7 -mattr=emulator"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest "-device=s5_ref" for the tutorial. It is the default device of aoc and available without installing additional BSP.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed de5net_a7 to s5_ref.

std::string cmd = "aoc aocl.cl";
if (target_str.find("-mattr=emulator") != std::string::npos) {
cmd += " -march=emulator";
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use target->options() to get target parameters.

  for (std::string option : target->options()) {
    if (option == "-mattr=emulator") {
      cmd += " -march=emulator";
    }
  }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@kazum
Copy link
Contributor

kazum commented Jul 26, 2018

@Ktabata In addition, please add a testcase to test the aocl backend.

@oss-dev-somewhere
Copy link
Contributor Author

@kazum I added testcases.

Copy link
Contributor

@kazum kazum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, thanks.

@oss-dev-somewhere
Copy link
Contributor Author

@kazum Thank you for reviewing. I tested this code on not only emulator but also physical FPGA device. It works.

@tqchen
Copy link
Member

tqchen commented Jul 30, 2018

cc @vegaluisjose @tmoreau89 @comaniac can you also please take a quick look?

@comaniac
Copy link
Contributor

As I saw from the PR, this feature leverages the existing OpenCL code generator for Intel FPGA kernel. It seems to me that this may become annoying for the future improvement because the high-performance OpenCL kernel for Intel FPGA and NVidia GPU is very different. From my personal perspective, separating intel_opencl and nvidia_opencl is necessary, but I'm open if you have a better plan.

@tmoreau89
Copy link
Contributor

tmoreau89 commented Jul 30, 2018

@Ktabata @kazum since we are adding support to both SDAccel and AOCL OpenCL backends, how much common infrastructure should we be using for FPGA-specific code generation? I understand that the OpenCL specs are similar between the two vendors, and that there exist analogous code pragmas between the two specs which can be generated depending on the target we're in.

My opinion is to merge common infrastructure as much as possible early on rather than later. I recommend if you haven't already to coordinate together to reuse common code generation infrastructure as much as possible.

Otherwise, this seems like a good start at supporting Intel FPGAs, I look forward to seeing more ways in which to use TVM to leverage FPGAs.

@kazum
Copy link
Contributor

kazum commented Jul 30, 2018

@comaniac +1 for separating AOCL code generation from the existing codes. Adding CodeGenAlteraOpenCL which inherits from CodeGenOpenCL looks good to me.

@tmoreau89 I think it's a good idea to add a key like hls to both SDAccel and AOCL targets and have common schedulers for high-level synthesis backends. Then both backends can share the same logic for code generation, and vendor-specific code generators would only be in src/codegen/. I'll give it a try in a few days.

At the current stage, this PR only adds support for compiling with AOCL and, IMHO, it looks good to be merged before implementing common code generation infrastructure.

@tmoreau89
Copy link
Contributor

@kazum - great! As long as we plan on converging the back-ends.

It would indeed be good to have an hls key if we plan on encompassing both OpenCL and C-like HLS languages. If we want a common target to only AOCL and SDAccel, then we should make the key more specific, like fpga_opencl to make it clear that it would differ from HLS like languages.
I think it makes more sense to have an all-encompassing key (hls) since scheduling could most likely be reused across HLS-C and OpenCL for FPGAs.

Feel free to approve!

@comaniac
Copy link
Contributor

Minor comment: My experience was the Xilinx HLS C and Intel OpenCL have very different programming model for generating the same architecture. For example, Intel OpenCL uses global variables to represent FIFOs between modules and a module is generated from a kernel function with __kernel OpenCL identifier. On the other hand, Xilinx HLS C uses their built-in data structure to explicitly specify FIFOs. As a result, even both code generators use the same lowered IR, the implementation of each generator may still be very different.

@tmoreau89
Copy link
Contributor

@comaniac that's a good point regarding hardware constructs such as FIFOs.

As a result the conclusion of this discussion is to re-use IR passes for schedule lowering purposes between the different vendors, and have specialized code generators for each vendor to convert lowered TVM IR into OpenCL code.

Let me know if you agree with this approach.

@tmoreau89 tmoreau89 merged commit f711853 into apache:master Jul 31, 2018
sergei-mironov pushed a commit to sergei-mironov/tvm that referenced this pull request Aug 8, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants