Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add initial support for Intel FPGA SDK for OpenCL (AOCL) #1474

Merged
merged 27 commits into from
Jul 31, 2018
Merged
Show file tree
Hide file tree
Changes from 17 commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
8a5ffdb
Support OpenCL offline compilation
oss-dev-somewhere Jul 13, 2018
8b0eef8
Merge branch 'aocl' of github.com:ktabata/tvm into aocl
oss-dev-somewhere Jul 20, 2018
120979c
AOCL emulation runs.
oss-dev-somewhere Jul 23, 2018
ca137cb
Support OpenCL offline compilation
oss-dev-somewhere Jul 13, 2018
f2257d0
AOCL emulation runs.
oss-dev-somewhere Jul 23, 2018
edd0437
tab to white spaces.
oss-dev-somewhere Jul 23, 2018
8c22538
Resolved conflicts.
oss-dev-somewhere Jul 23, 2018
aca96a5
Fixed macro error.
oss-dev-somewhere Jul 23, 2018
5016f73
Fixed submodule.
oss-dev-somewhere Jul 23, 2018
ed47e4a
Implemented AOCLWorkspace.
oss-dev-somewhere Jul 25, 2018
d51f19e
Fixed document.
oss-dev-somewhere Jul 25, 2018
2c18ee0
Fixed document.
oss-dev-somewhere Jul 25, 2018
fd3d4fa
Deleted macro.
oss-dev-somewhere Jul 25, 2018
5b578e2
Fixed file header.
oss-dev-somewhere Jul 25, 2018
66a46ea
Fixed file header.
oss-dev-somewhere Jul 25, 2018
fb80366
Deleted includes.
oss-dev-somewhere Jul 25, 2018
0030c81
Fixed OpenCL.cmake
oss-dev-somewhere Jul 25, 2018
65ad10d
Fixed platform name for AOCL.
oss-dev-somewhere Jul 25, 2018
d3ab18a
Fixed device type.
oss-dev-somewhere Jul 25, 2018
3a49eee
Fixed document.
oss-dev-somewhere Jul 25, 2018
93d8d9f
Added -mattr=emulator option.
oss-dev-somewhere Jul 25, 2018
b174098
Fixed documentation.
oss-dev-somewhere Jul 25, 2018
68569d9
Fixed documentation.
oss-dev-somewhere Jul 25, 2018
d4d435f
Fixed documentation.
oss-dev-somewhere Jul 25, 2018
9d45ead
Fixed documentation.
oss-dev-somewhere Jul 25, 2018
8ff0272
Use s5_ref for target device.
oss-dev-somewhere Jul 27, 2018
f7317b0
Added testcases.
oss-dev-somewhere Jul 28, 2018
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions cmake/config.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,9 @@ set(USE_ROCM OFF)
# Whether enable SDAccel runtime
set(USE_SDACCEL OFF)

# Whether enable Intel FPGA SDK for OpenCL (AOCL) runtime
set(USE_AOCL OFF)

# Whether enable OpenCL runtime
set(USE_OPENCL OFF)

Expand Down
12 changes: 12 additions & 0 deletions cmake/modules/OpenCL.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,18 @@ else()
list(APPEND COMPILER_SRCS src/codegen/opt/build_sdaccel_off.cc)
endif(USE_SDACCEL)

if(USE_AOCL)
message(STATUS "Build with Intel FPGA SDK for OpenCL support")
file(GLOB RUNTIME_AOCL_SRCS src/runtime/opencl/aocl/*.cc)
list(APPEND RUNTIME_SRCS ${RUNTIME_AOCL_SRCS})
if(NOT USE_OPENCL)
message(STATUS "Enable OpenCL support required for Intel FPGA SDK for OpenCL")
set(USE_OPENCL ON)
endif()
else()
list(APPEND COMPILER_SRCS src/codegen/opt/build_aocl_off.cc)
endif(USE_AOCL)

if(USE_OPENCL)
find_package(OpenCL REQUIRED)
message(STATUS "Build with OpenCL support")
Expand Down
74 changes: 74 additions & 0 deletions docs/deploy/aocl_fpga.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
AOCL Backend Example
====================

TVM supports Intel FPGA SDK for OpenCL also known as AOCL. Here is a tutorial for how to use TVM with AOCL.

***Note***: This feature is still experimental. We cannot use AOCL to deploy an end to end neural networks for now. In addition, we can only use AOCL's emulation mode for now.

We use two python scripts for this tutorial.

- build.py - a script to synthesize FPGA bitstream.
```import tvm

tgt_host="llvm"
tgt="aocl -device=de5net_a7"

n = tvm.var("n")
A = tvm.placeholder((n,), name='A')
B = tvm.placeholder((n,), name='B')
C = tvm.compute(A.shape, lambda i: A[i] + B[i], name="C")

s = tvm.create_schedule(C.op)
px, x = s[C].split(C.op.axis[0], nparts=1)

s[C].bind(px, tvm.thread_axis("pipeline"))

fadd = tvm.build(s, [A, B, C], tgt, target_host=tgt_host, name="myadd")

fadd.save("myadd.o")
fadd.imported_modules[0].save("myadd.aocx")

tvm.contrib.cc.create_shared("myadd.so", ["myadd.o"])
)```

- run.py - a script to use FPGA as an accelerator.
```python
import tvm
import numpy as np
import os

tgt="aocl -device=de5net_a7"

fadd = tvm.module.load("myadd.so")
fadd_dev = tvm.module.load("myadd.aocx")
fadd.import_module(fadd_dev)

ctx = tvm.context(tgt, 0)

n = 1024
a = tvm.nd.array(np.random.uniform(size=n).astype("float32"), ctx)
b = tvm.nd.array(np.random.uniform(size=n).astype("float32"), ctx)
c = tvm.nd.array(np.zeros(n, dtype="float32"), ctx)

fadd(a, b, c)
np.testing.assert_allclose(c.asnumpy(), a.asnumpy() + b.asnumpy())
```

Setup
-----

- Install AOCL 17.1 on Ubuntu 16.04.4 LTS.
- Install FPGA device driver.
- Make ICD file. (/etc/OpenCL/vendors/Altera.icd)
- Make FCD file. (/opt/Intel/OpenCL/Boards/de5net.fcd)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add more explanation about what kinds of files we should make.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed document.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I can't install FPGA PCIe driver on Ubuntu 16.04 LTS,

- Setup TVM with AOCL and OpenCL enabled.

Emulation
---------

- Run software emulation
```export CL_CONTEXT_EMULATOR_DEVICE_INTELFPGA=1

python build.py
python run.py
```
1 change: 1 addition & 0 deletions include/tvm/runtime/c_runtime_api.h
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,7 @@ typedef int64_t tvm_index_t;

/*! \brief Extension device types in TVM */
typedef enum {
kDLAOCL = 5,
kDLSDAccel = 6,
kDLVulkan = 7,
kOpenGL = 11,
Expand Down
2 changes: 2 additions & 0 deletions python/tvm/_ffi/runtime_ctypes.py
Original file line number Diff line number Diff line change
Expand Up @@ -96,6 +96,7 @@ class TVMContext(ctypes.Structure):
1 : 'cpu',
2 : 'gpu',
4 : 'opencl',
5 : 'aocl',
6 : 'sdaccel',
7 : 'vulkan',
8 : 'metal',
Expand All @@ -113,6 +114,7 @@ class TVMContext(ctypes.Structure):
'nvptx': 2,
'cl': 4,
'opencl': 4,
'aocl' : 5,
'sdaccel': 6,
'vulkan': 7,
'metal': 8,
Expand Down
3 changes: 3 additions & 0 deletions src/codegen/build_module.cc
Original file line number Diff line number Diff line change
Expand Up @@ -91,6 +91,9 @@ Target CreateTarget(const std::string& target_name,
} else if (target_name == "sdaccel") {
t->device_type = kDLOpenCL;
t->keys_array.push_back(ir::StringImm::make("sdaccel"));
} else if (target_name == "aocl") {
t->device_type = kDLOpenCL;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this should be kDLAOCL.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

t->keys_array.push_back(ir::StringImm::make("aocl"));
} else if (target_name == "opengl") {
t->device_type = kOpenGL;
t->keys_array.push_back(ir::StringImm::make("opengl"));
Expand Down
54 changes: 54 additions & 0 deletions src/codegen/codegen_aocl.cc
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
/*!
* Copyright (c) 2018 by Contributors
* \file codegen_aocl.cc
*/
#include <tvm/build_module.h>
#include <vector>
#include <string>
#include "./codegen_opencl.h"
#include "./build_common.h"
#include "../runtime/opencl/aocl/aocl_module.h"
#include "../runtime/file_util.h"

namespace tvm {
namespace codegen {

runtime::Module BuildAOCL(Array<LoweredFunc> funcs, std::string target_str) {
// Get code.
using tvm::runtime::Registry;
bool output_ssa = false;
CodeGenOpenCL cg;
cg.Init(output_ssa);
for (LoweredFunc f : funcs) {
cg.AddFunction(f);
}
std::string code = cg.Finish();
if (const auto* f = Registry::Get("tvm_callback_opencl_postproc")) {
code = (*f)(code).operator std::string();
}

// Write a .cl file.
runtime::SaveBinaryToFile("aocl.cl", code.c_str());

// Compile the .cl file.
Target target = Target::create(target_str);
std::string cmd = "aoc aocl.cl -march=emulator -board=";
cmd += target->device_name;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this doesn't work if we don't specify the '-device' option.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added logic to check device name.

if (system(cmd.c_str()) != 0) {
LOG(FATAL) << "OpenCL offline compilation error.";
}

// Read .aocx file
std::string aocxbin;
runtime::LoadBinaryFromFile("aocl.aocx", &aocxbin);

return AOCLModuleCreate(aocxbin, "aocx", ExtractFuncInfo(funcs), code);
}

TVM_REGISTER_API("codegen.build_aocl")
.set_body([](TVMArgs args, TVMRetValue* rv) {
*rv = BuildAOCL(args[0], args[1]);
});

} // namespace codegen
} // namespace tvm
21 changes: 21 additions & 0 deletions src/codegen/opt/build_aocl_off.cc
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
/*!
* Copyright (c) 2018 by Contributors
* Optional module when build aocl is switched to off
*/
#include "../codegen_source_base.h"
#include "../../runtime/opencl/opencl_module.h"

namespace tvm {
namespace runtime {

Module AOCLModuleCreate(
std::string data,
std::string fmt,
std::unordered_map<std::string, FunctionInfo> fmap,
std::string source) {
LOG(WARNING) << "AOCL runtime not enabled, return a source module...";
return codegen::DeviceSourceModuleCreate(data, fmt, fmap, "aocl");
}

} // namespace runtime
} // namespace tvm
2 changes: 1 addition & 1 deletion src/pass/verify_memory.cc
Original file line number Diff line number Diff line change
Expand Up @@ -145,7 +145,7 @@ class MemoryAccessVerifier final : protected IRVisitor {
}
/// Check if a given DLDeviceType/TVMDeviceExtType value denotes FPGA device.
static bool IsFPGADevice(int dev_type) {
return kDLSDAccel == dev_type;
return kDLSDAccel == dev_type || kDLAOCL == dev_type;
}

private:
Expand Down
1 change: 1 addition & 0 deletions src/runtime/c_runtime_api.cc
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@ inline std::string DeviceName(int type) {
case kDLGPU: return "gpu";
case kDLOpenCL: return "opencl";
case kDLSDAccel: return "sdaccel";
case kDLAOCL: return "aocl";
case kDLVulkan: return "vulkan";
case kDLMetal: return "metal";
case kDLVPI: return "vpi";
Expand Down
42 changes: 42 additions & 0 deletions src/runtime/opencl/aocl/aocl_common.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
/*!
* Copyright (c) 2018 by Contributors
* \file aocl_common.h
* \brief AOCL common header
*/
#ifndef TVM_RUNTIME_OPENCL_AOCL_AOCL_COMMON_H_
#define TVM_RUNTIME_OPENCL_AOCL_AOCL_COMMON_H_

#include "../opencl_common.h"

namespace tvm {
namespace runtime {
namespace cl {

/*!
* \brief Process global AOCL workspace.
*/
class AOCLWorkspace final : public OpenCLWorkspace {
public:
// override OpenCL device API
void Init() final;
bool IsOpenCLDevice(TVMContext ctx) final;
OpenCLThreadEntry* GetThreadEntry() final;
// get the global workspace
static const std::shared_ptr<OpenCLWorkspace>& Global();
};


/*! \brief Thread local workspace for AOCL */
class AOCLThreadEntry : public OpenCLThreadEntry {
public:
// constructor
AOCLThreadEntry()
: OpenCLThreadEntry(static_cast<DLDeviceType>(kDLAOCL), AOCLWorkspace::Global()) {}

// get the global workspace
static AOCLThreadEntry* ThreadLocal();
};
} // namespace cl
} // namespace runtime
} // namespace tvm
#endif // TVM_RUNTIME_OPENCL_AOCL_AOCL_COMMON_H_
44 changes: 44 additions & 0 deletions src/runtime/opencl/aocl/aocl_device_api.cc
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
/*!
* Copyright (c) 2018 by Contributors
* \file aocl_device_api.cc
*/
#include <tvm/runtime/registry.h>
#include <dmlc/thread_local.h>
#include "./aocl_common.h"

namespace tvm {
namespace runtime {
namespace cl {

OpenCLThreadEntry* AOCLWorkspace::GetThreadEntry() {
return AOCLThreadEntry::ThreadLocal();
}

const std::shared_ptr<OpenCLWorkspace>& AOCLWorkspace::Global() {
static std::shared_ptr<OpenCLWorkspace> inst = std::make_shared<AOCLWorkspace>();
return inst;
}

void AOCLWorkspace::Init() {
OpenCLWorkspace::Init("aocl", "accelerator", "Intel");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think "Intel" would match the Intel OpenCL platform for CPU/GPU. Should be "Intel(R) FPGA"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed "Intel" to the exact platform name.

}

bool AOCLWorkspace::IsOpenCLDevice(TVMContext ctx) {
return ctx.device_type == static_cast<DLDeviceType>(kDLAOCL);
}

typedef dmlc::ThreadLocalStore<AOCLThreadEntry> AOCLThreadStore;

AOCLThreadEntry* AOCLThreadEntry::ThreadLocal() {
return AOCLThreadStore::Get();
}

TVM_REGISTER_GLOBAL("device_api.aocl")
.set_body([](TVMArgs args, TVMRetValue* rv) {
DeviceAPI* ptr = AOCLWorkspace::Global().get();
*rv = static_cast<void*>(ptr);
});

} // namespace cl
} // namespace runtime
} // namespace tvm
58 changes: 58 additions & 0 deletions src/runtime/opencl/aocl/aocl_module.cc
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
/*!
* Copyright (c) 2018 by Contributors
* \file aocl_module.cc
*/
#include <dmlc/memory_io.h>
#include <tvm/runtime/registry.h>
#include <vector>
#include <string>
#include <unordered_map>
#include "./aocl_common.h"
#include "./aocl_module.h"

namespace tvm {
namespace runtime {

class AOCLModuleNode : public OpenCLModuleNode {
public:
explicit AOCLModuleNode(std::string data,
std::string fmt,
std::unordered_map<std::string, FunctionInfo> fmap,
std::string source)
: OpenCLModuleNode(data, fmt, fmap, source) {}
const std::shared_ptr<cl::OpenCLWorkspace>& GetGlobalWorkspace() final;
};

const std::shared_ptr<cl::OpenCLWorkspace>& AOCLModuleNode::GetGlobalWorkspace() {
return cl::AOCLWorkspace::Global();
}

Module AOCLModuleCreate(
std::string data,
std::string fmt,
std::unordered_map<std::string, FunctionInfo> fmap,
std::string source) {
std::shared_ptr<AOCLModuleNode> n =
std::make_shared<AOCLModuleNode>(data, fmt, fmap, source);
n->Init();
return Module(n);
}

Module AOCLModuleLoadFile(const std::string& file_name,
const std::string& format) {
std::string data;
std::unordered_map<std::string, FunctionInfo> fmap;
std::string fmt = GetFileFormat(file_name, format);
std::string meta_file = GetMetaFilePath(file_name);
LoadBinaryFromFile(file_name, &data);
LoadMetaDataFromFile(meta_file, &fmap);
return AOCLModuleCreate(data, fmt, fmap, std::string());
}

TVM_REGISTER_GLOBAL("module.loadfile_aocx")
.set_body([](TVMArgs args, TVMRetValue* rv) {
*rv = AOCLModuleLoadFile(args[0], args[1]);
});

} // namespace runtime
} // namespace tvm
Loading