index.html

<!DOCTYPE html>
<html lang="en-US">
  <head>
    <meta charset="utf-8">
    <meta name="viewport" content="width=device-width,initial-scale=1">
    <title>DeepBurning</title>
    <meta name="generator" content="VuePress 1.7.1">
    
    <meta name="description" content="Automatic generation of FPGA-based learning accelerators for the neural network family">
    
    <link rel="preload" href="/assets/css/0.styles.992cb7aa.css" as="style"><link rel="preload" href="/assets/js/app.6b2b11ce.js" as="script"><link rel="preload" href="/assets/js/2.eaadd307.js" as="script"><link rel="preload" href="/assets/js/3.8e12f1ca.js" as="script"><link rel="prefetch" href="/assets/js/10.80f94882.js"><link rel="prefetch" href="/assets/js/4.d8feef26.js"><link rel="prefetch" href="/assets/js/5.91fdae5d.js"><link rel="prefetch" href="/assets/js/6.984e7e31.js"><link rel="prefetch" href="/assets/js/7.2cf0207d.js"><link rel="prefetch" href="/assets/js/8.f5bac4d8.js"><link rel="prefetch" href="/assets/js/9.cf2602e5.js">
    <link rel="stylesheet" href="/assets/css/0.styles.992cb7aa.css">
  </head>
  <body>
    <div id="app" data-server-rendered="true"><div class="theme-container no-sidebar"><header class="navbar"><div class="sidebar-button"><svg xmlns="http://www.w3.org/2000/svg" aria-hidden="true" role="img" viewBox="0 0 448 512" class="icon"><path fill="currentColor" d="M436 124H12c-6.627 0-12-5.373-12-12V80c0-6.627 5.373-12 12-12h424c6.627 0 12 5.373 12 12v32c0 6.627-5.373 12-12 12zm0 160H12c-6.627 0-12-5.373-12-12v-32c0-6.627 5.373-12 12-12h424c6.627 0 12 5.373 12 12v32c0 6.627-5.373 12-12 12zm0 160H12c-6.627 0-12-5.373-12-12v-32c0-6.627 5.373-12 12-12h424c6.627 0 12 5.373 12 12v32c0 6.627-5.373 12-12 12z"></path></svg></div> <a href="/" aria-current="page" class="home-link router-link-exact-active router-link-active"><!----> <span class="site-name">DeepBurning</span></a> <div class="links"><div class="search-box"><input aria-label="Search" autocomplete="off" spellcheck="false" value=""> <!----></div> <nav class="nav-links can-hide"><div class="nav-item"><a href="/documentation/" class="nav-link">
  Documentation
</a></div><div class="nav-item"><a href="/publications/" class="nav-link">
  Publications
</a></div><div class="nav-item"><a href="/about/" class="nav-link">
  About
</a></div><div class="nav-item"><a href="https://github.com/labfor/DeepBurning" target="_blank" rel="noopener noreferrer" class="nav-link external">
  Download
  <span><svg xmlns="http://www.w3.org/2000/svg" aria-hidden="true" focusable="false" x="0px" y="0px" viewBox="0 0 100 100" width="15" height="15" class="icon outbound"><path fill="currentColor" d="M18.8,85.1h56l0,0c2.2,0,4-1.8,4-4v-32h-8v28h-48v-48h28v-8h-32l0,0c-2.2,0-4,1.8-4,4v56C14.8,83.3,16.6,85.1,18.8,85.1z"></path> <polygon fill="currentColor" points="45.7,48.7 51.3,54.3 77.2,28.5 77.2,37.2 85.2,37.2 85.2,14.9 62.8,14.9 62.8,22.9 71.5,22.9"></polygon></svg> <span class="sr-only">(opens new window)</span></span></a></div> <!----></nav></div></header> <div class="sidebar-mask"></div> <aside class="sidebar"><nav class="nav-links"><div class="nav-item"><a href="/documentation/" class="nav-link">
  Documentation
</a></div><div class="nav-item"><a href="/publications/" class="nav-link">
  Publications
</a></div><div class="nav-item"><a href="/about/" class="nav-link">
  About
</a></div><div class="nav-item"><a href="https://github.com/labfor/DeepBurning" target="_blank" rel="noopener noreferrer" class="nav-link external">
  Download
  <span><svg xmlns="http://www.w3.org/2000/svg" aria-hidden="true" focusable="false" x="0px" y="0px" viewBox="0 0 100 100" width="15" height="15" class="icon outbound"><path fill="currentColor" d="M18.8,85.1h56l0,0c2.2,0,4-1.8,4-4v-32h-8v28h-48v-48h28v-8h-32l0,0c-2.2,0-4,1.8-4,4v56C14.8,83.3,16.6,85.1,18.8,85.1z"></path> <polygon fill="currentColor" points="45.7,48.7 51.3,54.3 77.2,28.5 77.2,37.2 85.2,37.2 85.2,14.9 62.8,14.9 62.8,22.9 71.5,22.9"></polygon></svg> <span class="sr-only">(opens new window)</span></span></a></div> <!----></nav>  <!----> </aside> <main class="home"><header class="hero"><img src="/logo.png" alt="hero"> <!----> <p class="description">
      Automatic generation of FPGA-based learning accelerators for the neural network family
    </p> <!----></header> <!----> <div class="theme-default-content custom content__default"><br> <h2 id="introduction"><a href="#introduction" class="header-anchor">#</a> Introduction</h2> <p style="text-align:justify;">DeepBurning [1] is an end-to-end neural network acceleration design tool that generates both customized neural network model and neural processing unit (NPU) for a specialized learning task on FPGAs. The overview of DeepBurning is shown in Figure 1. It only requires the dataset of the target application and high-level design constraints such as total resource budget to produce a unified optimized acceleration solution targeting at a typical heterogeneous CPU+FPGA architecture that can be immediately deployed, while the application developers can focus on the application development without dealing with the complex neural network model designing nor the low-level accelerator parameter tuning. Particularly, we propose an efficient co-designed autoML search framework named YOSO [2] that seeks to optimize the neural network architecture and the NPU parameters at the same time. Note that DeepBurning relies on a pre-built NPU template that allows flexible configuration and customization. The template is supposed to be developed by skilled hardware designers to ensure efficient hardware implementation.</p> <div align="center"><img src="/assets/img/deepburning.3b138199.svg" width="100%" height="100%"> <br> <div style="display:inline-block;color:#999;padding:2px;"> Figure 1 DeepBuring Overview</div></div> <p style="text-align:justify;">DeepBurning is under active development. The major components including YOSO and NPU compilation are already in use while the automatic NPU generation based on the pre-built template still needs quite some handcrafted adjustment. We will put it online soon when we get it ready. Currently, we only allow the users to compile neural network models to a specific NPU configuration.</p> <br> <h2 id="key-features"><a href="#key-features" class="header-anchor">#</a> Key features</h2> <p>Supported</p> <ul><li><p style="text-align:justify;">Given high-level design constraints, YOSO can be used to search for the optimized neural network architecture and NPU configuration.</p></li> <li><p style="text-align:justify;">Neural network models described in Prototxt can be compiled to instructions and then deployed on the pre-built NPU. Currently, we just provide some pre-compiled neural networks and we will offer a free on-line compiler later.</p></li> <li><p style="text-align:justify;">A typical NPU with 2D array computing architecture is provided as a netlist. Its architecture is shown in Figure 2. It consists of 128 KB I/O buffer that can be allocated for input and output dynamically and supports data prefetch to hide the external memory access overhead. It covers a large number of typical operations utilized in typical neural networks and relevant image processing operations, so it supports more than 30 neural networks. The supported operations and neural network models are listed in Table 1.</p></li> <li><p style="text-align:justify;">The generated accelerators and drivers can be utilized in Xilinx Zynq 7000 devices. Particularly, the design is verified on ZC706 and MZ7100. The corresponding Linux kernel and root file system is also provided.</p></li></ul> <div align="center"><img src="/assets/img/npu.eaf61c51.svg" width="60%" height="60%"> <br> <div style="display:inline-block;color:#999;padding:2px;">Figure 2 NPU Architecture</div></div> <br> <br> <div align="center"><div style="display:inline-block;color:#999;padding:2px;"> Table 1 Supported NPU operations and neural network models</div></div> <table><thead><tr><th style="text-align:center;">Neural network operations</th> <th style="text-align:center;">General computing operations</th> <th style="text-align:center;">Neural network models</th></tr></thead> <tbody><tr><td style="text-align:center;">Convolution, deconvolution, 3D convolution, grouped convolution, Full connection, Softmax, <br>Elementwise, Concat, Reorganization, Batch normalization, Pooling (average, max) <br> Activation function (Relu, Prelu, Leaky Relu, tanh, Sigmoid, …)</td> <td style="text-align:center;">Matrix-matrix multiplication, Matrix-vector multiplication, Dot-production, Cosine distance, Feature scaling</td> <td style="text-align:center;">GoogleNet, DenseNet, VGG, ResNet, MobileNet, SqueezeNet, DCGAN, LSTM, MTCNN, Hourglass, …</td></tr></tbody></table> <br> <h2 id="performance-evaluation"><a href="#performance-evaluation" class="header-anchor">#</a> Performance evaluation</h2> <p style="text-align:justify;">
We measure the performance and the FPGA resource consumption on MZ7100 board which includes a Zynq 7100 FPGA chip. The NPU kernel runs at 100 MHz and it can be optimized up to 200 MHz. The measured fps on ImageNet is shown in Table 2 and the total FPGA resource overhead is presented in Table 3.
</p> <div align="center"><div style="display:inline-block;color:#999;padding:8px;"> Table 2 Performance Evaluation</div> <br> <table style="display:inline;"><tr><th>  Neural Network Models  </th> <th>  Fps (100 MHz)</th> <th>  Fps (200 MHz)  </th></tr> <tr><td>ResNet18</td> <td> 5 fps</td> <td> 10 fps  </td></tr> <tr><td>YOLO v2  </td> <td> 2.5 fps</td> <td> 4.5 fps</td></tr> <tr><td>MTCNN+Facenet</td> <td> 2 fps</td> <td> 4.2 fps</td></tr></table></div> <br> <br> <div align="center"><div style="display:inline-block;color:#999;padding:8px;"> Table 3 NPU resource consumption on MZ7100</div> <br> <table style="display:inline;"><tr><th>LUT                    </th> <th>                   BRAM    </th> <th>     DSP    </th> <th>    FF    </th> <th>    LUTRAM    </th></tr> <tr><td>67%</td> <td>39%</td> <td>40%</td> <td>16%</td> <td>1%</td></tr></table></div> <br> <h2 id="demo-video"><a href="#demo-video" class="header-anchor">#</a> Demo video</h2> <p>We also present two application videos in which we utilize DeepBurning to generate the acceleration solution on MZ7100 board.</p> <ul><li><p>Object detection: The input figures are captured by the camera and processed on NPU deployed on the FPGA. While the figures selected from ImageNet and displayed on screen with another computer.
</p><div align="center"><video src="/assets/media/object.b68137b4.mp4" controls="controls" width="75%" height="75%"></video></div><p></p></li> <li><p>DCGAN based face generation: The faces are generated with DCGAN which is a typical generative neural network.
</p><div align="center"><video src="/assets/media/face.1419adca.mp4" controls="controls" width="75%" height="75%"></video></div> <br><p></p></li></ul> <h3 id="contact"><a href="#contact" class="header-anchor">#</a> Contact</h3> <p><strong>Prof. Ying Wang (wangying2009@ict.ac.cn)</strong></p></div> <div class="footer">
    MIT Licensed
  </div></main></div><div class="global-ui"></div></div>
    <script src="/assets/js/app.6b2b11ce.js" defer></script><script src="/assets/js/2.eaadd307.js" defer></script><script src="/assets/js/3.8e12f1ca.js" defer></script>
  </body>
</html>