Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Project 1: Rishabh Shah #33

Open
wants to merge 7 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
67 changes: 62 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,67 @@
**University of Pennsylvania, CIS 565: GPU Programming and Architecture,
Project 1 - Flocking**

* (TODO) YOUR NAME HERE
* Tested on: (TODO) Windows 22, i7-2222 @ 2.22GHz 22GB, GTX 222 222MB (Moore 2222 Lab)
* RISHABH SHAH
* Tested on: Windows 10, i7-6700HQ @ 2.6GHz 16GB, GTX 960M 4096MB (Laptop)

### (TODO: Your README)
### Boids Flocking Simulation on GPU

Include screenshots, analysis, etc. (Remember, this is public, so don't put
anything here that you don't want to share with the world.)
![](./images/coherent_5000.gif)

*simulation result of 5000 boids using coherent grid*

In this simulation, boids (particles) move in space interacting with nearby boids based on three rules: Cohesion, Separation, Alignment. At every timestep, these rules define the new velocity of every boid.

The simulation was done using three methods
- Naive - Here, we check every boid with every other boid, regardless of the distance.
- Scattered Grid - This one is a bit smarter. Here, we divide the space into a uniform grid, and only check the boids in the nearby grid cells around the boid.
- Coherent Grid - Because of the way the GPU works, the scattered grid implementation doesn't work as efficiently as it should. On a GPU, accessing neighboring values in arrays is much faster than accessing far of values. So, we rearrange the position and velocity arrays from the scattered grid implementation and make them coherent. This gives considerably faster performance for high boid counts.

Skipping cells based on neighborhood distance: Instead of just looping over the boids from the nearby cells, I determine if a part of the neighborhood lies in the cells. If it does, only then, the boids of the cell are considered.

#### Performance Analysis
*NOTE: All the timings were calculated as an average over 15 seconds.*
- Number of Boids:
- As expected, the frame rate drops with the increase in number of boids.
- When the number of boids is low (about 1000), naive is the fatest. This must be because of the overhead of the additional calculation that are needed for grid based implementations. The number of boids is so less, it is better to just compare every boid with every other boid, tham to do the additional calculations.
- As the number of boids increases, the benefits of using grid based implementations are seen.

*Analysis with visualization off*

![](./images/boidsvsfps.png)

![](./images/chart_boids.png)

*Analysis with visualization on*

![](./images/viz.png)

![](./images/chart_viz.png)

- Block Size:
- Change in frame rate can be seen as block size increases from 16 to 32. Probably because half of the threads in the warps were idle with block size of 16. A little performance increase can be seen upto 64 or 128.
- After 128, there is not much difference.

*Analysis with visualization off*

![](./images/blocksizevsfps.png)

![](./images/chart_blocksize.png)

- Cell Width:
- Grid cell width directly affects performance, as increasing it, increases the number of grid cells to be processed, but also decreases the number of boids in each cell.
- The performance for half the original cell width was quite better for all the methods.
- As can be seen in the data below, the performance actually increased when doubling the boid count from 5000 to 10000. Same thing happens with original cell width. I have no clue why.

*Analysis with visualization off*

![](./images/halfcellwidth.png)

![](./images/chart_halfcellwidth.png)

- Comparison:
- This is a comparison of the performance changes with visualization on, visualization off, and with smaller cell size (+ visualization off)

![](./images/chart_comparison.png)

#### src/CMakeLists.txt changes: OPTIONS -arch=sm_50
Binary file added images/blocksizevsfps.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/boidsvsfps.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/boidsvsfps_viz.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/chart_blocksize.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/chart_boids.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/chart_comparison.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/chart_halfcellwidth.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/chart_viz.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/coherent_5000.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/coherent_50000.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/halfcellwidth.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 1 addition & 1 deletion src/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -10,5 +10,5 @@ set(SOURCE_FILES

cuda_add_library(src
${SOURCE_FILES}
OPTIONS -arch=sm_20
OPTIONS -arch=sm_50
)
Loading