Skip to content

Extending Ariel

plavin edited this page Feb 28, 2022 · 2 revisions

We wish to implement ScarPhase in SST. We are primarily interested in using Ariel to gather memory traces. However, Ariel is primarily focused on capturing memory references, and therefore doesn't capture the branching information needed by ScarPhase. How can we get this information into our simulation?

This document is an unedited stream-of-consciousness of how I worked through the code. Skip down to Plan - Option 1 to see what actually needs to be done.

Basics

We'll look first at the files in sst/elements/ariel/frontend/pin3.

Ariel has a parameter called arieltool. The error message for this is "The arieltool parameter specifying which PIN tool to run was not specified" (pin3frontend.cc). This implies that the pintool (what memtrace would call a client) can be changed by changing the Ariel configuration.

We do not typically specify this parameter, so there must be a way for SST to pick a default pintool. The location of the default is not specified in the macro declaring this parameter (pin3frontend.h). It is instead provided in the .cc file. The default is tool_path, which is set to ARIEL_TOOL_DIR/fesimple.so (.dylib on mac).

ARIEL_TOOL_DIR is defined in the makefile as $(libexecdir). This is a standard variable defined by the Makefile manual as:

The directory for installing executable programs to be run by other programs rather than by users. This directory should normally be /usr/local/libexec, but write it as $(exec_prefix)/libexec. (If you are using Autoconf, write it as ‘@libexecdir@’.)

The definition of ‘libexecdir’ is the same for all packages, so you should install your data in a subdirectory thereof. Most packages install their data under $(libexecdir)/package-name/, possibly within additional subdirectories thereof, such as $(libexecdir)/package-name/machine/version.

As I installed in a custom directory, this directory is located at ~/morrigan/install/libexec/. The fesimple.so frontend is located there.

What files are in fesimple.so? Judging by the Makefile, the only input file is fesimple.cc. This file includes pin.H so this appears to be the pintool file we want.

ArielCore

Looking at arielcore.cc.

The coreQ holds ArielEvents. Various functions push to this queue, all of which are dispatched by refillQueue. This reads from a tunnel between the pintool and the SST component, called tunnel. It reads into a variable ac (for ArielCommand).

It looks for ac.command == ARIEL_START_INSTRUCTION. When it sees this, the next commands should be a number of ARIEL_PERFORM_READ and ARIEL_PERFORM_WRITE instructions, followed by ARIEL_END_INSTRUCTION. This seems a reasonable place to add another type of instruction. We could put all call/ret instructions here, and add a new "createXEvent" function to handle them.

So, how do we get those instructions on the tunnel?

This is passed in to the constructor.

ArielCPU

Looking at arielcpu.cc.

This file gets the tunnel from fesimple. It then passes the tunnel to each ArielCore. It seems to be the same tunnel. Need to investigate if all the traced threads just push all their info to this one tunnel, and if there is anyway to figure out where the transactions came from.

This file creates all the cores, one ArielCore each, and passes in the tunnel.

Back to fesimple

fesimple.cc writes to the tunnel.

InstrumentTrace loops over every basic block and if it ends in a call instruction, instruments it to call ariel_stack_call for each one. It pushes this information to arielStack[thr]. Similar for returns.

For writing to the tunnel, the function InstrumentInstruction is used. For memory operations, an appropriate callback is used, e.g. WriteInstructionReadOnly. For all other instructions, the WriteNoOp function is used as a callback.

So, that's where we'd have to start.

Plan

Goal: Capture information about executed conditional branches and provide this to Ariel cores.

Option 1: Send conditional branch information over the tunnel

  1. In fesimple.cc, extend InstrumentInstruction to also capture information about conditional branches. According to this StackOverflow post, you can detect conditional branches with if(INS_IsBranch(ins) && INS_HasFallThrough(ins)). This should happen around line 579.

  2. This function will need a new callback, like WriteInstructionWriteOnly, e.g. WriteConditionalBranch. (Note - WriteInstructionWriteOnly is just a wrapper that calls three other functions. We can safely just look at WriteInstructionWrite for reference). This function will need access to the thread id and the instruction pointer. This should be enough, but the target address, and whether it was taken or not can also be grabbed. Respectively, the first three are IARG_THREAD_ID, IARG_INST_PTR, and IARG_BRANCH_TARGET_ADDR. I'm not sure how to check if it was taken, but I don't think ScarPhase needs this. Will update if that's wrong. The I* variables are automatically set by pin.

  3. New functions need to be added to properly write this information to the tunnel. Looking at WriteInstructionWrite, we will need to extend ac.command to include a command we can use, such as ARIEL_CONDITIONAL_BRANCH. We will then fill up the ip and addr fields as normal. We would probably need to extend this field to have space for whether a branch is taken or not. That should conclude the changes to fesimple.cc.

  4. In the previous step, we mentioned a new ac.command. This is of type ArielShmemCmd_t. We can add a new instruction in ariel_shmem.h. Looking at this struct, it also needs a type for the instruction. Ariel isn't really set up to handle branches, so we should add a new type in ariel_inst_class.h, such as ARIEL_INST_BRANCH. Furthermore, we could probably encode the taken/not taken information in the payload member. Perhaps we could change payload to be a union type. Or one of the other members.

  5. I don't think pin3frontend.{h,cc} need to be changed. Nor do arielcpu.{h,cc}.

  6. Now, let's look at arielcore.cc. Line 818 is a switch on the type of command. We'll have a new case here for the branch instruction. Since we're not actually doing anything with the branch, this line just needs to send the information to the phase detector.

Option 2: BB Map

Reverse engineer the instruction pointers though a map of the basic block extents and knowledge of where each conditional branch is. This might be faster, but will require more engineering.

Actually, I don't think this approach would work. You'd have to take a memory reference, then look backwards at what conditional branch last executed. This seems impossible, or at least impractical without more tracing infrastructure. If you wanted BBVs, this would probably work, but it doesn't work for the CBRVs that ScarPhase uses.

  1. Generate a map of all basic blocks in the startup for the pintool. Look at how BBs are already looped over for an idea of how to do this.

  2. You'll also need the location of every conditional branch.

  3. Somehow share this information with the arielcore. Non-trivial, as these exist in separate processes. You'd likely need to create a new shared memory structure.

  4. During runtime, every memory instruction that comes through Ariel (or perhaps only some percentage of them) will go through this map, which will tell you what BB this was in.

Some References

Pin Instrumentation Arguments

  • Includes enums

Pin Generic Inspection API

SST Ariel Summary

  • I'm not sure the "Multi-Level Memory Front End" is actually implemented. I only. see files called "simple".

Original ScarPhase Paper

  • A. Sembrant, D. Eklov and E. Hagersten, "Efficient software-based online phase classification," 2011 IEEE International Symposium on Workload Characterization (IISWC), 2011, pp. 104-115, doi: 10.1109/IISWC.2011.6114207.

Extending ScarPhase with Multithreading Capabilities

  • A. Sembrant, D. Black-Schaffer and E. Hagersten, "Phase behavior in serial and parallel applications," 2012 IEEE International Symposium on Workload Characterization (IISWC), 2012, pp. 47-58, doi: 10.1109/IISWC.2012.6402900.

ScarPhase Github Repo

  • More ScarPhase publications in the Readme