Skip to content

SparseDiscreteMeanRevertingMarket

David Byrd edited this page Jul 23, 2019 · 10 revisions

Sparse Discrete Mean Reverting Market Example

This document briefly outlines how to configure a basic sparse discrete mean reverting financial market with one exchange and a collection of "zero intelligence" background agents.

By sparse discrete we refer to a simulation process whereby events can occur at any nanosecond during the day, but arbitrary amounts of time can also be skipped with zero computation cost. This is possible due to the use of event-based rather than stepwise time advancement, and the use of a continuous mean reverting fundamental process which can be evaluated at any time without the necessity of evaluating all prior time "steps" (as in the discrete case).

This permits simulation of "realistic" market time scales where the timing of individual trades is precise, and high frequency trading is possible, but the simulation still proceeds quickly in the usual case of less frequent actions per trader (seconds or minutes).

Assuming your Python environment has the required libraries, this example can be invokes simply by executing python abides.py -c sparse_zi on the command line, from the top-level ABIDES directory. You may add -v to request verbose mode (with much behavior explanation) or specify a particular random seed with -s 100 (using any number) for reproducible behavior.

More information on configuring your environment and obtaining the required libraries will be added soon.

The Exchange Agent

The ExchangeAgent class defines an ABIDES agent derived from the base Agent class by way of the FinancialAgent class. This agent acts as a typical stock exchange with limited hours, a continuous double auction mechanism, multiple trading symbols each with a limit order book, and a core set of actionable messages and responses dealing with orders and administration. For our baseline example, the exchange will support a modest population of traders producing a "background" market into which a strategy could be injected.

Here we briefly discuss the important parameters and methods of the ExchangeAgent class.

Initialization

class ExchangeAgent(FinancialAgent):

  def __init__(self, id, name, type, mkt_open, mkt_close, symbols, book_freq='S',
               pipeline_delay = 40000, computation_delay = 1, stream_history = 0,
               log_orders = False, random_state = None):

Each exchange requires an opening (mkt_open) and closing time (mkt_close) as pandas.Timestamp objects. Prior to the opening time, the market will respond to limited queries, such as requests to know the opening time. After the closing time, the market will respond to limited queries, such as the last trade price for an equity. Order-related requests will not be honored outside these operating hours.

Each exchange also requires a dictionary symbols of symbols which can be traded on the exchange. Each trading symbol is a dictionary key. The value for each symbol is another dictionary holding important parameters for the symbol (for example, hyper parameters used to control the random generation of the fundamental value series for the symbol).

The book_freq parameter controls if and how the exchange will archive snapshots of the order book at the end of the simulation. A parameter value of None disables archival of the order book. Any other value is passed to Pandas as a Timestamp frequency to which the order book should be resampled (e.g. 1S for one second, 5T for five minutes, etc).

The pipeline_delay parameter adds a parallel processing pipeline delay (in integer nanoseconds) to order-related orders on the exchange. Order actions are applied immediately, but any outbound messages to agents are delayed by this duration. The exchange is not delayed for this duration. It can still accept and process new messages during the pipeline delay.

The computation_delay parameter functions as for any Agent. It delays the agent from processing new messages for the specified integer nanosecond delay after returning from message handling. Outbound messages also receive the delay. This simulates the agent being "busy" for the specified time, and outbound messages being generated after the processing is complete.

The stream_history parameter specifies an integer length of executed trades for the exchange to remember per symbol. It will also store all order activity leading up to those trades. This is useful for certain types of order-book aware agents like the Heuristic Belief Learning agent.

The log_orders parameter (boolean) controls whether the exchange will log, to its personal Agent log, all order-related activity. Disabling this can speed the simulation when not needed.

Simulation Lifecycle

The ExchangeAgent class overrides two of the four main simulation lifecycle methods: Agent.kernelInitializing() to additionally obtain oracular opening prices, and Agent.kernelTerminating() to additionally log exchange-specific data.

The kernelInitializing method simply retains a kernel reference (via the superclass call) and then iterates over each symbol traded on the exchange to obtain an opening price. This is recorded as a "last trade price". There are no quotes in the book when simulation begins.

To enable a generative sparse discrete mean reverting process for the stocks traded on an exchange (as opposed to historical data), select the util.oracle.SparseMeanRevertingOracle oracle class. For a generative dense discrete mean reverting process, use util.oracle.MeanRevertingOracle instead.

# The exchange agent overrides this to obtain a reference to an oracle.
# The exchange requires an oracle to generate an opening price in case agents
# query before simulated trades are made.  Once an opening cross auction is
# implemented, this should not be needed.
def kernelInitializing (self, kernel):
  super().kernelInitializing(kernel)

  self.oracle = self.kernel.oracle

  # Obtain opening prices (in integer cents).  These are not noisy.
  for symbol in self.order_books:
    self.order_books[symbol].last_trade = self.oracle.getDailyOpenPrice(symbol, self.mkt_open)
    log_print ("Opening price for {} is {}", symbol, self.order_books[symbol].last_trade)

The kernelTerminating method is quite busy for the exchange agent because of all its special data. First, it writes its agent log as usual (via the superclass call), which will contain raw order logs if requested in the configuration. Then it checks to see if the oracle has a log of fundamental prices, which it serializes for each symbol. Then, if requested, it serializes periodic, full-depth snapshots of the order book for each symbol during the trading period.

# The exchange agent overrides this to additionally log the full depth of its
# order books for the entire day.
def kernelTerminating (self):
  super().kernelTerminating()

  # If the oracle supports writing the fundamental value series for its
  # symbols, write them to disk.
  if hasattr(self.oracle, 'f_log'):
    for symbol in self.oracle.f_log:
      ### Truncated for brevity: write out this symbol's fundamental. ###

  # Skip order book dump if requested.
  if self.book_freq is None: return

  # Iterate over the order books controlled by this exchange.
  for symbol in self.order_books:
    book = self.order_books[symbol]

    # Log full depth quotes (price, volume) from this order book at some pre-determined frequency.
    # Here we are looking at the actual log for this order book (i.e. are there snapshots to export,
    # independent of the requested frequency).
    if book.book_log:
      ### Truncated for brevity: write out this symbol's order book snapshots. ###

Simulation Event Loop

The receiveMessage lifecycle method awakens the exchange to respond to the requests of other agents. It requires the Kernel to deliver the exchange agent's current simulation timestamp and an instance of the Message class. It first allows superclass behavior and then configured the exchange agent's computation delay, discussed in the __init__ method, above.

def receiveMessage (self, currentTime, msg):
  super().receiveMessage(currentTime, msg)

  # Note that computation delay MUST be updated before any calls to sendMessage.
  self.setComputationDelay(self.computation_delay)

Next, the exchange tests to see if it is after the market close time. If so, it will normally process query-style messages, to allow agents to obtain information about the final state of the market. To all other requests, it will response with a "market closed" message, without any further processing.

  # Is the exchange closed?  (This block only affects post-close, not pre-open.)
  if currentTime > self.mkt_close:
    # Most messages after close will receive a 'MKT_CLOSED' message in response.  A few things
    # might still be processed, like requests for final trade prices or such.
    if msg.body['msg'] in ['LIMIT_ORDER', 'CANCEL_ORDER']:
      log_print ("{} received {}: {}", self.name, msg.body['msg'], msg.body['order'])
      self.sendMessage(msg.body['sender'], Message({ "msg": "MKT_CLOSED" }))

      # Don't do any further processing on these messages!
      return
    elif 'QUERY' in msg.body['msg']:
      # Specifically do allow querying after market close, so agents can get the
      # final trade of the day as their "daily close" price for a symbol.
      pass
    else:
      log_print ("{} received {}, discarded: market is closed.", self.name, msg.body['msg'])
      self.sendMessage(msg.body['sender'], Message({ "msg": "MKT_CLOSED" }))

      # Don't do any further processing on these messages!
      return

Next, the exchange logs messages that could result in placing or cancelling an order, if that was requested during initialization.

  # Log order messages only if that option is configured.  Log all other messages.
  if msg.body['msg'] in ['LIMIT_ORDER', 'CANCEL_ORDER']:
    if self.log_orders: self.logEvent(msg.body['msg'], js.dump(msg.body['order']))
  else:
    self.logEvent(msg.body['msg'], msg.body['sender'])

The final and longest section of the method handles all the specific message types understood by the exchange. They are: WHEN_MKT_OPEN, WHEN_MKT_CLOSE, QUERY_LAST_TRADE, QUERY_SPREAD, QUERY_ORDER_STREAM, LIMIT_ORDER, CANCEL_ORDER, and MODIFY_ORDER.

  # Handle all message types understood by this exchange.
  if msg.body['msg'] == "WHEN_MKT_OPEN":
    ### Truncated for brevity ###

The sendMessage lifecycle method requires an agent id for the recipient and an instance of the Message class. It applies the configured parallel processing pipeline delay requested during initialization to all message types that could have altered the order book.

def sendMessage (self, recipientID, msg):
  # The ExchangeAgent automatically applies appropriate parallel processing pipeline delay
  # to those message types which require it.
  if msg.body['msg'] in ['ORDER_ACCEPTED', 'ORDER_CANCELLED', 'ORDER_EXECUTED']:
    # Messages that require order book modification (not simple queries) incur the additional
    # parallel processing delay as configured.
    super().sendMessage(recipientID, msg, delay = self.pipeline_delay)
    if self.log_orders: self.logEvent(msg.body['msg'], js.dump(msg.body['order']))
  else:
    # Other message types incur only the currently-configured computation delay for this agent.
    super().sendMessage(recipientID, msg)

The Sparse ZI Configuration File

Each ABIDES experiment begins with a configuration file in the ./config directory. These configuration files are simply Python code that sets up and then kicks off the desired simulation environment. There is a lot of similarity among experimental configurations. The primary element that will change is the agent population and parameterization.

import argparse

parser = argparse.ArgumentParser(description='Detailed options for sparse_zi config.')
parser.add_argument('-b', '--book_freq', default=None,
                    help='Frequency at which to archive order book for visualization')
parser.add_argument('-c', '--config', required=True,
                    help='Name of config file to execute')
parser.add_argument('-l', '--log_dir', default=None,
                    help='Log directory name (default: unix timestamp at program start)')
parser.add_argument('-n', '--obs_noise', type=float, default=1000000,
                    help='Observation noise variance for zero intelligence agents (sigma^2_n)')
parser.add_argument('-o', '--log_orders', action='store_true',
                    help='Log every order-related action by every agent.')
parser.add_argument('-s', '--seed', type=int, default=None,
                    help='numpy.random.seed() for simulation')
parser.add_argument('-v', '--verbose', action='store_true',
                    help='Maximum verbosity!')
parser.add_argument('--config_help', action='store_true',
                    help='Print argument options for this config file')

args, remaining_args = parser.parse_known_args()

if args.config_help:
  parser.print_help()
  sys.exit()

ABIDES uses the standard Python argparse module to handle command line parameters. The sparse_zi configuration uses three custom parameters in addition to those standard to all ABIDES configurations. The arguments are parsed with the parse_known_args method because we permits abides.py and the config files to 'peel off' arguments in multiple stages, leaving the rest for possible subsequent parsing.

The -c <config name> parameter is read by the abides.py bootstrapper and used to select a configuration file. The -l <log dir> optional parameter can be used to force a particular name on the log subdirectory (under ./log) where all log files will be written; if omitted, the directory name will be the current timestamp. The -s <seed> optional parameter can be used to set a global random seed to kick off the experiment, producing consistent results if all underlying code depends on this seed; if omitted, the current timestamp is used to seed the experiment. The optional -v parameter requests verbose mode, in which all calls to the util.log_print method will actually be processed; if omitted, the simulation runs in silent mode and calls to util.log_print are efficiently discarded. The optional --config_help parameter displays help text about the parameters and exits without starting a simulation.

There are three custom parameters for the sparse_zi config. The -b <freq> parameter is a Pandas frequency string that controls the frequency of order book snapshots to log, with None disabling order book snapshots. The -n <float> parameter controls the variance (noise) in agent observations of the fundamental value of a stock at a particular time. The -o parameter, if specified, instructs agents to log all order-related activity; if omitted, order-related activity will not be logged. These logged order events are detailed and their serialization can be slow. Thus, omitting them from the logs can provide a notable performance boost when they are not required for analysis.

historical_date = pd.to_datetime('2014-01-28')

midnight = historical_date
kernelStartTime = midnight

kernelStopTime = midnight + pd.to_timedelta('17:00:00')

defaultComputationDelay = 1000000000

This section controls the basic setup of time in the simulation. It is required to specify a specific date to simulate, as a Pandas Timestamp object. If no particular real world date is required (e.g. for historical market data), any arbitrary date may be given. The sparse_zi configuration is using generated market data, not historical, so an arbitrary date is used.

For the sparse_zi configuration, the Kernel will begin simulating time at midnight on the selected date. The call to pd.to_datetime has already defaulted to midnight on the given date, so this is the correct time. The Kernel will end simulation time at 5:00 PM the same day, as computed using a 17 hour offset with pd.to_timedelta. This is the general method for working with times in ABIDES: Pandas Timestamp objects modified by Pandas Timedelta objects.

The final line configures the default computation delay for all agents in the simulation. ABIDES implements the idea of "thinking time" for each agent. This permits experiments which explicitly account for (and effectively penalize) agents that take a long time to compute their decisions. All messages sent by the agent during a wake cycle will be delayed by the agent's computation time, and the agent will be unable to receive any new outside information (wakeup or receiveMessage) until its computation time has passed.

kernel = Kernel("Base Kernel", random_state = np.random.RandomState(seed=np.random.randint(low=0,high=2**32)))

This line instantiates the ABIDES Kernel, which is the primary driver and gatekeeper for the simulation. It handles message passing, enforces the passage of time, and performs other core functions. Here we supply it with an initialized RandomState object (using the global seed) to allow reproduction of the same simulation.

agent_count = 0
agents = []
agent_types = []

These lines are used to track the agent population as we instantiate various instances of the Agent class and its children. agent_count tracks the total number of agents created, agents is a list of the actual initialized agent objects (to pass to the Kernel), and agent_types is a matching list of the exact subclass to which each agent belongs (for summarization of results by agent "type").

mkt_open = midnight + pd.to_timedelta('09:30:00')
mkt_close = midnight + pd.to_timedelta('16:00:00')

oracle = SparseMeanRevertingOracle(mkt_open, mkt_close, symbols)

num_exchanges = 1
agents.extend([ ExchangeAgent(j, "Exchange Agent {}".format(j), "ExchangeAgent", mkt_open, mkt_close, [s for s in symbols], log_orders=log_orders, book_freq=book_freq, pipeline_delay = 0, computation_delay = 0, stream_history = 10, random_state = np.random.RandomState(seed=np.random.randint(low=0,high=2**32))) for j in range(agent_count, agent_count + num_exchanges) ])
agent_types.extend(["ExchangeAgent" for j in range(num_exchanges)])
agent_count += num_exchanges

The above section of code configures one exchange agent for the simulation, with market hours of 9:30 AM to 4:00 PM. It creates a SparseMeanRevertingOracle for the exchange, which will generate a random fundamental value series for each stock that can be sampled at any point in time, without the requirement to compute all intervening time steps.

The new ExchangeAgent instance is appended to the agents list, its agent class name is appended to the agent_types list, and the count of all agents is incremented appropriately. See the ExchangeAgent section for discussion of parameters particular to that agent.

The symbols dictionary specifies the important properties used to compute the fundamental value series for each symbol traded on a particular exchange in the simulation. The exact properties required depend on the Oracle in use (from the util.oracle module). Each symbol's trading ticker appears as a key in the dictionary, with the value being a dictionary of properties for that symbol. The symbol ticker strings (only) are given to the exchange agent, while the full dictionary of parameters is given to the oracle.

### Configure some zero intelligence agents.
# Cash in this simulator is always in CENTS.
starting_cash = 10000000
symbol = 'IBM'
s = symbols[symbol]

# Tuples are: (# agents, R_min, R_max, eta).
zi = [ (15, 0, 250, 1), (15, 0, 500, 1), (14, 0, 1000, 0.8), (14, 0, 1000, 1), (14, 0, 2000, 0.8), (14, 250, 500, 0.8), (14, 250, 500, 1) ]

for i,x in enumerate(zi):
  strat_name = "Type {} [{} <= R <= {}, eta={}]".format(i+1, x[1], x[2], x[3])
  agents.extend([ ZeroIntelligenceAgent(j, "ZI Agent {} {}".format(j, strat_name), "ZeroIntelligenceAgent {}".format(strat_name), random_state = np.random.RandomState(seed=np.random.randint(low=0,high=2**32)),log_orders=log_orders, symbol=symbol, starting_cash=starting_cash, sigma_n=sigma_n, r_bar=s['r_bar'], kappa=s['agent_kappa'], sigma_s=s['fund_vol'], q_max=10, sigma_pv=5e6, R_min=x[1], R_max=x[2], eta=x[3], lambda_a=1e-12) for j in range(agent_count,agent_count+x[0]) ])
  agent_types.extend([ "ZeroIntelligenceAgent {}".format(strat_name) for j in range(x[0]) ])
  agent_count += x[0]

The above section sets up the configuration of zero intelligence agents for this simulation. Each agent will begin with $100,000.00 in cash and will trade only the symbol 'IBM'. (The symbol is arbitrary here, since we are using a generated fundamental value series.) The list of tuples configures a certain number of agents with various parameterization: 15 agents that select requested surplus uniformly from the range 0-250 with a strategic threshold parameter (eta) of 1.0, 15 that select requested surplus from 0-500 with eta 1.0, and so on.

latency = np.random.uniform(low = 21000, high = 13000000, size=(len(agent_types),len(agent_types)))

for i, t1 in zip(range(latency.shape[0]), agent_types):
  for j, t2 in zip(range(latency.shape[1]), agent_types):
    if j > i:
      if (t1 == "ZeroIntelligenceAgent" and t2 == "ZeroIntelligenceAgent"):
        latency[i,j] = 1000000000 * 60 * 60 * 24    # Twenty-four hours.
    elif i > j:
      latency[i,j] = latency[j,i]
    else:
      latency[i,j] = 20000

This section of code configures our pairwise communication latency matrix for the agent population. We initialize a square matrix (from agent, to agent, latency in ns) with latency values that range from "colocation" latencies to "New York City to Seattle" latencies. The subsequent for loops override this with a special 24 hour latency for strategy agents communicating with other strategy agents (because this is forbidden in our experiment), a 20 microsecond delay for agents communicating with themselves via loopback, and mirrors the bottom half of the matrix to match the top half so all latency is pairwise symmetric.

noise = [ 0.25, 0.25, 0.20, 0.15, 0.10, 0.05 ]

This section defines an extremely simple latency noise model, which can add a few nanoseconds of extra delay to message delivery. The list index is the number of nanoseconds to add and the value is the probability of selecting that index. Thus it represents a probability density function for latency noise.

kernel.runner(agents = agents, startTime = kernelStartTime,
              stopTime = kernelStopTime, agentLatency = latency,
              latencyNoise = noise,
              defaultComputationDelay = defaultComputationDelay,
              oracle = oracle, log_dir = log_dir)

Once everything is configured, we finally kick off the Kernel we earlier configured, passing it the agent population list, start and stop time for simulation, latency matrix and noise model, the default computation delay for agents, the oracle used for fundamental value series generation, and the output log directory. Note that agents can always update their own individual computation delay to better represent their activities.