Skip to content

Latest commit

 

History

History
1233 lines (1056 loc) · 49.2 KB

2022-07-06.md

File metadata and controls

1233 lines (1056 loc) · 49.2 KB

Minesweeping in Materialize

Today we are going to write a game! A massively multiplayer game!

I've been increasingly interested in board games. Partly this is a byproduct of the pandemic, but shaped by an interest in creating rules and structure that nonetheless allow people to interact in creative ways. Games have a lot of variety, but I'm personally interested most in ones that are 1. cooperative, 2. asynchronous, and 3. accommodate players of varying levels of commitment.

We are going to use Materialize to implement a cooperative version of Minesweeper, inspired by https://m3o.xyz. We are also going to make what I think is a pretty cool tweak to it, that will allow teams of folks to collaborate in more interesting ways! Tremendous credit to Andreas Fuchs who proposed the name Oursweeper.

Of course, we are also going to show off some of the new and interesting properties of Materialize as we do! We'll evolve from a "relaxing" game of clicking on things at your own pace, to one where a team of players collaborately play related games that inform each other, to one of scale-out automation where teams write views to propose rafts of clicks using independent compute resources. I haven't concluded that these changes to the base game are as much "fun" as they are "interesting" and "instructive", so don't run over to KickStarter just yet.

We will do all of this in steps, to try and avoid overwhelming you.

  1. Background on Minesweeper, and our goals.
  2. Oursweeper in SQL.
  3. Using SQL to play Oursweeper.
  4. Architecture in Materialize: tables, views, indexes, materialized views.

Background on Minesweeper

Minesweeper is a game played on a large grid, in which various mines are concealed under grid squares. Your job is to discover where the mines are, indirectly. Rather than directly interact with any mines (ouch) you are instead meant to click on non-mine squares, and be told the number of adjacent squares (at most eight) that hide a mine. Using these numbers and your high-powered noggin you are meant to deduce which squares must have mines, and which squares cannot have mines, click on the latter type of square, and repeat indefinitely.

Here's an example visual that we will actually build up to. Each grid location is a count of the number of adjacent mines, or a "X" to indicate a flagged mine, or a "." to indicate "as yet unknown".

 . . . . . . . . . . . . . . . . . . . . .
 . . . . . . . . . . . . . . . . . . . . .
 . . . . . . . . . . . . . . . . . . . . .
 . . . . . . . . . . . . . . . . . . . . .
 . . . . . . . . . . . . . . . . . . . . .
 . . . . . . . . . . . . . . . . . . . . .
 . . . . . . . . . . . . . . . . . . . . .
 . . . . . . . . . . . . . . . . . . . . .
 . . . . . . . . . 2 1 2 1 3 X . . . . . .
 . . . . . . . . . 1 0 0 0 1 2 X . . . . .
 . . . . . . . . . 1 0 0 0 0 1 1 1 1 . . .
 . . . . . . . . . 1 1 1 1 1 0 0 0 1 . . .
 . . . . . . . . . . . . X 2 1 0 1 1 . . .
 . . . . . X X 2 2 2 . . . X 1 0 1 X . . .
 . . . 2 1 2 2 1 0 1 X 1 1 1 1 0 1 2 . . .
 . . . 2 0 0 0 0 0 1 1 1 0 0 0 0 0 1 . . .
 . . . 2 1 0 0 0 0 0 0 0 0 0 0 0 1 2 . . .
 . . . X 1 1 1 1 0 0 1 1 2 1 1 1 2 X . . .
 . . . . . . X 1 1 1 2 X . . . . . . . . .
 . . . . . . . . . . . . . . . . . . . . .
 . . . . . . . . . . . . . . . . . . . . .

Mechanically, you the user "left click" on a square to reveal its count (or explode), and "right click" on a square to flag it as a purported mine.

Scaling Minesweeping: Size

Minesweeper is normally played on a bounded grid, but there is no reason this needs to be the case.

The rules of Minesweeper make plenty of sense even if the grid cells spill off the four sides of the window you are looking at. Moreover, there is no reason you shouldn't be able to scroll around, beyond the limits of the window you might initially open. There is no reason you shouldn't be able to play minesweeper on a 2^64 x 2^64 game board, where you just happen to pick your favorite coordinate to center your screen on. This is what https://m3o.xyz does, as far as I can tell, and we are going to do that too.

To make this work, obviously we won't actually populate 2^128 cells. Instead, we will use a pseudorandom function from (x, y) to determine the nature of the mine at location (x, y). We will only evaluate the locations around those that a user has clicked on, determining where mines exist lazily, but consistently. If other folks visit the same location as you, either collaborators or adversaries, they will see the same mines that you saw.

The intended experience is that of a vast and limitless sea uniformly full of dangerous mines and logical opportunities.

Scaling Minesweeper: Players

Why not let multiple folks interact with the Minesweeper board at the same time?

You would need to back it in something like a database, and provide handy views that implement "the rules". Fortunately, Materialize is pretty much a databases, SQL lets us write views, so this could plausibly work out.

We'll spend most of the rest of the post doing exactly this. We will assemble SQL tables, views, and queries to mock up a massively multiplayer minesweeper on Materialize.

Scaling Minesweeper: Fun

We're going to make a few changes to Minesweeper in the interest of fun.

First, clicking on a mine would normally end the game. It won't do that here. We'll keep track of the mines you've clicked on, both flags and tests, correct and incorrect, and perhaps the leaderboards will report all in something like a Pareto frontier.

If you are a minesweeper purist, you can stop playing as soon as you click on a mine, as for better or for worse Materialize will have durably recorded your error.

Second, rather than rolling a die to see if a grid cell has a mine for everyone, each grid cell will have a mine and we'll roll a die to stamp a number onto the mine. You can play the game where 0 means that a mine is present. Your friend can play the game where 1 means that a mine is present. For either player, it is as if they are playing Minesweeper with random mine locations. But if two or more players team up, they can share their information can across the team: locations correctly flagged by players with one number are safe for players with another number.

This second change has the cool property that one player racing ahead of others leaves behind breadcrumbs that allow the others to catch up. Teammates that are slower, less invested, or perhaps just out getting lunch, don't slow down your game. And, if you get stuck, you can switch numbers and try to unstick yourself.

If you aren't wild about the change, just get all your friends to pick the same number and you'll see the same mines.

Oursweeeper in SQL

We are going to start with some module-level documentation describing our plan:

-- Oursweeper: cooperative, asynchronous minesweeper.
--
-- Each (x, y) location has a "mine", a value from 0 through 5.
-- For player `i`, mines with that value are active and may cause them to explode!
-- Locations that are not `i` can be "clicked" to report the number of adjacent `i` squares.
-- Each location has exactly one mine, so one player's discovery can inform their teammates!

Base Tables

We'll model user input as either a guess that a location is a mine ("has the value i") or a guess that a location is mine-free ("does not have the value i"). We'll refer to these as flags and tests, respectively.

Let's start with the guesses that a location is a mine (represented in the game as a flag). We'll add a few helpful views to make sure we don't accidentally treat flags as the truth.

-- Each "flag" is a supposition by `team` that the `mine` value is `flag`.
create table flags (x int, y int, flag int, team text);

-- Protect ourselves against the confusion of multiple entries.
create view distinct_flags as select distinct * from flags;

-- Restrict ourselves to correctly flagged mines.
create materialized view correct_flags as
select x, y, flag, team from distinct_flags
where get_byte(sha256((x::text || y::text)::bytes), 0) % 6 = flag;

Here we see the first instance of our function to determine how each grid cell is mined:

    -- A value 0 .. 6 determined by the pair `(x, y)`.
    get_byte(sha256((x::text || y::text)::bytes), 0) % 6

The x and y coordinates are merged, then hashed, and we pick out one byte to kinda-sorta get a random (if not fair) die roll. You could certainly change this and get a different game; no strong feelings that this is the correct function. In particular, the % 6 is where we determine what fraction of grid cells are mines (one out of six, somewhere between medium and expert difficulties for the Windows version of Minesweeper).

The guesses that a location is not a mine are handled similarly, but with an inequality instead:

-- Each "test" is a supposition that the `mine` value is not `test`.
create table tests (x int, y int, test int, team text);

-- Protect ourselves against the confusion of multiple entries.
create view distinct_tests as select distinct * from tests;

-- Restrict ourselves to correctly tested locations.
create materialized view correct_tests as
select x, y, test, team from distinct_tests
where get_byte(sha256((x::text || y::text)::bytes), 0) % 6 != test;

These two tables, flags and tests, will record the state of the game. Players play by inserting into these tables using their team name.

Observing some output

If players guess wrong the game still goes on, but let's make some views to notice their errors.

-- Errors are too bad. But they don't end the game.
create materialized view errors as
select x, y, flag as mine, 'flag' as action, team
from distinct_flags where get_byte(sha256((x::text || y::text)::bytes), 0) % 6 != flag
union all
select x, y, test as mine, 'test' as action, team
from distinct_tests where get_byte(sha256((x::text || y::text)::bytes), 0) % 6 = test;

We'll now start up in one session a subscription to a view of errors:

materialize=> copy (subscribe errors) to stdout;

There isn't any output, which makes sense as there also is not yet any input.

Let's add an incorrect guess about a mine. I happen to know that the mine at (0, 0) has value 1, so let's guess 0.

materialize=> insert into flags values (0, 0, 0, 'frank');
INSERT 0 1

Over in the other session, we get our first line of output:

materialize=> copy (subscribe errors) to stdout;
1657277533154   1       0       0       0       flag    frank

This is Materialize's SUBSCRIBE output. It reports a timestamp, a change in count, and then the columns of the changed row. In this case, our flag guess of 0 at coordinate (0, 0) is an error.

Let's introduce the correct guess:

materialize=> insert into flags values (0, 0, 1, 'frank');
INSERT 0 1

Nothing to report in the other session with the SUBSCRIBE. This entry wasn't erroneous, and so "no change" in the view. Had we run the SUBSCRIBE with the PROGRESS option, there would be a steady heartbeat of timestamp messages confirming that no changes have occured, which affirmatively distinguishes from the case that the session is just backed up.

Let's set up a scoreboard as well!

You may have noticed the team column, with 'frank' filled in there. We'll use this column as a way for multiple players to self-identify as collaborators. Their guesses, errors, and other information will all feed to the same reports.

-- Count up the correct tests and flags for each team
create materialized view scoreboard as
with
    -- distinct teams from either input.
    teams(team) as (
        select distinct team
        from (
            select team from tests
            union all
            select team from flags
        )
    ),
    -- count the number of correct tests.
    test_count(team, tests) as (
        select team, count(*)
        from correct_tests
        group by team
    ),
    -- count the number of correct flags.
    flag_count(team, flags) as (
        select team, count(*)
        from correct_flags
        group by team
    ),
    -- count the number of errors.
    error_count(team, errors) as (
        select team, count(*)
        from errors
        group by team
    )
-- left join the teams against each of the things to report.
select teams.team, tests, flags, errors
from teams
    left join test_count on (teams.team = test_count.team)
    left join flag_count on (teams.team = flag_count.team)
    left join error_count on (teams.team = error_count.team);

We can now pull down these reports for each team. We can SELECT from the view, or SUBSCRIBE to it, as you like. Maybe in the future we will reduce down to those teams that are not strictly dominated by the performance of another team ("Skyline queries" in SQL).

Probing Locations

Our next step is to take our flag and test collections, and have them produce numbers to report to the players. A correct test should reveal the count under it, as in standard minesweeper.

materialize=> select * from correct_tests;
 x | y | mine | team
---+---+------+------
(0 rows)

materialize=>

Ah. Although we know that (0, 0) contains a mine of type 1, and have flagged it as such, we haven't actually tested for 0 yet. We could do that automatically, but for now let's ask folks to do this themselves.

materialize=> insert into tests values (0, 0, 0, 'frank');
INSERT 0 1
materialize=> select * from correct_tests;
 x | y | mine | team
---+---+------+-------
 0 | 0 |    0 | frank
(1 row)

materialize=>

We need to "reach out" from each correct test, to assess the mines at each adjacent location. When doing this, we keep some notes about where we came from so that we can get the information back efficiently. Having done this, we aggregate up each probe and have a count to report!

-- Each tested location invokes a probe at neighbor locations.
create view probes as
select probe_x, probe_y, x, y, team, test
from
    correct_tests,
    generate_series(x - 1, x + 1) as probe_x,
    generate_series(y - 1, y + 1) as probe_y;

-- Each location reports some flavor of mine.
create view reports as
select x, y, get_byte(sha256((probe_x::text || probe_y::text)::bytes), 0) % 6 as mine, team, test
from probes;

-- The number of adjacent mines for probed locations.
-- We include the center and subtract one to ensure we see a zero.
create materialized view totals as
select x, y, mine, team, count(*) - 1 as count
from (
    -- union `reports` with `correct_tests` to avoid a left join.
    select * from reports
    union all
    select *, test from correct_tests
)
where mine = test
group by 1, 2, 3, 4;

If we look at the totals for the probes, we see:

materialize=> select * from totals;
 x | y | mine | team  | count 
---+---+------+-------+-------
 0 | 0 |    0 | frank |     0
(1 row)

materialize=>

Luckily, mine type 0 has no mines around it! We should probably click upon all of its neighbors!

Before going down that rabbit hole, let's first figure out how to present this information.

Presenting Information

We now have all the values for all the probes across a quite large game board. How do we get that information back to individual players without overwhelming them (or their connection)?

Each player will view the game board through a viewport. This is a description of a rectangular region, all of whose reported counts should be returned. We also need to include the mine and team, to know which subset of the information to present.

-- Each entry here results in an output viewport (window) for a certain mine.
create table viewport (x1 int, x2 int, y1 int, y2 int, mine int, team text);

Each entry in the viewport results in a small grid of locations that we want to report:

create view cells as
select distinct x, y, mine, team from
    viewport,
    generate_series(x1, x2) as x,
    generate_series(y1, y2) as y
;

To get something presentable, we'll fire off some subqueries for each element of cell, looking around in other relations that present data. In particular, we'll first look to see if there is a reported count, then if there is a flagged mine, and finally whether there is a flagged mine of a different type.

-- Convert counts to symbols, mines, clear cells, and include '.' for unknown locations.
create materialized view composited as
select x, y, mine, team,
    case
        when count is not null then count
        when mined is not null then mined
        when clear is not null then clear
        else ' .'
    end as symbol
from (
    select x, y, mine, team,
        (   -- counts from probes associated with this team.
            select ' '::text || count::text as count
            from totals
            where cells.x = totals.x
              and cells.y = totals.y
              and cells.mine = totals.mine
              and cells.team = totals.team
        ),
        (   -- a `X` for mines correctly flagged by this team.
            select ' X' as mined
            from correct_flags
            where cells.x = correct_flags.x
              and cells.y = correct_flags.y
              and cells.mine = correct_flags.flag
              and cells.team = correct_flags.team
        ),
        (   -- a `*` for locations known to be safe by flags of other mines.
            select ' *' as clear
            from correct_flags
            where cells.x = correct_flags.x
              and cells.y = correct_flags.y
              and not cells.mine = correct_flags.flag
              and cells.team = correct_flags.team
        )
    from cells
);

These should be the composited cells presenting whatever symbol we need in each location.

To present this to the user, we'll do a few shenanigans to lay out scanlines of symbols.

-- Create lines of output by grouping by `y`.
create view output as
select y, mine, string_agg(symbol, '' ORDER BY x) AS whole_row, team
from composited
group by y, mine, team;

To view the output, we'll need to introduce some viewports, specifying rectangle and mine type:

-- Define some overlapping viewports.
insert into viewport values (-10, 10, -10, 10, 0, 'frank');
insert into viewport values (-5, 10, -5, 5, 1, 'frank');

We can now peek at the output by selecting whole_row and ordering by y. Let's first look at the 0 mine:

materialize=> select whole_row from output where mine = 0 and team = 'frank' order by y;
                 whole_row
--------------------------------------------
  . . . . . . . . . . . . . . . . . . . . .
  . . . . . . . . . . . . . . . . . . . . .
  . . . . . . . . . . . . . . . . . . . . .
  . . . . . . . . . . . . . . . . . . . . .
  . . . . . . . . . . . . . . . . . . . . .
  . . . . . . . . . . . . . . . . . . . . .
  . . . . . . . . . . . . . . . . . . . . .
  . . . . . . . . . . . . . . . . . . . . .
  . . . . . . . . . . . . . . . . . . . . .
  . . . . . . . . . . . . . . . . . . . . .
  . . . . . . . . . . 0 . . . . . . . . . .
  . . . . . . . . . . . . . . . . . . . . .
  . . . . . . . . . . . . . . . . . . . . .
  . . . . . . . . . . . . . . . . . . . . .
  . . . . . . . . . . . . . . . . . . . . .
  . . . . . . . . . . . . . . . . . . . . .
  . . . . . . . . . . . . . . . . . . . . .
  . . . . . . . . . . . . . . . . . . . . .
  . . . . . . . . . . . . . . . . . . . . .
  . . . . . . . . . . . . . . . . . . . . .
  . . . . . . . . . . . . . . . . . . . . .
(21 rows)

materialize=>

Fascinating!

Let's do the 1 mine next:

materialize=> select whole_row from output where mine = 1 and team = 'frank' order by y;
            whole_row
----------------------------------
  . . . . . . . . . . . . . . . .
  . . . . . . . . . . . . . . . .
  . . . . . . . . . . . . . . . .
  . . . . . . . . . . . . . . . .
  . . . . . . . . . . . . . . . .
  . . . . . X . . . . . . . . . .
  . . . . . . . . . . . . . . . .
  . . . . . . . . . . . . . . . .
  . . . . . . . . . . . . . . . .
  . . . . . . . . . . . . . . . .
  . . . . . . . . . . . . . . . .
(11 rows)

materialize=>

Equally fascinating! The same location, with different takes for different players, is visible on both outputs!

In fact, we could SUBSCRIBE to either of these views, and each player would get live updates from the other player's work. We would need a smarter interface than what we have presented, as SUBSCRIBE is perhaps overly simplistic:

materialize=> copy (subscribe (select whole_row from output where mine = 0 and team = 'frank')) to stdout;
1660587116936	20	 . . . . . . . . . . . . . . . . . . . . .
1660587116936	1	 . . . . . . . . . . 0 . . . . . . . . . .

We would want to add back in the y colunm so that we could redraw the screen. We might want to SUBSCRIBE the composited relation to get individual changed squares, rather than whole rows. There are some really promising UX ideas here, where Materialize can ship you the minimal diff of information to update your presentation.

Using SQL to play Oursweeper

You play Minesweeper by repeatedly INSERTing into flags and tests. This probably gets tedious very quickly. I couldn't tell you; I haven't tried.

Instead, we can write SQL queries to produce modifications to these tables!

Determining locations to test

We should be able to test any location that has no adjacent mines (of our type). This is just a matter of looking at totals for counts of zero, and testing all neighbors.

-- We can test locations we know have no adjacent mines!
-- In fact, we can do this over and over until we stop inserting things.
create view should_test as
select distinct new_x as x, new_y as y, mine, team
from
    (select * from totals where count = 0),
    generate_series(x - 1, x + 1) as new_x,
    generate_series(y - 1, y + 1) as new_y
except
select * from distinct_tests;

If we peek at the contents of this view, we'll see new locations that are safe to test.

materialize=> select * from should_test where team = 'frank';
 x  | y  | mine | team
----+----+------+-------
  0 |  1 |    0 | frank
  0 | -1 |    0 | frank
  1 |  0 |    0 | frank
  1 |  1 |    0 | frank
  1 | -1 |    0 | frank
 -1 |  0 |    0 | frank
 -1 |  1 |    0 | frank
 -1 | -1 |    0 | frank
(8 rows)

materialize=>

All eight locations around the (0, 0) location are safe to test for mines of type 0. We could go and INSERT each of them, or we could use a SQL query to do that for us:

materialize=> insert into tests select * from should_test where team = 'frank';
INSERT 0 8

This is now an exciting moment. Let's look at the board again, for mines of type 0.

materialize=> select whole_row from output where mine = 0 and team = 'frank' order by y;
                 whole_row
-----------------------
  . . . . . . . . . . . . . . . . . . . . .
  . . . . . . . . . . . . . . . . . . . . .
  . . . . . . . . . . . . . . . . . . . . .
  . . . . . . . . . . . . . . . . . . . . .
  . . . . . . . . . . . . . . . . . . . . .
  . . . . . . . . . . . . . . . . . . . . .
  . . . . . . . . . . . . . . . . . . . . .
  . . . . . . . . . . . . . . . . . . . . .
  . . . . . . . . . . . . . . . . . . . . .
  . . . . . . . . . 1 0 0 . . . . . . . . .
  . . . . . . . . . 1 0 0 . . . . . . . . .
  . . . . . . . . . 1 1 1 . . . . . . . . .
  . . . . . . . . . . . . . . . . . . . . .
  . . . . . . . . . . . . . . . . . . . . .
  . . . . . . . . . . . . . . . . . . . . .
  . . . . . . . . . . . . . . . . . . . . .
  . . . . . . . . . . . . . . . . . . . . .
  . . . . . . . . . . . . . . . . . . . . .
  . . . . . . . . . . . . . . . . . . . . .
  . . . . . . . . . . . . . . . . . . . . .
  . . . . . . . . . . . . . . . . . . . . .
(21 rows)

materialize=>

We have more zeros! You Minesweeper folks know what this means! We .. should keep clicking! We shouldn't have to keep clicking, but Materialize doesn't support WITH RECURSIVE yet (write your representative!).

materialize=> insert into tests select * from should_test;
INSERT 0 8
materialize=>

Eight more insertions! This means that things changed again.

materialize=> select whole_row from output where mine = 0 and team = 'frank' order by y;
                 whole_row
-----------------------
  . . . . . . . . . . . . . . . . . . . . .
  . . . . . . . . . . . . . . . . . . . . .
  . . . . . . . . . . . . . . . . . . . . .
  . . . . . . . . . . . . . . . . . . . . .
  . . . . . . . . . . . . . . . . . . . . .
  . . . . . . . . . . . . . . . . . . . . .
  . . . . . . . . . . . . . . . . . . . . .
  . . . . . . . . . . . . . . . . . . . . .
  . . . . . . . . . 2 1 2 1 . . . . . . . .
  . . . . . . . . . 1 0 0 0 . . . . . . . .
  . . . . . . . . . 1 0 0 0 . . . . . . . .
  . . . . . . . . . 1 1 1 1 . . . . . . . .
  . . . . . . . . . . . . . . . . . . . . .
  . . . . . . . . . . . . . . . . . . . . .
  . . . . . . . . . . . . . . . . . . . . .
  . . . . . . . . . . . . . . . . . . . . .
  . . . . . . . . . . . . . . . . . . . . .
  . . . . . . . . . . . . . . . . . . . . .
  . . . . . . . . . . . . . . . . . . . . .
  . . . . . . . . . . . . . . . . . . . . .
  . . . . . . . . . . . . . . . . . . . . .
(21 rows)

materialize=>

Let's pause for a moment here and observe that the only things we are doing (twice now) are an insert followed by a select:

materialize=> insert into tests select * from should_test where team = 'frank';
materialize=> select whole_row from output where mine = 0 and team = 'frank' order by y;

The view definitions are doing the hard work behind the scenes. They make sure that the outputs are correctly updated as a function of the input changes. Combined with Materialize's Strict Serializable isolation, alternate insert and select statements show you the advancing board state you expect to see.

Ok. Enough pausing. Let's do a hand-rolled recursive process, and keep doing this INSERT INTO command until we have no more changes.

materialize=> insert into tests select * from should_test where team = 'frank';
INSERT 0 4
materialize=> insert into tests select * from should_test where team = 'frank';
INSERT 0 3
materialize=> insert into tests select * from should_test where team = 'frank';
INSERT 0 5
materialize=> insert into tests select * from should_test where team = 'frank';
INSERT 0 6
materialize=> insert into tests select * from should_test where team = 'frank';
INSERT 0 6
materialize=> insert into tests select * from should_test where team = 'frank';
INSERT 0 3
materialize=> insert into tests select * from should_test where team = 'frank';
INSERT 0 9
materialize=> insert into tests select * from should_test where team = 'frank';
INSERT 0 8
materialize=> insert into tests select * from should_test where team = 'frank';
INSERT 0 4
materialize=> insert into tests select * from should_test where team = 'frank';
INSERT 0 3
materialize=> insert into tests select * from should_test where team = 'frank';
INSERT 0 3
materialize=> insert into tests select * from should_test where team = 'frank';
INSERT 0 6
materialize=> insert into tests select * from should_test where team = 'frank';
INSERT 0 7
materialize=> insert into tests select * from should_test where team = 'frank';
INSERT 0 7
materialize=> insert into tests select * from should_test where team = 'frank';
INSERT 0 4
materialize=> insert into tests select * from should_test where team = 'frank';
INSERT 0 4
materialize=> insert into tests select * from should_test where team = 'frank';
INSERT 0 3
materialize=> insert into tests select * from should_test where team = 'frank';
INSERT 0 0
materialize=>

You can imagine why WITH RECURSIVE can occasionally be popular.

Let's take a peek at the output again.

materialize=> select whole_row from output where mine = 0 and team = 'frank' order by y;
                 whole_row
-----------------------
  . . . . . . . . . . . . . . . . . . . . .
  . . . . . . . . . . . . . . . . . . . . .
  . . . . . . . . . . . . . . . . . . . . .
  . . . . . . . . . . . . . . . . . . . . .
  . . . . . . . . . . . . . . . . . . . . .
  . . . . . . . . . . . . . . . . . . . . .
  . . . . . . . . . . . . . . . . . . . . .
  . . . . . . . . . . . . . . . . . . . . .
  . . . . . . . . . 2 1 2 1 3 . . . . . . .
  . . . . . . . . . 1 0 0 0 1 2 . . . . . .
  . . . . . . . . . 1 0 0 0 0 1 1 1 1 . . .
  . . . . . . . . . 1 1 1 1 1 0 0 0 1 . . .
  . . . . . . . . . . . . . 2 1 0 1 1 . . .
  . . . . . . . 2 2 2 . . . . 1 0 1 . . . .
  . . . 2 1 2 2 1 0 1 . 1 1 1 1 0 1 2 . . .
  . . . 2 0 0 0 0 0 1 1 1 0 0 0 0 0 1 . . .
  . . . 2 1 0 0 0 0 0 0 0 0 0 0 0 1 2 . . .
  . . . . 1 1 1 1 0 0 1 1 2 1 1 1 2 . . . .
  . . . . . . . 1 1 1 2 . . . . . . . . . .
  . . . . . . . . . . . . . . . . . . . . .
  . . . . . . . . . . . . . . . . . . . . .
(21 rows)

materialize=>

That's pretty neat, isn't it! We've built up a little island. If you are familiar with Minesweeper you might now start thinking about where we can certainly place mines.

Determining locations to flag

We can place flags using a variety of rules. We'll use a simple one for now: if a grid cell with a count is short mines exactly the number of unmarked neighbors, all those neighbors must be mines. This rule is great for filling in all of those nooks and crannies you see up above.

-- We can flag locations when the count equals the number of untested neighbors.
create view should_flag_from as
select x, y, test as flag, team
from totals
where totals.count > 0 and totals.count = 9 - (select count(*) from (
    select distinct *
    from
        generate_series(x - 1, x + 1) as probe_x,
        generate_series(y - 1, y + 1) as probe_y,
        correct_tests as tests
    where tests.x = probe_x and tests.y = probe_y and tests.test = totals.test
));

-- All adjacent, unmarked locations must be mines.
create view should_flag as
select distinct * from (select probe_x as x, probe_y as y, flag, team
from
    should_flag_from,
    generate_series(x - 1, x + 1) as probe_x,
    generate_series(y - 1, y + 1) as probe_y
)
except all
select * from correct_tests
except all
select * from distinct_flags;

We can take a peek at the should_flag view, containing locations known to be mines.

materialize=> select * from should_flag where team = 'frank';
 x  | y  | flag | team
----+----+------+-------
  0 |  4 |    0 | frank
  1 |  8 |    0 | frank
  2 |  2 |    0 | frank
  3 |  3 |    0 | frank
  4 | -2 |    0 | frank
  5 | -1 |    0 | frank
  7 |  3 |    0 | frank
  7 |  7 |    0 | frank
 -7 |  7 |    0 | frank
 -5 |  3 |    0 | frank
 -4 |  3 |    0 | frank
 -4 |  8 |    0 | frank
(12 rows)

materialize=>

Let's just add them in, and check out the picture again.

materialize=> insert into flags select * from should_flag where team = 'frank';
INSERT 0 12
materialize=> select whole_row from output where mine = 0 and team = 'frank' order by y;
                 whole_row
-----------------------
  . . . . . . . . . . . . . . . . . . . . .
  . . . . . . . . . . . . . . . . . . . . .
  . . . . . . . . . . . . . . . . . . . . .
  . . . . . . . . . . . . . . . . . . . . .
  . . . . . . . . . . . . . . . . . . . . .
  . . . . . . . . . . . . . . . . . . . . .
  . . . . . . . . . . . . . . . . . . . . .
  . . . . . . . . . . . . . . . . . . . . .
  . . . . . . . . . 2 1 2 1 3 X . . . . . .
  . . . . . . . . . 1 0 0 0 1 2 X . . . . .
  . . . . . . . . . 1 0 0 0 0 1 1 1 1 . . .
  . . . . . . . . . 1 1 1 1 1 0 0 0 1 . . .
  . . . . . . . . . . . . X 2 1 0 1 1 . . .
  . . . . . X X 2 2 2 . . . X 1 0 1 X . . .
  . . . 2 1 2 2 1 0 1 X 1 1 1 1 0 1 2 . . .
  . . . 2 0 0 0 0 0 1 1 1 0 0 0 0 0 1 . . .
  . . . 2 1 0 0 0 0 0 0 0 0 0 0 0 1 2 . . .
  . . . X 1 1 1 1 0 0 1 1 2 1 1 1 2 X . . .
  . . . . . . X 1 1 1 2 X . . . . . . . . .
  . . . . . . . . . . . . . . . . . . . . .
  . . . . . . . . . . . . . . . . . . . . .
(21 rows)

materialize=>

Look at all those X things!

Ready Player 1

Having marked all these mines of type 0, the boards for other mine types can be updated! Remember, each of these mines mean that the grid cell is free of mines of other types. If we peek, we'll see asterisks indicating where mines have been flagged by other players on the team.

materialize=> select whole_row from output where mine = 1 and team = 'frank' order by y;
 . . . . . . . . . . . . . . . .
 . . . . . . . . . . . . . . . .
 . . . . . . . . . . . . . . . .
 . . . . . . . . . * . . . . . .
 . . . . . . . . . . * . . . . .
 . . . . . X . . . . . . . . . .
 . . . . . . . . . . . . . . . .
 . . . . . . . * . . . . . . . .
 * * . . . . . . * . . . * . . .
 . . . . . * . . . . . . . . . .
 . . . . . . . . . . . . . . . .
materialize=>

Let's add all of our correct mine placement to tests for another user.

materialize=> insert into tests select x, y, 1, team from correct_flags where not flag = 1 and team = 'frank';
INSERT 0 12
materialize=> select whole_row from output where mine = 1 and tem = 'frank' order by y;
            whole_row
----------------------------------
  . . . . . . . . . . . . . . . .
  . . . . . . . . . . . . . . . .
  . . . . . . . . . . . . . . . .
  . . . . . . . . . 3 . . . . . .
  . . . . . . . . . . 1 . . . . .
  . . . . . X . . . . . . . . . .
  . . . . . . . . . . . . . . . .
  . . . . . . . 2 . . . . . . . .
  1 1 . . . . . . 1 . . . 2 . . .
  . . . . . 1 . . . . . . . . . .
  . . . . . . . . . . . . . . . .
(11 rows)

materialize=>

Interesting! We still see that mine there, but now have several other locations that have been explored. No zeros though, so no automatic progress to make.

How about mines of type 2?

materialize=> insert into tests select x, y, 2, team from correct_flags where not flag = 2 and team = 'frank';
INSERT 0 13
materialize=> insert into viewport values (-10, 10, -10, 10, 2, 'frank');
INSERT 0 1
materialize=> select whole_row from output where mine = 2 and team = 'frank' order by y;
                 whole_row
--------------------------------------------
  . . . . . . . . . . . . . . . . . . . . .
  . . . . . . . . . . . . . . . . . . . . .
  . . . . . . . . . . . . . . . . . . . . .
  . . . . . . . . . . . . . . . . . . . . .
  . . . . . . . . . . . . . . . . . . . . .
  . . . . . . . . . . . . . . . . . . . . .
  . . . . . . . . . . . . . . . . . . . . .
  . . . . . . . . . . . . . . . . . . . . .
  . . . . . . . . . . . . . . 0 . . . . . .
  . . . . . . . . . . . . . . . 2 . . . . .
  . . . . . . . . . . 2 . . . . . . . . . .
  . . . . . . . . . . . . . . . . . . . . .
  . . . . . . . . . . . . 0 . . . . . . . .
  . . . . . 1 3 . . . . . . 3 . . . 3 . . .
  . . . . . . . . . . 2 . . . . . . . . . .
  . . . . . . . . . . . . . . . . . . . . .
  . . . . . . . . . . . . . . . . . . . . .
  . . . 0 . . . . . . . . . . . . . 0 . . .
  . . . . . . 1 . . . . 1 . . . . . . . . .
  . . . . . . . . . . . . . . . . . . . . .
  . . . . . . . . . . . . . . . . . . . . .
(21 rows)

materialize=>

This player now has some zeros to explore, beyond the 2 at (0, 0).

materialize=> insert into tests select * from should_test where team = 'frank';
INSERT 0 30
materialize=> insert into tests select * from should_test where team = 'frank';
INSERT 0 19
materialize=> insert into tests select * from should_test where team = 'frank';
INSERT 0 15
materialize=> insert into tests select * from should_test where team = 'frank';
INSERT 0 21
materialize=> insert into tests select * from should_test where team = 'frank';
INSERT 0 16
materialize=> insert into tests select * from should_test where team = 'frank';
INSERT 0 21
materialize=> insert into tests select * from should_test where team = 'frank';
INSERT 0 25
materialize=> insert into tests select * from should_test where team = 'frank';
INSERT 0 29
materialize=> insert into tests select * from should_test where team = 'frank';
INSERT 0 12
materialize=> insert into tests select * from should_test where team = 'frank';
INSERT 0 9
materialize=> insert into tests select * from should_test where team = 'frank';
INSERT 0 8
materialize=> insert into tests select * from should_test where team = 'frank';
INSERT 0 2
materialize=> insert into tests select * from should_test where team = 'frank';
INSERT 0 0

All inserted, we arrive at the following picture for player 2:

materialize=> select whole_row from output where mine = 2 and team = 'frank' order by y;
                 whole_row
--------------------------------------------
  . . . . . . . . . . . . . . . . . . . . .
  . . . . . . . . . . . . . . . . . . . . .
  . . . . . . . . . . . . . . . . . . . . .
  . . . . . . . . . . . . . . . . . . . . .
  . . . . . . . . . . . . . . . . . . . . .
  . . . . . . . . . . . . . . . . . . . . .
  . . . . . . . . . . . . . . . . . . . . .
  . . . . . . . . . . . . 1 1 1 2 . . . . .
  . . . . . 2 2 1 1 1 . . 1 0 0 1 . . . . .
  . 1 1 2 2 1 0 0 0 1 . . 2 1 2 2 . . . . .
  1 1 0 0 0 0 1 1 1 1 2 . . . . . . . . 1 1
  1 0 0 0 0 1 2 . 1 0 1 1 2 1 . . . . . 1 0
  1 0 0 0 0 1 . . 2 0 0 0 0 1 . . . . . 2 1
  1 0 0 0 0 1 3 . 4 2 1 1 2 3 . . . 3 . . 1
  1 0 1 1 1 1 . . . . 2 . . . . . . . . 2 1
  1 0 1 . . . . . . . . . . . . . 1 1 1 1 0
  1 0 1 1 1 1 3 . . . . . . . . . 2 0 0 0 0
  0 0 0 0 0 0 2 . . . . . . . . . 2 0 0 0 0
  0 1 1 2 1 1 1 . . . . 1 . . . . 4 1 2 1 1
  0 1 . . . . . . . . . . . . . . . . . . .
  0 1 1 2 1 2 . . . . . . . . . . . . . . .
(21 rows)

materialize=>

The expanded regions spill off the viewport in a few directions!

Architecture in Materialize

Forthcoming!

In-progress material

Most of what we've seen above can be implemented atop most SQL databases. If you removed all materialized from view definitions, you would ask the system to reconstruct all results from scratch each query. That isn't why we are here, but it leads us to a case study of Materialize fundamentals.

There are three fundamental ways that data are produced and maintained:

  1. Tables and sources are exogenous: someone puts stuff in there, and Materialize writes it down. The data are storage durably in the persistence layer, and need to be re-streamed each time they are used.

In fact, I used one keyword that

However, if you've been paying careful attention, you may have noticed that

Universal Minesweepers

While clicking on things can be fun, I hope the previous section has convinced you that SQL can be much better if your goal is to populate as many squares as possible.

This leads us to a new meta-game, in the spirit of Universal Paperclips: how efficiently can independent teams populate the board with flagged mines and safe tests?

We'll go through a series of Materialize mechanisms that let you build out more efficient (and better isolated) infrastructure.

Clusters

A "cluster" is Materialize's unit of isolation for computational resources, a bit like a "process" in an operating system. Within a cluster, views can share access to in-memory arrangements of data. At the same time, views within a cluster contend for resources; one badly behaved view may stall (or crash) the whole cluster.

You can create multiple clusters for isolation, but the sharing between them is less efficient. All clusters share access to all tables, sources, views, and materialized views. Indexes are not shared across clusters, and you'll need to build them on the clusters that need them.

By default you get put on a default cluster, with limited resources (and limited impact on other clusters).

The first thing we'll want to do is create a new cluster for our team, to make sure we have exclusive access to some compute resources. This is done with the create cluster command, followed promptly by the set cluster command (

-- in the cloud product sizes are opaque strings: `small`, `large`, etc.
materialize=> create cluster team_frank replicas (smol (size 'small'));
-- in the local release sizes are strings containing numbers.
materialize=> create cluster team_frank replicas (smol (size '1'));
-- in either case, set the cluster to be the one we created.
materialize=> set cluster = 'team_frank';

This spins up a new team_frank cluster with modest resources, and makes it our active cluster. Until we change the cluster, team_frank will be used for our query processing.

This alone would make all of our select * from should_test where team = 'frank' queries that much faster: we'll have more compute resources for them, and we wont be contending with other teams for those resources. At the moment there are no such other teams, but we can imagine.

Indexes

Indexes are Materialize's mechanism to maintain collections in memory and share them within the cluster.

When you create an index (on a collection) Materialize will leap in to action and build a dataflow that computes and maintains the results. When you query that collection, say select * from indexed_thing;, Materialize will pull directly from the in-memory results.

Where materialized views are about sharing data between clusters, indexes are about arranging data within a cluster. You create an index on a source, table, view, or materialized view, and it loads the contents into the memory of your cluster. More specifically, you can create an index with some subset of columns, and the data will be indexed by those columns.

You can see that selecting from should_test is not as fast as we might like:

materialize=> select * from should_test;
 x | y | test | team
---+---+------+------
(0 rows)

Time: 381.977 ms

This is because should_test is a view, and we do a fair bit of work from its inputs to its outputs.

materialize=> explain select * from should_test;
                  Optimized Plan
--------------------------------------------------
 Source materialize.public.tests (u4):           +
 | Project (#0..=#3)                             +
                                                 +
 Source materialize.public.totals (u11):         +
 | Filter (#3 = 0)                               +
 | Project (#0..=#4)                             +
                                                 +
 Query:                                          +
 %0 =                                            +
 | Get materialize.public.totals (u11)           +
 | Filter (#3 = 0)                               +
 | Project (#0..=#2, #4)                         +
 | FlatMap generate_series((#0 - 1), (#0 + 1), 1)+
 | FlatMap generate_series((#1 - 1), (#1 + 1), 1)+
 | Filter ((#0 != #4) OR (#1 != #5))             +
 | Project (#2..=#5)                             +
 | Distinct group=(#2, #3, #0, #1)               +
                                                 +
 %1 =                                            +
 | Get materialize.public.tests (u4)             +
 | Distinct group=(#0, #1, #2, #3)               +
 | Negate                                        +
                                                 +
 %2 =                                            +
 | Union %0 %1                                   +
 | Threshold                                     +

(1 row)

Time: 13.305 ms
materialize=>

When we create an index, things improve dramatically.

materialize=> create default index on should_test;
CREATE INDEX
Time: 254.244 ms
materialize=> select * from should_test;
 x | y | test | team
---+---+------+------
(0 rows)

Time: 13.896 ms
materialize=>

That time is now roughly the same time it takes to explain the query, which is mostly the overhead of me reaching Materialize in us-east-1.

Of course, this isn't for free. We still had to do the work we did above, it's just it took me at least that long to type the new query. But, if we repeatedly select from should_test we see that it is consistently prompt:

materialize=> select * from should_test;
 x | y | test | team
---+---+------+------
(0 rows)

Time: 14.153 ms
materialize=> select * from should_test;
 x | y | test | team
---+---+------+------
(0 rows)

Time: 15.938 ms
materialize=> select * from should_test;
 x | y | test | team
---+---+------+------
(0 rows)

Time: 15.749 ms
materialize=> select * from should_test;
 x | y | test | team
---+---+------+------
(0 rows)

Time: 13.309 ms
materialize=>

You can also create an index on a subset of columns (or any expressions), giving fast random access to those columns.

materialize=> create index should_test_by_team on should_test (team);

This index will make it very fast to run our select * from should_test where team = 'frank' query. You can see in the explained plan the ReadExistingIndex node is annotated with Lookup value ("frank").

materialize=> explain select * from should_test where team = 'frank';
                       Optimized Plan
------------------------------------------------------------
 %0 =                                                      +
 | ReadExistingIndex materialize.public.should_test_by_team+
 | | Lookup value ("frank")                                +
 | Project (#1..=#3, #0)                                   +

(1 row)
materialize=>

Indexes come with a cost: you have to maintain the results. Maintaining the results costs CPU and memory. Since the team_frank cluster is meant for the exclusive use of team frank, we can thin down the results before we index them.

materialize=> create view should_test_frank as select * from should_test where team = 'frank';
materialize=> create default index on should_test_frank;

We now have a view should_test_frank that is locally indexed. We'll only keep records with team = 'frank', which should reduce the work we have to do.

Indexes doing a lot of other great things, from serving results fast to speeding up join processing. There is much more to learn about them later!

Materialized views

Materialized views are Materialize's mechanism to maintain views durably and share them between clusters.

Like indexes, materialized views maintain views, and can dramatically reduce the work that is re-done for repeated queries. Unlike indexes, materialized views can be shared between clusters, which opens up new data architectures.

You may have noticed that we used several create materialized view commands in the examples above. I did these intentionally, at moments where I thought other clusters might benefit from the results. Let's take a closer look.

When we created errors we did it as a materialized view.

-- Errors are too bad. But they don't end the game.
create materialized view errors as ...

This will let anyone else observe the errors collection from their own cluster. They don't need to hop on to the default cluster and potentially overwhelm it.

We also made totals a materialized view.

-- The number of adjacent mines for probed locations.
create materialized view totals as ...

This represents the board state, and multiple teams might each like to use this on their own clusters. In our definitions of should_test and should_flag we use totals, and having it pre-defined is helpful!

Finally, we made composited a materialized view.

-- Convert counts to symbols, mines, clear cells, and include '.' for unknown locations.
create materialized view composited as ...

This is the end result of a bunch of reasoning about the board state, and likely to be interesting to many folks. By making this a materialized view, we can both allow other clusters to read the results, and reduce the query load when one selects from it.

Appendix

Why not use mutual recursion?!?

In proposed syntax, Materialize allows you to write multiple views that depend on each other, potentially recursively. This would allow you to integrate the views we used to determine should_test, and feed them back into themselves. The result is a recursive query that can take large steps without repeatedly inserting in to the tests table.

For example, you might write:

-- extract all transitively testable locations in one query
create materialized view should_test_recursive as
with mutually recursive
    -- locations we should test for a mine of type `test`.
    tests (x int, y int, test int, team text) as (
        select * from correct_tests
        union
        select new_x as x, new_y as y, mine as test, team
        from
            (select * from totals where total = 0),
            generate_series(x - 1, x + 1) as new_x,
            generate_series(y - 1, y + 1) as new_y
        where (x != new_x or y != new_y)
    ),
    -- locations that we will probe to get a mine count.
    probes (probe_x int, probe_y int, x int, y int, team text, test int) as (
        select probe_x, probe_y, x, y, team, test
        from
            tests,
            generate_series(x - 1, x + 1) as probe_x,
            generate_series(y - 1, y + 1) as probe_y
        where x != probe_x OR y != probe_y
    ),
    -- reports back from the probed locations.
    reports (x int, y int, mine int, team text, test int) AS (
        select x, y, get_byte(sha256((probe_x::text || probe_y::text)::bytes), 0) % 6 as mine, team, test
        from probes
    ),
    -- counts for each probe of the number of adjacent mines
    totals (x int, y int, mine int, team text, total int8) as (
        select x, y, mine, team, count(*) - 1 as total
        from (
            select * from reports
            union all
            select *, test from tests
        )
        where mine = test
        group by 1, 2, 3, 4
    )
select * from tests except select * from correct_tests order by x, y;

This query starts with empty collections for each bound view, but then repeatedly updates all views in sequence, as if we ran them manually. But we don't have to run them manually, and instead just get out all the new locations we should test, even those we might learn about transitively.