Today we are going to write a game! A massively multiplayer game!
I've been increasingly interested in board games. Partly this is a byproduct of the pandemic, but shaped by an interest in creating rules and structure that nonetheless allow people to interact in creative ways. Games have a lot of variety, but I'm personally interested most in ones that are 1. cooperative, 2. asynchronous, and 3. accommodate players of varying levels of commitment.
We are going to use Materialize to implement a cooperative version of Minesweeper, inspired by https://m3o.xyz. We are also going to make what I think is a pretty cool tweak to it, that will allow teams of folks to collaborate in more interesting ways! Tremendous credit to Andreas Fuchs who proposed the name Oursweeper.
Of course, we are also going to show off some of the new and interesting properties of Materialize as we do! We'll evolve from a "relaxing" game of clicking on things at your own pace, to one where a team of players collaborately play related games that inform each other, to one of scale-out automation where teams write views to propose rafts of clicks using independent compute resources. I haven't concluded that these changes to the base game are as much "fun" as they are "interesting" and "instructive", so don't run over to KickStarter just yet.
We will do all of this in steps, to try and avoid overwhelming you.
- Background on Minesweeper, and our goals.
- Oursweeper in SQL.
- Using SQL to play Oursweeper.
- Architecture in Materialize: tables, views, indexes, materialized views.
Minesweeper is a game played on a large grid, in which various mines are concealed under grid squares. Your job is to discover where the mines are, indirectly. Rather than directly interact with any mines (ouch) you are instead meant to click on non-mine squares, and be told the number of adjacent squares (at most eight) that hide a mine. Using these numbers and your high-powered noggin you are meant to deduce which squares must have mines, and which squares cannot have mines, click on the latter type of square, and repeat indefinitely.
Here's an example visual that we will actually build up to.
Each grid location is a count of the number of adjacent mines, or a "X
" to indicate a flagged mine, or a ".
" to indicate "as yet unknown".
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . 2 1 2 1 3 X . . . . . .
. . . . . . . . . 1 0 0 0 1 2 X . . . . .
. . . . . . . . . 1 0 0 0 0 1 1 1 1 . . .
. . . . . . . . . 1 1 1 1 1 0 0 0 1 . . .
. . . . . . . . . . . . X 2 1 0 1 1 . . .
. . . . . X X 2 2 2 . . . X 1 0 1 X . . .
. . . 2 1 2 2 1 0 1 X 1 1 1 1 0 1 2 . . .
. . . 2 0 0 0 0 0 1 1 1 0 0 0 0 0 1 . . .
. . . 2 1 0 0 0 0 0 0 0 0 0 0 0 1 2 . . .
. . . X 1 1 1 1 0 0 1 1 2 1 1 1 2 X . . .
. . . . . . X 1 1 1 2 X . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
Mechanically, you the user "left click" on a square to reveal its count (or explode), and "right click" on a square to flag it as a purported mine.
Minesweeper is normally played on a bounded grid, but there is no reason this needs to be the case.
The rules of Minesweeper make plenty of sense even if the grid cells spill off the four sides of the window you are looking at. Moreover, there is no reason you shouldn't be able to scroll around, beyond the limits of the window you might initially open. There is no reason you shouldn't be able to play minesweeper on a 2^64 x 2^64 game board, where you just happen to pick your favorite coordinate to center your screen on. This is what https://m3o.xyz does, as far as I can tell, and we are going to do that too.
To make this work, obviously we won't actually populate 2^128 cells.
Instead, we will use a pseudorandom function from (x, y)
to determine the nature of the mine at location (x, y)
.
We will only evaluate the locations around those that a user has clicked on, determining where mines exist lazily, but consistently.
If other folks visit the same location as you, either collaborators or adversaries, they will see the same mines that you saw.
The intended experience is that of a vast and limitless sea uniformly full of dangerous mines and logical opportunities.
Why not let multiple folks interact with the Minesweeper board at the same time?
You would need to back it in something like a database, and provide handy views that implement "the rules". Fortunately, Materialize is pretty much a databases, SQL lets us write views, so this could plausibly work out.
We'll spend most of the rest of the post doing exactly this. We will assemble SQL tables, views, and queries to mock up a massively multiplayer minesweeper on Materialize.
We're going to make a few changes to Minesweeper in the interest of fun.
First, clicking on a mine would normally end the game. It won't do that here. We'll keep track of the mines you've clicked on, both flags and tests, correct and incorrect, and perhaps the leaderboards will report all in something like a Pareto frontier.
If you are a minesweeper purist, you can stop playing as soon as you click on a mine, as for better or for worse Materialize will have durably recorded your error.
Second, rather than rolling a die to see if a grid cell has a mine for everyone, each grid cell will have a mine and we'll roll a die to stamp a number onto the mine.
You can play the game where 0
means that a mine is present.
Your friend can play the game where 1
means that a mine is present.
For either player, it is as if they are playing Minesweeper with random mine locations.
But if two or more players team up, they can share their information can across the team: locations correctly flagged by players with one number are safe for players with another number.
This second change has the cool property that one player racing ahead of others leaves behind breadcrumbs that allow the others to catch up. Teammates that are slower, less invested, or perhaps just out getting lunch, don't slow down your game. And, if you get stuck, you can switch numbers and try to unstick yourself.
If you aren't wild about the change, just get all your friends to pick the same number and you'll see the same mines.
We are going to start with some module-level documentation describing our plan:
-- Oursweeper: cooperative, asynchronous minesweeper.
--
-- Each (x, y) location has a "mine", a value from 0 through 5.
-- For player `i`, mines with that value are active and may cause them to explode!
-- Locations that are not `i` can be "clicked" to report the number of adjacent `i` squares.
-- Each location has exactly one mine, so one player's discovery can inform their teammates!
We'll model user input as either a guess that a location is a mine ("has the value i
") or a guess that a location is mine-free ("does not have the value i
").
We'll refer to these as flags
and tests
, respectively.
Let's start with the guesses that a location is a mine (represented in the game as a flag).
We'll add a few helpful views to make sure we don't accidentally treat flags
as the truth.
-- Each "flag" is a supposition by `team` that the `mine` value is `flag`.
create table flags (x int, y int, flag int, team text);
-- Protect ourselves against the confusion of multiple entries.
create view distinct_flags as select distinct * from flags;
-- Restrict ourselves to correctly flagged mines.
create materialized view correct_flags as
select x, y, flag, team from distinct_flags
where get_byte(sha256((x::text || y::text)::bytes), 0) % 6 = flag;
Here we see the first instance of our function to determine how each grid cell is mined:
-- A value 0 .. 6 determined by the pair `(x, y)`.
get_byte(sha256((x::text || y::text)::bytes), 0) % 6
The x
and y
coordinates are merged, then hashed, and we pick out one byte to kinda-sorta get a random (if not fair) die roll.
You could certainly change this and get a different game; no strong feelings that this is the correct function.
In particular, the % 6
is where we determine what fraction of grid cells are mines (one out of six, somewhere between medium and expert difficulties for the Windows version of Minesweeper).
The guesses that a location is not a mine are handled similarly, but with an inequality instead:
-- Each "test" is a supposition that the `mine` value is not `test`.
create table tests (x int, y int, test int, team text);
-- Protect ourselves against the confusion of multiple entries.
create view distinct_tests as select distinct * from tests;
-- Restrict ourselves to correctly tested locations.
create materialized view correct_tests as
select x, y, test, team from distinct_tests
where get_byte(sha256((x::text || y::text)::bytes), 0) % 6 != test;
These two tables, flags
and tests
, will record the state of the game.
Players play by inserting into these tables using their team
name.
If players guess wrong the game still goes on, but let's make some views to notice their errors.
-- Errors are too bad. But they don't end the game.
create materialized view errors as
select x, y, flag as mine, 'flag' as action, team
from distinct_flags where get_byte(sha256((x::text || y::text)::bytes), 0) % 6 != flag
union all
select x, y, test as mine, 'test' as action, team
from distinct_tests where get_byte(sha256((x::text || y::text)::bytes), 0) % 6 = test;
We'll now start up in one session a subscription to a view of errors:
materialize=> copy (subscribe errors) to stdout;
There isn't any output, which makes sense as there also is not yet any input.
Let's add an incorrect guess about a mine.
I happen to know that the mine at (0, 0)
has value 1
, so let's guess 0
.
materialize=> insert into flags values (0, 0, 0, 'frank');
INSERT 0 1
Over in the other session, we get our first line of output:
materialize=> copy (subscribe errors) to stdout;
1657277533154 1 0 0 0 flag frank
This is Materialize's SUBSCRIBE output.
It reports a timestamp, a change in count, and then the columns of the changed row.
In this case, our flag
guess of 0
at coordinate (0, 0)
is an error.
Let's introduce the correct guess:
materialize=> insert into flags values (0, 0, 1, 'frank');
INSERT 0 1
Nothing to report in the other session with the SUBSCRIBE
.
This entry wasn't erroneous, and so "no change" in the view.
Had we run the SUBSCRIBE
with the PROGRESS
option, there would be a steady heartbeat of timestamp messages confirming that no changes have occured, which affirmatively distinguishes from the case that the session is just backed up.
Let's set up a scoreboard as well!
You may have noticed the team
column, with 'frank'
filled in there.
We'll use this column as a way for multiple players to self-identify as collaborators.
Their guesses, errors, and other information will all feed to the same reports.
-- Count up the correct tests and flags for each team
create materialized view scoreboard as
with
-- distinct teams from either input.
teams(team) as (
select distinct team
from (
select team from tests
union all
select team from flags
)
),
-- count the number of correct tests.
test_count(team, tests) as (
select team, count(*)
from correct_tests
group by team
),
-- count the number of correct flags.
flag_count(team, flags) as (
select team, count(*)
from correct_flags
group by team
),
-- count the number of errors.
error_count(team, errors) as (
select team, count(*)
from errors
group by team
)
-- left join the teams against each of the things to report.
select teams.team, tests, flags, errors
from teams
left join test_count on (teams.team = test_count.team)
left join flag_count on (teams.team = flag_count.team)
left join error_count on (teams.team = error_count.team);
We can now pull down these reports for each team.
We can SELECT
from the view, or SUBSCRIBE
to it, as you like.
Maybe in the future we will reduce down to those teams that are not strictly dominated by the performance of another team ("Skyline queries" in SQL).
Our next step is to take our flag
and test
collections, and have them produce numbers to report to the players.
A correct test
should reveal the count under it, as in standard minesweeper.
materialize=> select * from correct_tests;
x | y | mine | team
---+---+------+------
(0 rows)
materialize=>
Ah.
Although we know that (0, 0)
contains a mine of type 1
, and have flagged it as such, we haven't actually tested for 0
yet.
We could do that automatically, but for now let's ask folks to do this themselves.
materialize=> insert into tests values (0, 0, 0, 'frank');
INSERT 0 1
materialize=> select * from correct_tests;
x | y | mine | team
---+---+------+-------
0 | 0 | 0 | frank
(1 row)
materialize=>
We need to "reach out" from each correct test, to assess the mines at each adjacent location. When doing this, we keep some notes about where we came from so that we can get the information back efficiently. Having done this, we aggregate up each probe and have a count to report!
-- Each tested location invokes a probe at neighbor locations.
create view probes as
select probe_x, probe_y, x, y, team, test
from
correct_tests,
generate_series(x - 1, x + 1) as probe_x,
generate_series(y - 1, y + 1) as probe_y;
-- Each location reports some flavor of mine.
create view reports as
select x, y, get_byte(sha256((probe_x::text || probe_y::text)::bytes), 0) % 6 as mine, team, test
from probes;
-- The number of adjacent mines for probed locations.
-- We include the center and subtract one to ensure we see a zero.
create materialized view totals as
select x, y, mine, team, count(*) - 1 as count
from (
-- union `reports` with `correct_tests` to avoid a left join.
select * from reports
union all
select *, test from correct_tests
)
where mine = test
group by 1, 2, 3, 4;
If we look at the totals for the probes, we see:
materialize=> select * from totals;
x | y | mine | team | count
---+---+------+-------+-------
0 | 0 | 0 | frank | 0
(1 row)
materialize=>
Luckily, mine type 0
has no mines around it!
We should probably click upon all of its neighbors!
Before going down that rabbit hole, let's first figure out how to present this information.
We now have all the values for all the probes across a quite large game board. How do we get that information back to individual players without overwhelming them (or their connection)?
Each player will view the game board through a viewport.
This is a description of a rectangular region, all of whose reported counts should be returned.
We also need to include the mine
and team
, to know which subset of the information to present.
-- Each entry here results in an output viewport (window) for a certain mine.
create table viewport (x1 int, x2 int, y1 int, y2 int, mine int, team text);
Each entry in the viewport results in a small grid of locations that we want to report:
create view cells as
select distinct x, y, mine, team from
viewport,
generate_series(x1, x2) as x,
generate_series(y1, y2) as y
;
To get something presentable, we'll fire off some subqueries for each element of cell
, looking around in other relations that present data.
In particular, we'll first look to see if there is a reported count, then if there is a flagged mine, and finally whether there is a flagged mine of a different type.
-- Convert counts to symbols, mines, clear cells, and include '.' for unknown locations.
create materialized view composited as
select x, y, mine, team,
case
when count is not null then count
when mined is not null then mined
when clear is not null then clear
else ' .'
end as symbol
from (
select x, y, mine, team,
( -- counts from probes associated with this team.
select ' '::text || count::text as count
from totals
where cells.x = totals.x
and cells.y = totals.y
and cells.mine = totals.mine
and cells.team = totals.team
),
( -- a `X` for mines correctly flagged by this team.
select ' X' as mined
from correct_flags
where cells.x = correct_flags.x
and cells.y = correct_flags.y
and cells.mine = correct_flags.flag
and cells.team = correct_flags.team
),
( -- a `*` for locations known to be safe by flags of other mines.
select ' *' as clear
from correct_flags
where cells.x = correct_flags.x
and cells.y = correct_flags.y
and not cells.mine = correct_flags.flag
and cells.team = correct_flags.team
)
from cells
);
These should be the composited cells presenting whatever symbol we need in each location.
To present this to the user, we'll do a few shenanigans to lay out scanlines of symbols.
-- Create lines of output by grouping by `y`.
create view output as
select y, mine, string_agg(symbol, '' ORDER BY x) AS whole_row, team
from composited
group by y, mine, team;
To view the output, we'll need to introduce some viewports, specifying rectangle and mine type:
-- Define some overlapping viewports.
insert into viewport values (-10, 10, -10, 10, 0, 'frank');
insert into viewport values (-5, 10, -5, 5, 1, 'frank');
We can now peek at the output by selecting whole_row
and ordering by y
.
Let's first look at the 0
mine:
materialize=> select whole_row from output where mine = 0 and team = 'frank' order by y;
whole_row
--------------------------------------------
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . 0 . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
(21 rows)
materialize=>
Fascinating!
Let's do the 1
mine next:
materialize=> select whole_row from output where mine = 1 and team = 'frank' order by y;
whole_row
----------------------------------
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . X . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
(11 rows)
materialize=>
Equally fascinating! The same location, with different takes for different players, is visible on both outputs!
In fact, we could SUBSCRIBE
to either of these views, and each player would get live updates from the other player's work.
We would need a smarter interface than what we have presented, as SUBSCRIBE
is perhaps overly simplistic:
materialize=> copy (subscribe (select whole_row from output where mine = 0 and team = 'frank')) to stdout;
1660587116936 20 . . . . . . . . . . . . . . . . . . . . .
1660587116936 1 . . . . . . . . . . 0 . . . . . . . . . .
We would want to add back in the y
colunm so that we could redraw the screen.
We might want to SUBSCRIBE
the composited
relation to get individual changed squares, rather than whole rows.
There are some really promising UX ideas here, where Materialize can ship you the minimal diff of information to update your presentation.
You play Minesweeper by repeatedly INSERT
ing into flags
and tests
.
This probably gets tedious very quickly.
I couldn't tell you; I haven't tried.
Instead, we can write SQL queries to produce modifications to these tables!
We should be able to test any location that has no adjacent mines (of our type).
This is just a matter of looking at totals
for counts of zero, and testing all neighbors.
-- We can test locations we know have no adjacent mines!
-- In fact, we can do this over and over until we stop inserting things.
create view should_test as
select distinct new_x as x, new_y as y, mine, team
from
(select * from totals where count = 0),
generate_series(x - 1, x + 1) as new_x,
generate_series(y - 1, y + 1) as new_y
except
select * from distinct_tests;
If we peek at the contents of this view, we'll see new locations that are safe to test.
materialize=> select * from should_test where team = 'frank';
x | y | mine | team
----+----+------+-------
0 | 1 | 0 | frank
0 | -1 | 0 | frank
1 | 0 | 0 | frank
1 | 1 | 0 | frank
1 | -1 | 0 | frank
-1 | 0 | 0 | frank
-1 | 1 | 0 | frank
-1 | -1 | 0 | frank
(8 rows)
materialize=>
All eight locations around the (0, 0)
location are safe to test for mines of type 0
.
We could go and INSERT
each of them, or we could use a SQL query to do that for us:
materialize=> insert into tests select * from should_test where team = 'frank';
INSERT 0 8
This is now an exciting moment.
Let's look at the board again, for mines of type 0
.
materialize=> select whole_row from output where mine = 0 and team = 'frank' order by y;
whole_row
-----------------------
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . 1 0 0 . . . . . . . . .
. . . . . . . . . 1 0 0 . . . . . . . . .
. . . . . . . . . 1 1 1 . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
(21 rows)
materialize=>
We have more zeros!
You Minesweeper folks know what this means!
We .. should keep clicking!
We shouldn't have to keep clicking, but Materialize doesn't support WITH RECURSIVE
yet (write your representative!).
materialize=> insert into tests select * from should_test;
INSERT 0 8
materialize=>
Eight more insertions! This means that things changed again.
materialize=> select whole_row from output where mine = 0 and team = 'frank' order by y;
whole_row
-----------------------
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . 2 1 2 1 . . . . . . . .
. . . . . . . . . 1 0 0 0 . . . . . . . .
. . . . . . . . . 1 0 0 0 . . . . . . . .
. . . . . . . . . 1 1 1 1 . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
(21 rows)
materialize=>
Let's pause for a moment here and observe that the only things we are doing (twice now) are an insert
followed by a select
:
materialize=> insert into tests select * from should_test where team = 'frank';
materialize=> select whole_row from output where mine = 0 and team = 'frank' order by y;
The view definitions are doing the hard work behind the scenes.
They make sure that the outputs are correctly updated as a function of the input changes.
Combined with Materialize's Strict Serializable isolation, alternate insert
and select
statements show you the advancing board state you expect to see.
Ok.
Enough pausing.
Let's do a hand-rolled recursive process, and keep doing this INSERT INTO
command until we have no more changes.
materialize=> insert into tests select * from should_test where team = 'frank';
INSERT 0 4
materialize=> insert into tests select * from should_test where team = 'frank';
INSERT 0 3
materialize=> insert into tests select * from should_test where team = 'frank';
INSERT 0 5
materialize=> insert into tests select * from should_test where team = 'frank';
INSERT 0 6
materialize=> insert into tests select * from should_test where team = 'frank';
INSERT 0 6
materialize=> insert into tests select * from should_test where team = 'frank';
INSERT 0 3
materialize=> insert into tests select * from should_test where team = 'frank';
INSERT 0 9
materialize=> insert into tests select * from should_test where team = 'frank';
INSERT 0 8
materialize=> insert into tests select * from should_test where team = 'frank';
INSERT 0 4
materialize=> insert into tests select * from should_test where team = 'frank';
INSERT 0 3
materialize=> insert into tests select * from should_test where team = 'frank';
INSERT 0 3
materialize=> insert into tests select * from should_test where team = 'frank';
INSERT 0 6
materialize=> insert into tests select * from should_test where team = 'frank';
INSERT 0 7
materialize=> insert into tests select * from should_test where team = 'frank';
INSERT 0 7
materialize=> insert into tests select * from should_test where team = 'frank';
INSERT 0 4
materialize=> insert into tests select * from should_test where team = 'frank';
INSERT 0 4
materialize=> insert into tests select * from should_test where team = 'frank';
INSERT 0 3
materialize=> insert into tests select * from should_test where team = 'frank';
INSERT 0 0
materialize=>
You can imagine why WITH RECURSIVE
can occasionally be popular.
Let's take a peek at the output again.
materialize=> select whole_row from output where mine = 0 and team = 'frank' order by y;
whole_row
-----------------------
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . 2 1 2 1 3 . . . . . . .
. . . . . . . . . 1 0 0 0 1 2 . . . . . .
. . . . . . . . . 1 0 0 0 0 1 1 1 1 . . .
. . . . . . . . . 1 1 1 1 1 0 0 0 1 . . .
. . . . . . . . . . . . . 2 1 0 1 1 . . .
. . . . . . . 2 2 2 . . . . 1 0 1 . . . .
. . . 2 1 2 2 1 0 1 . 1 1 1 1 0 1 2 . . .
. . . 2 0 0 0 0 0 1 1 1 0 0 0 0 0 1 . . .
. . . 2 1 0 0 0 0 0 0 0 0 0 0 0 1 2 . . .
. . . . 1 1 1 1 0 0 1 1 2 1 1 1 2 . . . .
. . . . . . . 1 1 1 2 . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
(21 rows)
materialize=>
That's pretty neat, isn't it! We've built up a little island. If you are familiar with Minesweeper you might now start thinking about where we can certainly place mines.
We can place flags using a variety of rules. We'll use a simple one for now: if a grid cell with a count is short mines exactly the number of unmarked neighbors, all those neighbors must be mines. This rule is great for filling in all of those nooks and crannies you see up above.
-- We can flag locations when the count equals the number of untested neighbors.
create view should_flag_from as
select x, y, test as flag, team
from totals
where totals.count > 0 and totals.count = 9 - (select count(*) from (
select distinct *
from
generate_series(x - 1, x + 1) as probe_x,
generate_series(y - 1, y + 1) as probe_y,
correct_tests as tests
where tests.x = probe_x and tests.y = probe_y and tests.test = totals.test
));
-- All adjacent, unmarked locations must be mines.
create view should_flag as
select distinct * from (select probe_x as x, probe_y as y, flag, team
from
should_flag_from,
generate_series(x - 1, x + 1) as probe_x,
generate_series(y - 1, y + 1) as probe_y
)
except all
select * from correct_tests
except all
select * from distinct_flags;
We can take a peek at the should_flag
view, containing locations known to be mines.
materialize=> select * from should_flag where team = 'frank';
x | y | flag | team
----+----+------+-------
0 | 4 | 0 | frank
1 | 8 | 0 | frank
2 | 2 | 0 | frank
3 | 3 | 0 | frank
4 | -2 | 0 | frank
5 | -1 | 0 | frank
7 | 3 | 0 | frank
7 | 7 | 0 | frank
-7 | 7 | 0 | frank
-5 | 3 | 0 | frank
-4 | 3 | 0 | frank
-4 | 8 | 0 | frank
(12 rows)
materialize=>
Let's just add them in, and check out the picture again.
materialize=> insert into flags select * from should_flag where team = 'frank';
INSERT 0 12
materialize=> select whole_row from output where mine = 0 and team = 'frank' order by y;
whole_row
-----------------------
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . 2 1 2 1 3 X . . . . . .
. . . . . . . . . 1 0 0 0 1 2 X . . . . .
. . . . . . . . . 1 0 0 0 0 1 1 1 1 . . .
. . . . . . . . . 1 1 1 1 1 0 0 0 1 . . .
. . . . . . . . . . . . X 2 1 0 1 1 . . .
. . . . . X X 2 2 2 . . . X 1 0 1 X . . .
. . . 2 1 2 2 1 0 1 X 1 1 1 1 0 1 2 . . .
. . . 2 0 0 0 0 0 1 1 1 0 0 0 0 0 1 . . .
. . . 2 1 0 0 0 0 0 0 0 0 0 0 0 1 2 . . .
. . . X 1 1 1 1 0 0 1 1 2 1 1 1 2 X . . .
. . . . . . X 1 1 1 2 X . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
(21 rows)
materialize=>
Look at all those X
things!
Having marked all these mines of type 0
, the boards for other mine types can be updated!
Remember, each of these mines mean that the grid cell is free of mines of other types.
If we peek, we'll see asterisks indicating where mines have been flagged by other players on the team.
materialize=> select whole_row from output where mine = 1 and team = 'frank' order by y;
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . * . . . . . .
. . . . . . . . . . * . . . . .
. . . . . X . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . * . . . . . . . .
* * . . . . . . * . . . * . . .
. . . . . * . . . . . . . . . .
. . . . . . . . . . . . . . . .
materialize=>
Let's add all of our correct mine placement to tests
for another user.
materialize=> insert into tests select x, y, 1, team from correct_flags where not flag = 1 and team = 'frank';
INSERT 0 12
materialize=> select whole_row from output where mine = 1 and tem = 'frank' order by y;
whole_row
----------------------------------
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . 3 . . . . . .
. . . . . . . . . . 1 . . . . .
. . . . . X . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . 2 . . . . . . . .
1 1 . . . . . . 1 . . . 2 . . .
. . . . . 1 . . . . . . . . . .
. . . . . . . . . . . . . . . .
(11 rows)
materialize=>
Interesting! We still see that mine there, but now have several other locations that have been explored. No zeros though, so no automatic progress to make.
How about mines of type 2
?
materialize=> insert into tests select x, y, 2, team from correct_flags where not flag = 2 and team = 'frank';
INSERT 0 13
materialize=> insert into viewport values (-10, 10, -10, 10, 2, 'frank');
INSERT 0 1
materialize=> select whole_row from output where mine = 2 and team = 'frank' order by y;
whole_row
--------------------------------------------
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . 0 . . . . . .
. . . . . . . . . . . . . . . 2 . . . . .
. . . . . . . . . . 2 . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . 0 . . . . . . . .
. . . . . 1 3 . . . . . . 3 . . . 3 . . .
. . . . . . . . . . 2 . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . 0 . . . . . . . . . . . . . 0 . . .
. . . . . . 1 . . . . 1 . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
(21 rows)
materialize=>
This player now has some zeros to explore, beyond the 2
at (0, 0)
.
materialize=> insert into tests select * from should_test where team = 'frank';
INSERT 0 30
materialize=> insert into tests select * from should_test where team = 'frank';
INSERT 0 19
materialize=> insert into tests select * from should_test where team = 'frank';
INSERT 0 15
materialize=> insert into tests select * from should_test where team = 'frank';
INSERT 0 21
materialize=> insert into tests select * from should_test where team = 'frank';
INSERT 0 16
materialize=> insert into tests select * from should_test where team = 'frank';
INSERT 0 21
materialize=> insert into tests select * from should_test where team = 'frank';
INSERT 0 25
materialize=> insert into tests select * from should_test where team = 'frank';
INSERT 0 29
materialize=> insert into tests select * from should_test where team = 'frank';
INSERT 0 12
materialize=> insert into tests select * from should_test where team = 'frank';
INSERT 0 9
materialize=> insert into tests select * from should_test where team = 'frank';
INSERT 0 8
materialize=> insert into tests select * from should_test where team = 'frank';
INSERT 0 2
materialize=> insert into tests select * from should_test where team = 'frank';
INSERT 0 0
All inserted, we arrive at the following picture for player 2
:
materialize=> select whole_row from output where mine = 2 and team = 'frank' order by y;
whole_row
--------------------------------------------
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . 1 1 1 2 . . . . .
. . . . . 2 2 1 1 1 . . 1 0 0 1 . . . . .
. 1 1 2 2 1 0 0 0 1 . . 2 1 2 2 . . . . .
1 1 0 0 0 0 1 1 1 1 2 . . . . . . . . 1 1
1 0 0 0 0 1 2 . 1 0 1 1 2 1 . . . . . 1 0
1 0 0 0 0 1 . . 2 0 0 0 0 1 . . . . . 2 1
1 0 0 0 0 1 3 . 4 2 1 1 2 3 . . . 3 . . 1
1 0 1 1 1 1 . . . . 2 . . . . . . . . 2 1
1 0 1 . . . . . . . . . . . . . 1 1 1 1 0
1 0 1 1 1 1 3 . . . . . . . . . 2 0 0 0 0
0 0 0 0 0 0 2 . . . . . . . . . 2 0 0 0 0
0 1 1 2 1 1 1 . . . . 1 . . . . 4 1 2 1 1
0 1 . . . . . . . . . . . . . . . . . . .
0 1 1 2 1 2 . . . . . . . . . . . . . . .
(21 rows)
materialize=>
The expanded regions spill off the viewport in a few directions!
Forthcoming!
Most of what we've seen above can be implemented atop most SQL databases.
If you removed all materialized
from view definitions, you would ask the system to reconstruct all results from scratch each query.
That isn't why we are here, but it leads us to a case study of Materialize fundamentals.
There are three fundamental ways that data are produced and maintained:
- Tables and sources are exogenous: someone puts stuff in there, and Materialize writes it down. The data are storage durably in the persistence layer, and need to be re-streamed each time they are used.
In fact, I used one keyword that
However, if you've been paying careful attention, you may have noticed that
While clicking on things can be fun, I hope the previous section has convinced you that SQL can be much better if your goal is to populate as many squares as possible.
This leads us to a new meta-game, in the spirit of Universal Paperclips: how efficiently can independent teams populate the board with flagged mines and safe tests?
We'll go through a series of Materialize mechanisms that let you build out more efficient (and better isolated) infrastructure.
A "cluster" is Materialize's unit of isolation for computational resources, a bit like a "process" in an operating system. Within a cluster, views can share access to in-memory arrangements of data. At the same time, views within a cluster contend for resources; one badly behaved view may stall (or crash) the whole cluster.
You can create multiple clusters for isolation, but the sharing between them is less efficient. All clusters share access to all tables, sources, views, and materialized views. Indexes are not shared across clusters, and you'll need to build them on the clusters that need them.
By default you get put on a default
cluster, with limited resources (and limited impact on other clusters).
The first thing we'll want to do is create a new cluster for our team, to make sure we have exclusive access to some compute resources.
This is done with the create cluster
command, followed promptly by the set cluster
command (
-- in the cloud product sizes are opaque strings: `small`, `large`, etc.
materialize=> create cluster team_frank replicas (smol (size 'small'));
-- in the local release sizes are strings containing numbers.
materialize=> create cluster team_frank replicas (smol (size '1'));
-- in either case, set the cluster to be the one we created.
materialize=> set cluster = 'team_frank';
This spins up a new team_frank
cluster with modest resources, and makes it our active cluster.
Until we change the cluster, team_frank
will be used for our query processing.
This alone would make all of our select * from should_test where team = 'frank'
queries that much faster:
we'll have more compute resources for them, and we wont be contending with other teams for those resources.
At the moment there are no such other teams, but we can imagine.
Indexes are Materialize's mechanism to maintain collections in memory and share them within the cluster.
When you create an index (on a collection) Materialize will leap in to action and build a dataflow that computes and maintains the results.
When you query that collection, say select * from indexed_thing;
, Materialize will pull directly from the in-memory results.
Where materialized views are about sharing data between clusters, indexes are about arranging data within a cluster. You create an index on a source, table, view, or materialized view, and it loads the contents into the memory of your cluster. More specifically, you can create an index with some subset of columns, and the data will be indexed by those columns.
You can see that selecting from should_test
is not as fast as we might like:
materialize=> select * from should_test;
x | y | test | team
---+---+------+------
(0 rows)
Time: 381.977 ms
This is because should_test
is a view, and we do a fair bit of work from its inputs to its outputs.
materialize=> explain select * from should_test;
Optimized Plan
--------------------------------------------------
Source materialize.public.tests (u4): +
| Project (#0..=#3) +
+
Source materialize.public.totals (u11): +
| Filter (#3 = 0) +
| Project (#0..=#4) +
+
Query: +
%0 = +
| Get materialize.public.totals (u11) +
| Filter (#3 = 0) +
| Project (#0..=#2, #4) +
| FlatMap generate_series((#0 - 1), (#0 + 1), 1)+
| FlatMap generate_series((#1 - 1), (#1 + 1), 1)+
| Filter ((#0 != #4) OR (#1 != #5)) +
| Project (#2..=#5) +
| Distinct group=(#2, #3, #0, #1) +
+
%1 = +
| Get materialize.public.tests (u4) +
| Distinct group=(#0, #1, #2, #3) +
| Negate +
+
%2 = +
| Union %0 %1 +
| Threshold +
(1 row)
Time: 13.305 ms
materialize=>
When we create an index, things improve dramatically.
materialize=> create default index on should_test;
CREATE INDEX
Time: 254.244 ms
materialize=> select * from should_test;
x | y | test | team
---+---+------+------
(0 rows)
Time: 13.896 ms
materialize=>
That time is now roughly the same time it takes to explain
the query, which is mostly the overhead of me reaching Materialize in us-east-1
.
Of course, this isn't for free. We still had to do the work we did above, it's just it took me at least that long to type the new query.
But, if we repeatedly select from should_test
we see that it is consistently prompt:
materialize=> select * from should_test;
x | y | test | team
---+---+------+------
(0 rows)
Time: 14.153 ms
materialize=> select * from should_test;
x | y | test | team
---+---+------+------
(0 rows)
Time: 15.938 ms
materialize=> select * from should_test;
x | y | test | team
---+---+------+------
(0 rows)
Time: 15.749 ms
materialize=> select * from should_test;
x | y | test | team
---+---+------+------
(0 rows)
Time: 13.309 ms
materialize=>
You can also create an index on a subset of columns (or any expressions), giving fast random access to those columns.
materialize=> create index should_test_by_team on should_test (team);
This index will make it very fast to run our select * from should_test where team = 'frank'
query.
You can see in the explained plan the ReadExistingIndex
node is annotated with Lookup value ("frank")
.
materialize=> explain select * from should_test where team = 'frank';
Optimized Plan
------------------------------------------------------------
%0 = +
| ReadExistingIndex materialize.public.should_test_by_team+
| | Lookup value ("frank") +
| Project (#1..=#3, #0) +
(1 row)
materialize=>
Indexes come with a cost: you have to maintain the results.
Maintaining the results costs CPU and memory.
Since the team_frank
cluster is meant for the exclusive use of team frank
, we can thin down the results before we index them.
materialize=> create view should_test_frank as select * from should_test where team = 'frank';
materialize=> create default index on should_test_frank;
We now have a view should_test_frank
that is locally indexed.
We'll only keep records with team = 'frank'
, which should reduce the work we have to do.
Indexes doing a lot of other great things, from serving results fast to speeding up join processing. There is much more to learn about them later!
Materialized views are Materialize's mechanism to maintain views durably and share them between clusters.
Like indexes, materialized views maintain views, and can dramatically reduce the work that is re-done for repeated queries. Unlike indexes, materialized views can be shared between clusters, which opens up new data architectures.
You may have noticed that we used several create materialized view
commands in the examples above.
I did these intentionally, at moments where I thought other clusters might benefit from the results.
Let's take a closer look.
When we created errors
we did it as a materialized view.
-- Errors are too bad. But they don't end the game.
create materialized view errors as ...
This will let anyone else observe the errors
collection from their own cluster.
They don't need to hop on to the default
cluster and potentially overwhelm it.
We also made totals
a materialized view.
-- The number of adjacent mines for probed locations.
create materialized view totals as ...
This represents the board state, and multiple teams might each like to use this on their own clusters.
In our definitions of should_test
and should_flag
we use totals
, and having it pre-defined is helpful!
Finally, we made composited
a materialized view.
-- Convert counts to symbols, mines, clear cells, and include '.' for unknown locations.
create materialized view composited as ...
This is the end result of a bunch of reasoning about the board state, and likely to be interesting to many folks. By making this a materialized view, we can both allow other clusters to read the results, and reduce the query load when one selects from it.
Why not use mutual recursion?!?
In proposed syntax, Materialize allows you to write multiple views that depend on each other, potentially recursively.
This would allow you to integrate the views we used to determine should_test
, and feed them back into themselves.
The result is a recursive query that can take large steps without repeatedly inserting in to the tests
table.
For example, you might write:
-- extract all transitively testable locations in one query
create materialized view should_test_recursive as
with mutually recursive
-- locations we should test for a mine of type `test`.
tests (x int, y int, test int, team text) as (
select * from correct_tests
union
select new_x as x, new_y as y, mine as test, team
from
(select * from totals where total = 0),
generate_series(x - 1, x + 1) as new_x,
generate_series(y - 1, y + 1) as new_y
where (x != new_x or y != new_y)
),
-- locations that we will probe to get a mine count.
probes (probe_x int, probe_y int, x int, y int, team text, test int) as (
select probe_x, probe_y, x, y, team, test
from
tests,
generate_series(x - 1, x + 1) as probe_x,
generate_series(y - 1, y + 1) as probe_y
where x != probe_x OR y != probe_y
),
-- reports back from the probed locations.
reports (x int, y int, mine int, team text, test int) AS (
select x, y, get_byte(sha256((probe_x::text || probe_y::text)::bytes), 0) % 6 as mine, team, test
from probes
),
-- counts for each probe of the number of adjacent mines
totals (x int, y int, mine int, team text, total int8) as (
select x, y, mine, team, count(*) - 1 as total
from (
select * from reports
union all
select *, test from tests
)
where mine = test
group by 1, 2, 3, 4
)
select * from tests except select * from correct_tests order by x, y;
This query starts with empty collections for each bound view, but then repeatedly updates all views in sequence, as if we ran them manually. But we don't have to run them manually, and instead just get out all the new locations we should test, even those we might learn about transitively.