SPX Index Provider Testing #8087
Replies: 19 comments 26 replies
-
One small note: because only a small number of storage providers will initially be running this version, the notifications of indexing generated at step 11 are not very likely to actually reach our network indexer. Fear not! We will have the indexer poll for new index availability regularly. If you get to step 11, feel free to ping us, and we can initiate an additional poll to pick up your provider. As long as your markets peer ID and multiaddr are associated with your provider's on-chain activity, your indexing will be found and ingested by the network indexer. |
Beta Was this translation helpful? Give feedback.
-
Ozzy's IP Report ExampleResults of
|
Title | Metric |
---|---|
Are you only running IP process on your market subsystem (no deal making) when collecting the data below? | [x] yes [ ] no |
number of the deals made | 333 |
size of the dagstore repo as it’s shipping with a new top level index | 7GiB |
growth rate of the $LOTUS_MARKETS_PATH/index-provider folder’s size | 5 MiB / 10 min |
growth rate of the $LOTUS_MARKETS_PATH/datastore/metadata folder’s size | 3 MiB / 10 min |
IO/mem usage | with MaxSimultaneousTransfer = 10, usage is.. |
Beta Was this translation helpful? Give feedback.
-
is there a sampling rate you want for this? this will be easy to script and given to the community for broader testing later on |
Beta Was this translation helpful? Give feedback.
-
WIP 🚧 Phi-rjan´s IP Report ExampleResult of
|
Title | Metric |
---|---|
Are you only running IP process on your market subsystem (no deal making) when collecting the data below? | [x] yes [ ] no |
number of the deals made | 8398 |
size of the dagstore repo as it’s shipping with a new top level index | 8.1GB |
growth rate of the $LOTUS_MARKETS_PATH/index-provider folder’s size | n/a / n/a |
growth rate of the $LOTUS_MARKETS_PATH/datastore/metadata folder’s size | n/a / n/a |
IO/mem usage | with MaxSimultaneousTransfer = 10, usage is.. |
WIP 🚧
Beta Was this translation helpful? Give feedback.
-
@TippyFlits IP ReportResults of |
Title | Metric |
---|---|
Are you only running IP process on your market subsystem (no deal making) when collecting the data below? | [x] yes [ ] no |
number of the deals made | 4145 |
size of the dagstore repo as it’s shipping with a new top level index | 8.3GiB |
growth rate of the $LOTUS_MARKETS_PATH/index-provider folder’s size | TBD |
growth rate of the $LOTUS_MARKETS_PATH/datastore/metadata folder’s size | TBD |
IO/mem usage | with MaxSimultaneousTransfer = 20, usage is normal |
Beta Was this translation helpful? Give feedback.
-
@tmyuu IP ReportResults of
|
Title | Metric |
---|---|
Are you only running IP process on your market subsystem (no deal making) when collecting the data below? | [x] yes [ ] no |
number of the deals made | 1564 |
size of the dagstore repo as it’s shipping with a new top level index | 6.7GB |
growth rate of the $LOTUS_MARKETS_PATH/index-provider folder’s size | TBD |
growth rate of the $LOTUS_MARKETS_PATH/datastore/metadata folder’s size | TBD |
IO/mem usage | usage is TBD |
Beta Was this translation helpful? Give feedback.
-
New tag is out!https://github.com/filecoin-project/lotus/releases/tag/master-spx.idxprov.rc-3 is out! Update your nodes at your convenience! cc @masih to share high-level changes! |
Beta Was this translation helpful? Give feedback.
-
#@Meest NAM IP Report
|
Beta Was this translation helpful? Give feedback.
-
##BenjaminH IP Report
|
Beta Was this translation helpful? Give feedback.
-
RC-4 Tag is outThe new tag
Going forward please use this tag as the release target for the testing assignment. Already ran migration with earlier tags?You will need to re-run the dagstore migration if your installation used any of the previous tags. However, if you installed this commit (or any commits after it) from the indexing integration PR, then there is no need to re-run the migration or re-announce indices to the indexers again. Simply install the RC-4 tag when you get a chance. Massive thanks to @aarshkshah1992 for resolving the dagstore issue and SPX folks whom helped verify the fix. |
Beta Was this translation helpful? Give feedback.
-
@stephane's IP ReportResults of |
Title | Metric |
---|---|
Are you only running IP process on your market subsystem (no deal making) when collecting the data below? | [x] yes [ ] no |
number of the deals made | 2847 |
size of the dagstore repo as it’s shipping with a new top level index | 26GiB |
growth rate of the $LOTUS_MARKETS_PATH/index-provider folder’s size | TBD |
growth rate of the $LOTUS_MARKETS_PATH/datastore/metadata folder’s size | TBD |
IO/mem usage | TBD |
Beta Was this translation helpful? Give feedback.
-
@cryptowhizzard ReportResults of
|
Title | Metric |
---|---|
Are you only running IP process on your market subsystem (no deal making) when collecting the data below? | [x] yes [ ] no |
number of the deals made | 703 |
size of the dagstore repo as it’s shipping with a new top level index | 20 GB |
growth rate of the $LOTUS_MARKETS_PATH/index-provider folder’s size | 10G |
growth rate of the $LOTUS_MARKETS_PATH/datastore/metadata folder’s size | 4.5G |
IO/mem usage | with MaxSimultaneousTransfer = 10, usage has not changed |
Beta Was this translation helpful? Give feedback.
-
Step 11 (announcement) worked as I got the message: but, I'm getting the following error on some index files not found:
Those index files were not in the original dagstore index (from backup) If so, could you run your magic @willscott and run the manual step on your end? |
Beta Was this translation helpful? Give feedback.
-
@alex Fox 's IP ReportResults of
|
Title | Metric |
---|---|
Are you only running IP process on your market subsystem (no deal making) when collecting the data below? | [x] yes [ ] no |
number of the deals made | TBA |
size of the dagstore repo as it’s shipping with a new top level index | TBA |
growth rate of the $LOTUS_MARKETS_PATH/index-provider folder’s size | TBA |
growth rate of the $LOTUS_MARKETS_PATH/datastore/metadata folder’s size | TBA |
IO/mem usage | with MaxSimultaneousTransfer = 10, usage is.. TBA |
Beta Was this translation helpful? Give feedback.
-
Results of |
Title | Metric |
---|---|
Are you only running IP process on your market subsystem (no deal making) when collecting the data below? | [x] yes [ ] no |
number of the deals made | 662 |
size of the dagstore repo as it’s shipping with a new top level index | 4.2GiB |
growth rate of the $LOTUS_MARKETS_PATH/index-provider folder’s size | x MiB / 10 min |
growth rate of the $LOTUS_MARKETS_PATH/datastore/metadata folder’s size | x MiB / 10 min |
IO/mem usage | with MaxSimultaneousTransfer = 10, usage is.. |
Beta Was this translation helpful? Give feedback.
-
Question STEP 12 #!/usr/bin/env sh MINER_ID="${12D3KooWG66fpxbAxbn4d4Tj3ewY6CceHmuShmQrSvAk9sA1qC9Z}" echo "MINER_ID: ${MINER_ID}" for idx in ${DAGSTORE_REPO}/index/*.full.idx ./scrip.sh It can't be implemented, but can I know what the problem is? |
Beta Was this translation helpful? Give feedback.
-
James' IP ReportResults of |
Title | Metric |
---|---|
Are you only running IP process on your market subsystem (no deal making) when collecting the data below? | [x] yes [ ] no |
number of the deals made | 2240 |
size of the dagstore repo as it’s shipping with a new top level index | 7.7 GiB |
growth rate of the $LOTUS_MARKETS_PATH/index-provider folder’s size | x MIB / 10 min |
growth rate of the $LOTUS_MARKETS_PATH/datastore/metadata folder’s size | x MIB / 10 min |
IO/mem usage | with MaxSimultaneousTransfer = 10, usage is (WIP) |
Beta Was this translation helpful? Give feedback.
-
Results of provider verify-ingest - master-spx.idxprov.rc-5Monitoring Metrics : f024184
|
Beta Was this translation helpful? Give feedback.
-
2022-03-10T07:57:10.621+0900 INFO dt-impl impl/events.go:303 successfully sent completion message to initiator {"chid": "12D3KooWRKk7RMYvs5NQKmk5zRAPHPSxvNuEKeAVqmexfVZXi33u-12D3KooWM4wsQ3kdd8CDHiVDQthU9JZ9KqsxSdSQT2xj6TAdDth5-1646858319637830640"} goroutine 3578830 [running]: "Process" shutdown after 3 hours of running "provider-index". |
Beta Was this translation helpful? Give feedback.
-
💙 Huge shot out to datasystem team (@masih, @willscott ,@gammazero @rvagg , @MarcoPolo , @hannahhoward) & the ignite team (@aarshkshah1992, @nonsense, @dirkmc) for building this part of the interplanetary network content addressing system and bringing it into lotus so that Filecoin storage providers that use lotus may participate as the early content & retrieval providers in the network with ease!
Table of Contexts
Overview
Interplanetary network indexer nodes aim to store the indexes that provide a content routing sub-system to identify which providers in the interplanetary network(IPFS, Filecoin, etc.) are able to provide what content.
To build such content addressing system, index providers will act as the entities that advertise the content to index nodes and serve retrieval requests over graph sync. Index providers may be run as a standalone service or ported into an existing Golang application. In which, in lotus cases, it's embedded into the market subsystem so that Filecoin storage providers may serve as the index provider (later retrieval providers) in the Interplanetary network (IPFS & Filecoin).
To become a index provider involves three main parts:
DagStore Migration
The DagStore migration is required to roll out the new CARv2 indexing format,
MultihashIndexSorted
. It regenerates the indices with the latest CARv2 index format that includes the full multihash of CIDs in each DagStore shard. For more inform see DagStore Lotus documentation.Indexing Announcement
The indexing announcement publishes advertisements about the content available for retrieval onto a gossipsub topic, which is listened to by a set of indexer nodes.
The advertisements are then processed by the indexer in order to provide an endpoint that allows clients to lookup where and how to retrieve data for a given multihash. For more information, see Indexer Node Design.
Content Routing Verification
The content routing verification involves checking that the advertisements published by the Lotus instance are ingested by the indexer nodes and for a given stored multihash the correct provider is returned by the indexer node.
Note that ingestion of advertisements by the indexer nodes is progressive; total ingestion time depends on the number of multhihashes being advertised. Therefore, verification needs to be performed with some delay after indexing announcement is made.
Expectations
New CAR index format in DagStore
The new index format stores the multihash code as well as the digest. The size of DagStore index files as a result are slightly larger. The size increase is negligible.
Index Provider GossipSub Announcements
The index provider integration announces changes to the chain of advertisements onto a gossipsub topic, named
/indexer/ingest/mainnet
, which is propagated through the Lotus daemon onto the indexer nodes.Index Provider GraphSync Server
The index provider integration exposes a GraphSync server which serves requests from indexer nodes to sync the list of advertised multihashes provided by the SP.
Note that the server is exposed on the address configured under
IndexProvider
configuration section, with keyListenAddresses
. The advertisements will include the address configured under the same section with keyAnnounceAddresses
. You must make sure both sets of address are reachable publicly. For more information see Indexer Provider Configuration.Index Provider Storage Usage
The index provider integration shares a datastore with
markets
process, wrapped under the namespaceindex-provider
. The datastore entries stored include:The storage used by the internal mappings is negligible.
The storage used by caching is bound by an LRU cache, the maximum size of which is configured to
1024
, i.e. the number of chunks cached. The maximum length of each chunk configured to16,384
.The exact storage usage, then, depends on the number of multihashes stored in a single chunk and the size of each multihash, and can be calculated as:
For example, caching 128-bit long multihashes will result in chunk sizes of 0.25MiB with maximum cache growth of 256 MiB.
Note that the LRU cache may grow beyond its max size if the generated chain of chunks is longer than the configured
LinkChunkSize
. This is to avoid partial caching of chunks within a single advertisement. The cache expansion is logged inINFO
level atprovider/engine
logging subsystem and can be monitored for diagnosis purposes.provider verify-ingest
.$LOTUS_MARKETS_PATH/index-provider
folder’s size$LOTUS_MARKETS_PATH/datastore/metadata
folder’s sizeMaxSimultaneousTransfers
daemon
,miner
andmarkets
processes.daemon
andminer
processes first.lotus-miner storage-deals list
lotus-miner sectors list
markets
process.Steps to Become an Index Provider
Stop the
daemon
miner
andmarkets
processesStop all lotus processes to suspend any changes to the state of DagStore during the rollout.
The daemon needs to be stoped in order to roll out a change to the API that protects connections between markets and daemon libp2p node.
Back up the existing DagStore repository
The DagStore repository is located at
$LOTUS_MARKETS_PATH/dagstore
by default. Make a copy of that folder. This is necessary for:Delete the existing DagStore repository
Delete the DagStore repository located at
$LOTUS_MARKETS_PATH/dagstore
by default. The absence of the repository signals to the Lotus instance that a DagStore migration is needed and will automatically trigger one uponmarkets
instance start-up.Rotate any existing Lotus log files and adjust log level
For easier debugging rotate any existing logs so that the new logs only include output generated by the target release.
Deploy
daemon
andminer
processesDeploy the index provider tag -
master-spx.idxprov.rc-1
to thedaemon
andminer
processes and await until they are fully started and ready.Note: This tag is based off release/v1.15.0, thus it also supports the upcoming OhSnap network v15 upgrade!
They are ready when the following commands succeed:
lotus-miner storage-deals list
lotus-miner sectors list
Deploy the target release on
markets
processDownload and deploy
master-spx.idxprov.rc-1
to the market process.Start the
markets
processStart only the
markets
process and wait for the following log line in the markets process logs:dagstore migration completed successfully
This indicates that the list of shards that require initialisation have been queued for processing. See [DagStore First-time Migration](https://lotus.filecoin.io/docs/storage-providers/dagstore/#first-time-migration) for more information. See Indexer Provider Config to customize the configuration of the subsystem that announces indexes.
Configure logging subsystems
Make sure your lotus installation persists the log files for future debugging.
Set the log level for the following subsystems on market node to
INFO
:go-legs-gpubsub
provider/engine
dagstore
To do this run the following command:
Initialise the the DagStore shards
To start the initialisation of DagStore shards, run:
lotus-miner dagstore initialize-all --concurrency=N
if you run a monolith miner process orlotus-miner --call-on-market dagstore initialize-all --concurrency=N
if you have split your market subsystem.N
controls the number of deals that are concurrently initialised. See DagStrore Force Bulk Initialisation docs for more information.Wait for the initialisation to complete. The initialisation time is a factor of the volume of data stored, since it involves re-indexing the data blocks.
Verify re-creation of DagStore repository
The successful completion of the previous step should recreate the DagStore repository, located at
$LOTUS_MARKETS_PATH/dagstore
. Navigate to that director. Under the subfolderindex
verify that matching*.full.idx
files can be found for all files under the same sub-directory in the backup of DagStore taken in step 2.✨ Announce all indices to the indexers
To announce all the indices in bulk to the indexers, run:
lotus-miner index announce-all
if you run a monolith miner process orlotus-miner --call-on-market index announce-all
if you have split your market subsystem.This command generates advertisements and publishes indices onto the indexer gossipsub channel. In the markets logs look for a series of logs that include
deal announcement sent to index provider
. You should see one such log per deal. The log line also includes advertisement CID, the deal proposal CID to which it belongs and the shardKey from which its multihash entries are generated. The logs should also include logs that provide information about the number of multihash entries each advertisement includes. For example:Note that the bulk advertisement only announces deals that are not expired and handed over to the sealing subsystem. The expired deals will not be advertised. For any remaining deals the advertisement will occur after they are handed over to the sealing subsystem.
Wait for the bulk indexing announcement to complete. The bulk announcement is complete when
finished announcing active deals to index provider
is logged.Verify indices in DagStore repository are ingested by indexer nodes
To verify ingestion, download and install the latest
provider
CLI tool from:The built binaries can be found under assets attached for each target platform.
Once installed, download the following script.
Script below to be provided as a downloadable
.sh
file; for now pasted below for review purposes.The script takes two mandatory argument:
$LOTUS_MARKETS_PATH/dagstore
A third argument may optionally be specified as a number between
0.0>=1.0
which sets the selection probability of the multihash sample that is verified for ingestion, set to 5% by default.You can find out what your peer ID is using
lotus-miner net id
This script will:
*.full.idx
name,MINER_ID
as the provider.A verification result is printed for each file. Verify that verification is successful for each of the files. See
provider verify-ingest -h
for example output.Indexer Provider Configuration
You can adjust the values under the
IndexProvider
session in the config.toml of your market process to configure indexes announcement to the indexer.If the session doesn't exist, you can manually add it:
Almost Done
In cace you forgot the actions items listed in the previous session, here's a reminder to collect the requested metrics after you successfully deploy and become an Index Provider! As always, you may use Ozzy's Report as a template 😉 . We are looking forward to hearing back from you, good luck and have fun!
Beta Was this translation helpful? Give feedback.
All reactions