Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Overview of Filecoin Storage Provider (SP) components and how Boost fits in an SP stack
The Lotus stack is responsible for running a Filecoin full chain node, handling sector lifecycle, storing sectors, sector sealing, sector proving, handling wallets, and more. You can read about the Lotus components at https://docs.filecoin.io/storage-provider/architecture/lotus-components/
lotus
Lotus Daemon following the Filecoin chain
lotus-miner
Lotus Miner handling sector storage, jobs scheduling, etc.
lotus-worker
Lotus Worker handling various types of jobs (PC1, UNS, PC2, C1, DC, etc.)
The Boost stack is responsible for handling storage and retrieval requests, transferring data, indexing data, announcing data to network indexers, and ensuring that clients can retrieve their data once it is stored on the SP's infrastructure.
boost
Boost client-side executable that can be used to send a deal proposal to a Boost Storage Provider.
boostd
Boost SP daemon handling storage deals processing, Boost UI server, GraphQL server, etc.
booster-http
Serve blocks and files over HTTP
booster-bitswap
Serve blocks and files over Bitswap
boostd-data
Data service handling Boost SP daemon state, and providing interface to the Local Index Directory (hosted on YugabyteDB, or LevelDB)
yugabyted
YugabyteDB backend for Local Index Directory
By default the Boost daemon repository is located at ~/.boost
It contains the following files:
api
The local multi-address of Boost's libp2p API
boost.db
The sqlite database with all deal metadata
boost.logs.db
The sqlite database with the logs for deals
config.toml
The config file with all Boost's settings
repo.lock
A lock file created when Boost is running
storage.json
Deprecated (needed by legacy markets)
token
The token used when calling Boost's JSON RPC endpoints
It has the following directories:
datastore
Contains metadata about deals for legacy markets
deal-staging
The directory used by legacy markets for incoming data transfers
incoming
The directory used by Boost for incoming data transfers
journal
Contains journal events (used by legacy markets)
keystore
Contains the secret keys used by libp2p (eg the peer ID)
kvlog
Used by legacy markets datastore
Salient features of Boost
Boost supports multiple options for data transfer when making storage deals, including HTTP. Clients can host their CAR file on an HTTP server, such as S3, and provide that URL when proposing the storage deal. Once accepted, Boost will automatically fetch the CAR file from the specified URL.
Boost comes with a web interface that can be used to manage deals, watch disk usage, monitor funds, adjust settings and more.
Boost comes with a client that can be used to make storage deals, and can be configured to point at a public Filecoin API endpoint. That means clients don't need to run a Filecoin node or sync from chain.
See for more details.
See for more details.
Boost stores metadata about deals in a SQLite database in the root directory of the Boost repository.
To open the database use a SQLite client:
sqlite3 boost.db
The database tables are
Deals
metadata about Boost storage deals (eg deal proposal) and their current state (e.g. checkpoint)
FundsLogs
log of each change in funds reserved for a deal
FundsTagged
how much FIL is tagged for deal collateral and publish message for a deal
StorageLogs
log of each change in storage reserved for a deal
StorageTagged
how much storage is tagged for a deal
Boost keeps a separate database just for deal logs, so as to make it easier to manage log data separately from deal metadata. The logs database is named boost.logs.db
and it has a single table DealLogs
that stores logs for each deal, indexed by UUID.
Boost uses goose
(https://pressly.github.io/goose/) tool and library for handling sqlite3 migrations.
goose
can be installed following the instructions at https://pressly.github.io/goose/installation/
Migrations in Boost are stored in the /db/migrations
directory.
Boost handles database migrations on start-up. If a user is running an older version of Boost, migrations up to the latest version are automatically applied on start-up.
Developers can use goose
to inspect and apply migrations using the CLI:
Boost introduction
Boost is a tool for Storage Providers to manage data onboarding and retrieval on the Filecoin network.
Boost exposes libp2p interfaces for making storage and retrieval deals, a web interface for managing storage deals, and a GraphQL interface for accessing and updating real-time deal information.
Boost daemon exposes a GraphQL API that is used by the Web UI to query and update information about storage and retrieval deals.
You can test out queries, or explore the GraphQL API by clicking on the < Docs
link at the top right of the page:
To run a GraphQL query with curl
:
This 1m video shows how to use these tools to build an run a GraphQL query against Boost:
1. Query failed deals
2. Cancel a deal where ab12345c-5678-90de-12f3-45a6b78cd9ef
is the deal ID
This page contains all Boost API definitions. Interfaces defined here are exposed as JSON-RPC 2.0 endpoints by the boostd daemon.
To use the Boost Go client, the Go RPC-API library can be used to interact with the Boost API node.
Import the necessary Go module:
Create the following script:
Run go mod init
to set up your go.mod
file
You should now be able to interact with the Boost API.
The JSON-RPC API can also be communicated with programmatically from other languages. Here is an example written in Python. Note that the method
must be prefixed with Filecoin.
Perms: read
Inputs: null
Response:
Perms: admin
Inputs:
Response: "Ynl0ZSBhcnJheQ=="
Perms: read
Inputs:
Response:
There are not yet any comments for this method.
Perms: read
Inputs:
Response: "Ynl0ZSBhcnJheQ=="
Perms: read
Inputs:
Response: 123
Perms: read
Inputs:
Response: true
Perms: admin
Inputs:
Response:
Perms: admin
Inputs:
Response:
Perms: admin
Inputs:
Response:
Perms: admin
Inputs:
Response:
There are not yet any comments for this method.
Perms: admin
Inputs: null
Response: {}
Perms: admin
Inputs:
Response: null
Perms: admin
Inputs:
Response: null
Perms: admin
Inputs: null
Response: null
Perms: admin
Inputs:
Response: null
Perms: admin
Inputs:
Response: null
Perms: admin
Inputs:
Response:
Perms: admin
Inputs:
Response:
Perms: admin
Inputs:
Response:
Perms: write
Inputs: null
Response:
Perms: write
Inputs:
Response: {}
Perms: read
Inputs: null
Response:
Perms: read
Inputs: null
Response: "12D3KooWGzxzKZYveHXtpG6AsrUJBcWxHBFS2HsEoGTxrMLvKXtf"
Perms: read
Inputs: null
Response:
Perms: read
Inputs:
Response: "string value"
Perms: read
Inputs: null
Response:
Perms: read
Inputs: null
Response:
Perms: read
Inputs: null
Response:
Perms: read
Inputs: null
Response:
Perms: admin
Inputs:
Response: {}
Perms: read
Inputs: null
Response:
Perms: admin
Inputs:
Response: {}
Perms: write
Inputs:
Response: {}
Perms: read
Inputs:
Response: 1
Perms: write
Inputs:
Response: {}
Perms: read
Inputs:
Response:
Perms: read
Inputs:
Response:
Perms: read
Inputs:
Response:
Perms: read
Inputs: null
Response:
Perms: read
Inputs:
Response: 60000000000
Perms: admin
Inputs:
Response: {}
Perms: read
Inputs: null
Response:
Perms: admin
Inputs:
Response: {}
Perms: read
Inputs: null
Response:
Perms: admin
Inputs:
Response: {}
Perms: read
Inputs:
Response:
There are not yet any comments for this method.
Perms: admin
Inputs:
Response: {}
There are not yet any comments for this method.
Perms: admin
Inputs:
Response: {}
Perms: admin
Inputs: null
Response: {}
Perms: admin
Inputs:
Response: {}
Local Index Directory index types
In order to support the described retrieval use cases, LID maintains the following indexes:
To look up which pieces contain a block
To look up which sector a piece is in
To look up where in the piece a block is and the block’s size
The GraphQL API query endpoint is at
You can also run your own queries against the GraphQL API using curl
or a programming language that has a .
Boost has a built-in GraphQL explorer at
YugabyteDB is the backend database that hosts the Local Index Directory
YugabyteDB is used to store the retrievals indexes. When a client makes a retrieval request, LID service is used to lookup the requested data within the Lotus miner. For more details see the architecture of LID.
Depending on the size of data your SP is holding, you can run YugabyteDB in a variety of ways. You can find more information about how to run YugabyteDB here.
For production use, YugabyteDB can be deployed as a highly available (HA) cluster. The cluster can be deployed on a Kubernetes instance or as a manual deployment. Please ensure to read all the pre-requisites before proceeding.
YugabyteDB must be backed up regularly to avoid loosing deal metadata. Once this data is lost, there is no reliable way to recover it.
You can then start boostd-data
on :8044
and connect it to the yugabyte
with:
The PGX driver from Yugabyte supports cluster aware Postgres connection out of the box. If you are deploying a multi-node YugabyteDB cluster, then please update your connect-string to use a cluster aware connection.
With Cluster Mode: "postgresql://postgres:postgres@127.0.0.1:5433?load_balance=true"
With Cluster Mode + No SSL: "postgresql://postgres:postgres@127.0.0.1:5433?sslmode=disable&load_balance=true"
What is YugabyteDB and why YugabyteDB? YugabyteDB is a high-performance distributed SQL database. Built using a unique combination of high-performance document store, per-shard distributed consensus replication and multi-shard ACID transactions (inspired by Google Spanner), YugabyteDB serves both scale-out RDBMS and internet-scale OLTP workloads with low query latency, extreme resilience against failures and global data distribution. We tested multiple open source databases for the Boost use case and found YugabyteDB to be well suited as it is highly performant and scales well horizonally.
How do I learn about YugabyteDB and what do I need to know about YugabyteDB as an SP?
SPs should familiarize themselves with the “Deploy and Manage” section of the documentation along with the architecture before deploying YugabyteDB.
YugabyteDB will also be utilized by Lotus V2 architecture as well. We plan to allow SPs to connect all of their Boost instances to a single LID (YugabyteDB) with Boost v2.1.0 release. This will also allow SPs to serve retrievals from any of their miners using a single booster-http
or booster-bitswap
process.
Which deployment should I choose? We recommend deploying YugabyteDB either locally on bare metal or using a managed deployment. Users can choose to deploy on the cloud if they can guarantee that scaling up the DB will not be impacted by the network bandwidth and infrastructure. It is recommended that YugabyteDB is highly available so that if one of the nodes is not available, your SP operations will not be impacted. You can find some of the example YugabyteDB deployment here. Please feel free to add your experience and deployment details to the discussion.
Can I convert YugabyteDB processes to systemd services? YugabyteDB does not ship as a systemd service by default. You will need to create a new service based on the commands you are running to start YT-Master and YT-Server processes. These commands can be customized based on user requirements and infrastructure.
Once YugabyteDB is deployed, do I need to perform any additional steps?
Ideally, once the deployment is complete and can be reached over the network by boostd-data
service, users do not need to perform any additional steps. If you wish to change the default username/password, you must also update the same on the --connect-string
of boostd-data
service.
You can find some of the example YugabyteDB deployment here. Please feel free to add your experience and deployment details to the discussion.
Boost is not currently tied to a specific version of YugabyteDB. We recommend setting up the latest stable version of the YugabyteDB when creating a new LID instance. Over the time, users can upgrade their YugabyteDB. When upgrading:
Boost-related processes must be stopped beforehand. If YugabyteDB is being utilised by other services apart from Boost then those services must be stopped as well.
Read through and follow the upgrade guide and reach out to YugabyteDB team on Slack if you have any questions.
As per the industry best practices, YugabyteDB should be regularly backed up for redundancy. Check out YugabyteDB docs if you need any help with management of the database.
If you require help with YugabyteDB, we recommend checking out the troubleshooting guide. If the problems is not resolved then users should reach out to YugabyteDB team via Slack or Github for any support. You can also reach out to other users in #boost-help channel of filecoin slack to seek help from your peers.
Hardware requirements for YugabyteDB
For detailed instructions, playbooks and hardware recommendations, see the YugabyteDB website - https://docs.yugabyte.com
YugabyteDB is designed to run on bare-metal machines, virtual machines (VMs), and containers. CPU and RAM
You should allocate adequate CPU and RAM. YugabyteDB has adequate defaults for running on a wide range of machines, and has been tested from 2 core to 64 core machines, and up to 200GB RAM.
YugabyteDB requires the SSE2 instruction set support, which was introduced into Intel chips with the Pentium 4 in 2001 and AMD processors in 2003. Most systems produced in the last several years are equipped with SSE2.
In addition, YugabyteDB requires SSE4.2.
To verify that your system supports SSE2, run the following command:
cat /proc/cpuinfo | grep sse2
To verify that your system supports SSE4.2, run the following command:
cat /proc/cpuinfo | grep sse4.2
It is recommend that a minimum of 1TiB or more is allocated for YugabyteDB, depending on the amount of deal data you store and its average block size.
Assuming you've kept unsealed copies of all your data and have consistently indexed deal data, the size of your DAG store directory should be comparable with the requirements for YugabyteDB.
Hardware requirements for Storage Providers running Boost
The hardware requirements for Boost are tied to the Lotus stack in a Filecoin SP deployment.
Depending on how much data you need to onboard, and how many deals you need to make with clients, hardware requirements in terms of CPU and Disk will vary.
A miner will need an 8+ core CPU.
We strongly recommend a CPU model with support for Intel SHA Extensions: AMD since Zen microarchitecture, or Intel since Ice Lake. Lack of SHA Extensions results in a very significant slow down.
The most significant computation that Boost has to do is the Piece CID calculation (also known as Piece Commitment or CommP). When Boost receives data from a client, it calculates the Merkle root out of the hashes of the Piece (padded .car file). The resulting root of the clean binary Merkle tree is the Piece CID.
128 GiB of RAM is needed at the very least.
Boost stores all data received from clients before Piece CID is calculated and compared against deal parameters received from clients. Next, deals are published on-chain, and Boost waits for a number of epoch confirmations before proceeding to pass data to the Lotus sealing subsystem. This means that depending on the throughput of your operation, you must have disk space for at least a few staged sectors.
For small deployments 100 GiB of disk are needed at the very least if we assume that Boost is to keep three 32 GiB sectors before passing them to the sealing subsystem.
We recommend using NVME disk for Boost. As Dagstore grows in size, the overall performance might slow down due to slow disk.
This page describes the Local Index Directory component in Boost, what it is used for, how it works and how to start using it
The Local Index Directory (LID) manages and stores indices of deal data so that it can be retrieved by a content identifier (cid).
Currently this task is performed by the DAG store component. The DAG store keeps its indices on disk on a single machine. LID replaces the DAG store and introduces a horizontally scalable backend database for storing the data - YugabyteDB.
LID is designed to provide a more intuitive experience for the user, by surfacing problems and providing various repair tools.
To summarize, LID is the component which keeps fine-grained metadata about all the deals on Filecoin that a given Storage Provider stores, and without it, the client would only be able to retrieve full pieces, which generally are between 8GiB and 32GiB in size.
At the moment there are two implementations of LID: - a simple LevelDB implementation, for small SPs who want to keep all information in a single process database. - a scalable YugabyteDB implementation, for medium and large size SPs with tens of thousands of deals.
When designing LID we considered the needs of various Storage Providers (SPs) and the operational overhead LID would have on their systems. We built a solution for: - small- SPs - holding up to 1PiB, and - mid- and large- size SPs - holding anywhere from 1PiB, up to 100PiB data
Depending on underlying block size and data format, index size can vary in size. Typically block sizes are between 16KiB and 1MiB.
When a client uploads deal data to Boost, LID records the sector that the deal data is stored in and scans the deal data to create an index of all its blocks indexed by block cid. This way clients can later retrieve subsets of the original deal data, without retrieving the full deal data.
When a client makes a request for data by cid, LID: - checks which piece the cid is in, and where in the piece the data is - checks which sector the piece is in, and where in the sector the piece is - reads the data from the sector
The retrieval use cases that the Local Index Directory supports are:
Request one root cid with a selector, receive many blocks
LID is able to: - look up which piece contains the root cid - look up which sector contains the piece - for each block, get the offset into the piece for the block
Request one block at a time
LID is able to: - look up which piece contains the block - get the size of the block (Bitswap asks for the size before getting the block data) - look up which sector contains the piece - get the offset into the piece for the block
Request a whole piece
LID is able to look up which sector contains the piece.
Request an individual block
LID is able to: - look up which piece contains the block - look up which sector contains the piece - get the offset into the piece for the block
Request a file by root cid
LID is able to: - look up which piece contains the block - look up which sector contains the piece - for each block, get the offset into the piece for the block
booster-http is a service which allows SP to serve blocks and files over HTTP.
Go to the following page for more information on booster-http
:
This page covers all the configuration related to http transfer limiter in boost
Boost provides a capability to limit the number of simultaneous http transfer in progress to download the deal data from the clients.
This new configuration has been introduced in the ConfigVersion = 3
of the boost configuration file.
The transferLimiter
maintains a queue of transfers with a soft upper limit on the number of concurrent transfers.
To prevent slow or stalled transfers from blocking up the queue there are a couple of mitigations: The queue is ordered such that we
start transferring data for the oldest deal first
prefer to start transfers with peers that don't have any ongoing transfer
once the soft limit is reached, don't allow any new transfers with peers that have existing stalled transfers
Note that peers are distinguished by their host (eg foo.bar:8080) not by libp2p peer ID. For example, if there is
one active transfer with peer A
one pending transfer (peer A)
one pending transfer (peer B)
The algorithm will prefer to start a transfer with peer B than peer A. This helps to ensure that slow peers don't block the transfer queue.
The limit on the number of concurrent transfers is soft. Example: if there is a limit of 5 concurrent transfers and there are
three active transfers
two stalled transfers
then two more transfers are permitted to start (as long as they're not with one of the stalled peers)
This section details how to compile and install Boost if you are a storage provider or a client
Node 20.x
Linux / Ubuntu
macOS
Depending on your architecture, you will want to export additional environment variables:
Please ignore any output or onscreen instruction during the npm build
unless there is an error.
Please ignore any output or onscreen instruction during the npm build
unless there is an error.
To build boost for calibnet, please complete the above pre-requisites and build using the following commands.
The Boost source code repository is hosted at
Please make sure you have installed: Go - following
Rust - following
< v2.0.0
NA
v2.0.0
v1.23.x
1.20.x
NA
v2.1.0, v2.1.1, v2.1.2
v1.24.x, v1.25.x
1.20.x
NA
v2.2.0
v1.26.0, v1.26.1
1.21.7
NA
v2.3.0
v1.27.x, v1.28.x
1.22.3
1.22.x, 1.23.x
v2.4.0
v1.30.0
1.22.3
1.24.2
v2.4.1
v1.32.0-rc1
1.22.8
1.24.3, 1.24.4
v2.4.2
v1.32.1
1.23.7
NA
How to use deal filters
Your use case might demand very precise and dynamic control over a combination of deal parameters.
Lotus provides two IPC hooks allowing you to name a command to execute for every deal before the miner accepts it:
Filter
for storage deals.
RetrievalFilter
for retrieval deals.
The executed command receives a JSON representation of the deal parameters on standard input, and upon completion, its exit code is interpreted as:
0
: success, proceed with the deal.
non-0
: failure, reject the deal.
The most trivial filter rejecting any retrieval deal would be something like: RetrievalFilter = "/bin/false"
. /bin/false
is binary that immediately exits with a code of 1
.
This Perl script lets the miner deny specific clients and only accept deals that are set to start relatively soon.
You can also use a third party content policy framework like CIDgravity or bitscreen
by Murmuration Labs:
Boost configuration options available in UI
By default, the web UI listens on the localhost interface on port 8080. We recommend keeping the UI listening on localhost or some internal IP within your private network to avoid accidentally exposing it to the internet.
You can access the web UI listening on the localhost interface on a remote server, you can open an SSH tunnel from your local machine:
Price / epoch / Gib
500000000
Asking price for a deal in atto fils. This price is per epoch per GiB of data in a deal
Verified Price / epoch / Gib
500000000
Asking price for a verified deal in atto fils. This price is per epoch per GiB of data in a deal
Min Piece Size
256
Minimum size of a piece that storage provider will accept in bytes
Max Piece Size
34359738368
Maximum size of a piece that storage provider will accept in bytes
This tutorial goes through the steps required to run our Docker monitoring setup to collect and visualize metrics for various Boost processes
The monitoring stack we will use includes:
Prometheus - collects metrics and powers dashboards in Grafana
Tempo - collects traces and powers traces search in Grafana with Jaeger
Grafana - provides visualization tools and dashboards for all metrics and traces
Lotus and Boost are already instrumented to produce traces and stats for Prometheus to collect.
The Boost team also packages a set of Grafana dashboards that are automatically provisioned as part of this setup.
The monitoring setup should be done on the same node where boostd
and other Boost services are running. If you are running multiple Boost services (like boostd-data
/booster-http
) across multiple machines then any one of those machine can be used.
This setup has been tested on macOS and on Linux. We haven’t tested it on Windows, so YMMV.
All the monitoring stack containers run in Docker.
Install Docker
We have tested this setup with Docker 20.10.23 on macOS and Ubuntu.
DNS resolution for Prometheus
Depending on where your Filecoin processes (boostd
, lotus
, lotus-miner
, booster-bitswap
, etc.) are running, you need to confirm that they are reachable from Prometheus so that it can scrape their metrics.
By default the setup expects to find them within the same Docker network, so if you are running them elsewhere (i.e. on the `host` network), make sure to update the docker-compose file for the same.
Loki plugin for Docker
The loki
plugin for docker is required to allow collecting logs from the services running on docker itself
Storage location
The default value is $HOME/.boost-monitoring
Please ensure that the storage directory exists and has permission 0775 or 0777 depending on your user.
Update metrics endpoints for Prometheus
In case, you are running all Boost services and monitoring stack on the same node, you can skip this step.
Start the monitoring stack using docker-compose
You can also access the Grafana on the localhost interface on a remote server, you can open an SSH tunnel from your local machine:
If you are running software firewall like `ufw`, you might need to modify your iptables and allow access from the Prometheus container / network to the Filecoin stack network, for example:
sudo docker network inspect monitoring
# note the Subnet for the network
sudo ufw allow from 172.18.0.0/16
You can also access the Grafana on the localhost interface on a remote server, you can open an SSH tunnel from your local machine:
The default username and password for Grafana dashboard is "admin" and "admin" respectively.
Update extra_hosts
in docker-compose.yaml
for prometheus
, so that the Prometheus container can reach all its targets - boostd
, lotus-miner
, booster-bitswap
, booster-http
, etc.
The prometheus and tempo services requires access to local storage to ensure all historical data is safe across the docker restarts. This path can be defined in the .
If any of your Boost services like boostd-data
or booster-http
are running on a different host then we must modify the prometheus configuration to update the endpoint to be scraped. Please edit file and update the IP addresses and port to scrap the metrics. Example:
Confirm that Prometheus targets are scraped at Targets
Go to Grafana at and inspect the dashboards.
How to configure and use HTTP retrievals in Boost
booster-http
is a binary that can be run alongside the boostd
process in order to serve retrievals over HTTP. This can be used to serve full pieces and also block data in the form of CAR files or raw block bytes. This can be further extended using a bifrost-gateway server if a Storage Provider wishes to also serve plain files and directories.
HTTP retrievals currently have no in-built payment or permissioning mechanism. Storage Providers may wish to front the booster-http
server with a reverse proxy such as NGINX to handle operational concerns like SSL, authentication and load balancing.
booster-http
was introduced with Boost v1.2.0, able to serve full pieces using the /piece/ endpoint.
In Boost v1.7.0 booster-http
was extended to be able to serve CAR files, IPLD blocks and a full "trusted" gateway (files and directories).
In Boost v2.1.0 booster-http
was simplified to only serve "trustless" data: full pieces, CAR files and IPLD blocks; deferring optional "trusted" mode to bifrost-gateway for those Storage Providers who wish to offer this service to their users.
"Trustless" refers to a mode in which the client does not need to trust a data provider in order to have confidence in the integrity and authenticity of the data they are receiving. Instead, trust is achieved by content-addressing, by having the server deliver to the client everything they require in order to verify the data they are receiving.
"Trusted" refers to a mode in which the client must trust the data provider to deliver the correct data. This is the standard mode that most web servers operate in, and is a common method of interacting with IPFS gateways (including bifrost-gateway
). In this mode, the client must trust that the server is delivering the correct data, and that the server is not maliciously modifying the data in transit. Typically, the client does not receive enough information in order to verify the relationship between the content received and the content-address (CID) requested, hence the need to "trust" the server.
Trustless retrievals are always preferable where there does not exist a trust relationship between parties. By serving data only in a trustless mode, Storage Providers can avoid accusations of data alteration, and clients can be confident that they are receiving the data they requested. It also reduces the risk of having an HTTP endpoint as being flagged for serving illicit content. Learn more at Gateways and gatekeepers.
However, there are many circumstances in which a Storage Provider may wish to offer a trusted retrieval service, particularly due to convenience and interoperability with standard HTTP clients.
When performing certain actions, such as replicating deals, it can be convenient to retrieve the entire Piece (with padding) to ensure CommP integrity. When piece retrievals are enabled with booster-http
with the --serve-pieces
flag (enabled by default), the /piece/
endpoint is exposed.
booster-http
supports the IPFS Trustless Gateway specification for both CAR and raw IPLD data contained within deal payloads. When booster-http
is run with the --serve-cars
flag (enabled by default), the /ipfs/
endpoint is exposed.
To download a CAR file containing all the blocks linked under a given root CID, you can either pass an Accept: application/vnd.ipld.car
header, or a ?format=car
query parameter to the /ipfs/
endpoint. Likewise, to download raw single IPLD block bytes, you can either pass an Accept: application/vnd.ipld.raw
header, or a ?format=raw
query parameter to the /ipfs/
endpoint.
The /ipfs/
endpoint must be followed by a CID and one of the indicators above, for example:
Alternatively, using Accept
header:
It is also possible to use Lassie, a Trustless retrieval client, to perform and verify retrievals from booster-http
:
See the IPFS Trustless Gateway specification for more information on the ranges of queries possible with this protocol; most of which are implemented by Lassie as a client.
For Storage Providers that wish to offer trusted access to payload data, in the same way that an IPFS Gateway bridges IPLD data to standard browser-accessible files and directories, bifrost-gateway can be used to translate booster-http
's trustless data into trusted data. This is a separate process that must be configured to communicate with booster-http
.
See Trusted HTTP Gateway Setup for full setup instructions.
SPs should try a local setup and test their HTTP retrievals before proceeding to run booster-http
in production.
The booster-http
binary is built and installed with boostd
binary. If you are planning to run booster-http
on a different node, you can build and install the new binary. Otherwise, skip to step 3.
Clone the Boost repo and checkout the same release as your boostd
version
Build and install the new binary
Collect the token information for lotus-miner and lotus daemon API
Start the booster-http
server with the above details
You can run multiple booster-http
processes on the same machine by using a different port for each instance with the --port
flag. You can also run multiple instances of the booster-http
on different machines.
By default, the booster-http
server listens to 0.0.0.0
port 7777
. This can be changed with --address
and --port
; for example, you may wish to change the --address
to 127.0.0.1
to restrict it to localhost-only.
When exposing booster-http
to a public address for general retrievals, it is recommended that you run an HTTP reverse proxy such as NGINX to handle operational concerns such as:
SSL
Authentication and authorization
Compression
Load balancing
Logging
booster-http
can provide some of this functionality itself, but dedicated reverse proxy software is recommended for scalable production deployments. See HTTP Reverse Proxy Setup for more information and example configuration.
To enable public discovery of the Boost HTTP server, Storage Providers should set the domain root in boostd's config.toml
. Under the [DealMaking]
section, set HTTPRetrievalMultiaddr
to the public domain root in multi-address format.
Example config.toml
section:
Clients can determine if a Storage Provider offers HTTP retrieval by running:
Clients can also check the HTTP URL scheme version and supported queries
How retrieving data from a given Storage Provider works
Storage Providers on Filecoin support various modes of retrievals for deal data that they have accepted from clients.
Reputation systems periodically crawl the network and record success rate for Storage Providers and determine whether they should make future FIL+ deals with them depending on how well they serve retrievals.
This is a step by step guide to upgrade from Boost v2.1.x to Boost v2.2.0
Boost v2.2.0 enables support for Direct Data Onbording (DDO) and no longer support legacy deals in any form. These changes resulted in some breaking changes in the configuration. Please follow the below steps to upgrade.
Please save you current configuration to a file with boostd config updated --diff
Note down the asking price setting from the boostd
UI.
Pull the new stable release v2.2.0 or RC candidate v2.2.0-rcx
Rebuild the binaries with the new version
Stop booster-http
, booster-bitswap
, boostd
and boostd-data
in that order
Start boostd-data
service first.
Start boostd
after upgrade and shutdown after a successful start
Review the <boost repo>/config.toml
for configuration parameters saved in step 1 and update them if they have been reset to default values.
Start boostd
, booster-http
and booster-bitswap
services.
Restore the asking price setting from UI.
This pages describes the order in which different boost processes must be started and stopped
Start YugabyteDB and wait for services to come online.
Start boostd-data
process.
Start boostd
process.
Start booster-http
and booster-bitswap
processes.
Stop booster-http
and booster-bitswap
processes.
Stop boostd
process.
Stop boostd-data
process.
Finally, you can stop YugabyteDB services.
YugabyteDB services are not required to be stopped if your maintenance work does not directly make changes to the DB itself. The DB can simply be disconnected from boost process by shutting down the boostd-data
process.
Customizing your booster-http instance
Before proceeding any further, we suggest you read about the basics of HTTP retrieval with booster-http. This section is an extension of HTTP retrievals and deals with advanced configuration options.
booster-http
is an independent process that can be run on the same machine as boostd
or on a different machine. Multiple instances can also be run with different listen addresses if required.
booster-http
must have network, or localhost access to a full lotus
node, a lotus-miner
and a LID
instance. The following options are required:
--api-lid
- the Local Index Directory (LID) service endpoint
--api-fullnode
- the lotus
full node API endpoint, discoverable by running lotus auth api-info --perm=admin
--api-storage
- the lotus-miner
API endpoint, discoverable by running lotus-miner auth api-info --perm=admin
--address
and --port
configure the listen address and port of the booster-http
server. By default HTTP server will listen on 0.0.0.0:7777
. This can be set to other addresses and ports as required, e.g. 127.0.0.1
to serve localhost-only if running a reverse-proxy on the same server with a public listen address.
--serve-pieces
is enabled by default and allows retrieval of raw pieces on the /piece/
endpoint of your booster-http
server. Requests for pieces require the full piece CID appended to /piece/
.
Piece retrieval is typically performed to replicate deals, or by clients that are able to decode raw piece data.
--serve-cars
is enabled by default and allows IPFS Trustless Gateway retrievals on the /ipfs/
endpoint. This is not a full "trusted" gateway, and requests must either ask for CARs containing one or more blocks from a root CID, or raw block bytes for a single CID. Requests can either pass an Accept: application/vnd.ipld.car
header, or a ?format=car
query parameter for CAR data. Or, to download raw single IPLD block bytes, either an Accept: application/vnd.ipld.raw
header, or a ?format=raw
query parameter.
A trustless retrieval client is recommended for performing and verifying retrievals from booster-http
. See Lassie for more information. Providing Lassie with the --provider http://{booster-http exposed url}
will perform verified, trustless retrievals to your booster-http
instance.
booster-http
(and booster-bitswap
) automatically filter out known flagged content using the denylist maintained at https://badbits.dwebops.pub/denylist.json. Use one or more --badbits-denylists
flags to point to a custom, valid BadBits denylist and override the default.
By default, booster-http
will compress responses with gzip compression. --compression-level
can be set between values of 0
and 9
, where 0
is no compression and 9
is maximum compression. The default value is 1
, which optimises for speed over compression ratio but this may be increased if required.
Compression can be disabled by setting --compression-level 0
. If you are running a reverse proxy, such as NGINX, in front of booster-http
that performs compression, you should disable compression in booster-http
to avoid double-compression.
booster-http
logs HTTP requests and errors to stdout by default. This can be overridden with --log-file
to log to a file instead. The log file format is similar to typical NGINX or Apache log file formats and is suitable for ingestion into log aggregation tools such as Splunk or ELK. The format is as follows:
Where the elements are:
RFC 3339 timestamp
Remote address
HTTP Method
Request path
Response status code
Response duration (in milliseconds)
Response size
Compression ratio (or -
if no compression)
Remote user agent
Error (or ""
if no error)
When using a reverse proxy, log output from the reverse proxy may be more suitable for storage and analysis than booster-http
logging.
Where a Storage Provider wishes to serve plain files, including streaming video, audio and other media, a trusted HTTP gateway can be used to translate booster-http
's trustless data into trusted data. This is a separate process that must be configured to communicate with booster-http
.
Be aware that trusted HTTP responses typically do not count as "successful" retrievals by reputation systems and other retrieval checkers. It is also not possible to use trusted HTTP gateways to retrieve CAR files, which are required for verified retrievals by clients such as Lassie which is a recommended client for Filecoin downloads. When enabling trusted HTTP gateways, it is recommended to also enable the trustless CAR gateway to allow CAR retrievals; this includes via reverse proxies (see below).
bifrost-gateway is an IPFS Trusted Gateway server that can be used to translate booster-http
's trustless data into trusted data. bifrost-gateway is a separate process that must be configured to communicate with booster-http
.
When running bifrost-gateway
, two environment variables must be set:
PROXY_GATEWAY_URL=http://{booster-http exposed url}
to point to the booster-http
address (without path)
GRAPH_BACKEND=true
to instruct bifrost-gateway
to perform full CAR retrievals rather than individual IPLD block retrievals for efficiency
Additionally, --gateway-port
may be used to override the default listen port of 8081
.
A reverse proxy can be configured to talk to bifrost-gateway
, but be aware that IPFS gateway's are typically exposed on the /ipfs/
endpoint, which is also the endpoint of the trustless gateway which is required for standard Filecoin retrievals (e.g. using Lassie). A reverse proxy combining both the booster-http
trustless endpoint and a bifrost-gateway
trusted endpoint must be configured to route /ipfs/
requests to booster-http
where the Accept
header contains application/vnd.ipld.car
or application/vnd.ipld.raw
, and /ipfs/
requests to bifrost-gateway
where the Accept
header contains anything else, such as text/html
or */*
. Alternatively, separate reverse proxies may be configured for both booster-http
and bifrost-gateway
.
Storage Providers should secure their booster-http
before exposing it to the public. Storage Providers may use any tool available to limit who can download files, the number of requests per second, and the download bandwidth each client can use per second.
NGNIX is one such reverse proxy which may be used in front of a booster-http
instance. This section provides only a basic coverage of the ways in which NGINX can set access limits, rate limits and bandwidth limits. In particular it’s possible to add limits by request token, or using JWT tokens. The examples in this section are adapted from Deploying NGINX as an API Gateway which goes into more detail.
By default NGINX puts configuration files into /etc/nginx
The default configuration file is /etc/nginx/sites-available/default
In this example, we are setting up an NGINX server listen on port 7575
and forward requests to booster-http
on port 7777
.
The IPFS Trustless Gateway serves content from /ipfs/
and the piece retrieval endpoint is /piece/
. A location
block that matches both of these paths will forward requests to booster-http
.
Alternatively, to only forward /ipfs/
requests to booster-http
our location
directive can be simplified:
We can limit access to the IPFS gateway using the standard .htaccess
file. This file contains usernames and passwords. In this example we create a user named alice
:
Include the .htaccess
file in the /etc/nginx/sites-available/default
Now when we open any URL under the path /ipfs/
we will be presented with a Sign-in dialog.
To prevent users from making too many requests per second, we can add rate limits.
Create a file with the rate limiting configuration at /etc/nginx/ipfs-gateway.conf.d/ipfs-gateway.conf
Add a request zone limit to the file of 1 request per second, per client IP
Include ipfs-gateway.conf
in /etc/nginx/sites-available/default
and set the response for too many requests to HTTP response code 429
Click the refresh button in your browser on any path under /ipfs/
more than once per second you will see a 429 error page.
It is also recommended to limit the amount of bandwidth that clients can take up when downloading data from booster-http
. This ensures a fair bandwidth distribution to each client and prevents situations where one client ends up choking the booster-http
instance.
Create a new .htaccess user called bob
Add a mapping from .htaccess
username to bandwidth limit in /etc/nginx/ipfs-gateway.conf.d/ipfs-gateway.conf
Add the bandwidth limit to /etc/nginx/sites-available/default
To verify bandwidth limiting, use curl
to download a file with user alice
and then bob
Note the difference in the Average Dload column (the average download speed).
We can configure NGINX's compression settings in its main http
block, typically in the /etc/nginx/nginx.conf
file.
See the NGINX gzip module documentation for more information.
NGINX can be configured to serve HTTPS traffic. This is recommended for production deployments. See NGINX HTTPS configuration for more information.
NGINX can be configured to cache responses from booster-http
. Since booster-http
serves content addressed data that does not change, this is particularly well suited to caching in cases where certain content is frequently requested. booster-http
sets a long Cache-Control
header by default, so NGINX will cache responses for a long time by default.
See the NGINX proxy module documentation for more information on how to configure caching.
How to configure and use Bitswap retrievals in Boost
booster-bitswap
is a binary that runs alongside the boostd
process, to serve retrievals over the Bitswap protocol. This feature of boost provides a number of tools for managing a production grade Bitswap retrieval service for a Storage Provider's content.
There is currently no payment method in booster-bitswap. This endpoint is intended to serve free content.
Bitswap retrieval introduces interoperability between IPFS and Filecoin, as it enables clients to retrieve Filecoin data over IPFS. This expands the reach of the Filecoin network considerably, increasing the value proposition for users to store data on the Filecoin network. This benefits the whole community, including SPs. Users will be able to access data directly via IPFS, as well as benefit from retrieval markets (e.g. Saturn) and compute over data projects (e.g. Bacalhau).
booster-bitswap
modesThere are two primary "modes" for exposing booster-bitswap
to the internet.
In private mode
the booster-bitswap
peer ID is not publicly accessible to the internet. Instead, public Bitswap traffic goes to boostd
itself, which then acts as a reverse proxy, forwarding that traffic on to booster-bitswap
. This is similar to the way one might configure Nginx as a reverse proxy for an otherwise private web server. private mode
is simpler to setup but may produce greater load on boostd
as a protocol proxy.
In public mode
the public internet firewall must be configured to forward traffic directly to the booster-bitswap
instance. boostd
is configured to announce the public address of booster-bitswap
to the network indexer (the network indexer is the service that clients can query to discover where to retrieve content). This mode offers greater flexibility and performance. You can even setup booster-bitswap
to run over a separate internet connection from boostd
. However, it might require additional configuration and changes to your overall network infrastructure.
You can configure booster-bitswap in the demo mode and familiarise yourself with the configuration. Once you are confident and familiar with the options, please go ahead and configure booster-bitswap
for production use.
The booster-bitswap
binary is built and installed with boostd
binary. If you are planning to run booster-bitswap
on a different node, you can build and install the new binary. Otherwise, skip to step 3.
1. Clone the Boost repo and checkout the same release as your boostd
version
2. Build and install the booster-bitswap
binary:
3. Initialize booster-bitswap
:
4. Record the peer ID output by booster-bitswap init
-- we will need this peer id later
5. Run booster-bitswap
6. By default, booster-bitswap runs on port 8888. You can use --port
to override this behaviour
7. Fetching over bitswap by running
Where peerID
is the peer id recorded when you ran booster-bitswap init
and rootCID
is the CID of a data CID known to be stored on your SP.
booster-bitswap
to serve retrievalsAs described above, booster-bitswap
can be configured to serve the retrievals in 2 modes. We recommend using public mode
to avoid greater load on boostd
as a protocol proxy.
The booster-bitswap
binary is built and installed with boostd
binary. If you are planning to run booster-bitswap
on a different node, you can build and install the new binary. Otherwise, skip to step 3.
1. Clone the Boost repo and checkout the same release as your boostd
version
2. Build and install the booster-bitswap
binary:
3. Initialize booster-bitswap
:
4. Record the peer ID output by booster-bitswap init
-- we will need this peer id later
5. Stop boostd
and edit ~/.boost/config.toml to set the peer ID for bitswap
6. Start boostd
service again
7. Collect the token information for lotus-miner and lotus daemon API
8. Run booster-bitswap
You can get a boostd
multiaddress by running boostd net listen
and using any of the returned addresses
9. By default, booster-bitswap runs on port 8888. You can use --port
to override this behaviour
10. Try to fetch a payload CID over bitswap to verify your configuration
The booster-bitswap
binary is built and installed with boostd
binary. If you are planning to run booster-bitswap
on a different node, you can build and install the new binary. Otherwise, skip to step 3.
1. Clone the Boost repo and checkout the same release as your boostd
version
2. Build and install the booster-bitswap
binary:
3. Initialize booster-bitswap
:
4. Record the peer ID output by booster-bitswap init
-- we will need this peer id later
5. Stop boostd
and edit ~/.boost/config.toml to set the peer ID for bitswap
The libp2p private key file for booster-bitswap can generally be found at <booster-bitswap repo path>/libp2p.key
The reason boost needs to know the public multiaddresses and libp2p private key for booster-bitswap
is so it can properly announce these records to the network indexer.
6. Start boostd
service again
7. Collect the token information for lotus-miner and lotus daemon API
8. Run booster-bitswap
9. By default, booster-bitswap runs on port 8888. You can use --port
to override this behaviour
10. Try to fetch a payload CID over bitswap to verify your configuration
booster-bitswap
configurationbooster-bitswap
provides a number of performance and safety tools for managing a production grade bitswap server without overloading your infrastructure.
Depending on your hardware you may wish to increase or decrease the default parameters for the bitswap server internals. In the following example we are increasing the worker count for various components up to 600. This will utilize more CPU and I/O, but improve the performance of retrievals. See the command line help docs for details on each parameter.
Booster-bitswap is automatically setup to deny all requests for CIDs that are on the BadBits Denylist. The default badbits list can be override or addition badbits list can be provided to the booster-bitswap
instance.
booster-bitswap
provides a number of controls for filtering requests and limiting resource usage. These are expressed in a JSON configuration file <booster-bitswap repo>/retrievalconfig.json
You can create a new retrievalconfig.json
file if one does not exists
To make changes to the current configuration, you need to edit the retrievalconfig.json
file and restart booster-bitswap
for the changes to take affect. All configs are optional and absent parameters generally default to no filtering at all for the given parameter.
You can also configure booster-bitswap
to fetch your retrieval config from a remote HTTP API, possibly one provided by a third party configuration tool like CIDGravity. To do this, start booster-bitswap
with the --api-filter-endpoint {url} option where URL is the HTTP URL for an API serving the above JSON format. Optionally, add --api-filter-auth {authheader}, if you need to pass a value for the HTTP Authorization header with your API
When you setup with an API endpoint, booster-bitswap
will update its local configuration from the API every five minutes, so you won't have to restart booster-bitswap
to make a change. Please, be aware that the remote config will overwrite, rather than merge, with the local config.
Limiting bandwidth within booster-bitswap will not provide the optimal user experience. Dependent on individual setup, setting up limitations within the software could have a larger impact on the storage provider operations. Therefore, we recommend storage providers to set up their own bandwidth limitations using existing tools.
There are multiple options to setup bandwidth limitating.
At the ISP level - dedicated bandwidth is provided to the node running booster-bitswap.
At the router level - we recommend configuring the bandwidth at the router level as it provides more flexibility and can be updated as needed. To configure the bandwidth on your router, please check with your manufacturer.
Limit the bandwidth using different tools available in Linux. Here are some of the examples of such tools. Please feel free to use any other tools not listed here and open a Github issue to add your example to this page.
TC is used to configure Traffic Control in the Linux kernel. There are examples available online detailing how to configure rate limiting using TC.
You can use the below commands to run a very basic configuration.
Trickle is a portable lightweight user space bandwidth shaper, that either runs in collaborative mode (together with trickled) or in standalone mode. You can read more about rate limiting with trickle here. Here's a starting point for configuration in trickle to rate limit the booster-bitswap service.
Another way of controlling network traffic is to limit bandwidth on individual network interface cards (NICs). Wondershaper is a small Bash script that uses the tc command-line utility in the background to let you regulate the amount of data flowing through a particular NIC. As you can imagine, while you can use wondershaper on a machine with a single NIC, its real advantage is on a machine with multiple NICs. Just like trickle, wondershaper is available in the official repositories of mainstream distributions. To limit network traffic with wondershaper, specify the NIC on which you wish to restrict traffic with the download and upload speed in kilobits per second.
For example,
How setup a new Boost instance for a new lotus-miner
If you are already running a Boost node then please follow the migration tutorial and do not attempt the below steps. This can result in permanent data loss.
Make sure you have read the Components page before proceeding. Boost v2 introduces a new service called boostd-data which requires a database to be installed - YugabyteDB or LevelDB.
Boost v2 introduces the Local Index Directory as a replacement for the DAG store. It scales horizontally and provides a more intuitive experience for users, by surfacing problems in the UI and providing repair functionality.
The Local Index Directory (LID) stores retrieval indices in YugabyteDB. Retrieval indices store the size and location of each block in the deal data.
It is recommend to run YugabyteDB on a dedicated host with SSD drives. Depending on how many blocks there are in the user data, the retrieval indices may require up to 2% of the size of the unsealed data. e.g. 1 TiB of unsealed user data may require a 20 GiB index.
YugabyteDB requires about the same amount of space as your DAG store requires today.
You can find more information about YugabyteDB in the Components
section:
Follow these instructions in order to setup a new Boost instance with boostd-data
service:
Make sure you have a Lotus node and miner running
Create and send funds to two new wallets on the lotus node to be used for Boost. Boost currently uses two wallets for storage deals:
The publish storage deals wallet - This wallet pays the gas cost when Boost sends the PublishStorageDeals
message.
The deal collateral wallet - When the Storage Provider accepts a deal, they must put collateral for the deal into escrow. Boost moves funds from this wallet into escrow with the StorageMarketActor
.
Set the publish storage deals wallet as a control wallet
Run the boostd-data
service
boostd-data
is a data proxy service which abstracts the access to LID through an established API. It makes it easier to secure the underlying database and not expose it. boostd-data
listens to a websocket interface, which is the entrypoint which should be exposed to boostd
, andbooster-http
Start the boostd-data
service with parameters to connect to YugabyteDB on its Cassandra and PostgreSQL interfaces:
The boostd-data
service can run a separate machine from Boost as long as the service is reachable via the network from the boost node
Create and initialize the Boost repository
Boost keeps all data in a directory called the repository. By default the repository is at ~/.boost
. To use a different location pass the --boost-repo
parameter (must precede any particular command verb, e.g. boostd --boost-repo=<PATH> init
).
Export the environment variables needed for boostd init
to connect to the lotus daemon and lotus miner.
Export environment variables that point to the API endpoints for the sealing and mining processes. They will be used by the boost
node to make JSON-RPC calls to the mining/sealing/proving
node.
Run boostd init
to create and initialize the repository:
--api-sealer
is the API info for the lotus-miner instance that does sealing
--api-sector-index
is the API info for the lotus-miner instance that provides storage
--max-staging-deals-bytes
is the maximum amount of storage to be used for downloaded files (once the limit is reached Boost will reject subsequent incoming deals)
Update ulimit
file descriptor limit if necessary. Boost deals will fail if the file descriptor limit for the process is not set high enough. This limit can be raised temporarily before starting the Boost process by running the command ulimit -n 1048576
. We recommend setting it permanently by following the Permanently Setting Your ULIMIT System Value guide.
Before you start your boostd, it is important that it is reachable from any peer in the Filecoin network. For this, you will need a stable public IP and edit your <boostd repo>/config.toml
as follows:
Add boostd-data
details to the boostd
config
Configure boostd
repository config (located at <boostd repo>/config.toml
) to point to the exposed boostd-data
service endpoint. Note that the connection must be configured to go over a websocket.
For example:
Start the boostd
process
Make sure that the correct peer id and multiaddr for your SP is set on chain, given that boost init
generates a new identity. Use the following commands to update the values on chain:
<MULTIADDR> should be the same as the AnnounceAddresses
you set in the Libp2p
section of the config.toml of Boost
<PEER_ID> can be found in the output of boostd net id
command
At this stage you should have the latest version of Boost running with the Local Index Directory. Go to the Local Index Directory page and review the number sections:
Pieces section shows counters for total pieces of user data that your SP is storing as well as whether you are keeping unsealed and indexed copies of them.
Flagged pieces are pieces that either lack an unsealed copy or are missing an index. For the sealed-only user data, you should make sure that you unseal individual sectors if you want this data to be retrievable.
Sealed only copies of data are not retrievable and are only being proven on-chain within the corresponding deadline / window. Typically sealed only data is considered as archival as it is not immediately retrievable. If the client requests it, the SP sealing pipeline must first unseal it, which typically takes 1-2 hours, and only then the data becomes available.
Flagged (unsealed) pieces is user data that your SP is hosting, which is not indexed.
We recommend that you trigger re-indexing for these pieces, so that data becomes retrievable. Check the tutorial on re-indexing flagged unsealed pieces for more information.
Deal Sector Copies section displays counters of your sectors state - whether you keep unsealed copies for all sectors or not. Ideally the SP should keep unsealed copies for all data that should be immediately retrievable.
Sector Proving State section displays counters of your active and inactive sectors - active sectors are those that are actively proven on-chain, inactive sectors are those that you might have failed to publish a WindowPoSt for, or are expired or removed.
Migrate from Boost version 1 to version 2
Make sure you have read the Components page before proceeding. Boost v2 introduces a new service called boostd-data which requires a database to be installed - YugabyteDB or LevelDB.
Boost v2 introduces the Local Index Directory as a replacement for the DAG store. It scales horizontally and provides a more intuitive experience for users, by surfacing problems in the UI and providing repair functionality.
When boost receives a storage deal, it creates an index of all the block locations in the deal data, and stores the index in LID.
When boostd / booster-http etc gets a request for a block it:
gets the block sector and offset from the LID index
requests the data at that sector and offset from the miner
A large miner with many incoming retrieval requests needs many boostd / booster-http / booster-bitswap processes to serve those requests. These processes need to look up block locations in a centralized index.
We tested several databases and found that YugabyteDB is best suited to the indexing workload because
it performs well on off-the-shelf hardware
it's easy to scale up by adding more machines
it has great documentation
once set up, it can be managed through a web UI
It is possible to connect multiple boostd instances to a single LID instance. In this scenario, each boostd instance still stores data to a single miner. eg boostd A stores data to miner A, boostd B stores data to miner B. However each boostd instance saves retrieval indexes in a single, shared LID instance.
For retrieval, each boostd instance can query the shared LID instance (to find out which miner has the data) and retrieve data from any miner in the cluster.
booster-bitswap and booster-http can also be configured to query the shared LID instance, and retrieve data from any miner in the cluster.
If you are deploying multiple boostd
instances with a single LID instance you will need to set up the networking so that each boostd
, booster-bitswap
and booster-http
instance can query all miners and workers in the cluster. We recommend assigning all of your miner instances and boostd instances to the same subnet. Note also that the Yugabyte DB instance will need enough space for retrieval indexes for all of the miners.
The Local Index Directory stores retrieval indices in a YugabyteDB database. Retrieval indices store the size and location of each block in the deal data.
We recommend running YugabyteDB on a dedicated machine with SSD drives. Depending on how many blocks there are in the user data, the retrieval indices may require up to 2% of the size of the unsealed data. e.g. 1 TiB of unsealed user data may require a 20 GiB index.
YugabyteDB should require about the same amount of space as your DAG store requires today.
You can find more information about YugabyteDB in the Components
section:
Follow these instructions in order to migrate your existing DAG store into the new Local Index Directory and upgrade from Boost v1 to Boost v2:
1. Clone the Boost repository to a temporary directory
Note: Don’t overwrite your existing boost instance at this stage
2. Check out the Boost v2 release
3. Build from source
4. Migrate dagstore indices
Depending on the amount of data your SP is storing, this step could take anywhere from a few minutes to a few hours. You can run it even while Boost v1 continues to run. The command can be stopped and restarted. It will continue from where it left off.
Run the migration with parameters to connect to YugabyteDB on its Cassandra and PostgreSQL interfaces:
The PGX driver from Yugabyte supports cluster aware Postgres connection out of the box. If you are deploying a multi-node YugabyteDB cluster, then please update your connect-string to use a cluster aware connection.
With Cluster Mode: "postgresql://postgres:postgres@127.0.0.1:5433?load_balance=true"
With Cluster Mode + No SSL: "postgresql://postgres:postgres@127.0.0.1:5433?sslmode=disable&load_balance=true"
It will output a progress bar, and also a log file with detailed migration information at migrate-yugabyte.log
If you are deploying a single LID instance with multiple boost instances, you will need to repeat this step for each boost instance in the cluster.
5. Run the boostd-data
service
boostd-data
is a data proxy service which abstracts the access to LID through an established interface. It makes it easier to secure the underlying database and not expose it. boostd-data
listens to a websocket interface, which is the entrypoint which should be exposed to boostd
, andbooster-http
Start the boostd-data
service with parameters to connect to YugabyteDB on its Cassandra and PostgreSQL interfaces:
The PGX driver from Yugabyte supports cluster aware Postgres connection out of the box. If you are deploying a multi-node YugabyteDB cluster, then please update your connect-string to use a cluster aware connection.
With Cluster Mode: "postgresql://postgres:postgres@127.0.0.1:5433?load_balance=true"
With Cluster Mode + No SSL: "postgresql://postgres:postgres@127.0.0.1:5433?sslmode=disable&load_balance=true"
--hosts takes the IP addresses of the YugabyteDB YT-Servers separated by "," Example:
-- hosts 10.0.0.1,10.0.0.2,10.0.0.3
--addr is the <IP>:<PORT> where boostd-data
service should be listening on. The IP here can be a private one (recommended) and should reachable by all boost related processes. Please ensure to update your firewall configuration accordingly.
If you are deploying a single LID instance with multiple boostd
instances, you should run a single boostd-data
process on one of the hosts where YugabyteDB is installed. All boostd
, booster-bitswap
and booster-http
instances should be able to reach this single boostd-data
process.
6. Update boostd
repository config
Configure boostd
repository config (located at <boostd repo>/config.toml
) to point to the exposed boostd-data
service endpoint. Note that the connection must be configured to go over a websocket.
For example:
6.1 Add miners to boostd
repository config
If you are deploying a single LID instance with multiple boost instances, you will also need to add to config the RPC endpoint for each miner in the cluster. This allows boostd to serve data for each miner over Graphsync.
Make sure to test that this boostd instance can reach each miner by running
7. Install Boost v2
Note that in v2 booster-http
and booster-bitswap
take slightly different parameters (see below).
8. Stop boostd
, booster-http
and booster-bitswap
You need to stop boostd
before migrating piece info
data.
9. Migrate piece info data (information about which sector each deal is stored in)
This should take no more than a few minutes.
If you are deploying a single LID instance with multiple boostd
instances, you will need to repeat this step for each boostd
instance in the cluster.
10. Start the upgraded versions of boostd
, booster-http
and booster-bitswap
Note that booster-http
and booster-bitswap
take slightly different parameters:
--api-boost
is removed
There is a new parameter --api-lid
that points to the boostd-data
service (which hosts LID), e.g. --api-lid="ws://<boostd-data>:8044"
If you are deploying a single LID instance with multiple booster-bitswap
and booster-http
instances, you should supply a --api-storage
flag for each one
eg --api-storage=MINER_API_INFO_1 --api-storage=MINER_API_INFO_2
Make sure to test that this booster-http / booster-bitswap instance can reach each miner by running
11. Clean up the dagstore directory from boostd
repo and the temporary boost github repo
Be careful when running the below command to ensure that you do not remove incorrect directory $ rm -rf <boostd repo>/dagstore
$ rm -rf /tmp/boostv2
1. Test how long it takes to reindex a piece
2. Perform a retrieval using Graphsync, Bitswap and HTTP
At this stage you should have the latest version of Boost running with the Local Index Directory. Go to the Local Index Directory page and review the number sections:
Pieces section shows counters for total pieces of user data that your SP is storing as well as whether you are keeping unsealed and indexed copies of them.
Flagged pieces are pieces that either lack an unsealed copy or are missing an index. For the sealed-only user data, you should make sure that you unseal individual sectors if you want this data to be retrievable.
Sealed only copies of data are not retrievable and are only being proven on-chain within the corresponding deadline / window. Typically sealed only data is considered as archival as it is not immediately retrievable. If the client requests it, the SP sealing pipeline must first unseal it, which typically takes 1-2 hours, and only then the data becomes available.
Flagged (unsealed) pieces is user data that your SP is hosting, which is not indexed.
We recommend that you trigger re-indexing for these pieces, so that data becomes retrievable. Check the tutorial on re-indexing flagged unsealed pieces for more information.
Deal Sector Copies section displays counters of your sectors state - whether you keep unsealed copies for all sectors or not. Ideally the SP should keep unsealed copies for all data that should be immediately retrievable.
Sector Proving State section displays counters of your active and inactive sectors - active sectors are those that are actively proven on-chain, inactive sectors are those that you might have failed to publish a WindowPoSt for, or are expired or removed.
This pages explains how to re-index unsealed pieces flagged by the Piece Doctor in the Local Index Directory so that they are retrievable
Fixing individual flagged unsealed pieces from the Boost Web UI is possible directly from the Web UI.
If the SP has a lot of flagged pieces, you can automate the re-indexing of pieces with the following commands:
piececid
s of all the flagged unsealed pieces from LIDpiececid
This little script will fetch up to a thousand flagged pieces that are not processed yet, and will process them 4 at a time. Change the -P4 parameter to change the concurrency.
This is a step by step guide to upgrade from Boost v2.0.0 to Boost v2.1.0
Pull the new stable release v2.1.0 or RC candidate v2.1.0-rcx
Rebuild the binaries with the new version
Stop booster-http
, booster-bitswap
, boostd
and boostd-data
in that order
Start boostd-data
service first. If you are using systemd service files to manage the process, then please start it manually without using the systemd files.
The Postgres <CONNECT_STRING> will need to be updated to not use SSL mode. Otherwise, you might see error connecting to the DB. The updated connect string should like below
postgresql://<username>:<password>@<yugabytedb>:5433?sslmode=disable
The PGX driver from Yugabyte supports cluster aware Postgres connection out of the box. If you are deploying a multi-node YugabyteDB cluster, then please update your connect-string to use a cluster aware connection.
With Cluster Mode: "postgresql://postgres:postgres@127.0.0.1:5433?load_balance=true"
With Cluster Mode + No SSL: "postgresql://postgres:postgres@127.0.0.1:5433?sslmode=disable&load_balance=true"
Once boostd-data
service starts, it will throw the below error and quit the process
If you do not see the above error and process does not exit, then you do not require a migration. At this point, please skip to step 9.
Run the one time migration.
Please ensure to use the minerID of already connected miner. Other miners can only be connected once the migration is complete.
Once the migration is finished (few seconds to 2 minutes), SPs using systemd to maintain boostd-data
service should also update their connect-string
in the systemd service files.
Start the boostd-data
process (or service) and start your boostd
instance. Go to the UI and confirm that you can see your minerID on the top left side of the page and the LID page is being populated correctly.
Start booster-http
and booster-bitswap
services.
How to backup and restore Boost
Boost now supports both online and offline backups. The backup command will output a backup directory containing the following files.
metadata
- contains backup of leveldb
boostd.db
- backup of deals database
keystore
- directory containing libp2p keys
token
- API token
config
- directory containing all config files and config.toml
link
storage.json
- file containing storage details
Backup does not back up deal logs and Local Index Directory.
You can take an online backup with the below command
The online backup supports running only one instance at a time and you might see a locking error if another instance of backup is already running.
Shutdown boostd
before taking a backup
Take a backup using the command line
Make sure that --boost-repo
flag is set if you wish to restore to a custom location. Otherwise, it will be restored to ~/.boost
directory
Restore the boost repo using the command line
Advanced configurations you can tune to optimize your legacy deal onboarding
This section controls parameters for making storage and retrieval deals:
The final value of ExpectedSealDuration
should equal (TIME_TO_SEAL_A_SECTOR + WaitDealsDelay) * 1.5
. This equation ensures that the miner does not commit to having the sector sealed too soon
StartEpochSealingBuffer
allows lotus-miner
to seal a sector before a certain epoch. For example: if the current epoch is 1000 and a deal within a sector must start on epoch 1500, then lotus-miner
must wait until the current epoch is 1500 before it can start sealing that sector. However, if Boost sets StartEpochSealingBuffer
to 500, the lotus-miner
can start sealing the sector at epoch 1000.
If there are multiple deals in a sector, the deal with a start time closest to the current epoch is what StartEpochSealingBuffer
will be based off. So, if the sector in our example has three deals that start on epoch 1000, 1200, and 1400, then lotus-miner
will start sealing the sector at epoch 500.
The PublishStorageDeals
message can publish multiple deals in a single message. When a deal is ready to be published, Boost will wait up to PublishMsgPeriod
for other deals to be ready before sending the PublishStorageDeals
message.
However, once MaxDealsPerPublishMsg
is ready, Boost will immediately publish all the deals.
For example, if PublishMsgPeriod
is 1 hour:
At 1:00 pm, deal 1 is ready to publish. Boost will wait until 2:00 pm for other deals to be ready before sending PublishStorageDeals
.
At 1:30 pm, Deal 2 is ready to publish
At 1:45 pm, Deal 3 is ready to publish
At 2:00pm, Boost publishes Deals 1, 2, and 3 in a single PublishStorageDeals
message.
If MaxDealsPerPublishMsg
is 2, then in the above example, when deal 2 is ready to be published at 1:30, Boost would immediately publish Deals 1 & 2 in a single PublishStorageDeals
message. Deal 3 would be published in a subsequent PublishStorageDeals
message.
If any of the deals in the PublishStorageDeals
fails validation upon execution, or if the start epoch has passed, all deals will fail to be published
Your use case might demand very precise and dynamic control over a combination of deal parameters.
Boost provides two IPC hooks allowing you to name a command to execute for every deal before the miner accepts it:
Filter
for storage deals.
RetrievalFilter
for retrieval deals.
The executed command receives a JSON representation of the deal parameters on standard input, and upon completion, its exit code is interpreted as:
0
: success, proceed with the deal.
non-0
: failure, reject the deal.
The most trivial filter rejecting any retrieval deal would be something like: RetrievalFilter = "/bin/false"
. /bin/false
is binary that immediately exits with a code of 1
.
As explained in the tutorial , in Boost v2 the Local Index Directory periodically checks all pieces that the SP stores and confirms if there exists an unsealed copy of the data and whether it is indexed. If the index is missing, the piece is flagged, meaning that the operator of the SP should fix it if they wish to make it retrievable.
Boost v2.1.0 supports . This would enable the SP to serve retrievals from all the connected miners via a single boostd
, booster-bitswap
or booster-http
instance. To enable this feature, we need to update our existing LID tables to add minerID to the piece metadata.
If you wish to connected other miners to the upgraded LID service, then please follow the .
ExpectedSealDuration
is an estimate of how long sealing will take and is used to reject deals whose start epoch might be earlier than the expected completion of sealing. It can be estimated by or by .
lets the miner deny specific clients and only accept deals that are set to start relatively soon.
You can also use a third party content policy framework like or bitscreen
by Murmuration Labs:
If the client cannot connect to Boost running on a Storage provider, with an error similar to the following:
The problem is that:
The SP registered their peer id and address on chain.
eg "Register the peer id 123abcd
at address /ip4/123.456.12.345/tcp/1234
"
The SP changed their peer id locally but didn't update the peer id on chain.
The client wants to make a storage deal with peer 123abcd
. The client looks on chain for the address of peer 123abcd
and sees peer 123abcd
has registered an address /ip4/123.456.12.345/tcp/1234
.
The client sends a deal proposal for peer 123abcd
to the SP at address /ip4/123.456.12.345/tcp/1234
.
The SP has changed their peer ID, so the SP responds to the deal proposal request with an error: peer id mismatch
To fix the problem, the SP should register the new peer id on chain:
Clients would not be able to connect to Boost running on a Storage provider after an IP change. This happens as clients lookup the registered peer id and address on chain for a SP. When a SP changes their IP or address locally, they must update the same on chain.
The SP should register the new peer id on chain using the following lotus-miner command
Please make sure to use the public IP and port of the Boost node and not lotus-miner
node if your miner and boostd
runs on a separate machine.
The on chain address change requires access to the worker key and thus the command lives in lotus-miner
instead of Boost.
After migrating to Boost, following error is seen when running lotus-miner info
:
lotus-miner
is making a call on lotus-market
process which has been replaced by Boost, but lotus-miner
is not aware of the new market process.
Export the MARKETS_API_INFO variable on your lotus-miner node.
Users might find log lines indicating client timeouts when reading or writing to the YugabyteDB's Cassandra API
The CQL timeouts can be caused by multiple issues:
YugabyteDB prerequisites are not met. This can be due to lack of hardware resources or incorrect hardware types.
Slow network connection between boostd-data
service and YugabyteDB.
A low client timeout value.
Multiple indexing operation in parallel for large indices (tens of millions of records within the piece index).
YugabyteDB has not been scaled up to support the workload in case of extreme workloads.
Make sure that you don't have commp
hash computations running locally for incoming deals, while you also have YugabyteDB on the same node. YugabyteDB is CPU intensive, if another process (such as boostd
commp
computation) takes over its resources, it is possible that you are seeing cql timeout
errors.
We recommend verifying that all prerequisites for YugabyteDB are met.
Verify that you are running the latest version of Boost. If you are running an older version, we recommend you to upgrade to latest stable version and check issue persists.
Reduce the parallel indexing operation to 1 from the default of 4 by updating the value of ParallelAddIndexLimit
in config.toml
file
Consider moving CommP computations to remote nodes if you are running YugabyteDB on the same host as boostd
using the RemoteCommp
flag in config.toml
file
Open a support ticket if issue persists even after trying the above steps. Please make sure to include details about your YugabyteDB hardware and DEBUG
logs from boostd
and boostd-data
service.
This page explains how to start monitoring and accepting deals published on-chain on the FVM
With the release of FVM, it is now possible for smart contracts to make deal proposals on-chain. This is made possible though the DealProposal FRC.
DataDAOs, as well as other clients who want to store data on Filecoin, can now deploy a smart contract on the FVM which adheres to the DealProposal FRC, and make deal proposals that are visible to every storage provider who monitors the chain.
Boost already has support for the DealProposal FRC.
The code for FVM monitoring resides in the latest release of the Boost. It should be used with caution for production use. SPs must enable FEVM on lotus daemon before proceeding to the next step.
In order to enable contract deals, you have to edit your config.toml
and enable chain monitoring. By default it is disabled. Here is an example configuration:
AllowlistContracts
field could be left empty if you want to accept deals from any client. If you only want to accept deals from some clients, you can specify their contract addresses in the field.
From
field should be set to your SP's FEVM address. Some clients may implement a whitelist which allows specific SPs to accept deal proposals from their contract. This field will help those clients identify your SP and match it to their whitelist.
A contract publishes a DealProposalCreate
event on the chain.
Boost monitors the chain for such events from all the clients by default. When such an event is detected, we go and fetch the data for the deal.
Deal is then run through the basic deal validation filters like clients has enough funds, SP has enough funds etc.
Once deal passes the validation, we create a new deal handler in Boost and pass this deal for execution like other Boost deals.
Direct data onboarding deals
Wait till the Boost UI is reachable at http://localhost and then, open a terminal to the boost
container
Setup notary and add balance to client market actor
Grant the datacap to the client
Local Index Directory dependencies
Local Index Directory depends on a backend database to store various indices. Currently we support two implementations - YugabyteDB or LevelDB - depending on the size of deal data and indices a storage provider holds.
LevelDB is an open source on-disk key-value store, and can be used when indices fit on a single host.
YugabyteDB is an open source modern distributed database designed to run in any public, private, hybrid or multi-cloud environment.
Storage providers who hold more than 1PiB data are encouraged to use YugabyteDB as it is horizontally scalable, provides better monitoring and management utilities and could support future growth.
What is data segment indexing and how is affects storage providers
A large majority of users onboard data onto the Filecoin network via an Aggregator, a third party combining small pieces of data into a singular large deal. Today the work done by aggregators is unverifiable and unprovable. The user relies on the Aggregator to perform the work correctly and at the same time, it is impossible to prove to a third party that a given piece of data was included in a deal which is a highly requested functionality for user-programmable data use cases.
FRC 58 enables the data aggregators to produce a Proof of Data Segment Inclusion certifying proper aggregation of Client's data. The produced proof assures:
an inclusion of Client's data within the on-chain deal
the Client's data can be trivially discovered within the deal to enable retrieval
malicious behaviour of an Aggregator or another user, whose data was aggregated, does not interfere with retrievability of Client's data
This is a critical link in enabling and exposing small pieces of data to the FEVM ecosystem. In the majority of cases, small pieces of data undergo an aggregation process, combing them into a large deal for acceptance by a Storage Provider. Without the proposed proof, data within aggregated deals becomes a second class citizen in Filecoin ecosystem. A significant portion of the F(E)VM use-case is enabling the ability to process and reason about the data stored by Filecoin Storage Providers. The Proof of Data Segment Inclusion allows to apply this new capability on segments of data which are too small to be on-boarded in their own deals due to economic constraints.
After upgrading to Boost v2.1.0-rc1, users can build boostd
using the branch feat/noncar-files
. Once the new binary is used to start the boostd
process, the feature is automatically enabled on the storage provider side.
The attached index at the end of the aggregated cars allow Boost to index the aggregated deals correctly. Once the deals are indexed, client can retrieve any payload CIDs from that deal using one of the 3 available data transfer protocols.
Client can use the mkpiece utility to generate an aggregated car file for the deal making. The utility takes multiple car files and generates the resulting aggregated file on the standard output.
Please note that each car file is padded to the nearest 2^n bytes. So, the resultant aggregated file can be much larger than the original car files.
Example:
car1 - 4.5 GiB - Padded to 8 GiB
car2 - 10 GiB - Padded to 16 GiB
car3 - 5 GiB - Padded to 8 GiB
Total car size = 4.5+10+5 = 19.5 GiB
Aggregated car size = 8+16+8 = 32 GiB
This aggregated file can be used to generate the piece CID and size for a Boost deal.
This page details the Boost SP daemon - boostd
The boostd
executable runs as a daemon alongside a Lotus full node and Lotus miner. The daemon exposes a libp2p interface for storage and retrieval deals. It performs on-chain operations by making API calls to the Lotus node. The daemon hands off downloaded data to the Lotus miner for sealing via API calls to the Lotus miner.
boostd
has a web interface for fund management and deal monitoring. The web interface is a React application that consumes a GraphQL interface exposed by the daemon.
The typical flow for a Storage Deal is:
The Client puts funds in escrow with the Storage Market Actor on chain.
The Client uploads a CAR file to a web server.
The Client sends a storage deal proposal to Boost with the URL of the CAR file.
Boost checks that the client has enough funds in escrow to pay for storing the file.
Boost accepts the storage deal proposal.
Boost downloads the CAR file from the web server.
Boost publishes the deal on chain.
The client checks that the deal was successfully published on chain.
Boost exposes a libp2p interface to listen for storage deal proposals from clients.
Boost communicates with the Lotus node over its JSON-RPC API for on-chain operations like checking client funds and publishing the deal.
Once the deal has been published, Boost hands off the downloaded file to lotus-miner
for sealing.
This tutorial goes through all the steps required to make a storage deal with Boost on Filecoin.
The init
command will output your new wallet address, and warn you that the market actor is not initialised.
Then you need to send funds to the wallet, and add funds to the market actor (in the example below we are adding 1 FIL
).
You can use the boostx
utilities to add funds to the market actor:
You can confirm that the market actor has funds by running boost init
again.
Then you need to calculate the commp
and piece size
for the generated car
file:
Place the generated car
file on a public HTTP server, so that a storage provider can later fetch it.
Finally, trigger an online storage deal with a given storage provider:
This section describes how to upgrade your lotus-miner markets service to boostd v2.x.x
The migrating from a monolith Lotus node or Lotus markets is a 2 step process.
Frequently asked questions about Boost
Is there a way to stop boostd
daemon?
You can use the regular Unix OS signals
Is Boost compatible with the Lotus client? Can a client use lotus client deal
to send a deal to Boost storage providers or do they have to use the boost client?
No. Boost no longer supports deals from lotus client. Boost will work with any client if the client uses deal making protocol /fil/storage/mk/1.2.0
or newer.
Can Boost make verified deals?
Yes, payments for deals can be made either from a regular wallet, or from DataCap. Deals that are paid for with DataCap are called verified
deals.
Can I run both Boost and markets at the same time? No, Boost replaces the legacy markets process.
booster-bitswap is a service which allows SP to serve blocks and files over the Bitswap protocol.
Go to the following page for more information on booster-bitswap
:
This page explains how to initialise LID and start using it to provide retrievals to clients
Considering that the Local Index Directory is a new feature, Storage Providers should initialise it after upgrading their Boost deployments.
There are two ways a Storage Provider can do that:
Migrate existing indices from the DAG store into LID: this solution assumes that the Storage Provider has been keeping an unsealed copy for every sector they prove on-chain, and has already indexed all their deal data into the DAG store.
Typically index sizes for a given sector range between 100KiB up to 1GiB, depending on deal data and its blocks sizes. The DAG store keeps these indices in the repository directory of Boost under the ./dagstore/index
and ./dagstore/datastore
directories. This data should be migrated to LID with the migrate-lid
utility.
Recreate indices for deal data based on unsealed copies of sectors: this solution assumes that the Storage Provider has unsealed copies for every sector they prove on-chain. If this is not the case, then the SP should first trigger an unseal (UNS) job on their system for every sector that contains user data and produce an unseal copy.
SPs can use the boostd recover lid
utility to produce an index for all deal data within an unsealed sector and store it in LID so that they enable retrievals for the data. Depending on SPs deployment and where unsealed copies are hosted (NFS, Ceph, external disks, etc.) and the performance of the hosting system, producing an index for a 32GiB sector can take anywhere from a few seconds up to a few minutes, as the unsealed copy needs to be processed by the utility.
If you are migrating from Boost v1, make sure to read the tutorial:
This page details the Boost data service - boostd-data
The boostd-data
service is a proxy to the underlying backend database that hosts the Local Index Directory (LID). Currently there is support for two stores:
LevelDB
YugabyteDB
We recommend creating a systemd file for this service and utilising it to easily start and stop the boostd-data
service. Please note that this service is an independent process and is not controlled by any other Boost process.
The PGX driver from Yugabyte supports cluster aware Postgres connection out of the box. If you are deploying a multi-node YugabyteDB cluster, then please update your connect-string to use a cluster aware connection.
With Cluster Mode: "postgresql://postgres:postgres@127.0.0.1:5433?load_balance=true"
With Cluster Mode + No SSL: "postgresql://postgres:postgres@127.0.0.1:5433?sslmode=disable&load_balance=true"
--hosts takes the IP addresses of the YugabyteDB YT-Servers. Example:
--hosts=10.0.0.1 --hosts=10.0.0.2 --hosts=10.0.0.3
--addr is the <IP>:<PORT> where boostd-data
service should be listening on. The IP here can be a private one (recommended) and should reachable by all boost related processes. Please ensure to update your firewall configuration accordingly.
Boost exposes libp2p protocols so that clients can initiate storage deals with the SP
The client makes a deal proposal over v1.2.0
or v1.2.1
of the Propose Storage Deal Protocol: - /fil/storage/mk/1.2.0
or
- /fil/storage/mk/1.2.1
It is a request / response protocol, where the request and response are CBOR-marshalled.
There are two new fields in the request of v1.2.1
of the protocol, described in the table below.
The client requests the status of a deal over v1.2.0
of the Storage Deal Status Protocol: /fil/storage/status/1.2.0
It is a request / response protocol, where the request and response are CBOR-marshalled.
First, you need to initialise a new Boost client and also set the endpoint for a public Filecoin node. In this example we are using
After that you need to generate a car
file for data you want to store on Filecoin, and note down its payload-cid.
We recommend using CLI to generate the car file.
(Please use Boost v1.7.5 to migrate)
Does Boost provide retrieval functionality? Yes, Boost provides 3 protocols for retrievals as of now. By default, Boost has Graphsync retrieval enabled. SPs can run Bitswap and HTTP retrievals by running and respectively.
Does Boost client have retrieval functionality? Yes, Boost client supports retrieval over graphsync protocol. But we highly recommend, using client for Filecoin/IPFS retrievals.
DealUUID
uuid
A uuid for the deal specified by the client
IsOffline
boolean
Indicates whether the deal is online or offline
ClientDealProposal
ClientDealProposal
Same as <v1 proposal>.DealProposal
DealDataRoot
cid
The root cid of the CAR file. Same as <v1 proposal>.Piece.Root
Transfer.Type
string
eg "http"
Transfer.ClientID
string
Any id the client wants (useful for matching logs between client and server)
Transfer.Params
byte array
Interpreted according to Type
. eg for "http" Transfer.Params
contains the http headers as JSON
Transfer.Size
integer
The size of the data that is sent across the network
SkipIPNIAnnounce (v1.2.1)
boolean
Whether the provider should announce the deal to IPNI or not (default: false)
RemoveUnsealedCopy (v1.2.1)
boolean
Whether the provider should keep an unsealed copy of the deal (default: false)
Accepted
boolean
Indicates whether the deal proposal was accepted
Message
string
A message about why the deal proposal was rejected
DealUUID
uuid
The uuid of the deal
Signature
A signature over the uuid with the client's wallet
DealUUID
uuid
The uuid of the deal
Error
string
Non-empty if there's an error getting the deal status
IsOffline
boolean
Indicates whether the deal is online or offline
TransferSize
integer
The total size of the transfer in bytes
NBytesReceived
integer
The number of bytes that have been downloaded
DealStatus.Error
string
Non-empty if the deal has failed
DealStatus.Status
string
The checkpoint that the deal has reached
DealStatus.Proposal
DealProposal
SignedProposalCid
cid
cid of the client deal proposal + signature
PublishCid
cid
The cid of the publish message, if the deal has been published
ChainDealID
integer
The ID of the deal on chain, if it's been published
Boost is introducing a new feature that allows computing commP
during the deal on a lotus-worker node.
This should reduce the overall resource utilisation on the Boost node.
In order to enable remote commP on a Boost node, update your config.toml
:
Then restart the Boost node
Boost configuration options with examples and description.
SealerApiInfo
"eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJBbGxvdyI6WyJyZwdyIiwid3JpdGUiLCJzaWduIiwiYWRtaW4iXX0.nbSvy11-tSUbXqo465hZqzTohGDfSdgh28C4irkmE10:/ip4/0.0.0.0/tcp/2345/http"
Miner API info passed during boost init. Requires admin permissions. Connect string for the miner/sealer instance API endpoint
SectorIndexApiInfo
"eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJBbGxvdyI6WyJyZwdyIiwid3JpdGUiLCJzaWduIiwiYWRtaW4iXX0.nbSvy11-tSUbXqo465hZqzTohGDfSdgh28C4irkmE10:/ip4/0.0.0.0/tcp/2345/http"
Miner API info passed during boost init. Requires admin permissions. Connect string for the miner/sealer instance API endpoint
ListenAddress
"/ip4/127.0.0.1/tcp/1288/http"
# Format: multiaddress Address Boost API will be listening on. No need to update unless you are planning to make API calls from outside the boost node
RemoteListenAddress
"0.0.0.0:1288"
Address boost API can reached at from outside. No need to update unless you are planning to make API calls from outside the boost node
Timeout
"30s"
RPC timeout value
ListenAddresses
# Format: multiaddress ["/ip4/209.94.92.3/tcp/24001"]
Binding address for the libp2p host - 0 means random port.
AnnounceAddresses
# Format: multiaddress ["/ip4/209.94.92.3/tcp/24001"]
Addresses to explicitly announce to other peers. If not specified, all interface addresses are announced. On chain address need to be updated when this address is changed # lotus-miner actor set-addrs /ip4/<YOUR_PUBLIC_IP_ADDRESS>/tcp/24001
NoAnnounceAddresses
# Format: multiaddress ["/ip4/209.94.92.3/tcp/24001"]
Addresses to not announce. Can be used if you want to announce addresses with exceptions
ConnMgrLow
150
ConnMgrLow is the number of connections that the basic connection manager will trim down to. Too low number can cause frequent connectivity issues
ConnMgrHigh
200
ConnMgrHigh is the number of connections that, when exceeded, will trigger a connection GC operation Note: protected/recently formed connections don't count towards this limit. A high limit can cause very high resource utilization
ConnMgrGrace
"20s"
ConnMgrGrace is a time duration that new connections are immune from being closed by the connection manager.
ParallelFetchLimit
10
Upper bound on how many sectors can be fetched in parallel by the storage system at a time
Dealmaking
section handles deal making configuration explicitly for boost deal that uses the new /fil/storage/mk/1.2.0
protocol.
Miner
f032187
Miner ID
PublishStorageDeals
f3syzhufifmnbzcznoquhy4mlxo3byetqlamzbeijk62bjpoohrj3wiphkgxe3yjrlh5dmxlca3zqxp3yvd33a #BLS wallet address
This value is taken during init with --wallet-publish-storage-deals.
This wallet is used to send PublishDeal messages. It can be hosted on the remote daemon node and does not require to be present locally.
DealCollateral
f3syzhufifmnbzcznoquhy4mlxo3byetqlamzbeijk62bjpoohrj3wiphkgxe3yjrlh5dmxlca3zqxp3yvd33a #BLS wallet address
This value is taken during init with --wallet-deal-collateral
.
This wallet is used to provide collateral for the deal. Funds from this wallet are moved to market actor and locked during the deal duration.
It can be hosted on the remote daemon node and does not require to be present locally.
MaxPublishDealsFee
"0.05 FIL"
Maximum fee user is willing to pay for a PublishDeal message
MaxMarketBalanceAddFee
"0.007 FIL"
The maximum fee to pay when sending the AddBalance message (used by legacy markets)
RootDir
Empty
If a custom value is specified, boost instance will refuse to start. This will be deprecated and removed in the future.
MaxConcurrentIndex
5
The maximum amount of indexing jobs that can run simultaneously. 0 means unlimited.
MaxConcurrentReadyFetches
0
The maximum amount of unsealed deals that can be fetched simultaneously from the storage subsystem. 0 means unlimited.
MaxConcurrentUnseals
0
The maximum amount of unseals that can be processed simultaneously from the storage subsystem. 0 means unlimited.
MaxConcurrencyStorageCalls
100
The maximum number of simultaneous inflight API calls to the storage subsystem.
GCInterval
"1m0s"
The time between calls to periodic dagstore GC, in time.Duration string representation, e.g. 1m, 5m, 1h.
Enable
True/False
Enabled or disable the index-provider subsystem
EntriesCacheCapacity
5
EntriesCacheCapacity sets the maximum capacity to use for caching the indexing advertisement entries. Defaults to 1024 if not specified. The cache is evicted using LRU policy. The maximum storage used by the cache is a factor of EntriesCacheCapacity, EntriesChunkSize and the length of multihashes being advertised.
EntriesChunkSize
0
EntriesChunkSize sets the maximum number of multihashes to include in a single entries chunk. Defaults to 16384 if not specified. Note that chunks are chained together for indexing advertisements that include more multihashes than the configured EntriesChunkSize.
TopicName
""
TopicName sets the topic name on which the changes to the advertised content are announced. If not explicitly specified, the topic name is automatically inferred from the network name in following format: '/indexer/ingest/'
PurgeCacheOnStart
100
PurgeCacheOnStart sets whether to clear any cached entries chunks when the provider engine starts. By default, the cache is rehydrated from previously cached entries stored in datastore if any is present.
GCInterval
"1m0s"
The time between calls to periodic dagstore GC, in time.Duration string representation, e.g. 1m, 5m, 1h.
Advertising 128-bit long multihashes with the default EntriesCacheCapacity, and EntriesChunkSize means the cache size can grow to 256MiB when full.
How to setup monitoring for Boost services
Boost provides multiple metrics for monitoring the services and APIs. All the metrics are emitted in Prometheus format and can be used to monitor and identify bottlenecks.
Apart from the prometheus metrics, all Boost services also provides tracing spans. These tracing spans can be useful to debug the bottleneck and low performance sections of the execution. You can enable tracing using --tracing
flag.
By default, Boost ships a fully configured monitoring stack. This monitoring stack can be deployed on docker and allows storage provider to get started with monitoring in less than 10 minutes. We highly recommend using this stack to monitor your Boost services unless you are familiar with how to setup monitoring manually and create dashboards.
How to store data on Filecoin with Boost as a client
Boost comes with a client executable, boost
, that can be used to send a deal proposal to a Boost Storage Provider.
The client is intentionally minimal meant for developer testing. It is not a full featured client and is not intended to be so. It does not require a daemon process, and can be pointed at any public Filecoin API for on-chain operations. This means that users of the client do not need to run a Filecoin node that syncs the chain.
There are a number of public Filecoin APIs ran by a number of organisations, such as Infura, Glif, etc. For test purposes you can try:
export FULLNODE_API_INFO=https://api.node.glif.io
The init
command
Creates a Boost client repository (at ~/.boost-client
by default)
Generates a libp2p peer ID key
Generates a wallet for on-chain operations and outputs the wallet address
To make deals you will need to: a) add funds to the wallet b) add funds to the market actor for that wallet address
Currently, we don't distribute binaries, so you will have to build from source.
When a storage provider accepts the deal, you should see output of the command similar to:
You can check the deal status
with the following command:
The metrics endpoint for all Boost services is /metrics
. The full list of metrics emitted by boostd
, booster-http
and booster-bitswap
are . The boostd-data
(LID) metrics are separate from the other metrics and their list can be found .
The default URL to export tracing is , which is not correct outside of a Kubernetes or docker environment. Users must set --tracing-endpoint
flag to correct IP/Hostname pointing to their tempo
instance.
Step by step guide to various Boost tasks
This is a step by step guide of how to make verified DDO deals with Boost
First, you need to initialise a new Boost client and also set the endpoint for a public Filecoin node. In this example we are using https://glif.io
The init
command will output your new wallet address, and warn you that the market actor is not initialised.
Now, you need to send some funds and Datacap to the wallet.
You can confirm that the market actor has funds and Datacap by running boost wallet list
.
After that you need to generate a car
file for data you want to store on Filecoin, and note down its payload-cid.
We recommend using go-car
CLI to generate the car file.
Then you need to calculate the commp
and piece size
for the generated car
file:
boostx generate-rand-car -c=50 -l=$links -s=5120000 .
Create a new verified allocation for this piece using boost
client. You can use other method to create allocations as long as piece details match the generated commP.
Import the piece for the newly create allocation using boostd
Watch the boostd
UI to verify that the new DDO deal reaches "Complete" and "Claim Verified" state.
This configuration parameter allows SPs to send PSD messages based on their requirements
By default boost publishes storage deals automatically once 8 deals are in the publish queue, or after 24 hours. However some SPs need to be able to control exactly which deals to publish and when. This new feature allows SPs to turn on manual PSD. Once it is turned on, Boost will no longer send any PSD messages unless explicitly prompted by the user. The feature can be turned on with the below config variable:
The deals which have not been published can be queried using the Graphql endpoint of the Boost. The curl for the query:
curl -X POST -H "Content-Type: application/json" -d '{"query":"query { dealPublish{ ManualPSD Deals {ID IsLegacy ClientAddress ProviderAddress CreatedAt PieceCid PieceSize ProviderCollateral StartEpoch EndEpoch ClientPeerID PublishCid Transfer { Type Size Params ClientID} Message }}}"}' http://localhost:8080/graphql/query | jq
SPs can create a decision script which queries the Graphql endpoint using the above curl and take a decision on which deal to be published.
Once the decision has been taken on which deals to be published, SPs can use the Grapqhql endpoint to Publish the deals. The mutation is:
publishPendingDeals(ids: [ID!]!): [ID!]!
which takes an array of deal UUIDs to be published and an error message is returned. The following curl can be used to publish the deal.
curl -X POST -H "Content-Type: application/json" -d '{"query":"mutation {publishPendingDeals(ids: [\"d9f849f1-d5d8-4bfc-b034-2866bddfc8cb\", \"a8eb58ef-7381-4251-ae7a-1227c032c0b9\"])}"}' http://localhost:8080/graphql/query | jq
Successful output
Failure output
The publish all button in UI would have the same functionality as before. Same would be true for the Graphql mutation dealPublishNow
Storage providers might demand very precise and dynamic control over a combination of deal parameters.
Boost, similarly to Lotus, provides two IPC hooks allowing you to name a command to execute for every deal before the storage provider accepts it:
Filter
for storage deals.
RetrievalFilter
for retrieval deals.
The executed command receives a JSON representation of the deal parameters, as well as the current state of the sealing pipeline, on standard input, and upon completion, its exit code is interpreted as:
0
: success, proceed with the deal.
non-0
: failure, reject the deal.
The most trivial filter rejecting any retrieval deal would be something like:
RetrievalFilter = "/bin/false"
.
/bin/false
is binary that immediately exits with a code of 1
.
This Perl script lets the miner deny specific clients and only accept deals that are set to start relatively soon.
You can also use a third party content policy framework like bitscreen
by Murmuration Labs, or CID gravity:
Here is a sample JSON representation of the input sent to the deal filter:
How to get help for Boost
You can report any issues or bugs here.
If you are having trouble, check the Troubleshooting page for common problems and solutions.
If you have a question or require support, please open a support ticket or join the Filecoin Slack and ask for support in #boost-help.
You can also start a discussion about new feature and improvement ideas for the Boost.
Configure to publish IPNI announcements over HTTP
IndexProvider.HttpPublisher.AnnounceOverHttp
must be set to true
to enable the http announcements. Once HTTP announcements are enabled, the local-index provider will continue to announce over libp2p gossipsub along with HTTP for the specific indexers.
The advertisements are send to the indexer nodes defined in DirectAnnounceURLs
. You can specify more than 1 URL to announce to multiple indexer nodes.
Once an IPNI node starts processing the advertisements, it will reach out to the Boost node to fetch the data. Thus, Boost node needs to specify a public IP and port which can be used by the indexer node to query for data.
This section covers the current experimental features available in Boost
Boost is developing new market features on a regular basis as part of the overall market development. This section covers the experimental features released by boost along with details on how to use them.
It is not recommended to run experimental features in production environments. The features should be tested as per your requirements, and any issues or requests should be reported to the team via Github or Slack.
Once the new features have been tested and vetted, they will be released as part of a stable Boost release and all documentation concerning those features will be moved to an appropriate section of this site.
Current experimental features are listed below.