Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New Chain Integration + Minor changes to Chain Integration Overview #725

Merged
merged 16 commits into from
Aug 8, 2024
2 changes: 1 addition & 1 deletion website/pages/en/_meta.js
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ export default {
'operating-graph-node': '',
'chain-integration-overview': '',
'supported-network-requirements': '',
'new-chain-integration': 'Integrating New Networks',
'new-chain-integration': '',
firehose: '',
graphcast: '',
'mips-faqs': '',
Expand Down
6 changes: 3 additions & 3 deletions website/pages/en/chain-integration-overview.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -6,12 +6,12 @@ A transparent and governance-based integration process was designed for blockcha

## Stage 1. Technical Integration

- Teams work on a Graph Node integration and Firehose for non-EVM based chains. [Here's how](/new-chain-integration/).
- Please visit [New Chain Integration](/new-chain-integration) for information on `graph-node` support for new chains.
benface marked this conversation as resolved.
Show resolved Hide resolved
- Teams initiate the protocol integration process by creating a Forum thread [here](https://forum.thegraph.com/c/governance-gips/new-chain-support/71) (New Data Sources sub-category under Governance & GIPs). Using the default Forum template is mandatory.

## Stage 2. Integration Validation

- Teams collaborate with core developers, Graph Foundation and operators of GUIs and network gateways, such as [Subgraph Studio](https://thegraph.com/studio/), to ensure a smooth integration process. This involves providing the necessary backend infrastructure, such as the integrating chain's JSON RPC or Firehose endpoints. Teams wanting to avoid self-hosting such infrastructure can leverage The Graph's community of node operators (Indexers) to do so, which the Foundation can help with.
- Teams collaborate with core developers, Graph Foundation and operators of GUIs and network gateways, such as [Subgraph Studio](https://thegraph.com/studio/), to ensure a smooth integration process. This involves providing the necessary backend infrastructure, such as the integrating chain's JSON-RPC, Firehose or Substreams endpoints. Teams wanting to avoid self-hosting such infrastructure can leverage The Graph's community of node operators (Indexers) to do so, which the Foundation can help with.
- Graph Indexers test the integration on The Graph's testnet.
- Core developers and Indexers monitor stability, performance, and data determinism.

Expand All @@ -38,7 +38,7 @@ This process is related to the Subgraph Data Service, applicable only to new Sub

This would only impact protocol support for indexing rewards on Substreams-powered subgraphs. The new Firehose implementation would need testing on testnet, following the methodology outlined for Stage 2 in this GIP. Similarly, assuming the implementation is performant and reliable, a PR on the [Feature Support Matrix](https://github.com/graphprotocol/indexer/blob/main/docs/feature-support-matrix.md) would be required (`Substreams data sources` Subgraph Feature), as well as a new GIP for protocol support for indexing rewards. Anyone can create the PR and GIP; the Foundation would help with Council approval.

### 3. How much time will this process take?
### 3. How much time will the process of reaching full protocol support take?

The time to mainnet is expected to be several weeks, varying based on the time of integration development, whether additional research is required, testing and bug fixes, and, as always, the timing of the governance process that requires community feedback.

Expand Down
121 changes: 77 additions & 44 deletions website/pages/en/new-chain-integration.mdx
Original file line number Diff line number Diff line change
@@ -1,75 +1,108 @@
---
title: Integrating New Networks
title: New Chain Integration
Giuliano-1 marked this conversation as resolved.
Show resolved Hide resolved
---

Graph Node can currently index data from the following chain types:
Chains can bring subgraph support to their ecosystem by starting a new `graph-node` integration. Subgraphs are a powerful indexing tool opening a world of possibilities for developers. Graph Node already indexes data from the chains listed here. If you are interested in a new integration, there are 2 integration strategies:

- Ethereum, via EVM JSON-RPC and [Ethereum Firehose](https://github.com/streamingfast/firehose-ethereum)
- NEAR, via a [NEAR Firehose](https://github.com/streamingfast/near-firehose-indexer)
- Cosmos, via a [Cosmos Firehose](https://github.com/graphprotocol/firehose-cosmos)
- Arweave, via an [Arweave Firehose](https://github.com/graphprotocol/firehose-arweave)
1. **EVM JSON-RPC**
2. **Firehose**: All Firehose integration solutions include Substreams, a large-scale streaming engine based off Firehose with native `graph-node` support, allowing for parallelized transforms.

If you are interested in any of those chains, integration is a matter of Graph Node configuration and testing.
> Note that while the recommended approach is to develop a new Firehose for all new chains, it is only required for non-EVM chains.

If you are interested in a different chain type, a new integration with Graph Node must be built. Our recommended approach is developing a new Firehose for the chain in question and then the integration of that Firehose with Graph Node. More info below.
## Integration Strategies

**1. EVM JSON-RPC**
### 1. EVM JSON-RPC

If the blockchain is EVM equivalent and the client/node exposes the standard EVM JSON-RPC API, Graph Node should be able to index the new chain. For more information, refer to [Testing an EVM JSON-RPC](new-chain-integration#testing-an-evm-json-rpc).
If the blockchain is EVM equivalent and the client/node exposes the standard EVM JSON-RPC API, Graph Node should be able to index the new chain.

**2. Firehose**
#### Testing an EVM JSON-RPC

For non-EVM-based chains, Graph Node must ingest blockchain data via gRPC and known type definitions. This can be done via [Firehose](firehose/), a new technology developed by [StreamingFast](https://www.streamingfast.io/) that provides a highly-scalable indexing blockchain solution using a files-based and streaming-first approach. Reach out to the [StreamingFast team](mailto:[email protected]/) if you need help with Firehose development.
For Graph Node to be able to ingest data from an EVM chain, the RPC node must expose the following EVM JSON-RPC methods:

## Difference between EVM JSON-RPC & Firehose
- `eth_getLogs`
- `eth_call` (for historical blocks, with EIP-1898 - requires archive node)
- `eth_getBlockByNumber`
- `eth_getBlockByHash`
- `net_version`
- `eth_getTransactionReceipt`, in a JSON-RPC batch request
- `trace_filter` *(optionally required for Graph Node to support call handlers)*

While the two are suitable for subgraphs, a Firehose is always required for developers wanting to build with [Substreams](substreams/), like building [Substreams-powered subgraphs](cookbook/substreams-powered-subgraphs/). In addition, Firehose allows for improved indexing speeds when compared to JSON-RPC.
### 2. Firehose Integration

New EVM chain integrators may also consider the Firehose-based approach, given the benefits of substreams and its massive parallelized indexing capabilities. Supporting both allows developers to choose between building substreams or subgraphs for the new chain.
[Firehose](https://firehose.streamingfast.io/firehose-setup/overview) is a next-generation extraction layer. It collects history in flat files and streams in real time. Firehose technology replaces those polling API calls with a stream of data utilizing a push model that sends data to the indexing node faster. This helps increase the speed of syncing and indexing.

> **NOTE**: A Firehose-based integration for EVM chains will still require Indexers to run the chain's archive RPC node to properly index subgraphs. This is due to the Firehose's inability to provide smart contract state typically accessible by the `eth_call` RPC method. (It's worth reminding that eth_calls are [not a good practice for developers](https://thegraph.com/blog/improve-subgraph-performance-reduce-eth-calls/))
The primary method to integrate the Firehose into chains is to use an RPC polling strategy. Our polling algorithm will predict when a new block will arrive and increase the rate at which it checks for a new block near that time, making it a very low-latency and efficient solution. For help with the integration and maintenance of the Firehose, contact the [StreamingFast team](https://www.streamingfast.io/firehose-integration-program). New chains and their integrators will appreciate the [fork awareness](https://substreams.streamingfast.io/documentation/consume/reliability-guarantees) and massive parallelized indexing capabilities that Firehose and Substreams bring to their ecosystem.

---
> NOTE: All integrations done by the StreamingFast team include maintenance for the Firehose replication protocol into the chain's codebase. StreamingFast tracks any changes and releases binaries when you change code and when StreamingFast changes code. This includes releasing Firehose/Substreams binaries for the protocol, maintaining Substreams modules for the block model of the chain, and releasing binaries for the blockchain node with instrumentation if need be.

## Testing an EVM JSON-RPC
#### Specific Firehose Instrumentation for EVM (`geth`) chains

For Graph Node to be able to ingest data from an EVM chain, the RPC node must expose the following EVM JSON RPC methods:
For EVM chains, there exists a deeper level of data that can be achieved through the `geth` [live-tracer](https://github.com/ethereum/go-ethereum/releases/tag/v1.14.0), a collaboration between Go-Ethereum and StreamingFast, in building a high-throughput and rich transaction tracing system. The Live Tracer is the most comprehensive solution, resulting in [Extended](https://streamingfastio.medium.com/new-block-model-to-accelerate-chain-integration-9f65126e5425) block details. This enables new indexing paradigms, like pattern matching of events based on state changes, calls, parent call trees, or triggering of events based on changes to the actual variables in a smart contract.

- `eth_getLogs`
- `eth_call` \_(for historical blocks, with EIP-1898 - requires archive node):
- `eth_getBlockByNumber`
- `eth_getBlockByHash`
- `net_version`
- `eth_getTransactionReceipt`, in a JSON-RPC batch request
- _`trace_filter`_ _(optionally required for Graph Node to support call handlers)_
![Base block vs Extended block](/img/extended-vs-base-substreams-blocks.png)

> NOTE: This improvement upon the Firehose requires chains make use of the EVM engine `geth version 1.13.0` and up.

### Graph Node Configuration
## EVM considerations - Difference between JSON-RPC & Firehose

**Start by preparing your local environment**
While the JSON-RPC and Firehose are both suitable for subgraphs, a Firehose is always required for developers wanting to build with [Substreams](https://substreams.streamingfast.io). Supporting Substreams allows developers to build [Substreams-powered subgraphs](/cookbook/substreams-powered-subgraphs) for the new chain, and has the potential to improve the performance of your subgraphs. Additionally, Firehose — as a drop-in replacement for the JSON-RPC extraction layer of `graph-node` — reduces by 90% the number of RPC calls required for general indexing.

- All those `getLogs` calls and roundtrips get replaced by a single stream arriving into the heart of `graph-node`; a single block model for all subgraphs it processes.

> NOTE: A Firehose-based integration for EVM chains will still require Indexers to run the chain's archive RPC node to properly index subgraphs. This is due to the Firehose's inability to provide smart contract state typically accessible by the `eth_call` RPC method. (It's worth reminding that `eth_calls` are not a good practice for developers)

## Graph Node Configuration

Configuring Graph Node is as easy as preparing your local environment. Once your local environment is set, you can test the integration by locally deploying a subgraph.

1. [Clone Graph Node](https://github.com/graphprotocol/graph-node)
2. Modify [this line](https://github.com/graphprotocol/graph-node/blob/master/docker/docker-compose.yml#L22) to include the new network name and the EVM JSON RPC compliant URL
> Do not change the env var name itself. It must remain `ethereum` even if the network name is different.
3. Run an IPFS node or use the one used by The Graph: https://api.thegraph.com/ipfs/
2. Modify [this line](https://github.com/graphprotocol/graph-node/blob/master/docker/docker-compose.yml#L22) to include the new network name and the EVM JSON-RPC compliant URL

> Do not change the env var name itself. It must remain `ethereum` even if the network name is different.

3. Run an IPFS node or use the one used by The Graph: https://api.thegraph.com/ipfs/

**Test the integration by locally deploying a subgraph**
### Testing an EVM JSON-RPC by locally deploying a subgraph

1. Install [graph-cli](https://github.com/graphprotocol/graph-tooling/tree/main/packages/cli)
1. Install [graph-cli](https://github.com/graphprotocol/graph-cli)
2. Create a simple example subgraph. Some options are below:
1. The pre-packed [Gravitar](https://github.com/graphprotocol/example-subgraph/tree/f89bdd4628efa4badae7367d4919b3f648083323) smart contract and subgraph is a good starting point
2. Bootstrap a local subgraph from any existing smart contract or solidity dev environment [using Hardhat with a Graph plugin](https://github.com/graphprotocol/hardhat-graph)
3. Adapt the resulting `subgraph.yaml` by changing `dataSources.network` to the same name previously passed on to Graph Node.
4. Create your subgraph in Graph Node: `graph create $SUBGRAPH_NAME --node $GRAPH_NODE_ENDPOINT`
5. Publish your subgraph to Graph Node: `graph deploy $SUBGRAPH_NAME --ipfs $IPFS_ENDPOINT --node $GRAPH_NODE_ENDPOINT`
1. The pre-packed [Gravitar](https://github.com/graphprotocol/example-subgraph/tree/f89bdd4628efa4badae7367d4919b3f648083323) smart contract and subgraph is a good starting point
2. Bootstrap a local subgraph from any existing smart contract or solidity dev environment [using Hardhat with a Graph plugin](https://github.com/graphprotocol/hardhat-graph)
3. Adapt the resulting `subgraph.yaml` by changing `dataSources.network` to the same name previously passed on to Graph Node.
4. Create your subgraph in Graph Node: `graph create $SUBGRAPH_NAME --node $GRAPH_NODE_ENDPOINT`
5. Publish your subgraph to Graph Node: `graph deploy $SUBGRAPH_NAME --ipfs $IPFS_ENDPOINT --node $GRAPH_NODE_ENDPOINT`

Graph Node should be syncing the deployed subgraph if there are no errors. Give it time to sync, then send some GraphQL queries to the API endpoint printed in the logs.

---
## Substreams-powered Subgraphs

For StreamingFast-led Firehose/Substreams integrations, basic support for foundational Substreams modules (e.g. decoded transactions, logs and smart-contract events) and Substreams-powered subgraph codegen tools are included (check out [Injective](https://substreams.streamingfast.io/documentation/intro-getting-started/intro-injective/injective-first-sps) for an example).

There are two options to consume Substreams data through a subgraph:

- **Using Substreams triggers:** Consume from any Substreams module by importing the Protobuf model through a subgraph handler and move all your logic into a subgraph. This method creates the subgraph entities directly in the subgraph.
- **Using EntityChanges:** By writing more of the logic into Substreams, you can consume the module's output directly into `graph-node`. In `graph-node`, you can use the Substreams data to create your subgraph entities.

It is really a matter of where you put your logic, in the subgraph or the Substreams. Keep in mind that having more of your logic in Substreams benefits from a parallelized model, whereas triggers will be linearly consumed in `graph-node`. Consider the following example implementing a subgraph handler:

```ts
export function handleTransactions(bytes: Uint8Array): void {
let transactions = assembly.eth.transaction.v1.Transactions.decode(bytes.buffer).trasanctions // 1.
if (transactions.length == 0) {
log.info('No transactions found', [])
return
}

## Integrating a new Firehose-enabled chain
for (let i = 0; i < transactions.length; i++) {
// 2.
let transaction = transactions[i]

Integrating a new chain is also possible using the Firehose approach. This is currently the best option for non-EVM chains and a requirement for substreams support. Additional documentation focuses on how Firehose works, adding Firehose support for a new chain and integrating it with Graph Node. Recommended docs for integrators:
let entity = new Transaction(transaction.hash) // 3.
entity.from = transaction.from
entity.to = transaction.to
entity.save()
}
}
```

1. [General docs on Firehose](firehose/)
2. [Adding Firehose support for a new chain](https://firehose.streamingfast.io/integrate-new-chains/integration-overview)
3. [Integrating Graph Node with a new chain via Firehose](https://github.com/graphprotocol/graph-node/blob/master/docs/implementation/add-chain.md)
The `handleTransactions` function is a subgraph handler that receives the raw Substreams bytes as parameter and decodes them into a `Transactions` object. Then, for every transaction, a new subgraph entity is created. For more information about Substreams triggers, visit the [StreamingFast documentation](https://substreams.streamingfast.io/documentation/consume/subgraph/triggers) or check out community modules at [substreams.dev](https://substreams.dev/).
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading