submerge.io
Open-Source Indexer, Analysis, KYT, and AML Compliance Platform for the Polkadot Ecosystem
Proponent: HELIKON (15fT...yBzL
)
Beneficiary: HELIKON (15fT...yBzL
)
Date: September 10, 2024
Request: $180,000.00 USDT for Milestone 1 of 2
Summary: An open-source, high-performance, high-availability multi-chain indexer, scanner, data analysis, real-time analytics, Know Your Transaction (KYT) and Anti-Money Laundering (AML) compliance platform for the Polkadot ecosystem.
Category: Software Development
Proposal Document: Google Docs, submerge.io
PDF IPFS CID: bafyb...xsuye
CID Remark: 22499544-2 (0xe23f0...5a5c7)
📢 Please take some time to view The Kusamarian's excellent coverage of the Submerge proposal at https://x.com/TheKusamarian/status/1834321085980180928.
1. Introduction
Submerge is an open-source data platform designed and developed by Polkadot natives. It addresses several critical gaps in the Polkadot ecosystem for developers and users:
- Lack of indexer/explorer/scanner alternatives.
- Limited open-access data analysis and analytics tools.
- Insufficient KYT integration and intelligence gathering.
- No AML compliance research, development, and tooling.
- Insufficient real-time visualization of entities, relationships and dynamics.
Submerge is a continuation of Helikon’s previous work, building on the foundation of:
- SubVT (indexing, ETL, data services, notifications, UI/UX design, mobile development),
- Chainviz (data visualization, UI/UX design, web development),
- and followthedot.live (indexing, graph visualization).
This proposal seeks funding for Milestone 1 of the project, which will deliver core functionalities over the 3 months following the approval of the proposal.
2. Problem Statement
A. Vendor Lock-In: Polkadot primarily relies on Subscan as the only fully functional scanner, indexer, and explorer. Alternatives such as Polkascan, Polkastats, and Polkaholic are either discontinued or unmaintained, while Statescan provides limited functionality.
B. Limited Open-Access Data Analysis and Analytics Tools: Although the Parity Data Dashboards and the Dune integration expanded insights into the ecosystem networks, a comprehensive data platform is still needed. Such a platform would enable various algorithms (clustering, community detection, links, anomaly detection, autoencoders, association mining) to be applied to indexed data, offering deeper insights into account behaviors and network dynamics. Currently, no platform provides comprehensive insights into fluid network dynamics through 2D or 3D real-time and historical visualizations.
C. Insufficient KYT Integration and Intelligence Gathering: Currently, there is no open-access KYT and screening solution available for institutional and individual DOT holders. While Merkle Science offers its account attribution API to Subscan and various wallets, the ecosystem needs a dedicated application to meet its attribution, analysis, KYT, and screening requirements.
D. No AML Compliance Research, Development, and Tooling: There is no comprehensive effort in AML research, compliance or tooling development within the Polkadot ecosystem. This limits the network's ability to detect and prevent suspicious (financial) activities. A reliable solution is crucial for ensuring long-term trust and regulatory compliance.
3. Proposed Solution
Submerge proposes a comprehensive data processing platform to address the challenges identified in the Problem Statement section.
A. Submerge Indexer We will leverage our expertise from developing the SubVT Indexer to build a high-performance, high-availability, open-source multi-chain indexer in Rust. The indexer is going to be responsible for fetching the blockchain data from RPC archive nodes, and feeding it into the ETL layer, which in turn will store the data in OLAP, OLTP and in-memory databases to address the high-availability and performance requirements of various use cases.
We will implement a block scanner/explorer web application to present the complete blockchain data to the users. It will provide an interface to all Submerge services and account management functionalities (visualizations, Analyst, Sentinel, settings, credits, API keys, etc.).
B. Submerge Analyst and Real-Time Analytics The Analyst component will host various data analysis algorithms to be applied to the indexed data, allowing users to discover clusters, relationships, associations, and anomalies in the blockchain data. The component will also run graph algorithms for deep, network-wide and inter-network analytics. The Real-Time Analytics component will monitor newly produced blocks, running analysis algorithms on state data, allowing users to gain real-time insights and receive notifications.
C. Submerge Sentinel The Sentinel monitors blockchain state changes in real-time for anomalies and security breaches, identifying and flagging suspicious activities in coordination with the Analyst and Real-Time Analytics components. We will collaborate with regulators and institutions to understand their compliance requirements and ensure they are effectively integrated into the Polkadot ecosystem. We will also focus on securing the ecosystem by integrating external KYT/AML service providers while also building our own intelligence-gathering capabilities.
4. Key Benefits
• Ecosystem Resilience: Submerge offers a robust alternative to existing indexing solutions, eliminating the dependency on a single solution. This ensures resilience and stability.
• Seamless Onboarding: We will help onboard new and existing development teams by providing a seamless process that minimizes costs and initial integration efforts. Submerge will offer the best path for teams to make their data and insights available to their audiences.
• Enhanced Security, Compliance, and Transparency: Submerge will enable real-time monitoring and reporting of suspicious activities, historical analysis through previously unavailable facilities, and increased visibility into account relationships and network dynamics. These features will introduce new security, compliance, and transparency levels for the Polkadot ecosystem.
• Stake-to-Access (S2A): A brand new utility for the DOT Token: Submerge's basic features will be free for all DOT holders, while premium features will reward larger DOT stakeholders. By staking DOT, users unlock premium features, driving increased demand for the token and creating additional network utility.
5. Project Overview
The project consists of 2 milestones. Each milestone will be a separate proposal. Below is an overview of Milestones 1 and 2.
We will publish monthly progress reports to inform the community on the developments.
Legend:
|
|
T0 |
Milestone 1 official start date. Planned November 1st, 2024. |
T1 |
End of Milestone 1, beginning of Milestone 2. Planned February 1st, 2024. |
✅ |
Task completed as part of the milestone. |
⏳ |
Task partially completed as part of the milestone. |
Milestone 1 (Current Proposal):
- Duration: T0 + 3 months
- Cost: $180,000 USDT
- Deliverables:
- ✅ Data Repository
- ✅ Indexer
- ✅ Analyst
- ✅ Notification Manager
- ✅ KYT/AML
- ⏳API
- ⏳Web Application
- ⏳Real-Time Analytics
- ⏳Sentinel
- A detailed breakdown of Milestone 1 activities and deliverables is available on the full proposal document.
Milestone 2 (Next Proposal):
- Duration: T1 + 3 months
- Cost: $189,000 USDT
- Deliverables:
- ✅ API
- ✅ Web Application
- ✅ Real-Time Analytics
- ✅ Sentinel
- A detailed breakdown of Milestone 2 activities and deliverables is available on the full proposal document.
6. Company Background
The team behind Submerge has a proven track record in delivering high-quality open-source projects within the Polkadot ecosystem, including:
Helikon is a member of the Polkadot and Kusama Thousand Validators programs, and the Infrastructure Builders Program (IBP). Helikon provides RPC services for 28 ecosystem chains through 56 archive nodes.
These projects and services demonstrate our technical capability and commitment to the growth of the Polkadot network.
Please refer to the full proposal document for project details, milestone deliverables, team structure, cost breakdown and other details.
How does the indexer plan to interact with JAM? I am not sure if it is useful to build more indexers for the current architecture, when it is about to change.
Statescan is open source and seems to work fine.