Thanks a lot Ioannis for these periodical updates on subject activity, really much appreciated indeed!!!
I have finished writing the chat server an all openai assistant logic. The chat server sits between the user client and the assistant api + database. Lucky me, openai recently released websocket/streaming support for assistants, so I implemented everything in sockets for better user experience.
This is the first time I get to âchat with the chainâ that was the vision that kept me going for so many months!
I still need to fix some bugs with charts, and optimize the assistant to reduce mistakes.
this is going to be very cool. canât wait to see it live. thanks for the update.
Totally agree, looking forward more and more!!!
- Ironed out some bugs on the chat websocket server
- Broke the indexing workflow from one workflow (with 5 activities) to 4 workflows with 2 + 1 + 1 + 1 activities. This will make future changes easier, and will speed up syncing by ~15%.
- Profiled the final db ingestion activity to see why it was taking longer than expected. Added tx batching and removed some unnecessary object de/re-structurings.
- Re-deployed the production server (Hetzner AX162-S, 256 GB RAM, 4,4,7 TB nvmes)
- Finished testing indexing across all sample ranges, at 1, 1M, âŚ6M blocks.
I have kick started syncing from 6M block onwards so that I have enough data to test the front-end and present it.
keep up the good work!
Completed full indexing of 100K blocks (6M to 6,100,00) without a hitch. The underlying rocksdb is about 50GB of data. This gives us a good estimate of the final database size at 3TB for Moonbeam. Not bad.
This size means it might be possible to roll out the chatbot database feature for the relay chains too (kusama and polkadot) that have many more blocks. The code is generic (it is based on chain metadata) so the only issue is the amount of parallel compute (many machines) that would be required.
I am waiting for the sync to catch up to the latest block (should be some time next week) so I can share a wip app with both the staking board and the chatbot.
So, after a couple weeks of syncing, I was not able to ssh into the server and asked Htzner to take a look at it. They said the server would not boot at all so they moved the drives to a new machine.
My guess is that the load burnt the motherboard. Htzr does not use server-grade mbs to keep their cost low, so 2 weeks of 48 cores at 70% probably did it.
Working on connecting/getting the new server back up and running. Will also add some temperature checks in the indexer and hopefully this wonât happen again.
Thanks for sharing this update Mate, please keep your current dedication and commitment about this activity, youâre doing an amazing job, indeed!!!
Sorry for the long silence. I was working on something else the last couple of weeks and just let the test indexing ran. It completed successfully, so thatâs great! I will spend some time on the UI this week to shape it up and play with the bot to improve results. So, hopefully not long before i report back here again with working demo
WIP version published at:
âChat with chainâ and staking dashboards are operational but there are still things to sort out mostly on the front end.
TODO:
- Fix various bugs on the frontend
- Integrate contract interactions for staking insurance (delegator and collator dashboards)
- Implement backend endpoints for collator account management (icon, description, post-portems, etc.)
- Add more functions to the ChatGpt assistant to allow it to query the database in more ways
- Make the assistant db queries more robust so that, for example, a user query that requires too much data to process is declined.
The indexer has processed only blocks from 6M onwards, so anything before that will not show up in the dashboards or in the chat. In production, we will obviously index from block 1.
Hetzner locked the server due to port scanning. We forgot to add a flag to the moonbeam service files. Chat should be back up in the next few hours, as soon as Hetzner unlocks the server :-/
talk about bad timingâŚ
EDIT: Fixed
https://v2.stakeglmr.com
Video presentation
Project Report - M2 Delivery
Code Stats Summary
Component | Language | Lines of Code |
---|---|---|
SvelteKit App | Typescript | 14,740 |
SvelteKit App | Svelte | 9,912 |
Chat Server | Typescript | 4,346 |
Staking Indexer | Typescript | 11,730 |
Deep State Indexer | Typescript | 8,888 |
Total Typescript Lines: 39,704
Initial Schedule: 2-6 months Actual Time: 1 year and 2 months
Milestone Status
Milestone 1
- SvelteKit frontend app, with mock db calls
- Integration of app with wallets, chain queries (TODO: extrinsic calls)
- Indexing backend and ETL services
Milestone 2
- Additional microservices (pending only the collator management services)
- Full frontend + backend integration (pending only the integration of the above)
- Backend Testing
- TODO: Frontend Testing
- TODO: Deployment to production
Features planned and delivered
- New dashboards with more staking info for collators
- New personal account dashboards for delegators
- Faster load time - V2 loads all data straight from the database on a single http call. Collator details data load on separate call, when requested.
- We ended up not using Subsquid and built our own custom indexing pipeline using Temporal for cloud-level process durability on our own hardware.
- Wallet Connect was used to support multiple wallets.
- The new setup allows infinite resolution times, should an error occur during the ETL process. Of course, this does cause the indexer to fall behind, but there is no deadline by which data will start getting lost.
- Cost-efficiency: the new setup requires one 48-core Hetzner server to house the database and workflow/activity programs ($200 per month), one vercel app ($30), and one S3-compatible data bucket. It also requires Temporal cloud services (we donât run our own temporal system) which is a flat $50 (pro-rata, as i also run other services on Temporal) + $10 variable.
Features not planned and delivered
- Deep State Substrate Indexer. We have built an indexing pipeline that goes well beyond indexing what is readily available, i.e. extrinsic, event, and block data, to indexing the values returned by pallet methods, for any valid arguments. The indexer is built around substrate and the Polkadot.js library. In a nutshell, it works by reading the chain metadata, deconstructing every method input and output values to basic types, creating new basic and complicated inputs, and feeding them back into the methods, and repeating the process (outputs create new inputs).
- Chat interface for the staking data and deep state indexer. This is a ChatGPT assistant based interface that uses ~35 methods to query the database. I plan to add more methods in the future that will provide some interesting graph-based abilities to the bot.
Features planned and not delivered yet
- Multiple collator staking functionality. This includes the backend logic for recommending staking allocations, and the evm calls on the frontend.
- Full integration with insurance contracts. The contract code is there but i have to connect it to the frontend.
- Push updates of dashboard information via websockets. This is supported out of the box by surrealdb but the functionality is still a bit buggy. Waiting for it to mature. Then, it should be as simple as rewriting the current db queries to include a LIVE keyword and opening surrealdbâs websocket port.
Features planned but will not be delivered
- We will not use a two-layer indexing architecture due to the high computing cost required by the deep state indexer. Instead, we will backup the data extracted files (large jsons, initially cached in s3) and take snapshots of rocksdb.
Next Steps
Production Indexing
Catching up to the chain from block 0 will be a challenge. Using a single 48-core machine would take around 7 months to get to block 6M, and by that time the chain would be well ahead especially given the 6-second block periods. Fortunately, the process can be parallelized across multiple servers. I would like to wait for surrealdb 2 to hit the production version before starting syncing from block 1. We are currently using surrealdb 1.5
Finishing Up Pending Functionality
I will finish up all remaining tasks, but I canât work full-time on this anymore. Indexing will take some time anyways, so I can probably finish everything over part-time weekends.
M2 Payment
We are way past the second milestone so I would like to submit a vote to the committee. If itâs ok with you, I would like to add $2K which is the minimum that I will have to pay for deep-state indexing Moonbeam form block 1. This was not part of the original proposal (neither was the deep state indexer) so I am ok with covering the cost if I have to.
Questions for the committee
- Future running costs I can run the staking indexer and the apps (for both stakeglmr and stakemovr) for less than 100 EUR per month on infra costs. I think the treasury can cover this once every year or two, no problem. However, running the deep state indexer requires renting a powerful machine from Hetzner that is ~220 EUR per month for each chain. What are your thoughts on this? Should we just do one chain and see how it goes? If so, which one?
Other Questions
How was the money spent
- Abdullah received approx. $4K for working on SvelteKit code.
- I have spent around $3K in infrastructure for development and testing.
- The other $11K went to coffee and diapers over 15 months.
Why did it take so long?
I spent 70-80% of the time on getting the deep state indexer to work. As far as i know, no other chain has something similar, so I think it was worth it. Although the approach worked, it turned out to be much more challenging that I originally thought for 4 reasons. 1) The recursive architecture of the chain metadata required complicated recursive functions that took forever to get right, 2) running hundreds of thousands of queries to index each block required a asynchronous architecture at all 3 levels (in each process, across processes, and across machines), 3) i spent a couple months trying to implement the deep state indexer in golang to leverage its speed but ran into other issues, and 4) surrealdb is still not quite production ready so I had to spend a lot of time submitting GitHub issues and finding ways around them. Knowing what I know now, I would have to taken up this challenge, but whatâs done is done and (luckily) it works
looking good, keep up the work!
Hey guys, do you have any questions? I would like to address them before I submit the proposal for the M2 payment.
@turrizt can you tag the treasury committee members so they can take a look?
hey @stakebaby, sure. @TreasuryCouncil, could you please take a look and share your thoughts on this proposal â [Proposal: MB11/MR8] StakeGlmr.com and StakeMovr.com V2 - Treasury Proposal - #95 by stakebaby
@TreasuryCouncil
Alright, here is the M2 quote breakdown.
Total | M2 Payment | 30D MA | |||||
---|---|---|---|---|---|---|---|
$44,800 | $22,400 | Moonbeam | 17,920 | 0.1958 | 91,522 | GLMR | |
Moonriver | 4,480 | 9.3326 | 480 | MOVR |
I did not add the $2K for the deep state indexing process. Iâll cover it myself but I can only do deep indexing for Moonbeam and not Moonriver cause itâs too expensive. This means that the Moonriver app will be missing the Explorer and Chat sections.
If anyone wants to review the code send me your Github account email so i can invite you to the repositories.
If you have questions, please donât be shy. A meeting would also work and would probably be more efficient.
Hey @stakebaby
The council unanimously agrees on the additional $2k for the M2 payment, extending the coffee & diapers runway.
Please move forward submitting your proposal on-chain
We absolutely love getting those insights into the nitty-gritty aspects of the development.
Thanks for providing these! Innovative features like chat with chain
are just great to see and exemplify the amount of effort and dedication put into this project.
Usually three council members signal their support as a forum post here individually to ensure your on-chain proposal wonât be rejected and the deposit slashed, but I can assure you weâve all agreed so please put your M2 proposals on-chain as soon as you are ready.
We confirm both 30DMAâs youâve provided for GLMR and MOVR respectively. As weâve seen varying deviations of 30D MA compared to our calculations in the past, please let us provide them for you given the $ amount requested for future proposals.
Again, keep up your great work â we are happy to keep funding it!
y