Heya, I just joined the forum and was browsing through the latest posts and saw this. I know we just spoke on telegram the other day after I saw your tweet. I think this is a great initiative and as a user and NFT project and marketplace developer, I support this initiative. Obviously I am not on a council, so just expressing my personal support here. But I do have some technical comments to make
While I think itâs great that you use the tools that you are most comfortable with and this allowed you for a quick turnaround, and itâs great that it works. I think a much more web3 and on-chain ways to subscribe to NFT changes can be explored in place of scraping. The scraping have number of issues in my opinions
- The websites that rely on indexers heavily might have out of sync state, so scraper would also get out of sync data
- The latency and caching will always be dependant on how the website that is scraped is handling it, they might have server side caching enabled, plus how performant is their connection between their website and their indexer, and finally their indexer might be slow due to processing a lot of uneccessary data or just doing a lot of async operations, and finally the indexer sdk itself might handle unfinalised blocks and re-orgs in such a way that they might be behind the latest block for 30 - 200 blocks (Seen this myself on some of the well known indexers in Polkadot)
- By scraping these websites you can incur additional costs for them or make the experience of their users worse if you accidently did too many requests too often or their servers donât expect this extra pressure
Some things I would propose to try, depending on the type of data you are after:
- Subsquidâs arrowsquid has a great example of simple erc721 indexer, ERC-721 is one of the simplest things to index, as itâs mostly just a single even âTransferâ which does everything, the only additional thing you might have to do is get itâs metadata for name, image, description and attributes, but again, the same subsquidâs example has example of using Multicall v3. Hereâs an example that I am talking about: GitHub - subsquid-labs/evm-multicall-example
- Subquery recently released their Universal NFT API Unified NFT API (Beta) I donât think they have websocket subscription available, but you can just a cron job. You can filter by chainId there to get all moonbeam nfts for example, and limit results + order by block so the cron gets just the latest ones.
These 2 solutions would get you just Mints (if from
is AddressZero, Transfers, Burns (if to
is AddressZero). For sales it might be trickier. Most marketplaces use their own trading implementation, but I suspect (although I havenât tried myself), that you should be able to look for erc20 transfers in logs immediatly after or before Transfer event (when neither to
or from
is AddressZero) and detect the Sales this way, this way you can get the price and who purchased it to.
If you need to track Marketplace Listings it might get trickier, some marketplace contracts (the one responsible for trading) have them verified (like Moonbeans), others are not verified (like ours at RMRK/SIngular), so their ABI is not available or code is obfuscated (Bytecode in block explorer), but since there are handful of marketplaces that you support that donât have ABI publicly available in moonscan, you can just look at block explorer Events tab to work out which event is the Listing, copy itâs topic
hash and just check against it on the indexer, you should be able to detect topic pretty easily IMO, and then youâll just have to work out what each encoded parameter value means (if the contract is not verified). While annoying, I donât think there are that many marketplace contracts that are unverified, and with a bit of fiddling you can work out the events. While this solution to Listing detection is a bit hacky and not the cleanest, itâs used by other projects, like evrlootâs discord bot worked out how to detect our Listings without our help: https://github.com/theshadesofsummer/evrloot-listings/blob/main/src/publish-listing.js
A lot of indexers have free plans / options. Subsquid is free for now, and if you self-host it, it might be free forever, and you can always use public RPC there too. But in the long run, reading from chain will be much more robust and also more-real time solution than scraping and as a bonus, you will gain some good web3 experience