In the year 2023 the launch of ChatGPT has spurred the growth of AI which has resulted in a lot of capital and attention being poured into it. As with any technological innovation AI is currently going through the hype phase as seen by the overvalued forward revenue and earnings multiples for stocks such as Nvidia. This reminds me of the hype phase crypto currencies went through in 2014 during the launch of Ethereum, the emergence of decentralized finance in 2020 and the launch of metaverse in 2021. The current popularity of AI has cast a shadow over the blockchain space. However, the intersection between AI and blockchain has not garnered much attention and I feel this will be an important sector to investigate in the coming years. In this article series, I will be looking to unpack the synergies between blockchain and AI and how they cover each other’s shortcomings.
Introduction to the problem
In this article I will be looking at how decentralized storage solutions can help reduce infrastructure cost for AI solutions and can also bolster the security of the system. Currently AI tools like ChatGPT use web scraping to gather data from various sources on the internet before storing it in its database. Each time a user chats with the AI, the conversations and user inputs are saved as well to further train the AI model. This is a lot of data and the cost of storing this data in a centralized platform is expensive. ChatGPT is hosted on Microsoft’s Azure cloud, and it costs OpenAI around US$3 million per month to run ChatGPT (Tech Desk, 2023). The cost involved in storing data make it hard for start-ups to build similar AI models and this will hinder the rate of growth in the space.
How Decentralized Storage Solutions work
One way to solve this is to make use of decentralized storage solutions such as FileCoin and Arweave. Their model is similar to Airbnb where users who have vacant storage space on their computers can rent their data nodes out and quote a price for it. Users who want to make use of this space will then pay a fee for it in the form of the networks native token.
When users upload their data to a decentralized storage network, this data is encrypted using cryptographic hash mechanisms and only the user has access to the private key which can be used to decrypt this information. This data is then split into little pieces and sent to different nodes on the network in a process known as sharding. Sharding ensures that no single node holds the complete dataset hence ensuring data privacy. When the user needs the file, the network retrieves the components from the nodes storing it and reassembles it for the user to download. Node operators usually must set aside a certain amount of money as a down payment in the form of the networks native tokens to show their commitment towards ensuring the data is secured and made available upon request.
How AI can benefit from using Decentralized Storage Solutions
According to a report published by Foresight Ventures, the cost of decentralized file storage on blockchains such as Filecoin and Arweave is around $4 per terabyte per month while that on Web 2 services such as Amazon Cloud or Microsoft Azure can range from $16 to $23 per per terabyte per month. To put things into context, GPT-3 was trained on 45 terabytes of text data amounting to around US$10.2K spent additionally per year. The cost savings may seem insignificant today especially for bigger companies. However, as AI continues to grow the models are going to need more and more data and these costs will add up. Smaller companies in the space will need a cheaper alternative and this is where decentralized technologies will come into play.
Another benefit of using decentralized storage solutions is the increased security it brings amount. AI models are only as good as the data they work with. Attackers can deliberately introduce malicious data into a training data set to corrupt the model or copy already-trained models and use them for nefarious purposes (Platz, Forbes 2023). This becomes easier when all the data is centralized and stored in a small number of databases. The 2022 hack of Dropbox which led to 68 million passwords getting leaked on the dark web emphasizes the need for greater security. On the other hand, to attack a decentralized storage service, hackers need to access every node running the protocol which is often very costly to pull off.
How Decentralized Storage Solutions can benefit from AI
Currently blockchain storage solutions may face difficulties in retrieving data due to high bandwidth requirements. In addition, despite having over 17 million terabytes in storage capacity at the end of 2022, network utilization remains relatively low around 3.1%. A reason for this could be due to the lack of appeal of decentralized storage solutions amongst retail users. Tools such as Dropbox and Google Drive are currently free to use. Making payments in the form of crypto currencies is currently not user friendly and there are many costs involved such as on-ramp fees and network gas fees. Institutional adoption as well has been mostly limited to life sciences and Web 3 firms. This has caused profitability to be an issue for decentralized storage solutions.
This is where AI models can help to spur demand in the decentralized storage space. As more institutions make use of decentralized storage solutions, the network utilization will increase driving up the profitability for these solutions. This in turn will invite more investment in the space leading to better decentralized solutions being developed that can handle the high bandwidth requirement and make using such platforms seamless for users.
Closing thoughts
We are currently in the early days of AI technology and growth in this space will bring about more challenges that require unique solutions. The cons having all your data in one location and the cost of maintaining it is going to be a key problem moving forward. While decentralized storage platforms can tackle these problems effectively, they currently lack Web 2 adoption. I believe that as technology in the blockchain space continues to improve, decentralized storage technology will develop as well to handle the demands of Web 2 companies. The future will not be blockchain vs AI but will rather be blockchain and AI working together hand in hand.
Note: This is the first article I have written so would love to hear thoughts on what can be done better, or things you would like me to cover in the next few articles.