Cloud Storage for $2 / TB / Mo

/ 0评 / 0

One of the frequent claims of the Sia network is that over the long term, storage will be cheaper than $2 / TB / Mo, assuming that storage economics do not change. Though we’ve claimed this many times, we’ve never published a detailed model explaining where this number comes from. Until now.

For the purposes of this model, we are going to be assuming an endgame where Sia has substantially outgrown all of the latent / unused storage in the world, where the only way Sia can continue to grow is by having new datacenters established for the sole purpose of profiting from selling data to the Sia network. You can follow along with the math using this spreadsheet.

Uptime Math

One of the key ways that the Sia network distinguishes itself from traditional cloud storage is its datacenter architecture and requirements. The Sia network only expects hosts to have a 95–98% uptime. Despite this, the Sia network is able to achieve 99.9999% uptime for files. This is because each piece of data on the Sia network is stored on many hosts, requiring only a subset of them to be online in order for the file to remain available.

Today, data is typically stored on the Sia network with a 10-of-30 redundancy scheme. This means that there’s a 3x overhead, and that as long as any 10 hosts are online, the file itself is still retrievable. Once the Sia network is more mature, we will likely be switching to 64-of-96 hosts.

If we assume that the 30 hosts go offline independently, and each host has a 95% chance of being online over a given time interval, the equation to determine the probability that a file is unavailable looks like this, giving a result of 10^-19, or 18 nines of uptime. Practically speaking host failures are not truly independent and you have to account for black swan situations like world war three. The true reality is that no system actually hits 18 nines of uptime (nor 11 nines of reliability).

Amazingly, even though 64-of-96 is only 1.5x redundancy, exactly half of the total copies of 10-of-30, the uptime equation has a nearly identical result — 17 nines of uptime.

In Sia’s endgame, 1.5x redundancy is the number that is most likely to be used in production contexts, so that is the number we are going to be using when we model the long term cost of storage.

Raw Storage Cost

An efficient storage rig is going to need the following parts:

What’s going here is we’re buying 32 HDDs, and then finding a low cost way to put them together. This is a technique that was championed by the Bitcoin mining industry, with unprecedented levels of corner cutting, bitcoin mining farms were able to get prices to be absurdly low. We can employ the same strategies here. With 95% uptime requirements, we don’t need to splurge on expensive parts like a full computer tower. The mobo selected above supports 48 gbps in data transfer, which means we still end up with over 1gbps per HDD even though we are doing a ton of splitting. It also turns out that 32 HDDs only consume 200w, so the 750w PSU we picked is more than sufficient.

In total, we’re spending $4945 to get a rig with 192 TB, or about $26/TB. And this is all paying consumer prices. This rig was assembled using consumer parts, when buying a large number of rigs at scale, a datacenter should be able to get much better pricing. So we will be using a rig cost of $4500 in our spreadsheet.

Buildout

Often times for datacenter buildout, you budget around $1mm per megawatt in costs, or about $1 per watt. These rigs are going to consume about 400w each (6 watts per HDD, 65 for the CPU, 70 for RAM and mobo, and then loss from the PSU and datacenter PUE). In total, that means budgeting about $500 in capex for datacenter buildout.

Another cost that needs to be considered is networking equipment. If we’re assuming that you can fit about 8 machines per rack, and you need $2,000 of networking equipment in the datacenter per rack, you get $250 per rig.

Finally, everything is going to have to be assembled and constructed. Including screws and such, each rig has in the ballpark of 400 parts. That means about 2 hours of labor per rig. We’ll call that $50, bringing our total buildout expenses to $800 per rig.

Depreciation and Profit

The whole purpose of understanding capex for the purposes of a profitable datacenter build is to model depreciation. We need to understand how much value we are losing on our hardware every year that the datacenter is in operation. Hard drives last on average about 7 years. Hard drives also go down in price over time, meaning that you lose value a little faster than hard drives break. For this example, we are going to assume a 15% depreciation rate on our storage rigs, and a 5% depreciation on our build-out. That means our total depreciation costs are $785 per year.

Investors putting money into datacenter buildout are going to expect to make more revenue than just operational costs and depreciation costs. I think a reasonable profit expectation for a datacenter built to support a mature Sia ecosystem is around 10%, because the risk of building storage for a mature platform should be relatively low. Earlier in the life of the Sia network, risk may be higher, therefore profit requirements may be higher. At 10% profit expectation, the rigs will need to earn $570 per year in profit to be considered a good investment.

Rent, Utilities, Maintenance

Our rig takes up about 2 square feet. The shelf that I linked should actually be able to hold about 3 of these rigs, and should be able to stack 3 or more of these shelves together. If you account for leaving space for airflow and aisles, you end up with about 1 square foot per rig, meaning a 2,000 square foot datacenter can hold 200,000 TB. The cost of rent is going to be negligible compared to other expenses, and is therefore excluded. Electricity aside, utilities should be approximately negligible as well.

If we assume a PSU efficiency of 93% (this is easily attained at datacenters) and a PUE of 1.4 (generally attainable for this type of setup), you get about 400w per rig. Mining farms often have electricity costs at low as 4 cents, but Sia datacenters are restricted by the fact that they need to have access to good network connections, so we will budget 10 cents per kilowatt hour for electricity. At 500 watts per rig, this comes out to $450 per year in electricity expenses.

On Sia, bandwidth is priced separately, renters pay for upload bandwidth and download bandwidth independently from storage, which means that we can exclude all ISP costs from this equation. Those expenses are covered as renters utilize the bandwidth.

Typical datacenters often have multiple redundant power agreements, backup power setups, batteries, etc, so that they can maintain 99.99% uptime. They will also have technical staff at the datacenter 24/7 to react quickly to any failures or outages. None of these things are required when your uptime target is 95%, and therefore this is a huge set of costs we can ignore when designing Sia datacenters.

Sia datacenters will however need at least a bit of maintenance. For a 32 HDD system, you expect about 5 drives to fail per year. This takes time to repair and you will need on-site staff (just not 24/7). To account for these costs, we will budget $50 per year per rig.

All told, we are at about $1850 per year in expected revenue. This combines the depreciation costs, the electricity costs, the utility costs, and the profit requirements.

Utilization and Sia Specific Costs

A mature datacenter should be able to maintain a utilization of 90% or more by purchasing equipment on an as-needed basis. Assuming that a datacenter does achieve 90% utilization, a 192 TB storage rig is going to need to earn all of its revenue on only 172 TB.

Something that is unique to the Sia network is collateral lockup. When hosting someone’s data, you need to lock up collateral to retain that data. When storing 172 TB of data, a host should expect to have about 6 months worth of revenue locked up. At a 10% profit expectation, that comes down to losing about 1 month worth of revenue to capital expenses per year on actively used storage. So really, we should only be counting 11 months of revenue, as that 12th month is going to pay for our collateral lockup expenses. We compute the collateral expenses as a number of months lost because it can be computed independently of other values this way and simplifies the math.

The final expense is the siafund fee. All storage on the network incurs a roughly 10% fee, which means the renter is going to be paying more than the host is earning.

Bringing Everything Together: Final Cost

With a total revenue requirement of about $1750, an effective utilization of about 172 TB, and an effective earnings period of 11 months, we compute that the host needs to be earning about 90 cents per TB per month in revenue for an investment to make sense. Accounting for siafund fees, the renter needs to pay about $1.00 per TB per month. And then accounting for 1.5x redundancy, the final cost of redundant storage is about $1.50 per TB per month.

This is possible because the Sia network is designed so that datacenters can take shortcuts while building. The traditional model for the cloud assumes that datacenters need to be ultra high reliability, and often also assumes certain performance requirements on the internal network and servers of the datacenter. Sia is much more relaxed, everything needs to happen on the order of milliseconds as opposed to microseconds, and is perfectly comfortable with datacenters and storage rigs that have over a day of downtime per month on average.

More to the point, as a whole the Sia network has been carefully designed to be optimal as a decentralized system. Many of the fundamental design choices for Sia had to be different to achieve decentralization, and we have been able to leverage the strengths and weaknesses of these requirements to create something entirely new, and ultimately far superior to the traditional cloud.

Try out Sia’s personal storage solution at https://sia.tech or Sia’s content publishing and distribution solution at https://siasky.net

Article reprinted from David Vorick

发表评论

电子邮件地址不会被公开。 必填项已用*标注