10.6084/m9.figshare.11929515.v1
Dan Sun
Dan
Sun
Big internet pipe and cloud saved my storage in crisis
eResearch NZ
2020
NeSI
eResearch
eResearch NZ 2020
2020-03-10 03:54:25
Presentation
https://eresearchnz.figshare.com/articles/presentation/Big_internet_pipe_and_cloud_saved_my_storage_in_crisis/11929515
<div>
<div>
<div>
<div>
<p>The current storage solutions in AgResearch are all based on Network Attached Storage
(NAS) technologies. It was simple, quick and cost effective to deploy. In some instances, it
was even easy to scale up their capacities. However, individual fileservers have become
data silos and we suffered from their limitations regularly. This talk is based on an incident
caused one of those struggles. It also covers how we recovered from it quickly by utilising
the Cloud, and our thoughts on our future storage platform.
</p><p><br></p>
<p>Over one weekend in early October 2019, unexpected amount of data was placed on one of
user accessible fileservers and pushed its utilisation over 85%. Consequently, its
performance started to degrade. Unfortunately, there was no other storage which had
enough spare capacity to offload this additional load in the same physical location.
</p><p><br></p>
<p>We decided to remove some large datasets which had not been accessed by users for over 2
years to reclaim capacity quickly. At the same time, we had to maintain the same data
protection level (two separated copies of the same data stored in two different locations).
To achieve this objective, we uploaded a copy of such datasets’ offsite replicas to Microsoft
Azure Blob storage before removing the original copy from the server. Additionally, we also
configured the Cloud storage to automatically migrate data from the Cool tier to the Archive
tier after data being in the cloud for 7 days. This significantly reduces the cost of storing
data in the Cloud for the long term, although we acknowledge the additional cost and time
for retrieving such data if that’s required. We deem the probability of such operation low
and would only be necessary in a disaster recovery scenario.
</p><p><br></p>
<p>We were extremely pleased by the performance of REANZ’s network when we were
uploading data to Microsoft Azure’s instance in Australia. We were able to upload 2TB of
data in just over 37 minutes, which translates to 7 Gbps per second in average. The speed
of our WAN is 10 Gbps. It took us another 2 hours to remove the dataset on the fileserver
where we were running out of capacity. Overall, it took us just less than 3 hours to stabilise
this fileserver and we think it was a fairly good outcome. After the initial crisis was over, we
uploaded further 6TB of data to the Cloud to reclaim capacity from the same fileserver. We
plan to use the same approach whenever we encounter similar issues in the short term until
we are able to replace our current generation storage solutions.
</p><p><br></p><p>
</p><div>
<div>
<div>
<div>
<p>Almost all of our storage solutions will reach their end of life in the next 12 to 24 months,
and we are currently planning a new generation storage platform to replace them. From all
lessons we have learned to date, we think a scale out storage solution is much more fit for
purpose than NASs or fileservers. Based on our uses of the Cloud, we start to see the value
of Object stores, although we won’t be getting rid of unstructured data store, filesystems,
any time soon. It is our ambition to integrate both by some smart software. We also think
data replication is more practical and appropriate than the traditional backup/restore model
for the amount of data volume we have to keep. Lastly, the possibility to replicate data to
the Cloud is attractive, particularly the low-cost archival storage, but its high retrieval
overhead (both time and cost) is a risk that needs to be further investigated and mitigated.</p><p><br></p>
<div>
<div>
<div>
<div>
<p><b><u>ABOUT THE AUTHOR</u></b></p>
</div>
</div>
</div>
</div><p>Dan is currently working for AgResearh as a HPC consultant and maintains a smallish Linux
cluster and storage. He is passionate about helping researchers to do science by using
advanced technologies. When he is not firefighting at work, he enjoys having barista made
coffee, fancy burgers and donuts with his collaborators and friends. </p>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>