NFS Storage Recommendations#
A common mechanism for provisioning the storage required for Anaconda Enterprise persistence involves the use of a Network File Server (NFS). This includes many cloud offerings such as Amazon EFS and Google Filestore and many on-premise NAS/SAN implementations. In this section, we provide specific recommendations for server- and client-side configuration. These recommendations should be used to augment the general storage requirements offered on our install requirements page.
If you are building a new machine to serve as your NFS server:
It should have at least 4 cores and 16GiB of RAM.
Increase the number of threads created by the NFS daemon to at least 64, to reduce the likelihood of contention between multiple users. For information on how to do this see your operating system documentation; for instance, this RedHat article.
If possible, use this file server as your administration server as well. This is a great way to manage and administer this persistence. If this is not possible, make sure to export the volume to the administration server as well as the Kubernetes cluster.
If you are intending to use the same server for both
anaconda-persistence, then you should consolidate to a single PersistentVolume, as discussed in the general storage requirements.
The use of premium storage tiers, and SSD-based storage in particular, is strongly recommended.
In many environments, the performance of the volume (e.g., IOPS) is tightly coupled to the size of the disk. For this reason, Anaconda recommends over-provisioning the size of the disk to take advantage of this. In some environments, IOPS can be provisioned separately, but it can still be cost-effective to over-provision size instead.
Anaconda recommends the use of the
Anaconda recommends against the use of the
root_squashoption. While a seemingly sensible option for security reasons, in practice we find that it too often leads to unexpected permissions issues. That said, a similar and more reliable option is to use the
all_squashoption along with
anonguid. This effectively forces all remote access to be translated to the same UID and GID on the server. In summary, in order of preference, Anaconda recommends:
no_root_squashfor maximum administration flexibility, and to allow the containers to utilize GID 0, the Kubernetes default.
anon_gidfor a reliable option that avoids UID 0 & GID 0;
root_squashonly if there is no other alternative.
To improve both security and performance, locate the file server on the same private subnet as the Kubernetes cluster, and limit the exports to that subnet.
When mounting the NFS share, Anaconda recommends overriding the default read and write block sizes by using the options
wsize=65536. The reason smaller block sizes are preferred is because the creation of conda environments frequently involves the manipulation of thousands of smaller files. Large block sizes result in significant inefficiency.
We also recommend the use of the
noatimeoption. This eliminates the updating of file access times over NFS, further reducing network overhead. Note that file modification times are still preserved.
Persistent Volume specifications#
Encapsulating the client recommendations into the
PersistentVolumeClaim specifications is relatively simple.
Begin with the following template, called (for instance)
apiVersion: v1 kind: PersistentVolume metadata: name: <NAME> annotations: pv.beta.kubernetes.io/gid: "<GID>" spec: capacity: storage: 100Gi accessModes: - ReadWriteMany mountOptions: - rsize=65536 - wsize=65536 - noatime nfs: server: <ADDRESS> path: <PATH> --- apiVersion: v1 kind: PersistentVolumeClaim metadata: name: <NAME> spec: accessModes: - ReadWriteMany volumeName: <NAME> storageClassName: "" resources: requests: storage: 100Gi
Perform the following replacements:
<NAME>: you can give this any name you wish, or adhere to our conventions of
anaconda-persistence. This name will ultimately be supplied to the Helm chart values. Note that
<NAME>appears in three places; use the same value for all.
<GID>: this is the group ID which has write access to the volume. As discussed above, the recommended value is
0; but if you are forced to use
all_squash, make sure this has the value of the selected GID. The quotes must be preserved.
<ADDRESS>: the FQDN or numeric IP address of the NFS server.
<PATH>: the exported path from the NFS server.
size entry in both resources does not need to be changed, even if
your volume is (as is likely) significantly larger. All that matters in this
case is that the values are the same.
Once this template is properly populated, you can create the resources with the command:
kubectl create -f pv.yaml
If you have allocated two different NFS volumes for
anaconda-persistence, repeat this template for each.
When creating the Persistent Volume (PV) using NFS as the provider, specify your NFS version under
mountOptions to avoid performance issues within your Kubernetes platform. For example:
mountOptions: - hard - nfsvers=3 - rsize=65536 - wsize=65536 - noatime