EFS in AWS

Learn via video courses
Topics Covered

Overview

In the computing world, there are different types of servers available for different purposes/use cases, such as Web servers, Application Servers, File Servers, DB Servers, etc. The file server is a central computer connected to the network (CIDR Ipv4 or IPv6) that gives a location for shared file access. Amazon EFS is a service that will work the same as a file server for Linux systems with thousands of concurrent NFS clients (EC2 instance) and is capable of petabyte-scale network file systems.

What is EFS in AWS?

Amazon EFS is a serverless storage file system managed by AWS.Since EFS is serverless, we don't need to worry about provisioning, scaling, and managing the file server. While using EFS, we need to pay only for the amount of data stored in EFS.

Why do we need EFS ?

Imagine the below situation:

A customer has thousands of EC2 instances. All those instances should share the common data storage device. Each EC2 is also attached to an EBS volume. But, only selected instances are supported by Multi AZ EBS. However, we cannot attach a single EBS Volume to 1000 Ec2 instances/servers.

In the above scenario, the customer's optimal choice will be Amazon EFS.

Because

  • EFS is a highly scalable and pay-as-you-go storage service.

  • A single EFS can be attached to thousands of EC2 instances.

  • EFS also provides data durability, security and backup, and other options similar to EBS volumes.

Amazon EFS Features

EFS includes numerous features such as elasticity, encryption, and performance, among others.

We will discuss each feature in detail.

Highly Available

Amazon EFS stores the copy of customer's data files redundantly across AZ in the case of the EFS Standard storage class and within AZ in the case of one zone storage class. By default, Amazon EFS will do automatic failover in case of any concurrent failures in losing redundancy for the standard storage class (Multi-AZ).

Elastic and Scalable

EFS storage can be increased (by adding the files) and decreased (by removing the files) dynamically as per the application's needs. It is designed to support petabyte scaling with parallel access to multiple EC2 instances simultaneously.

Shared File System with NFS

The connection between EC2 instances and EFS can be established by the NFS protocol.

What is the NFS Protocol? NFS is a network file system developed in the 1980s to support multiple devices and users accessing files over the network.Amazon EFS supports NFSv4.

Encryption

In EFS, AWS provides a transparent solution for encryption for both the data at rest and data in transit. Data at rest is encrypted by a managed AWS KMS key. Data in transit is encrypted by industry standard Transport Layer Security (TLS).

Modes

There are two modes provided by Amazon EFS to increase performance with low latencies.

They are :

  1. Performance Mode 
  2. Throughput Mode
ModesPerformance modesThroughput modes
1General PurposeBursting
2Max I/OProvisioned

Before learning about the types of modes, we should know the networking terminologies associated with file sharing for a better understanding.

  • IOPS
  • Throughput
  • Burst

IOPS:

It is used to measure the performance of the storage device.IOPS in a storage device determines the read and write commands every second.

Example: On a computer, if we click on any application icon, it will take some time to load and display. The IOPS of a storage device (HDD or SSD ) in a computer decides the time and latency.

Representation: milliseconds(ms)

Throughput:

Throughput measures the data transfer rate from source to destination.

Representation: MB/s

Burst:

Burst refers to the time period when data/files are sent at a rate that is faster than the normal transfer rate. Throughput can be increased by using burst.

Performance Mode:

  • General Purpose
  • Max I/O

General Purpose General Purpose provides less latency(less than 1 millisecond) for reading and writing operations.

Use case: Web server, CMS, etc.

**Max I/O **

Max I/O modes provide scaling with higher throughput and operations per second.This mode is ideal for 1000s of concurrent NFS clients (EC2 instances).

Use case: Big Data, Media Processing

Note: This mode is available only on Amazon EFS file systems using Standard storage classes.

Throughput Mode:

  • Bursting Throughput
  • Provisioned Throughput

Bursting Throughput

In this mode, throughput scaling depends upon the size of the file system and also supports dynamic bursting for our file-based workloads.

Use case: Intensive workload

Provisioned Throughput

In this mode, throughput is fixed at a specified amount in MB/s regardless of file system size. It is suitable for applications that require dedicated throughput which incurs additional cost.

Amazon EFS Backup Options

Amazon EFS Replication

AWS EFS Replication is an option provided by AWS to transfer the file system data from one AWS region to another or one AZ to another without any infrastructure changes.we can enable the EFS replication using the AWS EFS Console, AWS CLI, or APIs.

Amazon Backup

Amazon Backup is a service that can be used to automate the backup process for all the compute resources such as EC2, RDS, EFS, DynamoDB, etc. using Amazon Backup, we can automatically enable EFS backup and put retention policies on a centrally configured Amazon Backup dashboard. we also have the option to enable automatic backup while creating EFS in the console.

amazon-backup

Storage Classes and Life Cycle Management

There are two storage classes available for two different use cases.

Access TypeFrequently AccessInfrequently Access
Storage ClassStandardStandard IA
Storage ClassOne ZoneOne Zone IA

Amazon EFS Intelligent-Tiering

Using Life Cycle policies, we can automate the file transition from standard tier to standard IA tier and from one zone tier to one zone IA tier.

IA: Infrequent Access

All we have to do is create two policies.

First policy:

If the file is accessed, then the policy automatically moves the file/data to EFS Standard or One Zone. 

Second policy:

If the file is not accessed, the policy will automatically move the file/data to EFS Standard IA or One Zone IA.

By using EFS Intelligent Tiering, we can save nearly 92% of the cost.

Reference : EFS Intelligent Tiering

Create an EFS File System and Mount it to Your Instance

For this demo, we need two resources: an EC2 instance with Linux OS and EFS. To create an EC2 instance with a Linux AMI, kindly refer to the below link.

How to create EC2 linux instance

Create an EFS File System

  • Before creating EFS, we should have our EC2 instance with Linux AMI in a running state to establish a connection between EC2 and EFS.
  • The EC2 Linux instance is in a running state like below.

ec2-linux-instance

  • Once logged in to the AWS console, search for EFS in the search bar and click it. In the EFS console, click Create a File System.Enter the below options and click create.

Name: EFS-Demo VPC: Select your VPC Storage Class: Standard

Note: For advanced EFS settings, we can choose to customize the option while creating the EFS.

create-efs-file-system

Mount the File System

  • Once EFS is created, click EFS ID and go to the network tab.In the network tab, select the efs security group and open the security group in another tab.

mounting-file-system1

mounting-file-system2

In the EFS Security group.

Allow rule for inbound traffic

Port: 2049

protocol: NFS

Source: Private IP of the EC2 Instance Server

Allow the rule for outbound traffic

Port: All

protocol: All

Destination: 0.0.0.0/0

In the EC2 security group,

Allow rule for inbound traffic

Port: 2049

protocol: NFS

Source: Security Group ID of the EFS

Allow the rule for outbound traffic

Port: All

protocol: All

Destination: 0.0.0.0/0

  • In Linux, we should create a directory to map the mount point.

Login to the EC2 machine and create a directory.

create-a-directory

make-directory-code

show-directory-code

Test the File System

  • Once the directory is created, copy and paste the below command.

test-the-file-system

mount-file-system

  • This command is available in the AWS EFS console.

  • Go to the EFS and click the created EFS. You can see the attach button in the top right.Once you click the attach button and choose mount via IP, you can get the command

Troubleshooting EFS and EC2 Connectivity

  • Check the security group in EFS and EC2 for proper connection.

  • In the case of a One Zone Storage tier, check whether EC2 AZ and EFS AZ are the same or not.

  • Check whether EFS and EC2 belong to the same VPC or not.

Clean up

  • Stop the EC2 instance and terminate it if it is no longer needed. Click the delete button in the top right corner of the EFS console.

Pricing

  • EFS pricing is a pay-as-you-go model. We will pay for only the amount of data stored in EFS.

  • EFS pricing differs from Regions and storage classes.The below pricing is based on the assumption.

    • 80% of data is infrequently accessed and stored in One Zone-IA

    • 80% of data is infrequently accessed and stored in Standard-I

RegionEffective storage price ($/GB-Mo) - One Zone*Effective storage price ($/GB-Mo) - Standard**
US East (N. Virginia)$0.043$0.08
Africa (Cape Town)$0.054$0.10
Asia Pacific (Seoul)$0.047$0.09
Europe (Ireland)$0.046$0.09

For more regions and their pricing, Kindly refer to the mentioned link.

EFS Pricing

EFS Estimation Calculator

Conclusion

To conclude,

  • Amazon EFS is a `serverless file storage service provided by AWS.

  • Amazon EFS provides security, durability, encryption, and backups for our files/data with a pay-as-you-go model.

  • EFS supports 1000s of parallel EC2 instance connections.

  • Customers can also establish the connection to Amazon EFS from on-premise via Direct Connect or Site-to=site VPN connections.