AWS Elasticsearch

Learn via video courses
Topics Covered

Overview

Amazon Elasticsearch is a fully managed solution for hosting Elasticsearch. Amazon Elasticsearch manages the deployment and running of Elasticsearch and provides us with capabilities to swiftly scale our instances up and down based on our software or business requirements. It also delivers the monitoring, performance, and security that we rely on across all AWS services. Rather than maintaining the servers and instances yourself, adopting a service like Amazon Elasticsearch automates most of the low-level settings.

What is Elasticsearch and How Does It Work?

Elasticsearch is an open-source search engine written in Java and designed for distributed or multi-tenant environments. It’s built for scalability while still offering speed and flexibility for indexing and searching data. Given this flexibility, Elasticsearch has a wide variety of use cases, from storing analytics data and logs to more general search purposes, and from inventory data to full-text document search.

Benefits of Elasticsearch:

  • Open source
  • Fast time to value
  • Easy Ingestion
  • Easy Visualization
  • High Performance and distributed
  • Beast analytic and search

Working:

Raw data flows into Elasticsearch from a variety of sources, including logs, system metrics, and web applications. Data ingestion is the process by which this raw data is parsed, normalized, and enriched before it is indexed in Elasticsearch. Once indexed in Elasticsearch, users can run complex queries against their data and use aggregations to retrieve complex summaries of their data. From Kibana, users can create powerful visualizations of their data, share dashboards, and manage the Elastic Stack.

Concept of AWS Elasticsearch

AWS Elasticsearch is a fully managed service that makes it easy for you to deploy, secure and manage Elasticsearch clusters at the petabyte scale. It is a fully managed service that takes care of hardware provisioning, software installation and patching, failure recovery backups and monitoring.

AWS Elasticsearch service supports elastic search open-source APIs and seamlessly integrates with popular data ingestion and visualization tools like Logitech, Kibana, and other AWS services allowing you to use your existing code and tools to extract insights quickly and securely.

The AWS Elasticsearch concepts are as follows:

  • Elasticsearch cluster and the AWS Elasticsearch domain are the same in meaning and are interrelated. Domains wrap hardware and software needed to run an elasticsearch cluster. You can deploy the domain through the AWS SDK, CLI (or) Cloudformation.

amazon-elasticsearch

  • Domain is a cluster which runs multiple compute nodes under it like data nodes(which can be many data nodes present in the single domain). Data nodes are one type of virtual machine which are used for storing data and query operations.

  • Domain also has the Master Nodes which are responsible for the orchestration of the whole query and managing the data nodes, the number of master nodes describes the performance throughput of the domain.

  • AWS Elasticsearch Index is a collection of several documents that are co-related to each other. Indexes are divided into shards, each shard is a subset of all of the documents that are in the index and shards are non-overlapping they are all distinct.

Data-pattern

  • Elasticsearch holds data in the form of JSON documents. Each document associates a set of keys (field or property names) with their associated values (strings, numbers, Booleans, dates, arrays of values, geolocations, or other types of data).

deployment of indices to a cluster

  • A blue/green deployment strategy is used by Amazon Elasticsearch. Shards come in two flavours i.e., primary shards first shard for each one and a set of replica shards which you can dynamically set from 1 to n (in the above example single primary shard and single replica shard for instances 1 and 2).

  • If you do not take any action about the necessary upgrades after a specified amount of time, AWS updates the service software automatically.

AWS Elasticsearch Architecture

  • We can use the SDK, CLI, Console, and CloudFormation to deploy an Elasticsearch service cluster. As ElasticSearch is a clustered technology we have a collection of data and master nodes.

aws-elasticsearch-architecture

  • In the VPC the cluster has the data and master nodes, in front of the cluster we put the load balancer just to spread the load across the nodes in the cluster.

  • In front of the Elastic Load balancing, we have the AWS IAM to provide security to the cluster. Then we send the monitoring information to CloudTrail and CloudWatch so that you can monitor the performance of the cluster and make sure that everything is going well.

Features of AWS Elasticsearch

Various features are provided by AWS Elasticsearch some of which are listed here:

  • Easy to Use

AWS Elasticsearch services are managed, making them simple to use. Monitoring, software patching, backup, and failure recovery all take less time. AWS Elasticsearch clients may post a production-ready Elasticsearch cluster in a matter of seconds.

  • Open Sources APIs

Without the requirement for new software or programming experience, AWS Elasticsearch provides direct access to open-source APIs. Logstash, an open-source data ingestion tool, provides compatibility with AWS Elasticsearch services. It also supports Kibana, a data visualization tool.

  • Secure

Elasticsearch on AWS is completely secure. It is straightforward to set up secure access to the VPC (Virtual Private Cloud) using Amazon Elasticsearch Service. AWS IAM and Amazon Cognito policies help with authentication and access control management. Users can build network isolation for their data in the Elasticsearch service using Amazon VPC.

  • Tightly Integrated with other AWS Services

Integrates with Amazon Kinesis Firehose, AWS IoT, and Amazon CloudWatch Logs for seamless data ingestion. AWS CloudTrail for auditing, AWS Identity and Access Management(IAM) for security, and AWS CloudFormation for cloud orchestration.

  • Scalable

Scale clusters form a single node up to 20 nodes. Configure clusters to meet performance requirements by selecting from a range of instance types and storage including SSD-powered EBS volumes.

Getting Started with AWS Elasticsearch

Amazon Elasticsearch Service is an AWS-managed service. Elasticsearch clusters on the cloud may be more easily set up, run, and scaled. We could have direct access to Elasticsearch APIs via Amazon Elasticsearch. Use these instructions to begin using AWS Elasticsearch. These are the actions to take:

  1. Create an AWS account.
  2. Set up an Amazon ES domain.
  3. Submit data to the Amazon ES domain for indexing.
  4. Look for a document on an Amazon domain.
  5. Deactivate an Amazon ES domain.

First and foremost, to get started with AWS, we must register an account on AWS services.

Step 1: Signup for AWS Account

Step 1: Register with AWS to create a new account. Click here and then choose the Create an AWS Account option in the upper right corner(or). Suppose having an AWS account, then log in.

signup-for-aws-account

Step 2: Create an Amazon ES domain

An Amazon ES domain and an Elasticsearch cluster are the same things. After you've set up your AWS account, you're ready to set up an Amazon Elasticsearch domain. We will construct an Amazon ES domain called books in this stage. The procedures for setting up and running the Elasticsearch service domain are as follows.

  1. Establish your domain
  2. Set up your cluster
  3. Create an access policy.
  4. Examine

create anamazon es domain

The procedures to create an Amazon ES domain are outlined below.

Define your domain

  • Use your credentials to access your AWS account. Go to the Analytics section and click on Elasticsearch Service to browse the Elasticsearch Service page.
  • Select development and testing after clicking the Create a new domain option.
  • You must pick the Elasticsearch version and the Deployment type here. Elasticsearch 7.4.0 is the most recent version, and we are also on it.

Configure your domain

  • Enter the domain name (for example, books) and select the Instance type from the drop-down box.
  • Use the default value for data node storage and several instances of one.
  • For instance, type, we select small.elasticsearch, which is a free tier. Ignore the other forms and click Next to the Setup access page.

Set up access policy

  • To access this domain, we must provide it with the necessary permissions. As a result, you must set up access on this page. 
  • We recommend that you choose the public access domain for ease. However, you may limit access to a VPC or an IAM role. A limited number of individuals can only access your Elasticsearch cluster.
  • For the time being, leave the Amazon Cognito Authentication configuration alone.
  • Select a template to Set the domain access policy under the Access policy. For this, select the Allow open access to the domain policy option.
  • Ignore the encryption option and leave it at its default value before clicking Next.

Review

The final stage in the domain development process is review. Before finalizing, the review page displays the parameters you have already configured.

  • Double-check your settings and click Confirm. It will take 10-15 minutes to construct and initialize a new domain (cluster). However, depending on the settings, it may take longer to initialize.

When these processes are done, you will receive the notification, "You have successfully built an Elasticsearch domain."

Your ES domain will now be operational. The domain status will be set to Active, and the cluster health will be green.

Step 3: Uploading Data for Indexing

The following step is to upload the data for indexing. We may upload the data to the Amazon ES Service domain using the command-line interface or a programming language. We will submit a tiny quantity of test data at this stage.

Use the command line to upload a single document

To upload a single document to the Amazon ES domain, use the command below from the command line.

  • Submit a JSON file with numerous documents.
  1. To begin, we will build a JSON file called json. Copy and paste the following information:
  1. Next, use the script below to upload the json file to the domain of the book.

Step 4: Searching Documents in the Amazon ES Domain

Elasticsearch Search APIs assist users in searching for documents within the Amazon Elasticsearch Service domain. Alternatively, you may use Kibana (a data visualization tool) to search for the document in the domain. Elasticsearch's most essential event is the search operation. When there is a large quantity of data, searching the data using a specific query string is a good idea.

We will use the example below for hunting for technical books within the book's domain.

Using the command line to search for a document

To search the domain you've created, use the command below from the command line.

  1. Open your browser and go to the Kibana plugin for your Amazon ES domain. The Kibana endpoint will appear on your domain dashboard in the Amazon ES interface. The URL format will be as follows:
  1. Enter your master username and password to access the console.

  2. It is necessary to set at least one index pattern to utilize Kibana since Kibana uses these patterns to determine which indices to study. Because we have the domain of a book, enter books for this lesson and then click Create.

  3. The Index Pattern will display different document fields such as book name, author, publisher, etc. For the time being, select Discover to search your data.

  4. Type Mars into the search field and hit Enter. Take note of how the similarity score (_score) improves when you search for the term mars assaults.

Step 5: Delete an Amazon ES domain

In step 2, we set up an Amazon ES domain called books. This domain is just for testing purposes. In this stage, we will erase it. To remove an Amazon ES domain, take the following steps:

  • Enter your username and password to access the Amazon Elasticsearch Service interface. Select the book's domain from My Domains on the navigation page.
  • Next, choose Action and then Delete the domain inside it. Finally, choose Delete after checking the Delete Domain option.

Use Cases of AWS Elasticsearch

  • Log Analytics

Analyze unstructured and semi-structured logs generated by websites, mobile devices, servers, sensors, and more for a wide variety of applications such as digital marketing, operational intelligence, fraud detection, ad tech, gaming, and IoT.

  • Full-Text Search

Provide a highly performant, rich search and navigation experience over a diverse set of documents with support for features, including text matching, faceting, filtering, fuzzy search, auto-complete, and highlighting.

  • Distributed Document Store

Power your application with an easy-to-use, highly performant JSON document-oriented storage platform that can store and retrieve billions of documents, with integrated replication across Availability Zones.

  • Real-Time Application Monitoring

Capture activity logs across your customer-facing applications and websites by indexing data for analysis in near real-time(less than one second), visualize it, and perform statistical aggregations to identify the root causes and fix issues.

  • Click-Stream Analytics

Deliver real-time metrics on digital content and enable authors and marketers to connect with their customers. Streaming billions of small messages into Elasticsearch for aggregating, filtering, and processing the data for providing content performance dashboards.

AWS Elasticsearch Pricing

  • With the AWS ElasticSearch Service, one may pay for the exact use. There are no specified obligations, minimum expenses, or up-front costs.

  • Otherwise, the cost is decided by the amount of Amazon ElasticSearch Instance hours used, typical data transmission rates, and Amazon EBS storage.

  • Start with the free tier, which likely allows up to 750 hours of free use using a single AZ t2.micro.elasticsearch per month (or) t2.small.elasticsearch instance is a good way to get started.

  • AWS ElasticSearch pricing will now include 10GB of optional Amazon EBS storage with monthly magnetic and general-purpose capabilities.

Benefits and Limitations of AWS Elasticsearch

  • Simple to Use

One may quickly and simply post a development ElasticSearch cluster using AWS Elasticsearch. There is nothing to worry about installing, provisioning, or maintaining Elasticsearch software. All services in Amazon ElasticSearch are completely managed, saving time on failure recovery, backup, software patching, and monitoring.

  • Compatibility with Open Source APIs and Tools

It most likely allows them immediate access to the ElasticSearch Open-Source API without needing new software or programming expertise. It supports Logstash, an open-source data intake, loading, and transformation tool. It also works with Kibana, an open-source visualization tool.

  • Secure

It is simple to set up safe access to Amazon ElasticSearch Service from the VPC for the effective administration of the VPC and Amazon ElasticSearch Service within the AWS network. It deploys security patches and maintains the domain up to date at regular intervals to improve performance with ease.

  • Scalability

Amazon ElasticSearch services may easily monitor many clusters using Amazon CloudWatch metrics. It can also resize the cluster with a few clicks in the AWS Management dashboard and a single API request.

AWS Elasticsearch has a few drawbacks in addition to its many benefits, and they are as follows:

  • It allows consumers to launch their domain through a public endpoint or a VPC. However, it is prohibited to combine the two activities in it.

  • AWS Elasticsearch offers only a 12-month free tier. Therefore it is not free. You must pay to use it beyond the first year of signing up.

Conclusion

  • Elasticsearch is a Java-based open-source search engine developed for distributed or multi-tenant applications. It is designed for Scalability while maintaining speed and flexibility in indexing and finding data.

  • Amazon ElasticSearch services are completely managed, saving time on failure recovery, backup, software patching, and monitoring.

  • AWS ElasticSearch allows immediate access to the ElasticSearch Open-Source API without requiring new software or programming expertise. It also works with Kibana, an open-source visualization tool.

  • One may quickly set up secure access to Amazon ElasticSearch Service from the VPC to administer the VPC and Amazon ElasticSearch Service properly within the AWS network.

  • The AWS ElasticSearch services feature built-in connections with AWS services such as Kinesis Firehose, Amazon CloudWatch Logs, and AWS IoT for easy data intake.

  • Amazon ElasticSearch services can easily monitor several clusters using Amazon CloudWatch metrics. It can also resize the cluster with a few clicks in the AWS Management dashboard and a single API request.