Kinesis OpenSearch

Learn via video courses
Topics Covered

Overview

Users can find, analyze, visualize, and protect up to petabytes of textual and unstructured information with AWS OpenSearch Service. Users just pay for everything they use with AWS OpenSearch Service. Users are charged based on three factors: instance hours, storage requirements, and data transported into and out of AWS OpenSearch Service. Storage price is determined by the storage tier and kind of instance selected.

Introduction to Amazon OpenSearch Service

AWS OpenSearch Service is a fully managed service that simplifies the deployment, operation, and scaling of the OpenSearch cluster on the AWS Cloud. OpenSearch and legacy Elasticsearch OSS were supported by Amazon OpenSearch Service (up to 7.10, the final open-source version of the software). Users may choose what search engine to employ to build the clusters.

OpenSearch is a completely open-source searching and analytic engine that may be used for log analytics, real-time app monitoring, and clickstream analytics. 

opensearch service

AWS OpenSearch Service creates and deploys the cluster's resources. Also, it finds and replaces broken OpenSearch Service nodes quickly, lowering the complexity associated with manually managed infrastructure. With such a single API, users could grow the clusters.

To begin utilizing OpenSearch Service, users must first build an OpenSearch Service domain, which is analogous to the OpenSearch cluster. Every EC2 instance inside the cluster serves like one node of the OpenSearch Service.

In minutes, users can set up and customize the domain using the OpenSearch Service panel. If users desire programmatic access, the AWS CLI or SDKs are available.

Features of Amazon OpenSearch Service

Security

Authorization and Authentication

Provide your customers safe access by utilizing authorization and authentication techniques of their choosing, such as native SAML assistance, AWS Cognito, AWS IAM, and others. 

Encryption

With military-grade AES-256 AWS KMS credentials, users may safeguard the information from intruders by allowing encryption of information on disk, system logs, and automatic snapshots. TLS 1.2 is used to encrypt data as it travels between nodes.

Control of Granular Access

AWS gives users a regulated and reliable way to access commercial data and track cluster settings, utilize even more access control tools including AWS IAM rules or fine-grained access control.

Network Isolations and Access Management

Manage the domain's boundary by utilizing AWS identification and resource rules to link IDs and resources to particular allow/deny actions. Just use an AWS VPC and AWS security group, and establish a logically separated network that allows only traffic through recognized organizations.

Audit Documentation and Compliance

Track domain changes, watch user activities, and audit data queries, containing specific connection information. Monitor the utilization of configuring APIs and access to the data with AWS CloudTrail monitoring and OpenSearch auditing logs.

Security Patches and Upgrades

Secure the information against security flaws. OpenSearch Service delivers backward-compliant security patches and updates for all systems of OpenSearch and Elasticsearch to reduce the requirement for versions upgrade.

Security of Indexes, Documents, and Fields

Advanced security safeguards provide access to sensitive or secret data. To limit access to certain indexes, documents, or fields, utilize indexing, documents, or field-level protection.

Programmatic Access that is Secure

Employ Sigv4 signed request provided via AWS SDKs or the AWS CLI to establish communication with the OpenSearch domain.

Facilitate Compliance and Governance

Satisfy the group's demanding compliance and governance standards. AWS OpenSearch Service is compliant with various industry standards, including HIPAA, FedRAMP, DoD CC SRG, SOC, PCI, ISO & CSA STAR, and FIPS 140-2.

Stability

  • A variety of geographic regions are available for their resources, referred to as Regions and Availability Zones (AZs).
  • Node allotment over two or three AWS Region AZs, known as Multi-AZs.
  • Master node devoted to offloading cluster management tasks
  • Simplified snapshot backups and restoration of the OpenSearch Service domain.

Flexibility

  • SQL access for business intelligence (BI) apps
  • Customized bundles to enhance search results

Setup and Configuration

AWS OpenSearch Service is simple to use. Users may configure and control the AWS OpenSearch Service clusters by using AWS Management Console or a simple AWS CLI. At any moment, users may change the number of instances, the instance type, and storage configurations, and alter or delete existing clusters.

Event Tracking and Alerting

AWS OpenSearch Service involves event tracking and alerting, allowing us to monitor the data in the clusters and deliver messages based on predetermined criteria. This functionality, developed with the OpenSearch alert plugins, allows users to set and control notifications via the Kibana or OpenSearch Dashboard interfaces via the REST API.

Aws Kinesis Data Firehose Data Ingestion

Despite needing to construct their computational pipeline, users can quickly transform raw data streams directly from the sources of data into the format required by the Elasticsearch or OpenSearch indexes and feed it to AWS OpenSearch Service using AWS Kinesis Firehose.

aws kinesis data firehose data ingestion

Just choose a Lambda Function from the AWS Kinesis Firehose delivery streams setup page in the AWS Management Console to utilize this capability. AWS Kinesis Firehose will run the AWS Lambda functions to each incoming data file and feed the transformed data into the AWS OpenSearch Service index instantly.

AWS Kinesis Firehose offers pre-built Lambda blueprints that may be used as-is or customized to convert popular sources of data including Apache logs to log to JSON and CSV format. AWS Kinesis Firehose may also be configured to proactively retry unsuccessful processes and backup raw streaming data.

Logstash Data Ingestion

Logstash, an open-source computational tool that takes data from the source, changes it, and then inserts it into Elasticsearch or OpenSearch, is supported by AWS OpenSearch Service. Logstash may be simply deployed on EC2 and configured to use your AWS OpenSearch Service domains as the backend storage for any logs generated by the Logstash implementations.

Ingestion of Data via AWS CloudWatch Logs

Utilizing the current system, application, and custom log files, AWS CloudWatch Logs allows users to monitor and manage both the application and system. A CloudWatch Logs group may be configured to feed data to the AWS OpenSearch Service domains.

AWS IoT Data Ingestion

AWS IoT is a fully managed system that allows the device to interface using cloud platforms as well as other peripherals simply and safely. Users can use AWS IoT to collect data from various connected devices including household appliances, sensor technology, and Television set-up devices.

Others Features

Deployment and Management

  • Setup and configuration: Getting started with Amazon OpenSearch Service is easy. You can set up and configure your Amazon OpenSearch Service cluster using the AWS Management Console or a single API call through the AWS Command Line Interface (CLI). You can specify the number of instances, instance types, and storage options, and modify or delete existing clusters at any time.
  • In-place upgrades: Amazon OpenSearch Service makes it easy to upgrade your OpenSearch and Elasticsearch clusters (up to version 7.10) to newer versions without any downtime, using in-place version upgrades. In-place upgrades eliminate the hassle of taking a manual snapshot, restoring it to a cluster running the newer version, and updating all your endpoint references.

Event Monitoring and Alerting

Amazon OpenSearch Service provides built-in event monitoring and alerting, enabling you to monitor the data stored in your cluster and automatically send notifications based on pre-configured thresholds. Built using the OpenSearch alerting plugin, this feature lets you configure and manage alerts using your Kibana or OpenSearch Dashboards interface and the REST API. You can receive notifications via custom webhooks, Slack, Amazon Simple Notification Service (SNS), and Amazon Chime. You can also view cluster health metrics including the number of instances, cluster health, searchable documents, CPU, and memory, as well as disk utilization for data and master nodes through Amazon CloudWatch, at no additional charge.

Support for Multiple Query Languages

With Amazon OpenSearch Service, there’s no need for OpenSearch query domain-specific language (DSL) proficiency. Write SQL queries with OpenSearch SQL or use the OpenSearch Piped Processing Language (PPL), a query language that lets you use pipe (|) syntax, to explore, discover, and query your data. OpenSearch Dashboards also includes a SQL and PPL workbench. Integration with open source tools: Amazon OpenSearch Service offers built-in OpenSearch Dashboards and Kibana (Elasticsearch version 7.10 and previous) and integrates with Logstash, so you can ingest and visualize your data using the open source tools you prefer. Perform trace analytics with Amazon OpenSearch Service’s support for the open source OpenTelemetry standard and continue to use your existing code with direct access to Elasticsearch APIs and plugins such as Kuromoji, Phonetic Analysis, Ingest Processor Attachment, Ingest User Agent Processor, and Mapper Murmur3.

Connection with VPC

With Amazon OpenSearch Service, you can securely connect your applications to your managed Elasticsearch (version 7.10 and previous) or OpenSearch environment from your Amazon Virtual Private Cloud (VPC) or via the public Internet, configuring network access using VPC security groups or IP-based access policies. You can also securely authenticate users and control access using Amazon Cognito, AWS Identity and Access Management (IAM), or basic authentication with a username and password. Amazon OpenSearch Service leverages the OpenSearch security plugin, enabling you to define granular permissions for indices, documents, or fields. You can also extend Kibana with read-only views and secure multi-tenant support. Amazon OpenSearch Service also supports built-in encryption for data at rest and in-transit, so you can protect your data when it is stored in your domain or in automated snapshots and transferring between nodes in your domain. Amazon OpenSearch Service is HIPAA-eligible and compliant with PCI DSS, SOC, ISO, and FedRAMP standards, making it easy for you to build applications that meet compliance requirements.

UltraWarm

Hot storage allows for fast retrieval of frequently accessed data. UltraWarm is a warm storage tier that complements Amazon OpenSearch Service’s hot storage tier by providing less expensive storage for older and less-frequently accessed data while still providing an interactive querying experience. UltraWarm stores data in Amazon S3 and uses custom, highly-optimized nodes, purpose-built on the AWS Nitro System, to cache, pre-fetch, and query that data quickly.

With UltraWarm, you can retain up to 3 PB of data in a single Amazon OpenSearch Service cluster while reducing cost per GB by nearly 90% compared to the hot storage tier. You can also easily query and visualize the data in your Kibana (version 7.10 and previous) or OpenSearch Dashboards interface. Analyze both your recent (weeks) and historical (months or years) log data without spending hours or days restoring archived logs.

Cold Storage

Cold storage is the lowest-cost storage option for Amazon OpenSearch Service, which allows you to retain infrequently accessed data in Amazon S3 and only pay for compute when you need it. Cold storage builds on UltraWarm, which provides specialized nodes that store data in Amazon S3 and uses a sophisticated caching solution to provide an interactive experience. By decoupling compute resources from storage, cold storage lets you retain any amount of data in your Amazon OpenSearch Service domain while reducing cost per GB to near Amazon S3 storage prices. Detach historical or infrequently accessed warm data while not in use and free up compute to help lower costs. Discover and selectively attach your cold data to your domain’s UltraWarm nodes in seconds with your choice of a Kibana (version 7.10 and previous) or OpenSearch Dashboards interface and easy-to-use APIs. With cold storage, you can query the attached cold data with a similar interactive experience and performance as your warm data.

Supported Versions of OpenSearch

Various OpenSearch versions are presently supported by OpenSearch Service:

  • 2.3, 1.3, 1.2, 1.1, 1.0

The following historical Elasticsearch OSS version is also supported by OpenSearch Service:

  • 7.10, 7.9, 7.8, 7.7, 7.4, 7.1
  • 6.8, 6.7, 6.5, 6.4, 6.3, 6.2, 6.0
  • 5.6, 5.5, 5.3, 5.1
  • 2.3
  • 1.5

It is strongly recommended that users utilize the most recent supported OpenSearch version when starting a new OpenSearch Service project. If they already have a domain that utilizes an older Elasticsearch version, users may maintain it or move the data.

Getting started with Amazon OpenSearch Service

Create a Domain

  1. Navigate to link and select Login to Console.
  2. Select AWS OpenSearch Service from the Analytics menu.
  3. Select Create domain.
  4. Enter a name for the domain. The term movies is used throughout the example in this article.
  5. Uncheck the Custom endpoint option.
  6. Select Development and testing as the Deployment type
  7. Select the most recent versions in Version.
  8. Modify the Instances type to t3.small under Data nodes.

Find and save the default value of nodes.

  1. For the sake of simplicity, utilize an open-access domain in this lesson. Select Public access from the Network option.
  2. Select Create master user from the Fine-grained access control option. Enter a username and a password.
  3. For the time being, disregard the sections on SAML authentication and AWS Cognito authentication.
  4. Select Only use fine-grained access control under Access policy. Fine-grained access management, rather than the domain access policy, controls authentication in this example.
  5. Skip the remaining options and click Create.
  6. New domains normally take 15-30 minutes to set up, although this might differ depending on the settings. When the domain has finished initializing, select it to view its configuration pane. In General information, take note of the domain endpoint

Size the Domain Appropriately for Your Workload

There is no ideal way to size the AWS OpenSearch Service domain. But, by first knowing the storage requirements, the services, and OpenSearch effectively, you may make accurate initial estimates on the hardware requirements.

This estimation may be used as a starting point to begin the most important component of domain sizing: testing and evaluating their performance with a realistic workload.

The majority of OpenSearch workloads go into one of two categories:

  • Long-lived index: Users create code that converts information as one or even more OpenSearch indexes and afterward regularly refreshes those indexes when the source data changes. The site, documents, and eCommerce search are some common examples.
  • Rolling index: Flow of data continually into a collection of the temporary index values, each with its own indexing time and retention windows. Log analysis, time series, and clickstream analytics are some popular examples.

Control Access to Your Domain Using a Domain Access Policy or Fine-Grained Access Control

Whenever users build a domain, users add a resource-based policy also known as the domain access policy. These rules define which activities a principle is permitted to do on the domain's subresources.

OpenSearch index and APIs are examples of subresources. The Principal element provides access to the account, user, or role to whom it is granted. The Resource element indicates what subresources the principal has access to.

Fine-grained access controls allow users to manage access to the data on AWS OpenSearch Service in new ways. One could like a search to return results from only one index, for example, based on who performs the request. One may choose to conceal some fields in your paper or exclude some documents entirely. The main advantages come with fine-grained access control:

  • Access control based on roles
  • Indexing, documents, and field-level security
  • Multi-tenancy for OpenSearch Dashboard
  • OpenSearch and OpenSearch Dashboards use HTTP authentications

Index Data Manually or From Other AWS Services

Since AWS OpenSearch Service employs a REST API, there are multiple options for indexing documents. Users may submit HTTP requests using common clients such as curl or any computer language that supports it.

OpenSearch Service covers a client in several computer languages to make dealing with it easier. Advanced users can proceed to Sign an HTTP request to AWS OpenSearch Service or Upload streaming data into AWS OpenSearch Service directly.

Use OpenSearch Dashboards to Search Your Data and Create Visualisations

The dashboard may be used to look for documents in OpenSearch Service domains.

  • Go to the domain's OpenSearch Dashboard URL. The URL may be found in the OpenSearch Service interfaces on the domain's dashboards. The URL looks like this: domain-endpoint/ dashboards/. Enter your primary username and password.
  • Dashboard requires the creation of at least one index pattern. The dashboard utilizes such a pattern to choose which indices to examine. Select Stack Management, Index Pattern, and afterward Create index patterns from the left navigation pane. Add movies for this instruction.
  • Select the Next phase, then Create index patterns. Following the creation of the patterns, one may inspect various document fields like as an actor or director.
  • Return to the Index Pattern section and choose movies as the option. If it isn't already the default, choose the pattern and click the star symbol to set it as such.
  • To start searching the data, return to the left navigation bar and select Discover. You may simply search for actor or director titles.

Use Cases of Amazon OpenSearch Service

Simpler Than Provisioned

OpenSearch Serverless removes much of the complexity of managing OpenSearch clusters and capacity. It automatically sizes and tunes your clusters, and takes care of shard and index lifecycle management. It also manages service software updates and OpenSearch version upgrades. All updates and upgrades are non-disruptive.

Cost-effective

When you use OpenSearch, you only pay for the resources that you consume. This removes the need for upfront provisioning and overprovisioning for peak workloads.

Highly Available

OpenSearch supports production workloads with redundancy to protect against Availability Zone outages and infrastructure failures.

availability zone of amazon opensearch services

Scalable

OpenSearch Serverless automatically scales resources to maintain consistently fast data ingestion rates and query response times.

Make Customized, Seamless Search Possible

With a speedy, tailored search in the apps, website, and data lake catalogs, it can assist users in finding relevant information fast.

Pricing for Amazon OpenSearch Service

Users pay for OpenSearch Service as per the hourly usage of each EC2 instance as well as the total size of any EBS storage associated with the instance. Standard AWS data transmission fees is charged as well.

There are, though, several important data transport exceptions. If a domain has numerous AZs, OpenSearch Service doesn't charge for traffic across them. In shard allotment and rebalance, important data transit happens inside a domain. The traffic is not metered or billed by OpenSearch Service. Similarly, data transport across UltraWarm / Cold node and AWS S3 is not billed by OpenSearch Service.

Free Tier

Using AWS Free Tier, customers may begin using AWS OpenSearch Service at no cost. The t2.small.search or t3.small.search instances, which are entry-level instances generally used only for test workloads, can be utilized for free for a maximum of 750 hours a month for consumers in the AWS Free Tier. Additionally, 10 GB of extra AWS EBS storage can be used each month. If users utilize more resources than allowed by the free tier, users would be charged according to the AWS OpenSearch Service prices.

Pricing for Reserved Instances

When compared to on-demand instances, AWS OpenSearch Service Reserved Instance allows users to secure instances for one or three years with considerable cost savings. On-Demand and Reserved Instances are equivalent in terms of functionality. Yet, Reserved Instances can result in considerable cost reductions from a pricing standpoint.

Volume Pricing for AWS EBS

Users may select the kind of AWS EBS volume using AWS OpenSearch Service. If users pick Provisioned IOPS (SSD) storage, users will be charged for both storage and throughput. Users will not, though, be billed for the I/Os that they utilize.

Pricing for UltraWarm and Cold Storages

UltraWarm is an AWS OpenSearch Service tier that allows users to preserve enormous volumes of data affordably while maintaining the same interactive analysis experiences.

Cold storage is the most affordable storage tier for AWS OpenSearch Service, allowing users to detach and store seldom accessed data in AWS S3 while only paying for computation when you require it.

Storage of Automatic Snapshots

With both automated and manual cluster snapshots, AWS OpenSearch Service provides increased data durability. The service provides free storage capacity for automatic snapshots for every AWS OpenSearch Service domain and keeps these snapshots for 14 days. Manual snapshots are kept in AWS S3 and are subject to typical AWS S3 usage fee. The data transmission required to use the snapshot is completely free.

Conclusion

  • AWS OpenSearch Service simplifies interactive logs analytics, real-time applications monitoring, internet search, and other tasks.
  • OpenSearch is a decentralized search and analytic package based on Elasticsearch that is open source.
  • AWS OpenSearch Service provides the most recent version of OpenSearch, supporting 19 Elasticsearch versions (1.5 to 7.10), and visualization capabilities driven by OpenSearch Dashboard and Kibana.
  • AWS OpenSearch Service presently has thousands of active clients and manages hundreds and thousands of clusters that handle hundreds of billions of queries each month.