AWS Kinesis

Learn via video courses
Topics Covered

Overview

AWS Kinesis is one of the managed services, which particularly scales elastically, especially for real-time processing of data at a massive point. Kinesis services can be used to collect a large stream of data records, especially consumed by the application process that runs on AWS EC2 instances.

Introduction to AWS Kinesis

Streaming Data:

Streaming data is data that is continually created from hundreds of data sources and these data sources can provide data records concurrently in small sizes.

Log files generated by customers, e-commerce purchases, real-time stock trade, ride-sharing apps, multi-player activity in games, information from social media, financial trading floors, geospatial services, and instrumentation in data centers are all examples of streaming data.

AWS Kinesis:

AWS Kinesis is used to collect, streamline the process and analyze the data to get the perfect insights and quick response for the information.

It also provides crucial capabilities at a low cost to process the streamlined data at a specific scale using flexible tools. Through AWS Kinesis, you can get real-time data like video, audio, and application logs, the website, clicks streams, machine learning, and other application tools.

Architecture of Kinesis Stream

Kinesis streams are made up of shards. Shards support reads at a pace of 5 tps, with a maximum total data read rate of 2MB per second, and writes at a rate of 1,000 records per second, with a complete data write speed of 1MB per second.

It also provides crucial capabilities at a low cost to process the streamlined data at a specific scale using flexible tools. The overall capacity of the Kinesis stream is the sum of all shard capabilities.

Architecture of Kinesis Stream

Assume we have EC2, mobile phones, laptops, and IoT devices producing data. They are referred to as producers since they are generating the data.

The data is sent to Kinesis streams and saved in the shard. The data is saved in shards for 24 hours by default. You can extend the retention period to 7 days. 

When data is saved in shards, EC2 instances, also known as consumers, are created. They convert the data from shards into meaningful data.

Once the consumers have completed their calculations, the user data is transported to one of the AWS services, such as DynamoDB, AWS S3, EMR, (or) Redshift.

How to Use Amazon Kinesis?

  • Users have to login into Amazon Console, click on services, and then click on kinesis in the analytics section of services.

How to Use Amazon Kinesis

  • Now click on "Get Started."

How to Use Amazon Kinesis-2

  • Now, you can see the four capabilities of AWS Kinesis, i.e., data stream, delivery stream, analytics, and video stream.

How to Use Amazon Kinesis-3

  • As of now, click on "Create data stream." Now you specify the stream name. Scroll down and provide the number of shards. Click on "Create Kinesis stream."

How to Use Amazon Kinesis-4

How to Use Amazon Kinesis-5

  • Now you can see that stream has been created; this takes just a few seconds. You can connect your application to this kinesis, and you are good to go.

How to Use Amazon Kinesis-6

Benefits/Features

  1. Real-Time: AWS Kinesis simplifies the collection, processing, and analysis of real-time streaming data such as stock transaction prices, etc., allowing you to get immediate insights and respond rapidly to new information.

  2. Fully Managed: Run all your streaming apps without being required to build or manage expensive infrastructure.

  3. Scalable: AWS Kinesis can take a large amount of streaming data and process it with minimal latency from hundreds of sources.

  4. Simple to use: We can rapidly establish a new stream, provide specifications, and begin streaming data using Amazon Kinesis.

  5. Integration: Other Amazon services, such as DynamoDB, Amazon S3, and Amazon Redshift, can be integrated.

  6. Create kinesis applications: Amazon Kinesis provides developers with client libraries to create and run real-time data processing applications. Including the Amazon Kinesis Client Library in your Java application will inform you when new data is ready for processing.

  7. Cost-effective: It provides crucial capabilities at a low cost to process the streamlined data at a specific scale using flexible tools.

Limitations

  • By default, the on-demand option allows customers to produce up to 50 data streams inside their AWS account. You must contact AWS support if you wish to extend this restriction.

  • Before base64-encoding, the data payload of a record can be up to 1 MB in size.

  • By default, stream recordings are available for up to 24 hours and can be extended to 7 days by activating extended data retention.

  • A single shard can handle up to 1000 PUT records per second.

  • Every shard can handle five read transactions per second. Each read transaction can supply up to 10,000 records, with a maximum transaction size of 10 MB.

Core Services Of Kinesis

  • Kinesis Video Streams

    It is a data streaming service that focuses on video streaming. It lets you securely stream video from any device and offers data for viewing, machine learning, analytics, or other processing. It can read data from any video device: security cameras, smartphone video, drones, RADARs, LIDARs, satellites, and more.

    It enables you to construct apps with real-time computer vision capabilities through connection with Amazon Rekognition Video and video analytics through major open-source machine learning frameworks.

    Core Services Of Kinesis

    Kinesis Video Streams may also help you with live or recorded video streaming to browsers or mobile applications through HTTP Live Streaming (HLS). WebRTC enables two-way real-time streaming among web browsers, mobile apps, and linked gadgets.

  • Kinesis Data Streams

    Amazon Kinesis Data Streams (KDS) is a highly scalable and robust real-time data streaming solution from Amazon. KDS is utilized when a vast volume of data flows from a large number of possibly unusual data sources. It can process terabytes of data per second from a variety of sources such as website clicks, IoT devices, financial transactions, website clicks, gaming micro-transactions, database event streams, and location-tracking events. In other words, if the data you wish to stream must flow directly and be accessible by a service or application (or) if it must quickly trigger analysis, KDS is the option. The data generated is instantly available for real-time analytics (within 70 milliseconds of being recorded), enabling real-time dashboards, real-time outlier detection, dynamic cost, and more.

  • Kinesis Firehose

    Kinesis Firehose loads vast amounts of streaming data into data lakes, data sources, and analytics services reliably. Firehose can ingest, process, and deliver streaming data to an unlimited number of destinations and services.

    Kinesis Firehose

    AWS S3, Redshift, ElasticSearch Service, (or) generic HTTP endpoints, as well as service providers, can be included. It provides compression and batch processing and the ability to convert and encrypt data streams before loading, boosting security and lowering storage costs. A firehose is used to swiftly convey a flood of data to a central repository (in whatever form that repository may take) for further processing.

  • Kinesis Analytics

    Kinesis Data Analytics converts and analyzes streaming data in real time, utilizing the open-source Apache Flink framework and engine. It is intended to simplify the development, management, and integration of Flink applications with other AWS services.

    Kinesis Analytics

    Kinesis Analytics facilitates the development of applications in widely used languages such as SQL, Java, Scala, and Python. It also interfaces with other Amazon Web services, such as Kinesis Data Streams (KDS), Managed Streaming for Apache Kafka (Amazon MSK, Kinesis Firehose, and Amazon Elasticsearch) others.

  • Kinesis Client Library

    KCL simplifies the consumption and processing of data from a Kinesis data stream by handling many of the complicated tasks involved with distributed computing. Load balancing across numerous consumer application instances, responding to consumer application instance failures, checkpointing processed records, and responding to resharding are examples. The KCL handles all of these subtasks, allowing you to concentrate on creating your specific record-processing logic. The KCL differs from the Kinesis Data Streams APIs in the AWS SDKs. The KCL adds a layer of abstraction around these subtasks, allowing you to concentrate on your consumer application's specialized data processing logic.

Conclusion

  • AWS Kinesis simplifies the collection, processing, and analysis of real-time streaming data such as stock transaction prices, allowing you to get immediate insights and respond rapidly to new information.

  • Amazon Kinesis supports Kinesis Data Streams, Kinesis Video Streams, Kinesis Data Analytics, and Kinesis Data Firehose.

  • Using the AWS Kinesis Video Streams users can effortlessly stream live recordings from your Kinesis video streams to your web or mobile application.

  • AWS Kinesis provides crucial capabilities at a low cost to process the streamlined data at a specific scale using flexible tools.