Amazon S3 # Amazon Simple Storage Service (Amazon S3) provides cloud object storage for a variety of use cases. You can use S3 with Flink for reading and writing data as well in conjunction with the streaming state backends. You can use S3 objects like regular files by specifying paths in the following format: s3:/// The endpoint can either be a single file or a directory, for example: // Read from S3 bucket FileSource fileSource = FileSource.| nightlies.apache.org
Name Last modified Size Description| nightlies.apache.org
Name Last modified Size Description| nightlies.apache.org
Performance Tuning # SQL is the most widely used language for data analytics. Flink’s Table API and SQL enables users to define efficient stream analytics applications in less time and effort. Moreover, Flink Table API and SQL is effectively optimized, it integrates a lot of query optimizations and tuned operator implementations. But not all of the optimizations are enabled by default, so for some workloads, it is possible to improve performance by turning on some options.| nightlies.apache.org
User-defined Functions # User-defined functions (UDFs) are extension points to call frequently used logic or custom logic that cannot be expressed otherwise in queries. User-defined functions can be implemented in a JVM language (such as Java or Scala) or Python. An implementer can use arbitrary third party libraries within a UDF. This page will focus on JVM-based languages, please refer to the PyFlink documentation for details on writing general and vectorized UDFs in Python.| nightlies.apache.org
Configuration # By default, the Table & SQL API is preconfigured for producing accurate results with acceptable performance. Depending on the requirements of a table program, it might be necessary to adjust certain parameters for optimization. For example, unbounded streaming programs may need to ensure that the required state size is capped (see streaming concepts). Overview # When instantiating a TableEnvironment, EnvironmentSettings can be used to pass the desired configuration for the cur...| nightlies.apache.org
Versioned Tables # Flink SQL operates over dynamic tables that evolve, which may either be append-only or updating. Versioned tables represent a special type of updating table that remembers the past values for each key. Concept # Dynamic tables define relations over time. Often, particularly when working with metadata, a key’s old value does not become irrelevant when it changes. Flink SQL can define versioned tables over any dynamic table with a PRIMARY KEY constraint and time attribute.| nightlies.apache.org
Time Attributes # Flink can process data based on different notions of time. Processing time refers to the machine’s system time (also known as epoch time, e.g. Java’s System.currentTimeMillis()) that is executing the respective operation. Event time refers to the processing of streaming data based on timestamps that are attached to each row. The timestamps can encode when an event happened. For more information about time handling in Flink, see the introduction about event time and water...| nightlies.apache.org
Temporal Table Function # A Temporal table function provides access to the version of a temporal table at a specific point in time. In order to access the data in a temporal table, one must pass a time attribute that determines the version of the table that will be returned. Flink uses the SQL syntax of table functions to provide a way to express it. Unlike a versioned table, temporal table functions can only be defined on top of append-only streams — it does not support changelog inputs.| nightlies.apache.org
Streaming Concepts # Flink’s Table API and SQL support are unified APIs for batch and stream processing. This means that Table API and SQL queries have the same semantics regardless whether their input is bounded batch input or unbounded stream input. The following pages explain concepts, practical limitations, and stream-specific configuration parameters of Flink’s relational APIs on streaming data. State Management # Table programs that run in streaming mode leverage all capabilities of...| nightlies.apache.org
Dynamic Tables # SQL - and the Table API - offer flexible and powerful capabilities for real-time data processing. This page describes how relational concepts elegantly translate to streaming, allowing Flink to achieve the same semantics on unbounded streams. Relational Queries on Data Streams # The following table compares traditional relational algebra and stream processing for input data, execution, and output results. Relational Algebra / SQL Stream Processing Relations (or tables) are bo...| nightlies.apache.org
Apache Flink Documentation # Apache Flink is a framework and distributed processing engine for stateful computations over unbounded and bounded data streams. Flink has been designed to run in all common cluster environments, perform computations at in-memory speed and at any scale. Try Flink # If you’re interested in playing around with Flink, try one of our tutorials: Fraud Detection with the DataStream API Real Time Reporting with the Table API Intro to PyFlink Flink Operations Playground...| nightlies.apache.org
INSERT Statement # INSERT statements are used to add rows to a table. Run an INSERT statement # Java Single INSERT statement can be executed through the executeSql() method of the TableEnvironment. The executeSql() method for INSERT statement will submit a Flink job immediately, and return a TableResult instance which associates the submitted job. Multiple INSERT statements can be executed through the addInsertSql() method of the StatementSet which can be created by the TableEnvironment.| nightlies.apache.org
Apache Kafka SQL Connector # Scan Source: Unbounded Sink: Streaming Append Mode The Kafka connector allows for reading data from and writing data into Kafka topics. Dependencies # Only available for stable versions. The Kafka connector is not part of the binary distribution. See how to link with it for cluster execution here. How to create a Kafka table # The example below shows how to create a Kafka table:| nightlies.apache.org
Windowing table-valued functions (Windowing TVFs) # Batch Streaming Windows are at the heart of processing infinite streams. Windows split the stream into “buckets” of finite size, over which we can apply computations. This document focuses on how windowing is performed in Flink SQL and how the programmer can benefit to the maximum from its offered functionality. Apache Flink provides several window table-valued functions (TVF) to divide the elements of your table into windows, including:| nightlies.apache.org
Joins # Batch Streaming Flink SQL supports complex and flexible join operations over dynamic tables. There are several different types of joins to account for the wide variety of semantics queries may require. By default, the order of joins is not optimized. Tables are joined in the order in which they are specified in the FROM clause. You can tweak the performance of your join queries, by listing the tables with the lowest update frequency first and the tables with the highest update frequen...| nightlies.apache.org