This is Technical Insights Series by Perry Ma | Product Lead, Real-time Compute for Apache Flink at Alibaba Cloud.
Backing up data is like taking photos of important files - sometimes you need automatic periodic snapshots (checkpoints), sometimes you need manual snapshots of important moments (savepoints). In early versions of Flink, these two "snapshot" methods operated independently, making them less convenient to use. FLIP-10 aims to unify these two methods, making data backup simpler and more flexible.
Before understanding FLIP-10, let's look at the differences between these two "snapshot" methods:
Let's use room cleaning as an analogy:
The current design caused some inconveniences:
FLIP-10's core is breaking down the barriers between checkpoints and savepoints, making them convertible. Specifically:
This is like adding a new option to room cleaning: after tidying, you can choose to preserve important cleaning records, allowing you to return to this state anytime.
This is equivalent to adding a "timer reminder" feature to savepoints, eliminating the need to remember manual saves.
FLIP-10 implemented these improvements through:
Usage is simple, mainly through CheckpointConfig:
// Enable persistent checkpoints
env.getCheckpointConfig()
.enablePersistentCheckpoints("/path/to/save");
// Enable periodic savepoints
env.getCheckpointConfig()
.enablePeriodicSavepoints(1, TimeUnit.HOURS, "/path/to/save");
To simplify management, all savepoints are stored uniformly in the filesystem. It's like storing all photos in the same album for easy finding and management.
The CheckpointCoordinator has become smarter:
These improvements bring tangible conveniences to Flink users:
If a program encounters unrecoverable errors, the most recent checkpoint is automatically saved, like adding an extra layer of protection for data.
Programs without enabled checkpoints can now create savepoints, giving users more choices.
When using these new features, here are some recommendations:
1.Choose Storage Location Wisely:
2.Set Save Frequency:
3.Monitoring and Maintenance:
FLIP-10 makes Flink's data backup system more unified and intelligent. It's like integrating two different photo album systems, maintaining their individual features while making usage more convenient. These improvements not only make data backup more reliable but also make operations work easier. This is the significance of technological progress - making complex things simple and tedious work efficient.
184 posts | 49 followers
FollowApache Flink Community China - January 9, 2020
Apache Flink Community China - September 16, 2020
Apache Flink Community China - September 27, 2019
Apache Flink Community China - January 9, 2020
Apache Flink Community China - November 6, 2020
Apache Flink Community China - July 28, 2020
184 posts | 49 followers
FollowRealtime Compute for Apache Flink offers a highly integrated platform for real-time data processing, which optimizes the computing of Apache Flink.
Learn MoreAlibaba Cloud provides big data consulting services to help enterprises leverage advanced data technology.
Learn MoreA real-time data warehouse for serving and analytics which is compatible with PostgreSQL.
Learn MoreAlibaba Cloud experts provide retailers with a lightweight and customized big data consulting service to help you assess your big data maturity and plan your big data journey.
Learn MoreMore Posts by Apache Flink Community