All Products
Search
Document Center

DataWorks:Create and use an exclusive resource group for Data Integration

Last Updated:Mar 07, 2025

DataWorks provides exclusive resource groups for Data Integration. You can configure an exclusive resource group for Data Integration for your synchronization task to allocate exclusive computing resources to the task. This helps improve the running efficiency and stability of your synchronization task. Before you use an exclusive resource group for Data Integration that you purchase, you must perform operations such as configuring network settings and IP address whitelists. This topic describes the process from the purchase of an exclusive resource group for Data Integration to the use of the resource group.

Important

If you have never activated DataWorks before June 10, 2024, you can purchase and use only serverless resource groups after you activate DataWorks, and you cannot purchase or use old-version resource groups.

Prerequisites

You are familiar with the performance and billing of an exclusive resource group for Data Integration with specific specifications. The performance of an exclusive resource group for Data Integration is measured based on the number of tasks that can be run in parallel. We recommend that you determine the specifications and subscription duration based on your business requirements before you purchase an exclusive resource group for Data Integration. For more information, see Billing of exclusive resource groups for Data Integration (subscription).

Precautions

Exclusive resource groups for Data Integration support data synchronization in complex network environments. For example, you can use an exclusive resource group for Data Integration to synchronize data across cloud environments (Alibaba Finance Cloud and Alibaba Gov Cloud), across Alibaba Cloud accounts, or from or to data centers. Before you run a synchronization task, you must make sure that network connections are established between your resource group and data sources and the IP address whitelists of the data sources are configured to ensure accessibility. If the network connections are not established, your synchronization task cannot be run. For more information about solutions for the network connectivity between an exclusive resource group for Data Integration and a data source, and precautions for configuring an IP address whitelist for a data source, see Exclusive resource groups for Data Integration.

Note
  • If you do not need to connect your resource group to your data source but want to only resolve task latency issues caused by insufficient resources of the shared resource group for Data Integration, you do not need to pay attention to the network settings described in this topic. You can purchase an exclusive resource group for Data Integration that resides in a random zone, and you do not need to configure network settings for the resource group.

  • By default, an exclusive resource group for Data Integration can access the Internet. However, the access performance cannot be ensured because the access uses an Internet Shared Bandwidth instance. To ensure the access performance, you can use a serverless resource group.

Limits

  • Only an Alibaba Cloud account or a RAM user to which the AliyunBSSOrderAccess and AliyunDataWorksFullAccess policies are attached can create a resource group.

  • Only a workspace administrator can associate a resource group with a workspace and change the workspace with which a resource group is associated.

  • For information about the permissions that are required to use the features and perform operations on the Resource Groups page of the DataWorks console, see Custom policies used to manage permissions on the entities in the DataWorks console.

  • For information about how to create a custom policy and attach the custom policy to a RAM user, see (Optional) Create a custom policy.

  • You can associate an exclusive resource group for Data Integration that uses the specifications of 4 vCPUs and 8 GiB of memory with a maximum of two VPCs. You can associate an exclusive resource group for Data Integration that uses the other specifications with a maximum of three VPCs.

Step 1: Create an exclusive resource group for Data Integration

DataWorks provides exclusive resource groups that are charged based on the subscription billing method. You can purchase such a resource group based on the instructions provided in this section.

Note

Only an Alibaba Cloud account or a RAM user to which the AliyunBSSOrderAccess and AliyunDataWorksFullAccess policies are attached can create a resource group.

  1. Log on to the DataWorks console.

  2. In the left-side navigation pane, click Resource Group. On the Exclusive Resource Groups tab of the Resource Groups page, click Create Resource Group for Data Integration of Old Version. On the buy page, configure the parameters. The following table describes the parameters.

    Parameter

    Description

    Region

    The region in which you want to create and use the exclusive resource group.

    Note

    An exclusive resource group for Data Integration cannot be shared across regions. For example, exclusive resource groups in the China (Shanghai) region can be used only by the workspaces in the China (Shanghai) region.

    Type

    The type of the exclusive resource group. Select Exclusive Resource Groups for Data Integration for this parameter.

    Resource Group Name

    The name of the exclusive resource group for Data Integration. The name must be unique within a tenant. Otherwise, an error is reported when you confirm the purchase operation.

    Note

    A tenant refers to an Alibaba Cloud account. Each tenant can have multiple RAM users.

    Resource Group Description

    The description of the exclusive resource group for Data Integration.

    Duration

    Exclusive resource groups for Data Integration are charged based on the subscription billing method. To ensure service continuity, we recommend that you select Auto-renewal. You can also go to the Renewal Management page to enable or disable auto renewal after the resource group is created. For more information, see General reference: Stop using DataWorks commodities.

    You can configure other parameters based on your business requirements.

  3. Click Buy Now, and pay the order as prompted.

    Then, DataWorks starts to initialize the resource group. When the resource group enters the Running state, the resource group is created in the DataWorks console.

    Note

    DataWorks requires approximately 20 minutes to initialize the exclusive resource group for Data Integration. Wait until the status of the resource group changes to Running.

After the exclusive resource group for Data Integration is created in the DataWorks console, you must associate the resource group with a workspace. This way, you can select the resource group when you configure a task in the workspace.

Step 2: Associate the exclusive resource group for Data Integration with a workspace

You must associate the exclusive resource group for Data Integration with a workspace before you can use the resource group in the workspace. An exclusive resource group for Data Integration can be shared among multiple workspaces but cannot be used across regions. For example, you can associate an exclusive resource group for Data Integration in the China (Shanghai) region only with workspaces in the China (Shanghai) region. To associate the created exclusive resource group for Data Integration with a workspace, perform the following steps:

Note

Only a workspace administrator can associate a resource group with a workspace and change the workspace with which a resource group is associated.

  1. Log on to the DataWorks console.

  2. In the left-side navigation pane, click Resource Group. On the Exclusive Resource Groups tab of the Resource Groups page, find the exclusive resource group for Data Integration and click Associate Workspace in the Actions column.

  3. In the Associate Workspace panel, find the workspace with which you want to associate the resource group and click Associate in the Actions column.

Step 3: Configure network settings for the exclusive resource group

Associate the exclusive resource group with a VPC

Exclusive resource groups reside in the VPC in which DataWorks is hosted and are disconnected from other network environments. To use an exclusive resource group, you must associate the exclusive resource group with a VPC that can connect to data sources. This way, the exclusive resource group can access the data sources over the VPC. To associate the exclusive resource group for Data Integration with a VPC, perform the following steps:

Important

You can associate an exclusive resource group for Data Integration that uses the specifications of 4 vCPUs and 8 GiB of memory with a maximum of two VPCs. You can associate an exclusive resource group for Data Integration that uses the other specifications with a maximum of three VPCs.

  1. Log on to the DataWorks console.

  2. In the left-side navigation pane, click Resource Group. On the Exclusive Resource Groups tab of the Resource Groups page, find the exclusive resource group for Data Integration and click Network Settings in the Actions column.

    Before you associate the exclusive resource group for Data Integration with a VPC, you must log on to the RAM console with your Alibaba Cloud account and authorize DataWorks to access your cloud resources. You can go to the Cloud Resource Access Authorization page to authorize DataWorks to access your cloud resources. You can also authorize DataWorks to access your cloud resources by clicking the related button in the dialog box that is displayed the first time you log on to the DataWorks console with your Alibaba Cloud account.

  3. Associate the exclusive resource group with a VPC.

    1. On the VPC Binding tab of the page that appears, click Add VPC Association. In the Add VPC Association dialog box, configure the parameters. You must configure the parameters based on the network environments of your data source and resource group. The following table describes the details.

      Note

      If you want to use the resource group to access a data source, such as an Alibaba Cloud data source or a self-managed data source hosted on an Elastic Compute Service (ECS) instance, you can select a network connectivity solution and configure network settings based on whether the resource group and data source belong to the same Alibaba Cloud account.

      Parameter

      Description (same region and Alibaba Cloud account)

      Description (different regions or Alibaba Cloud accounts)

      VPC

      If your data source and the exclusive resource group belong to the same Alibaba Cloud account, we recommend that you select the VPC in which your data source resides.

      If your data source and the exclusive resource group belong to different Alibaba Cloud accounts, configure this parameter based on the description for the scenario where your data source and the exclusive resource group reside in different regions.

      If your data source and the exclusive resource group belong to different Alibaba Cloud accounts or reside in different regions, you must select a VPC that connects to the data source. For example, if your data source does not reside in a VPC, you can click Create VPC to create a VPC for the exclusive resource group. After the VPC is created, you can select it from the VPC drop-down list. You can also select a VPC that connects to your data source.

      Note

      If your data source and the exclusive resource group reside in different regions or belong to different Alibaba Cloud accounts, you must use VPN Gateway or Express Connect to establish a connection between the VPC with which the exclusive resource group is associated and the VPC in which the data source resides and add a route that points to the IP address of the data source for the exclusive resource group. For more information, see Network connectivity solutions.

      Zone

      Select the zone in which your data source resides.

      Select a zone from which a network connection to your data source is established.

      vSwitch

      If you set the VPC parameter to the VPC in which your data source resides, we recommend that you select the vSwitch with which the data source is associated.

      Note

      After you associate the exclusive resource group with the VPC in which the data source resides and a vSwitch that resides in the VPC, a route that points to the CIDR block of the VPC is automatically added. This ensures that the exclusive resource group can access the data sources in this VPC.

      Select the vSwitch to which the data source is connected. If no vSwitch is available, you can click Create VSwitch to create a vSwitch for the exclusive resource group. After a vSwitch is created, select the vSwitch.

    2. Click OK.

    Note

    If your data source and the exclusive resource group reside in different regions or belong to different Alibaba Cloud accounts, you must add a route that points to the IP address of your data source after you associate the exclusive resource group with a VPC.

  4. Add host configurations. This operation is optional.

    You may fail to access your data source by using IP addresses. For example, you can access your data source only by using hostnames. In this case, you must perform the following steps to add host configurations. Otherwise, the connectivity test fails when you add the data source by using its hostnames.

    1. Click the Hostname-to-IP Mapping tab. On this tab, click Add. In the Create Hostname-to-IP Mapping dialog box, configure the parameters. The following table describes the parameters.

      Parameter

      Description

      IP Address

      The actual IP address of the data source.

      Hostname

      The hostname that is used to access the data source. If you want to specify multiple hostnames, place each hostname in a separate row.

    2. If the data source has multiple IP addresses, click Add to add more host configurations.

      Note
      • The IP address or hostnames that are added in a host configuration must be different from the IP addresses or hostnames in existing host configurations.

      • You can map one IP address to multiple hostnames in a host configuration. However, one hostname can point to only one IP address.

Configure the IP address whitelist of a data source

Even if your exclusive resource group for Data Integration and your data source reside in the same zone, same VPC, and same vSwitch, the access from the resource group to the data source may still fail due to restrictions of the IP address whitelist of the data source. In this case, you must configure the IP address whitelist of your data source based on the following instructions:

  • If you want to establish a network connection between the exclusive resource group and your data source over an internal network, you must add the CIDR block of the vSwitch with which the resource group is associated to the IP address whitelist of your data source.

    To view the CIDR block of the vSwitch with which the resource group is associated, perform the following operations: Log on to the DataWorks console and click Resource Group in the left-side navigation pane. On the Exclusive Resource Groups tab of the Resource Groups page, find the exclusive resource group and click Network Settings in the Actions column. On the VPC Binding tab of the page that appears, you can view the CIDR block in the VSwitch CIDR Block column.独享绑定的交换机网段

  • If you want to establish a network connection between the exclusive resource group and your data source over the Internet, you must add the EIP of the resource group to the IP address whitelist of your data source. 查看独享资源组EIP

Step 4: Test the network connectivity of the exclusive resource group

After you complete the preceding network configuration, you need to test the network connectivity between the resource group and your data source by performing the following operations:

  1. Go to the Data Sources page.

    1. Log on to the DataWorks console. In the top navigation bar, select the desired region. In the left-side navigation pane, choose More > Management Center. On the page that appears, select the desired workspace from the drop-down list and click Go to Management Center.

    2. In the left-side navigation pane of the SettingCenter page, click Data Sources.

  2. On the Data Sources tab of the Data Sources page, find the desired data source and click Modify in the Operation column.

  3. In the Connection Configuration section of the page that appears, select Data Integration, find the exclusive resource group for Data Integration that you want to use, and then click Test Network Connectivity in the Connection Status column. If the connectivity status is Connected, a network connection is established between the resource group and data source.测试网络联通

    Note

    If a network connection cannot be established between the resource group and the data source, click Self-service Troubleshoot in the Connection Status column to choose a diagnostic tool to diagnose network connection exceptions. For more information about the solutions for network connectivity between an exclusive resource group and data sources that reside in various network environments, see Network connectivity solutions.

  4. Click Complete Modification.

What to do next

View the resource usage of the exclusive resource group and monitor the resource group

You can view the resource usage of the exclusive resource group and the number of instances that are waiting for resources in the resource group in the DataWorks console. You can also use the intelligent monitoring feature provided in Operation Center to monitor the resource usage of the resource group and the number of instances that are waiting for resources in the resource group. For more information about how to view the resource usage of a resource group, see View the resource usage of an exclusive resource group. For more information about how to monitor a resource group, see Create a custom alert rule.

Change the zone of the exclusive resource group

To change the zone of the exclusive resource group, perform the following steps:

  1. Log on to the DataWorks console.

  2. In the left-side navigation pane, click Resource Group. On the Exclusive Resource Groups tab of the Resource Groups page, find an exclusive resource group whose purpose is Data Integration.

  3. Click the image icon in the Actions column and select Change Zone.

  4. In the Change Zone for Resource Group dialog box, configure the Current Zone, Machines, New Zone, and Number of Machines to Use parameters.

  5. Click OK.

If you change the zone of a resource group to another zone, network changes may occur:

  • CIDR blocks of the resource group: Each zone of an ECS instance in a resource group has an independent CIDR block. After the change, whether the CIDR blocks of the resource group change depends on whether the zone of the resource group is changed. If the zone of the resource group is changed, the CIDR blocks of the resource group also change.

  • Primary ENI IP addresses of the resource group: If you change the zone of an ECS instance in the resource group, the primary ENI IP address of the ECS instance also changes and the system assigns an IP address that falls into the CIDR block of the new zone to the ECS instance.

  • Elastic IP address (EIP) associated with the resource group: If you add the CIDR block of the vSwitch with which the resource group is associated to the IP address whitelist of the desired data source, you do not need to update the address information in the IP address whitelist after the change. If you add the EIP associated with the resource group to the IP address whitelist, you must update the address information in the IP address whitelist after the change. This ensures that the resource group has permissions to access the data source and you can perform operations on the data source.

Appendix: Change the resource groups used by tasks to an exclusive resource group for Data Integration

After an exclusive resource group for Data Integration is created and configured, you can change the resource groups that are used to run tasks to the exclusive resource group by using one of the following methods.

Environment for the operation

Supported change operation

Entry point

Production environment

Change the resource groups for Data Integration for multiple tasks in the production environment at the same time

In the left-side navigation pane of the Operation Center page, choose Auto Triggered Node O&M > Auto Triggered Nodes.

On the page that appears, select the tasks for which you want to change the resource groups, click Actions in the lower part of the page, and then select Modify Data Integration Resource Group.image.png

Development environment

  • Change the resource group for Data Integration for a single task in the development environment

  • Change the resource groups for Data Integration for multiple tasks in the development environment at the same time

Go to the DataStudio page.

  • Change the resource group for Data Integration for a single task in the development environment

    Go to the configuration tab of the task for which you want to change the resource group, click More in the Configure Network Connections and Resource Group step, select the type of the resource group that you want to use, and then select the desired resource group from the drop-down list.

    image

  • Change the resource groups for Data Integration for multiple tasks in the development environment at the same time

    Click the 批量操作 icon. On the Batch Operation-Data Development tab, select the tasks for which you want to change the resource groups, click More in the lower part of the tab, and then select Modify Data Synchronization Task.批量操作

Note

If you cannot find the entry point of changing the resource groups for tasks, you can select Offline Synchronization from the Node Type drop-down list in the filter condition section to search for all batch synchronization tasks.