Table of contents

CloudBees Jenkins Operations Center administration guide


Shared clouds

Operations Center enables the sharing of cloud provisioned agent executors across the client masters in the Operations Center cluster.

Currently there are a number of restrictions:

Restriction Description

Non-standard Launcher

If a non-standard launcher is used, the plugin defining the launcher must be installed on all the client masters within scope for using the shared agent, and the plugin versions on the client master and the Operations Center server must be compatible in terms of configuration data model. (An example of known non-compatibility would be that the ssh-slaves plugin pre 0.23 uses a significantly different configuration data model from post 1.0. This specific configuration data model difference is not of concern as the current supported versions of Jenkins all bundle versions of the ssh-slaves newer than 1.0)

Shared Agent Usage

Shared agents can only be used by sibling client masters or by client masters in sub-folders of the container where the shared agent item is defined.

One Shot Build Mode

Shared agents operate in a "one-shot" build mode, unless the client master loses its connection with the Operations Center server. When the connection has been interrupted, client masters will use any matching agents on-lease to perform builds. If there are no agents on-lease to the client master when the connection is interrupted, the client master may be unable to perform any builds (unless it has dedicated executors available)

Shared Agent with More than One Executor

If an agent is configured with more than one executor, the other executors are available to start builds while at least one executor is in use and no more builds than the number of configured executors have been started on the agent. In other words, if an agent is configured with four executors, it accepts up to four builds on a client master, but after at least one build has completed returned immediately after it becomes idle even if less than four builds have been run during the lease.

Built-in Garbage

If the backing cloud provider plugin performs built-in garbage collection and does not use the Node Iterator API to iterate the nodes that are in use, then the built-in garbage collection may result in the termination of in-progress builds when nodes are on-lease to client masters.

The sharing model used for shared agents is the same as the credentials propagation and role-based access control plugin’s group inheritance model. Consider the following configuration:

sharing model
Figure 1. Sample configuration
  • There are three folders: F1, F2 and F1/F3

  • There are three shared clouds: F1/C1, F2/C2 and C3

  • There are four client masters: F1/F3/M1, F1/M2, F2/M3 and M4

The following logic is used to locate a shared cloud:

  • If there is a shared cloud with available capacity (or an idle shared agent) at the current level and that cloud can provision the labels required by the job, then that cloud requests to provision an agent for lease.

  • If there is no matching shared cloud with available capacity (or matching idle shared agents) at the current level proceed to the parent level and repeat.

Thus:

  • F1/F3/M1 and F1/M2 are able to perform builds on agents provisioned from F1/C1 and C3 but not from F2/C2. F1/C1 are preferred as it is "nearer".

  • F2/M3 are able to perform builds on agents provisioned from F2/C2 and C3 but not from F1/C1. F2/C2 is preferred.

  • M4 will only be able to perform builds on agents provisioned from C3

Under normal operation when an agent is leased to a client master it will be leased for one and only one job build. Once the build is completed the agent will be returned from its lease. This is known as "one-shot" build mode. If while the agent is on lease, the connection between the client master and the Operations Center server is interrupted the client master is unable to return the agent until the connection is re-established. While in this state the agent is available for re-use by the client master.

Installing a shared cloud

To install a shared cloud, first you must determine how you want to install:

  • If you want to create the shared cloud item at the root of CloudBees Jenkins Operations Center server (for example, the cloud is available to all client masters) navigate to the root and select the New Job.

  • If you want to create the shared cloud item within a folder in CloudBees Jenkins Operations Center server (for example: the cloud will be available only to client masters within the folder or within the folders sub-folders) then navigate to that folder and select the New Item.

In either case, the standard new item screen appears:

new item screen
Figure 2. New item screen
  1. Provide a name and select Shared Cloud as the type.

    Once the cloud is created, you are redirected to the configuration screen. A newly-created shared agent is in the offline state:

    configuration screen
    Figure 3. Configuration screen

    The configuration screen options are analogous to the standard Jenkins cloud configuration options.

  2. When the cloud configuration is complete, save or apply the configuration. If you do not want the cloud available for immediate use, deselect the Take on-line after save/apply checkbox before saving the configuration.

Adding Cloud-defined Node Properties

The Jenkins Cloud API allows that a cloud can define the node properties for the provisioned agent. This allows the implementation of the cloud extension point to use node properties to track and identify agents that have been provisioned by the implementation. As such, you can’t define node properties for agents provisioned from the Jenkins Cloud API. Defining the node properties results in removing any tracking node properties injected by the cloud implementation.

You can inject/override additional node properties. The initial implementation provides support for two specific types of node property:

  • Environment Variables

  • Tool Locations

Note

While the configuration may look similar to that of directly attached agents it must be stressed that these properties are being injected and merged into the list of properties provided by the implementation of the Jenkins Cloud API itself.

The merging of NodeProperty implementations is not something that Jenkins provides an API for, and as such each NodeProperty implementation needs explicit merge logic to be provided for.

Where customers have written their own custom plugins which provide custom implementation(s) of NodeProperty, those customers can write a custom plugin for Operations Center (see Creating Custom Extensions) that providing implementation(s) of com.cloudbees.opscenter.server.properties.NodePropertyCustomizer with the appropriate injection/override logic for these custom node properties.

Injecting and/or Overriding Node Properties

To inject and/or override node properties, navigate to and the Inject/Override Node Properties option on the configuration screen and add the required node property customizers:

  • Environment variables adds/updates the environment variables node property with the supplied values.

  • Tool Locations adds/updates the tool locations. The required tool installers must be defined with the same names both on Operations Center and on the client masters to which agents are leased.

node property override
Figure 4. Injecting/overriding the environment variable node properties of provisioned agents

Reviewing Common Shared Cloud Tasks

Taking a Cloud Offline

To take a shared cloud offline, (for example: for maintenance of the server that hosts the shared cloud or to make configuration changes to the shared cloud):

Navigate to the shared cloud screen and select the Take off-line.

online state
Figure 5. An online shared cloud

Taking a Cloud Online

To take a shared cloud online, (for example: for maintenance of the server that hosts the shared cloud or to make configuration changes to the shared cloud):

Navigate to the shared cloud screen and select the Take on-line.

offline state
Figure 6. An offline shared cloud

Configuring a Shared Cloud

To configure a shared cloud, it is necessary to take the cloud off-line first. If the shared cloud is online selecting the Configure action will prompt to take the cloud offline first.

configure online
Figure 7. Attempting to configure an online shared cloud

Deleting a Shared Cloud

To delete a shared cloud, take the cloud offline first. If the shared cloud is online, select Delete to take the cloud offline first.

delete online
Figure 8. Attempting to delete an online shared cloud

Moving a Shared Cloud

A shared cloud can be moved between folders by selecting the Move.

Note

When moving shared clouds, remember that the JNLP agent launch commands include the path to the cloud. If you move a JNLP shared cloud, you need to update the JNLP agents to connect to the new location. Any agents that are connected while the move is in progress are unaffected. If the Operations Center master is restarted or fails over in a HA cluster, however, the JNLP agents will be unable to reconnect until they are reconfigured with the new path.

Recovering "Lost" Agents

Occasionally, due to lost connections between client masters and the Operations Center Server, a Shared Cloud’s agent node may become temporarily stuck in an on-lease state, whereby the Operations Center Server believes the node to be leased to a specific client master, but the client master has no knowledge of the node.

Built-in safety mechanisms kick in to identify and recover such "lost" nodes. By their nature, these processes perform cross-checks to ensure that an in-use node is not incorrectly recovered.

To start the recovery process, the client master that the node was leased to must be connected to the Operations Center server. Once the connection is established, it can take up to 15 minutes for the recovery process to progress through its checks. Under normal circumstances, the checks are completed in under two to three minutes.

If the automatic recovery processes fails, use the Force release link on each lease record to force record the lease into a returned state.

Caution
Forcing a lease into a released state bypasses all the safety checks that ensure that the agent is no longer in use.

Reviewing CLI Operations that Support Shared Clouds

The following CLI operations are designed to support management of shared clouds:

  • create-job can be used to create a shared cloud

  • disable-agent-trader can be used to take a shared cloud off-line

  • enable-agent-trader can be used to take a shared cloud on-line

  • list-leases queries the active leases of a shared cloud

  • shared-cloud-force-release can be used to force release of a "stuck" lease record

  • shared-cloud-delete can be used to delete a shared cloud

Note

The following CLI operations have been deprecated, and will be removed in a future release:

  • disable-slave-trader has been replaced with disable-agent-trader

  • enable-slave-trader has been replaced with enable-agent-trader