This weekend I am attending Markup UK at King’s College London , a 2 day conference on XML and other markup technologies. I am presenting a paper on running XSpec tests in a serverless environment with AWS Lambda (which I blatantly titled XSpec in the Cloud with Diamonds). The paper is available here and the slides of my presentation are available here.
Tag Archives: AWS Lambda
AWS Lambda and Jenkins Integration
Serverless is gaining attention as the next big thing in the DevOps space after containers. Developers are excited because they don’t have to worry about servers any more; Ops may be sceptical and slightly worried to hear about a world without servers (and sys admin maintaining them). Can these two worlds co-exist? Can serverless just be another tool in the DevOps toolkit?
I recently implemented a real use case at work where we took advantage of an event-driven workflow to trigger Jenkins jobs originally created to be executed manually or on a schedule. The workflow is as follows:
1. New data is uploaded to an S3 bucket
2. The S3 event calls a lambda function that triggers a Jenkins job via the Jenkins API
3. The Jenkins job validates the data according to various criteria
4. If the job passes, the data is upload on an S3 bucket and a successful message is sent to a Slack channel
5. If the job fails, a message with a link to the failed job is sent to a Slack channel
Jenkins User
Let’s start by creating a new user with the correct permissions in Jenkins. This allows to restrict what the lambda function can do in Jenkins.
In Manage Jenkins -> Manage Users -> Create User I create a user called lambda:
In Manage Jenkins -> Configure Global Security -> Authorization -> Matrix-based Security add the user lambda to User/group to add and set the permissions as in the matrix below:
This is a minimum set up and allows the lambda user to build jobs. According to your security policies, you may want to further restrict the permissions of the lambda user in order to run only some specific jobs (you may need role based authentication for setting this up).
AWS IAM Role
Now let’s move to AWS and set up a IAM Role for the lambda function. Head to IAM -> Roles and create a new roles with the following policies (my role name is digiteum-file-transfer , sensitive information is obfuscated for security reasons):
This role allows to execute lambda functions, access S3 buckets as well as the Virtual Private Cloud (VPC).
S3 Configuration
I create an empty S3 bucket using the wizard configuration in S3 and name it gadictionaries-leap-dev-digiteum. This is the bucket that is going to trigger the lambda function.
AWS Lambda Configuration
Finally, let’s write the lambda function. Go to Lambda -> Functions -> Create a Lambda Function. Select Python 2.7 (read Limitations to see why I’m not using Python 3) as runtime environment and select a blank function.
In Configure Trigger, set up the trigger from S3 to Lambda, select the S3 bucket you created above (my S3 bucket is named gadictionaries-leap-dev-digiteum ), select the event type to occur in S3 (my trigger is set to respond to any new file drop in the bucket) and optionally select prefixes or suffixes for directories and file names (I only want the trigger to occur on XML files). Here is my trigger configuration:
In Configure Function, choose a name for your function (mine is file_transfer ) and check out the following Python code before uploading it:
from __future__ import print_function import json import urllib import boto3 import jenkins import os print('Loading lambda function') s3 = boto3.client('s3') # TODO: private IP of the EC2 instance where Jenkins is deployed, public IP won't work jenkins_url = 'http://123.45.56.78:8080/' # TODO: these environment variables should be encrypted in Lambda username = os.environ['username'] password = os.environ['password'] def lambda_handler(event, context): # Get the S3 object and its filename from the S3 event bucket = event['Records'][0]['s3']['bucket']['name'] filename = urllib.unquote_plus(event['Records'][0]['s3']['object']['key'].encode('utf8')) try: # Connect to Jenkins and build a job server = jenkins.Jenkins(jenkins_url, username=username, password=password) server.build_job('Pipeline/Digiteum_File_Transfer', {'filename': filename}) return 'Job Digiteum File Transfer started on Jenkins' except Exception as e: print(e) print('Cannot connect to Jenkins server or run build job') raise e
Note the following:
- Line 6 imports the python-jenkins module. This module is not in Python’s standard library and needs to be provided within the zip file (more on this in a minute).
- Line 12 sets up the URL of the EC2 instance where Jenkins is deployed. Note that you need to use the private IP address as shown in EC2, it won’t work if you use the public IP address or the Elastic IP address.
- Lines 15 and 16 set up the credentials of the Jenkins user lambda. The credentials will be exposed to the lambda function as environment variables and, unlike in this example, it is recommended to encrypt them.
- Lines 18-31 contain the handler function that is triggered automatically by a new file upload in the S3 bucket. The handler function does the following:
- retrieve the filename of the new file uploaded on S3 (lines 21-22)
- log into Jenkins via username and password for the lambda user (line 25)
- build the job called Digiteum_File_Transfer in the folder Pipeline (line 26)
- throw an error if it can’t connect to Jenkins or start the job (lines 28-31)
As an example, here is the zip file to upload in Configure Function. It contains the lambda function and all the Python modules needed, including the python-jenkins module. Make sure you edit the private IP address of your Jenkins instance in line 12. If you need to install additional Python modules, you can follow these instructions.
Here is how my Configure Function looks like:
Note the name (it should read file_transfer instead of file_transfe ), the handler (as in the Python code above), and the role (as created in IAM). Note also that the username and the password of the Jenkins user lambda are provided as environment variables (ideally, you should encrypt these values by using the option Enable encryption helpers).
Once you’ve done the basic configuration, click on Advanced Settings. In here you need to select the VPC, subnet, and security group of the EC2 instance where Jenkins is running (all these details about the instance are in EC2 -> Instances). In fact, the lambda function needs to run in the same VPC as Jenkins otherwise it cannot connect to Jenkins. For example, here is how my advanced settings look like (sensitive information is obfuscated):
Finally, review your settings and click on Create Function.
Test the Lambda Function
Once you created a lambda function, configure a test event to make sure the lambda function behaves as intended. Go to Actions -> Configure test event and select S3 Put to simulate a data upload in the S3 bucket. You need to replace the bucket name (in this example gadictionaries-leap-dev-digiteum) and the name of an object in that bucket (in this example I uploaded a file in the bucket and called it test.xml). Here is a test example to adapt:
{ "Records": [ { "eventVersion": "2.0", "eventTime": "1970-01-01T00:00:00.000Z", "requestParameters": { "sourceIPAddress": "127.0.0.1" }, "s3": { "configurationId": "testConfigRule", "object": { "eTag": "0123456789abcdef0123456789abcdef", "sequencer": "0A1B2C3D4E5F678901", "key": "test.xml", "size": 1024 }, "bucket": { "arn": "arn:aws:s3:::gadictionaries-leap-dev-digiteum", "name": "gadictionaries-leap-dev-digiteum", "ownerIdentity": { "principalId": "EXAMPLE" } }, "s3SchemaVersion": "1.0" }, "responseElements": { "x-amz-id-2": "EXAMPLE123/5678abcdefghijklambdaisawesome/mnopqrstuvwxyzABCDEFGH", "x-amz-request-id": "EXAMPLE123456789" }, "awsRegion": "us-east-1", "eventName": "ObjectCreated:Put", "userIdentity": { "principalId": "EXAMPLE" }, "eventSource": "aws:s3" } ] }
Click on Save and Test and you should see the lambda function in action. Go to Jenkins and check that the job has been executed by user lambda . If it doesn’t work, have a look at the logging in AWS Lambda to debug what went wrong.
Slack Configuration
Finally, I set up a Slack integration in Jenkins so that every time the Jenkins job is executed, a notification is sent to a Slack channel. This also allows several people to get notified about a new data delivery.
First, install and configure the Slack plugin in Jenkins following the instructions on the GitHub page. The main configuration is done in Manage Jenkins -> Configure System -> Global Slack Notifier Settings. For example, this is my configuration:
- Team Subdomain is the name of your Slack account
- Channel is the name of your default slack channel (you can override this in every job)
- Integration Token Credential ID is created by clicking Add and creating a token in Jenkins’ credentials. As the message says, it is recommended to use a token for security reasons. Here is an example of a Token Credential ID for Slack in Jenkins:
You typically want to add a notification to a specific Slack channel in your Jenkins job as a post-build action in order to notify the result of a job. In Jenkins go to your job’s configuration, add Post-build Actions -> Slack Notifications and use settings similar to these:
This sends a notification to the Slack channel (either the default one set in Global Slack Notifier Settings or a new one set here in Project Channel) every time a job passes or fails. When a notification is sent to Slack, I will look like this:
Now you can keep both technical and non-technical users informed without having to create specific accounts on Jenkins or AWS or spamming users with emails.
Limitations
I ran into two problems that I was not yet been able to solve due to lack of time. I want to flag them as they can improve the lambda function and make it more maintainable. If anyone want to help me to fix this, please send me your comments.
- Encryption: I tried to encrypt the Jenkins password but I could not make the lambda function decrypt the password. I set up an encryption key in IAM -> Encryption keys -> Configuration -> Advances settings -> KMS key and pasted the sample code in the lambda function but the lambda function timed out without giving an error message. I imported the b64decode module from base64 in the Python code but there must be an issue with this instructions that decrypts the variable ENCRYPTED :
DECRYPTED = boto3.client('kms').decrypt(CiphertextBlob=b64decode(ENCRYPTED))['Plaintext']
- Python 2.7: I wanted to use Python 3 but I had issues with the installation of some modules. Therefore I used Python 2.7 but the code should be compatible with Python 3 (apart from the imported modules).
Conclusion
Integrating AWS Lambda and Jenkins requires a little bit of configuration but I hope this tutorial may help other people to set it up. If the integration needs to be done the other way round (i.e. trigger a lambda function from a Jenkins job), check out the AWS Lambda Plugin.
I believe integrating AWS Lambda (or any FaaS) with Jenkins (or any CI/CD server) is particularly suited for the following uses cases:
- Organisations that already have some DevOps practices in place and a history of build jobs but want to take advantages of the serverless workflow without completely re-architecturing their infrastructure.
- CI/CD pipelines that need be triggered by events but are too complex or long to be crammed in a single function.