How to Validate a Jenkinsfile

Jenkins Pipeline
Image  jenkins.io ©

As I’m using more and more often Jenkins Pipelines, I found the need to validate a Jenkinsfile in order to fix syntax errors before committing the file into version control and running the pipeline job in Jenkins. This can saves some time during development and allows to follow best practices when writing a Jenkinsfile as validation returns both errors and warnings.

There are several ways to validate a Jenkinsfile and some editors like VS Code even have a built-in linter. Personally, the easiest way I found is to validate a Jenkinsfile is to run the following command from the command line (provided you have Jenkins running somewhere):

curl --user username:password -X POST -F "jenkinsfile=<Jenkinsfile" http://jenkins-url:8080/pipeline-model-converter/validate

Note the following:

  1. If your Jenkins is authenticating users, you need to pass the username and password otherwise you can omit that part.
  2. By default, this command expects your Jenkinsfile to be called Jenkinsfile. If not, change the name in the command.
  3. Replace jenkins_url and possibly port 8080 based on the URL and port where you are running Jenkins. You can also use localhost as URL if you are running Jenkins on your machine.

If the Jenkinsfile validates, it will show a message like this one:

Jenkinsfile successfully validated.

Or, if you forgot to use steps within stage in your Jenkinsfile, the validation will flag an error like this:

Errors encountered validating Jenkinsfile:
WorkflowScript: 10: Unknown stage section "git". Starting with version 0.5, steps in a stage must be in a ‘steps’ block. @ line 10, column 9.
           stage('Checkout Code') {
           ^

WorkflowScript: 10: Expected one of "steps", "stages", or "parallel" for stage "Checkout Code" @ line 10, column 9.
           stage('Checkout Code') {
           ^

Happy validation!

How to encrypt and decrypt emails and files

Privacy Encryption
Image  Richard Patterson CC BY 2.0

Somewhere I read that sending unencrypted email is like sending postcards: anyone can potentially read them. This is not nice for privacy but becomes very dangerous when the content of the email or attached files contains secrets like passwords, access keys, etc. Anyone who can get hold of your email can also potentially access your systems.

For sending encrypted email I generally use Enigmail which is data encryption and decryption extension for the Thunderbird email client. I also used Mailvelope which is an add-on for Firefox and Chrome allowing to integrate encryption in webmail providers such as Gmail, Outlook, etc. These tools simplify the encryption/decryption process, especially if you are not familiar with it.

However, it has occurred to me to have to encrypt large files containing data dumps. The challenge with email extensions is that they don’t allow you to send email with such huge attachments. Plus, Mailvelope doesn’t allow to encrypt files larger than 25 MB. This is when knowing how to encrypt and decrypt a file via the command line comes in handy. You can easily upload a large encrypted file on an FTP server or cloud hosting service without worrying that the file will end in the wrong hands. As a bonus, an encrypted file is generally smaller than a non-encrypted file so the upload is also quicker.

The encryption process requires to first get the GPG public key from the person you want to send the encrypted file or email to. Once you have the recipient’s public key, you can encrypt a file with that key. You send the email or upload the file and then ask the recipient to decrypt it at their end using their GPG private key. I’m going to cover both processes. Note that this is also useful in order to encrypt the content of an email that you want to keep secret and send it as attachment in a non-encrypted email.

Generate GPG public and private keys

  1. Install gpg or gpg2 on Linux or MacOS. This is generally part of the standard packages, for example on Ubuntu:
    sudo apt install gnupg2

    If you are on Windows, you can use Cygwin and install gpg or use the GnuPG utility which should work similarly (although I have not tried it).

  2. Generate a GPG key and follow the instructions. I recommend selecting RSA and RSA (default) as kind of key and 4096 as keysize of the key:
    gpg2 --gen-key
  3. You should now have two files in .gnupg within your home directory (e.g. /home/sandro/.gnupg):
    -- pubring.gpg: this is your public key
    -- secring.gpg: this is your private key

    Verify your public key with:

    gpg2 --list-keys

    Verify your private key with:

    gpg2 --list-secret-keys

Encrypt and decrypt files

You have received a public key from someone and you want to encrypt a file with their public key in order to transmit it securely. The file containing the public key will typically have an extension .gpg or .asc.

  1. Import the public key (e.g. someonekey.asc is the filename of the key):

    gpg2 --import someonekey.asc
  2. Trust the public key (user@example.com is the email associated with the key and should be shown as output of the import command):
    gpg2 --edit-key user@example.com

    You’ll get a prompt command>, type trust and select 5 = I trust ultimately. Type quit to exit.

  3. Encrypt the file with the public key of the user (replace the email address with the email address of the user associated to the public key):
    gpg2 -e -r user@example.com mysecretdocument.txt
  4. This will generate an encrypted file mysecretdocument.txt.gpg which is smaller than the original file. Transmit the encrypted file and tell the user to decrypt it at their end with the following command:
    gpg2 -o mysecretdocument.txt -d mysecretdocument.txt.gpg

Stay safe and encrypt important emails and files!

Reference: How to easily encrypt a file with GPG on Linux

Oxford Dictionaries API at Google Developer Group Meetup

Oxford Dictionaries API
Image  Oxford University Press ©

This week together with my colleagues Phil and Amos I gave a presentation on the Oxford Dictionaries API at the Google Developer Group Reading & Thames Valley Meetup.

We were kindly hosted by Perusha and Ming from the local Google Developer Group Meetup and got lots of interesting questions from the audience. This also gave us few ideas on how to develop further the Oxford Dictionaries API and what developers may want to do with it in the fields of Natural Language Processing and Machine Learning.

XML Prague 2018

XML Prague
Image  XML Prague ©

This week I am attending XML Prague at the University of Economics College campus in Prague, a conference on markup languages and data on the web. Together with other XSpec developers I am organising the XSpec Users Meetup. I’m also giving a lightning talk in the Schematron Users Meetup on how to test Schematron with XSpec.

Slides of the XSpec Users Meetup are available here whereas my lightning talk on testing Schematron with XSpec is available here.

How to exclude a package from being updated on Linux

Packages
Image by Marc Falardeau – CC BY 2.0

Sometimes you prefer not to update a specific package in Linux. This may be because you don’t want to upgrade to a new version with new features but no security updates. Or maybe because upgrading requires a service restart that you want to avoid just yet. This was the case for me recently when a new version of Docker came up and upgrading would have restarted the docker daemon and stopped the running containers.

It is possible to exclude a package from being updated. On Linux RPM systems (RedHat, CentOS, Fedora, etc.) this is the command to install all updates but exclude a specific package (say docker):

sudo yum update --exclude=docker

On Debian-like systems (Debian, Ubuntu, Mint, etc.) it is slightly more convoluted because you need to hold a package first and then upgrade the system

sudo apt-mark hold docker && sudo apt-get upgrade

and remember to remove the hold when you’re ready to upgrade that package too

sudo apt-mark unhold docker

 

References:

https://access.redhat.com/solutions/10185

https://askubuntu.com/a/18656

AWS Lambda and Jenkins Integration

AWS Lambda and Jenkins
Logos by Amazon Web Services © and Jenkins ©

Serverless is gaining attention as the next big thing in the DevOps space after containers. Developers are excited because they don’t have to worry about servers any more; Ops may be sceptical and slightly worried to hear about a world without servers (and sys admin maintaining them). Can these two worlds co-exist? Can serverless just be another tool in the DevOps toolkit?

I recently implemented a real use case at work where we took advantage of an event-driven workflow to trigger Jenkins jobs originally created to be executed manually or on a schedule. The workflow is as follows:

1. New data is uploaded to an S3 bucket
2. The S3 event calls a lambda function that triggers a Jenkins job via the Jenkins API
3. The Jenkins job validates the data according to various criteria
4. If the job passes, the data is upload on an S3 bucket and a successful message is sent to a Slack channel
5. If the job fails, a message with a link to the failed job is sent to a Slack channel

Workflow S3 Lambda Jenkins Slack-workflow

Jenkins User

Let’s start by creating a new user with the correct permissions in Jenkins. This allows to restrict what the lambda function can do in Jenkins.

In Manage Jenkins -> Manage Users -> Create User I create a user called lambda:

Create Jenkins User

In Manage Jenkins -> Configure Global Security -> Authorization -> Matrix-based Security  add the user lambda  to User/group to add and set the permissions as in the matrix below:

Set Jenkins User PermissionsThis is a minimum set up and allows the lambda user to build jobs. According to your security policies, you may want to further restrict the permissions of the lambda user in order to run only some specific jobs (you may need role based authentication for setting this up).

AWS IAM Role

Now let’s move to AWS and set up a IAM Role for the lambda function. Head to IAM -> Roles and create a new roles with the following policies (my role name is digiteum-file-transfer , sensitive information is obfuscated for security reasons):

AWS IAM RoleThis role allows to execute lambda functions, access S3 buckets as well as the Virtual Private Cloud (VPC).

S3 Configuration

I create an empty S3 bucket using the wizard configuration in S3 and name it gadictionaries-leap-dev-digiteum. This is the bucket that is going to trigger the lambda function.

AWS Lambda Configuration

Finally, let’s write the lambda function. Go to Lambda -> Functions -> Create a Lambda Function. Select Python 2.7 (read Limitations to see why I’m not using Python 3) as runtime environment and select a blank function.

In Configure Trigger, set up the trigger from S3 to Lambda, select the S3 bucket you created above (my S3 bucket is named gadictionaries-leap-dev-digiteum ), select the event type to occur in S3 (my trigger is set to respond to any new file drop in the bucket) and optionally select prefixes or suffixes for directories and file names (I only want the trigger to occur on XML files). Here is my trigger configuration:

AWS Lambda Configure TriggerIn Configure Function, choose a name for your function (mine is file_transfer ) and check out the following Python code before uploading it:

from __future__ import print_function

import json
import urllib
import boto3
import jenkins
import os

print('Loading lambda function')
s3 = boto3.client('s3')
# TODO: private IP of the EC2 instance where Jenkins is deployed, public IP won't work
jenkins_url = 'http://123.45.56.78:8080/'

# TODO: these environment variables should be encrypted in Lambda
username = os.environ['username']
password = os.environ['password']

def lambda_handler(event, context):
    
    # Get the S3 object and its filename from the S3 event 
    bucket = event['Records'][0]['s3']['bucket']['name']
    filename = urllib.unquote_plus(event['Records'][0]['s3']['object']['key'].encode('utf8'))
    try:
        # Connect to Jenkins and build a job 
        server = jenkins.Jenkins(jenkins_url, username=username, password=password)
        server.build_job('Pipeline/Digiteum_File_Transfer', {'filename': filename})
        return 'Job Digiteum File Transfer started on Jenkins'
    except Exception as e:
        print(e)
        print('Cannot connect to Jenkins server or run build job')
        raise e

Note the following:

  • Line 6 imports the python-jenkins module. This module is not in Python’s standard library and needs to be provided within the zip file (more on this in a minute).
  • Line 12 sets up the URL of the EC2 instance where Jenkins is deployed. Note that you need to use the private IP address as shown in EC2, it won’t work if you use the public IP address or the Elastic IP address.
  • Lines 15 and 16 set up the credentials of the Jenkins user lambda. The credentials will be exposed to the lambda function as environment variables and, unlike in this example,  it is recommended to encrypt them.
  • Lines 18-31 contain the handler function that is triggered automatically by a new file upload in the S3 bucket. The handler function does the following:
    • retrieve the filename of the new file uploaded on S3 (lines 21-22) 
    • log into Jenkins via username and password for the lambda user (line 25)
    • build the job called Digiteum_File_Transfer  in the folder Pipeline  (line 26)
    • throw an error if it can’t connect to Jenkins or start the job (lines 28-31)

As an example, here is the zip file to upload in Configure Function. It contains the lambda function and all the Python modules needed, including the python-jenkins module. Make sure you edit the private IP address of your Jenkins instance in line 12. If you need to install additional Python modules, you can follow these instructions.

Here is how my Configure Function looks like:

Lambda Configure FunctionNote the name (it should read file_transfer  instead of file_transfe ), the handler (as in the Python code above), and the role (as created in IAM). Note also that the username and the password of the Jenkins user lambda are provided as environment variables (ideally, you should encrypt these values by using the option Enable encryption helpers).

Once you’ve done the basic configuration, click on Advanced Settings. In here you need to select the VPC, subnet, and security group of the EC2 instance where Jenkins is running (all these details about the instance are in EC2 -> Instances). In fact, the lambda function needs to run in the same VPC as Jenkins otherwise it cannot connect to Jenkins. For example, here is how my advanced settings look like (sensitive information is obfuscated):

Lambda Configure Function Advanced SettingsFinally, review your settings and click on Create Function.

Test the Lambda Function

Once you created a lambda function, configure a test event to make sure the lambda function behaves as intended. Go to Actions -> Configure test event and select S3 Put to simulate a data upload in the S3 bucket. You need to replace the bucket name (in this example gadictionaries-leap-dev-digiteum) and the name of an object in that bucket (in this example I uploaded a file in the bucket and called it test.xml). Here is a test example to adapt:

{
  "Records": [
    {
      "eventVersion": "2.0",
      "eventTime": "1970-01-01T00:00:00.000Z",
      "requestParameters": {
        "sourceIPAddress": "127.0.0.1"
      },
      "s3": {
        "configurationId": "testConfigRule",
        "object": {
          "eTag": "0123456789abcdef0123456789abcdef",
          "sequencer": "0A1B2C3D4E5F678901",
          "key": "test.xml",
          "size": 1024
        },
        "bucket": {
          "arn": "arn:aws:s3:::gadictionaries-leap-dev-digiteum",
          "name": "gadictionaries-leap-dev-digiteum",
          "ownerIdentity": {
            "principalId": "EXAMPLE"
          }
        },
        "s3SchemaVersion": "1.0"
      },
      "responseElements": {
        "x-amz-id-2": "EXAMPLE123/5678abcdefghijklambdaisawesome/mnopqrstuvwxyzABCDEFGH",
        "x-amz-request-id": "EXAMPLE123456789"
      },
      "awsRegion": "us-east-1",
      "eventName": "ObjectCreated:Put",
      "userIdentity": {
        "principalId": "EXAMPLE"
      },
      "eventSource": "aws:s3"
    }
  ]
}

Click on Save and Test and you should see the lambda function in action. Go to Jenkins and check that the job has been executed by user lambda . If it doesn’t work, have a look at the logging in AWS Lambda to debug what went wrong.

Slack Configuration

Finally, I set up a Slack integration in Jenkins so that every time the Jenkins job is executed, a notification is sent to a Slack channel. This also allows several people to get notified about a new data delivery.

First, install and configure the Slack plugin in Jenkins following the instructions on the GitHub page. The main configuration is done in Manage Jenkins -> Configure System -> Global Slack Notifier Settings. For example, this is my configuration:

Jenkins Slack Notifier SettingsNote that:

  • Team Subdomain is the name of your Slack account
  • Channel is the name of your default slack channel (you can override this in every job)
  • Integration Token Credential ID is created by clicking Add and creating a token in Jenkins’ credentials. As the message says, it is recommended to use a token for security reasons. Here is an example of a Token Credential ID for Slack in Jenkins:

Jenkins Slack Integration Token

You typically want to add a notification to a specific Slack channel in your Jenkins job as a post-build action in order to notify the result of a job. In Jenkins go to your job’s configuration, add Post-build Actions -> Slack Notifications and use settings similar to these:

Jenkins Post-build ActionsThis sends a notification to the Slack channel (either the default one set in Global Slack Notifier Settings or a new one set here in Project Channel)  every time a job passes or fails. When a notification is sent to Slack, I will look like this:

Slack NotificationsNow you can keep both technical and non-technical users informed without having to create specific accounts on Jenkins or AWS or spamming users with emails.

Limitations

I ran into two problems that I was not yet been able to solve due to lack of time. I want to flag them as they can improve the lambda function and make it more maintainable. If anyone want to help me to fix this, please send me your comments.

  • Encryption: I tried to encrypt the Jenkins password  but I could not make the lambda function decrypt the password. I set up an encryption key in IAM -> Encryption keys -> Configuration -> Advances settings -> KMS key and pasted the sample code in the lambda function but the lambda function timed out without giving an error message. I imported the b64decode  module from base64  in the Python code but there must be an issue with this instructions that decrypts the variable ENCRYPTED :
    DECRYPTED = boto3.client('kms').decrypt(CiphertextBlob=b64decode(ENCRYPTED))['Plaintext']
  • Python 2.7: I wanted to use Python 3 but I had issues with the installation of some modules. Therefore I used Python 2.7 but the code should be compatible with Python 3 (apart from the imported modules).

Conclusion

Integrating AWS Lambda and Jenkins requires a little bit of configuration but I hope this tutorial may help other people to set it up. If the integration needs to be done the other way round (i.e. trigger a lambda function from a Jenkins job), check out the AWS Lambda Plugin.

I believe integrating AWS Lambda (or any FaaS) with Jenkins (or any CI/CD server) is particularly suited for the following uses cases:

  • Organisations that already have some DevOps practices in place and a history of build jobs but want to take advantages of the serverless workflow without completely re-architecturing their infrastructure. 
  • CI/CD pipelines that need be triggered by events but are too complex or long to be crammed in a single function.

XSpec v0.5.0

XSpec
Image  XSpec MIT License

XSpec is a unit test and behaviour driven development (BDD) framework for XSLT and XQuery. I picked up this open source project at work for testing our XSLT and I’m now actively contributing to it together with the XSpec community.

XSpec v0.5.0 has just been released and is the new XSpec release after 5 years. It includes new features like XSLT 3 support, JUnit report for integration with Continuous Integration tools, support for Saxon-B, etc. It also fixes long standing regression bugs as well as integration bugs in the code coverage, provides feature parity between the shell and batch scripts, integrates an automated test suite, and updates the documentation in the wiki. More information on the official release notes