We were kindly hosted by Perusha and Ming from the local Google Developer Group Meetup and got lots of interesting questions from the audience. This also gave us few ideas on how to develop further the Oxford Dictionaries API and what developers may want to do with it in the fields of Natural Language Processing and Machine Learning.
This week I am attending XML Prague at the University of Economics College campus in Prague, a conference on markup languages and data on the web. Together with other XSpec developers I am organising the XSpec Users Meetup. I’m also giving a lightning talk in the Schematron Users Meetup on how to test Schematron with XSpec.
Slides of the XSpec Users Meetup are available here whereas my lightning talk on testing Schematron with XSpec is available here.
Sometimes you prefer not to update a specific package in Linux. This may be because you don’t want to upgrade to a new version with new features but no security updates. Or maybe because upgrading requires a service restart that you want to avoid just yet. This was the case for me recently when a new version of Docker came up and upgrading would have restarted the docker daemon and stopped the running containers.
It is possible to exclude a package from being updated. On Linux RPM systems (RedHat, CentOS, Fedora, etc.) this is the command to install all updates but exclude a specific package (say docker):
sudo yum update--exclude=docker
On Debian-like systems (Debian, Ubuntu, Mint, etc.) it is slightly more convoluted because you need to hold a package first and then upgrade the system
sudo apt-mark hold docker&&sudo apt-getupgrade
and remember to remove the hold when you’re ready to upgrade that package too
This weekend I am attending XML London at University College London, a 2 day conference on XML, Open Data, Digital Publishing, and Data Management. I am presenting a paper on XSpec. The paper is available here and the slides of my presentation are available here.
Serverless is gaining attention as the next big thing in the DevOps space after containers. Developers are excited because they don’t have to worry about servers any more; Ops may be sceptical and slightly worried to hear about a world without servers (and sys admin maintaining them). Can these two worlds co-exist? Can serverless just be another tool in the DevOps toolkit?
I recently implemented a real use case at work where we took advantage of an event-driven workflow to trigger Jenkins jobs originally created to be executed manually or on a schedule. The workflow is as follows:
1. New data is uploaded to an S3 bucket
2. The S3 event calls a lambda function that triggers a Jenkins job via the Jenkins API
3. The Jenkins job validates the data according to various criteria
4. If the job passes, the data is upload on an S3 bucket and a successful message is sent to a Slack channel
5. If the job fails, a message with a link to the failed job is sent to a Slack channel
Let’s start by creating a new user with the correct permissions in Jenkins. This allows to restrict what the lambda function can do in Jenkins.
In Manage Jenkins -> Manage Users -> Create User I create a user called
In Manage Jenkins -> Configure Global Security -> Authorization -> Matrix-based Security add the user
lambda to User/group to add and set the permissions as in the matrix below:
This is a minimum set up and allows the lambda user to build jobs. According to your security policies, you may want to further restrict the permissions of the lambda user in order to run only some specific jobs (you may need role based authentication for setting this up).
AWS IAM Role
Now let’s move to AWS and set up a IAM Role for the lambda function. Head to IAM -> Roles and create a new roles with the following policies (my role name is
digiteum-file-transfer , sensitive information is obfuscated for security reasons):
This role allows to execute lambda functions, access S3 buckets as well as the Virtual Private Cloud (VPC).
I create an empty S3 bucket using the wizard configuration in S3 and name it
gadictionaries-leap-dev-digiteum. This is the bucket that is going to trigger the lambda function.
AWS Lambda Configuration
Finally, let’s write the lambda function. Go to Lambda -> Functions -> Create a Lambda Function. Select Python 2.7 (read Limitations to see why I’m not using Python 3) as runtime environment and select a blank function.
In Configure Trigger, set up the trigger from S3 to Lambda, select the S3 bucket you created above (my S3 bucket is named
gadictionaries-leap-dev-digiteum ), select the event type to occur in S3 (my trigger is set to respond to any new file drop in the bucket) and optionally select prefixes or suffixes for directories and file names (I only want the trigger to occur on XML files). Here is my trigger configuration:
In Configure Function, choose a name for your function (mine is
file_transfer ) and check out the following Python code before uploading it:
print('Loading lambda function')
# TODO: private IP of the EC2 instance where Jenkins is deployed, public IP won't work
# TODO: these environment variables should be encrypted in Lambda
# Get the S3 object and its filename from the S3 event
return'Job Digiteum File Transfer started on Jenkins'
print('Cannot connect to Jenkins server or run build job')
Note the following:
Line 6 imports the python-jenkins module. This module is not in Python’s standard library and needs to be provided within the zip file (more on this in a minute).
Line 12 sets up the URL of the EC2 instance where Jenkins is deployed. Note that you need to use the private IP address as shown in EC2, it won’t work if you use the public IP address or the Elastic IP address.
Lines 15 and 16 set up the credentials of the Jenkins user lambda. The credentials will be exposed to the lambda function as environment variables and, unlike in this example, it is recommended to encrypt them.
Lines 18-31 contain the handler function that is triggered automatically by a new file upload in the S3 bucket. The handler function does the following:
retrieve the filename of the new file uploaded on S3 (lines 21-22)
log into Jenkins via username and password for the lambda user (line 25)
build the job called
Digiteum_File_Transferin the folder Pipeline (line 26)
throw an error if it can’t connect to Jenkins or start the job (lines 28-31)
As an example, here is the zip file to upload in Configure Function. It contains the lambda function and all the Python modules needed, including the python-jenkins module. Make sure you edit the private IP address of your Jenkins instance in line 12. If you need to install additional Python modules, you can follow these instructions.
Here is how my Configure Function looks like:
Note the name (it should read
file_transfer instead of
file_transfe ), the handler (as in the Python code above), and the role (as created in IAM). Note also that the username and the password of the Jenkins user lambda are provided as environment variables (ideally, you should encrypt these values by using the option Enable encryption helpers).
Once you’ve done the basic configuration, click on Advanced Settings. In here you need to select the VPC, subnet, and security group of the EC2 instance where Jenkins is running (all these details about the instance are in EC2 -> Instances). In fact, the lambda function needs to run in the same VPC as Jenkins otherwise it cannot connect to Jenkins. For example, here is how my advanced settings look like (sensitive information is obfuscated):
Finally, review your settings and click on Create Function.
Test the Lambda Function
Once you created a lambda function, configure a test event to make sure the lambda function behaves as intended. Go to Actions -> Configure test event and select S3 Put to simulate a data upload in the S3 bucket. You need to replace the bucket name (in this example
gadictionaries-leap-dev-digiteum) and the name of an object in that bucket (in this example I uploaded a file in the bucket and called it
test.xml). Here is a test example to adapt:
Click on Save and Test and you should see the lambda function in action. Go to Jenkins and check that the job has been executed by user
lambda . If it doesn’t work, have a look at the logging in AWS Lambda to debug what went wrong.
Finally, I set up a Slack integration in Jenkins so that every time the Jenkins job is executed, a notification is sent to a Slack channel. This also allows several people to get notified about a new data delivery.
First, install and configure the Slack plugin in Jenkins following the instructions on the GitHub page. The main configuration is done in Manage Jenkins -> Configure System -> Global Slack Notifier Settings. For example, this is my configuration:
Team Subdomain is the name of your Slack account
Channel is the name of your default slack channel (you can override this in every job)
Integration Token Credential ID is created by clicking Add and creating a token in Jenkins’ credentials. As the message says, it is recommended to use a token for security reasons. Here is an example of a Token Credential ID for Slack in Jenkins:
You typically want to add a notification to a specific Slack channel in your Jenkins job as a post-build action in order to notify the result of a job. In Jenkins go to your job’s configuration, add Post-build Actions -> Slack Notifications and use settings similar to these:
This sends a notification to the Slack channel (either the default one set in Global Slack Notifier Settings or a new one set here in Project Channel) every time a job passes or fails. When a notification is sent to Slack, I will look like this:
Now you can keep both technical and non-technical users informed without having to create specific accounts on Jenkins or AWS or spamming users with emails.
I ran into two problems that I was not yet been able to solve due to lack of time. I want to flag them as they can improve the lambda function and make it more maintainable. If anyone want to help me to fix this, please send me your comments.
Encryption: I tried to encrypt the Jenkins password but I could not make the lambda function decrypt the password. I set up an encryption key in IAM -> Encryption keys -> Configuration -> Advances settings -> KMS key and pasted the sample code in the lambda function but the lambda function timed out without giving an error message. I imported the
b64decode module from
base64 in the Python code but there must be an issue with this instructions that decrypts the variable
Python 2.7: I wanted to use Python 3 but I had issues with the installation of some modules. Therefore I used Python 2.7 but the code should be compatible with Python 3 (apart from the imported modules).
Integrating AWS Lambda and Jenkins requires a little bit of configuration but I hope this tutorial may help other people to set it up. If the integration needs to be done the other way round (i.e. trigger a lambda function from a Jenkins job), check out the AWS Lambda Plugin.
I believe integrating AWS Lambda (or any FaaS) with Jenkins (or any CI/CD server) is particularly suited for the following uses cases:
Organisations that already have some DevOps practices in place and a history of build jobs but want to take advantages of the serverless workflow without completely re-architecturing their infrastructure.
CI/CD pipelines that need be triggered by events but are too complex or long to be crammed in a single function.
XSpec is an open source unit testing and behaviour driven development framework for XML technologies. I recently wrote an article on XSpec for XML.com that describes what XSpec is and provides a tutorial on how to write XSpec unit tests for XSLT.
XSpec is a unit test and behaviour driven development (BDD) framework for XSLT and XQuery. I picked up this open source project at work for testing our XSLT and I’m now actively contributing to it together with the XSpec community.
XSpec v0.5.0 has just been released and is the new XSpec release after 5 years. It includes new features like XSLT 3 support, JUnit report for integration with Continuous Integration tools, support for Saxon-B, etc. It also fixes long standing regression bugs as well as integration bugs in the code coverage, provides feature parity between the shell and batch scripts, integrates an automated test suite, and updates the documentation in the wiki. More information on the official release notes.
Once you have your SSL certificate installed on your server, you may want to force HTTPS so that any request for HTTP pages will automatically be redirected to HTTPS.
The Apache web server provides the
.htaccess file to store Apache configuration on a per-directory basis. For example, if your website is stored under
/var/www/html/mysite and you’re using Apache, you can create the following
.htaccess file in that directory:
The third line is the rewrite rule that forces HTTPS for any request made to the web server. Note that you need to have the mod_rewrite module installed on Apache to add rewrite rules for URL redirection.
Make sure that the URL in the rewrite rule is the one used in the SSL certificate. I initially put
www.sandrocirulli.net in the rewrite rule even though I register the SSL certificate for
sandrocirulli.net and all its sub-domains (including
www.sandrocirulli.net ) and got nasty security warnings displaying on the browser. You can easily check the SSL certificate with any browser by clicking on the green padlock near the URL and select View Certificate or the like:
If the padlock near the URL displays a warning, click on it and see what’s the problem. I initially encountered issues with mixed content. This occurred because I had links to images on the websites with HTTP instead of HTTPS. All the major browsers allow you to see where the error occurs, just click on the warning and then Details or the like. Changing these links to HTTPS solved the issue with mixed content.
This week I’m giving a talk about Continuous Security with Jenkins, Docker Bench, and Amazon Inspector at CD Summit & Jenkins Days in Amsterdam and in Berlin. CD Summit & Jenkins Days are a series of conferences in the US and in Europe focusing on Continuous Integration (CI) and Continuous Delivery (CD).
This is the abstract of my talk:
Security testing is often left out from CI/CD pipelines and perceived as an ad hoc and one-off audit performed by external security experts. However, the integration of security testing into a DevOps workflow (aka DevSecOps) allows to achieve security by design and to continuously assess software vulnerabilities within a CI/CD pipeline. But how does security fit in the world of cloud and microservices?
In this talk I show how to leverage tools like Jenkins, Docker Bench , and Amazon Inspector to perform security testing at the operating system and container levels in a cloud environment and how to integrate them into a typical CI/CD workflow. I discuss how these tools can help assessing the risk of security vulnerabilities during development, improving security and compliance, and lower support costs in the long term.