Graham Nicholls


Professional Profile

(last updated 4th Feb 2020)

Intelligent, adaptable, passionate, and very experienced Unix/Linux devops engineer and problem solver with a verifiable track-record of providing effective solutions. Excellent communicator at all levels, with a strong work ethic.  Certified AWS associate solutions architect as of 28th June 2016 (now lapsed).  I'm told that I find solutions to problems which others had thought impossible, or at least too difficult or complex to be worth tackling.  As you may gather from this CV, making attractive front ends is not where my skillset lies.  Also, I am very happy at Macquarie, who are an excellent employer, doing interesting stuff - aside from the usual banking bureaucracy required by current (and perfectly reasonable) regulatory frameworks, so please don't contact me with job opportunities. 

I am a highly experienced programmer and I have written tens of thousands of lines of C, C++, bash/shell, Python, Lua, and Ruby, as well as other languages such as expect, D, tcl/tk, sql, etc.  (I've even written Perl and PHP, but would rather not do so again.)  My current language of choice is Google's go (golang for the cv scrapers) .  I'd claim to have been a devops engineer since before the term was invented - if it's not automated, it's not finished.

Core Technical Skills

  • Developer/Architect with an ability to visualise and communicate proposed solutions.  Proficient in many programming languages. Fast learner of new ones.
  • Highly skilled Linux developer/platform engineer.  Exposed to many flavours of Linux - most recently worked with RHEL/Centos, and of course Amazon Linux.
  • Significant experience with AWS - via console, command-line, API, and third party tooling. 
  • Knowledgeable in Docker and Virtualisation (mainly KVM).
  • Extensive experience with open source tooling: Git, Ansible, Terraform, Jenkins, Bamboo, Wireshark, SonarQube, etc.
  • Good understanding of networking and TCP/IP.
  • Highly regarded mentor/instructor, having taught Unix/Linux, AWS, and Python courses for LearningTree over the last 20 years, in the UK, Ireland, Sweden, the USA, Canada, Belgium, and Germany.
  • Competent in Linux systems administration.

Recent Career History

November 2017 - Present: Macquarie Bank Limited,  London

London Infrastructure development team lead.   Responsible for a small graduate team developing solutions to Infrastructure problems, as well as mentoring the team members.  I am proud to be working with such a talented (albeit small) team of individuals. During a hiatus between projects, I learnt go.  Initially, to create a small project to manage server host keys (the issue being that users always unthinkingly type "yes" when warned that a host key is new or even a mismatch), then various other projects.   For an old Unix/C programmer, Go is the perfect programming language for writing tools and small web tools/API's.  Why would anyone use Python when Go offers the same programming facilities, as well as a far easier concurrence model, but without the ludicrous problems Python versioning and packages cause?

Currently working on a "compliance agent" project, which I started in response to my security concerns with other solutions.   Using a privileged account to run arbitrary scripts on servers strikes me as offering a large attack surface, with the potential to create a botnet out of the fleet.   The solution which I have engineered involves a single, statically compiled go program, which has a set of tests it can run, such as checksumming files, listing  loaded kernel modules (other solutions look at module blacklist files, which is pointless), enumerating ssh algorithms, open ports offered by the server, logins since last test, and so on.  The agent - which runs as an ordinary user, but with appropriate capabilities) writes data to an https connection on an unprivileged port, in text,json, or Prometheus format, and is sufficiently performant to be run every 15 seconds or so in response to Prometheus scrapes.  The agent never forks, uses mTLS for authentication, runs tests configured by a json file which must match it's checksum burnt into the binary at build time in order to load, and is generally written with security in mind.  ServerSpec is a great tool for an initial check of server config, but should never, in my opinion, be considered as a continuous assurance tool. 

I work with closely with another talented group of individuals - the Unix platforms team, and offer and accept advice and assistance from the team on a frequent basis.

March 2015 - October 2017:   BJSS Limited London

BJSS aims to be the UK's leading privately-held IT consultancy.  Initially deployed to a large government project migrating to AWS, subsequently at other clients.

Worked with the Python boto library to automate the process of setting default ASG scheduling rules, so that a manual override - done for example to allow overnight testing – would automatically be reverted the next day.  This runs from Jenkins and is passed parameters through the environment.   Wrote AWS lambdas for managing infrastructure sprawl by auto-decrementing "TTL counter tags" on EC2 assets, and stopping (and eventually  terminating) outlived instances.   Performed extensive optimisation of the instance-size and counts for scheduled/auto-scaled groups to assess the best mix of static and scaled instance to handle diurnal loads.

Another project involved using Terraform to automate the (AWS) cloud infrastructure for development and pre-prod environments for a major banking client.  This involved multiple VPCs, security groups, RDS instances, and of course, EC2 instances.

I wrote a Ruby script to assist with enumerating all AWS resources across - all regions - within an account.   This is on GitHub as: https://github.com/grahamnicholls/aws_info

Prior to this, I spent some time at Hogarth Worldwide Ltd, as a platform engineer.

Hogarth offer in particular, a couple of class-leading products; Zonza and Copycentral.  These are digital asset management tools, and I was part of the BJSS techops team who manage these.   My role involved infrastructure to application support – managing servers in the VMWare cluster, up to monitoring and debugging the application.  I scripted the process of auditing the ingest of a large number (around 1 million) assets from AWS S3 storage; initially with shellscripts, then with Lua programs, when runtime speed became an issue.  The applications are a complex structure of various open-and-closed source tools such as solr, redis RabbitMQ etc, running in a virtualized estate managed with Puppet and Ansible, and monitored using New Relic and other, open-source tools.  Automated the backup and restore into a pre-prod environment of postgres databases, using postgres, bash, etc.

November 2013 - March 2015 : 7Ticks/ Interactive Data Corporation, London

Interactive Data Corporation provides financial market data, analytics and related solutions to financial institutions, active traders and individual investors. The company’s businesses supply real-time market data, time-sensitive pricing, evaluations and reference data for securities trading, including hard-to-value instruments. 7 ticks was the subsidiary offering on or near-exchange infrastructure to financial clients.

  • Troubleshooting analyzing system and network performance, and tuning network interfaces and/or other kernel parameters.  I recently diagnosed and resolved jitter issues with ptp, presenting results graphically.  Analysis  and graphing of sar and other system performance monitoring tool output.
  • Proprietary tool and system development:  I wrote a system to audit user access to (Linux) servers, using PAM (Pluggable Authentication Modules), and shell scripts. This was done in a manner which made it difficult to subvert, although not unduly onerous on users.   Wrote a tool in Python to allow network engineers to configure the system they use for access to switches and routers.
  • Running KVM virtual servers for testing and debugging, with the occasional use of Vmware/Vsphere, and locally, VirtualBox / vagrant.
  • Server commissioning and decommissioning - involving installing the operating system, ptp, SolarFlare/Mellanox drivers with complex routing, and kernel rebuilds etc. When performed remotely, network latency can cause problems, so configured geo-local Amazon EC2 instance(s) offering o/s ISOs via an nginx webserver. 
  • Used AWS for other admin systems, and Ansible for server configuration discovery. 
  • Miscellaneous other Linux admin/performance troubleshooting/automation tasks .
  • Ran team meetings with colleagues in Chicago and Melbourne, organizing workflow, enhancing procedures for on-boarding/exiting systems staff, etc.

May 2008 - November 2013: FTSE Ltd, London

FTSE Group (FTSE) is a global leader in indexing and analytic solutions. FTSE calculates thousands of unique indices that measure and benchmark markets and asset classes in more than 80 countries around the world.

Senior Systems Engineer (RTF Platform):

  • Understanding and supporting the platform, using Unix skills to monitor and maintain the platform.
  • Fixing issues with the platform and its performance.
  • Diagnosing issues with firewalls, routing and using Linux' IPTables utility to test and resolve connectivity problems.

RTF Platform Developer:

  • Maintained and enhanced the platform according to business needs using Lua.
  • Provided Unix/Linux expertise and advice to the team.
  • Scripted build and deployment of the system which reduced the effort required from days to seconds.
  • Analysis of log data, and conversion of packet data into a format suitable for use with testing tools.
  • Providing Unix/Linux expertise to other team members.

Earlier History

This is a in summary form, with dates as I best remember.

2007: Netstore Ltd (contract).                    

Unix Project Engineer

 Third line support and project work for this hosting & solutions provider in Reading.  Resolving client issues as well as project work, such as implementing new mail servers (exim, postfix), adding DNS entries, planning and installing client and infrastructure upgrades, etc.   Installation of new servers with Linux or Solaris (9 and 10), on SPARC hardware such as the SunFire V-series.  Monitoring services and servers with Nagios. Various sysadmin/automation tasks. 

2007: Guardian Unlimited (contract)                   

Unix Consultant

Guardian Unlimited is the web site of the Guardian Newspaper; one of the busiest news sites on the web.  Enhanced the existing install system to increase automation, saving administrator time, and reducing human error.

Working with Java applications within an Apache/tomcat environment on Linux and Solaris systems.  Monitoring application performance, debugging memory problems, working with the Jumpstart/Kickstart utilities for automated systems building.  Troubleshooting network issues.  User administration across multiple systems.  Blade configuration and maintenance.  Scripting with shell and Ruby.  Managing ticket system.Worked peripherally on the Puppet implementation, which was at the time, in its infancy

 

1992 – 2007: HSB Haughton Insurance (freelance)

Unix Consultant

Systems administration, network engineering, programming, support, training.

SCO Unix, then Linux, TCP/IP, Shell scripting, Perl, python, ruby, postscript.

Specified, built, installed and administered their first SCO Unix system approximately in around 1995.  All aspects of server admin. - eg NFS, Samba, adding users, printers, monitoring, installation, adding, cloning  and partitioning disks and RAID configuration, backup, user administration.  Involved in DR/BCP and testing, network design and troubleshooting installed firewalls, and set up VPNs. Off-site telephone support. Writing and delivering internal courses.  Programming of various “glue” systems, as described below.  Kernel patching, build and configuration.

One project involved adding logos to PDF documents according to page content – so for a 10 page document, the first page, which would be a covering letter, would have the standard company letterhead applied.  If there was a continuation page, then this would have the same logo.  Otherwise, the page orientation would be used to decide the logo type, unless the page was a report, in which case industry certification logos would apply.  This was programmed in Ruby, using postscript to place the images within the document.  Rules were abstracted out of the program code to a config file in order to allow non-programmers to write them based on characteristics of the page such as the text,size, orientation, etc.  Allowed the client to move to Internet delivery of documentation, and to make significant savings on printing.

 

1990-1992: Primar (then Prism) Ltd.

C Developer/Unix Specialist

Primar was a computer bureau, specializing in direct marketing. I was responsible for the Unix systems and network, which I implemented from scratch, using TCP/IP over thin Ethernet. Sysadmin for the Unix systems; responsible for backing up, adding users, troubleshooting, etc. As a C programmer, I became involved in the data-import/export side of the bureau business;  reading data from a variety of sources - mainframes, DOS systems, Windows systems, other Unix systems, DEC Vax, etc. Data came in a variety of formats - plain text, report format, BCD, comp3, EBCDIC, etc. I would often have to deduce the data format (as specs were rare!), and then write programs to convert it to CSV plaintext.  For export I would have to produce half inch magnetic tapes in a format suitable for various platforms using the same data types.  I wrote a unix-based hex-editor during this time, and released it to the public domain.

I was responsible for the design and entire back-end development (in C) of Primar's award-winning bitmap database, which used variable-size bitfields to store numeric and enumerated data, producing significant performance and space benefits over a traditional database.  I later recoded this in C++ and enhanced it in numerous ways.  Primar continued to be a client when I became self-employed/contracting in around 1992/1993.

 

Prior to 1990

I entered the IT industry in around 1984, after attending a government-sponsored TOPS course at Systime in Slough.   My first role was for Tetra Business systems, supporting their Unix/DOS based accounts package.  Later I was moved to the development team.  From there I was recruited by Computer Answers, a Tetra dealer, and then Shortlands Computing Services, a competitor.  From there I joined Primar as outlined above.