I was employed by Hewlett-Packard Enterprise from May 2015 to May 2017, and during that time I worked on The Machine. Basically, the whole idea of The Machine is to create a faster, more secure always-on server solution for enterprise-level operations using fiber-optic connections and a revolutionary architecture. For my part, I was working on the “secure” part of that promise for the better part of two years. Before I left this project, the Key Manager was accepted as complete and reliable code.
The principal aspect of this security is the key manager. The Machine can be comprised of a great many nodes, which contain vast amounts of non-volatile memory that would make a beautiful target for corporate espionage, so it is imperative that these nodes be safe from third-parties. This security is accomplished using the following basic elements:
Trusted Platform Module
The TPM is a secure data storage instrument which can only be accessed with specific passwords at specific points in a computer’s operation. The TPM not only stores key data, but also creates a secure channel by which to transmit that data safely to its user and provides additional security provisions like read-only access. Due to the nature of such a device, there is little available storage space, so part of this project entailed identifying which data elements needed to be stored in that device and how to fit them into the space provided.
While the TPM is employed to securely store and transmit specific key data values to individual components of a node, nodes also communicate with a Top of Rack server, which coordinates many of the aspects that give The Machine its power and reliability. The communication between the Top of Rack server and the nodes must be secure to prevent unauthorized access to sensitive data, which could include keys or proprietary information. In order to facilitate this in an efficient and system-oriented way, I was required to integrate various SSL/TLS library routines into a streamlined and cohesive security platform that can ensure security is maintained at both ends of the transmission.
The majority of cybersecurity relies on key values to provide identification of communication agents and encryption of data elements. These values must be securely generated and stored in order to prohibit impersonation and data interception. Much of this project involved identifying all security-related values required for full system operation, generating these values in a manner that ensured uniqueness and security, and transmitting any required communication keys in a secure environment.
Top of Rack Security
As previously mentioned, The Machine relies on a top-of-rack server to coordinate events and provide secure communication. As an aspect of this project, I implemented an Enterprise Secure Key Manager (ESKM) solution to store communication keys, minimizing the window of opportunity for any malicious agents to access secure encryption keys the top-of-rack server required to communicate with individual nodes. In addition, I established secure key generation procedures for the top-of-rack server as well as secure communication channels using TLS. All of this is integrated into the code base to ensure both security and efficiency.
OpenSSL vs. GnuTLS
There are two contenders for most-popular open-source packages for implementing SSL/TLS security: OpenSSL and GnuTLS. OpenSSL is by far the more widespread, due in part to its accessible standalone tools and clean interfaces, but GnuTLS is published under the GNU license and so allows for somewhat greater independence. In working with both packages, I determined that OpenSSL has much cleaner libraries and interfaces than GnuTLS with a license that is among the most favorable to developers. However, working with GnuTLS exposed me to the inner workings of all SSL libraries and taught me to work without clear documentation.
Working on this project, I was able to experience the majority of any project’s cycle. I entered the company in time to help finalize the initial specification, design and build the initial libraries, retool the specification to adapt to changing requirements and new information, finish an initial prototype, test and refine that prototype, maintain the code and add features, receive final approval, and ship a product.
Managing a Code Base
I acted as the primary maintainer of the code repositories for this project among my team. This allowed me to learn to massively reorganize the file structure without breaking functionality or reliability, develop and maintain makefiles, handle review requests and approve updates to the code repositories, work with git’s core functionality on a daily basis, and keep documentation fully updated at all times.
Developing Tool Awareness
As this was my first job out of college, I was required to pick up and develop some level of mastery with a number of tools I’d never worked with before. In the first few months alone I learned to work with git, awk and sh scripting languages, makefiles, ctags, vi, ReviewBoard, gdb, gcc, Python, rpm, tar, ssh, and several other essential tools without which my work would be exponentially more difficult. This not only gave me a suite of tools with which I am comfortable and which make my work more efficient and painless, but it also taught me to identify tools and quickly learn to make them a part of my life.
I studied the various development strategies in college, but only in the real world did I truly understand the strengths and weaknesses of the Agile method. Breaking a complex and large project into manageable chunks which can be completed in a few days or weeks allows programmers to produce constant and reliable output while regularly experiencing the accomplishments that justify hours of relentless programming and testing. Regular meetings, both daily stand-up meetings and bimonthly planning meetings, creates accountability while allowing any problems to be considered and addressed while still fresh in memory. The constant cycling of project chunks allows priority shifts and bugs to be addressed swiftly, lessons learned to be integrated into the process efficiently, and development to progress organically.
However, the Agile method does seem to lend itself to missing the larger picture. While most team members are able to focus in individual smaller tasks, a few must largely dedicate themselves to overarching architecture, planning, and management so that the project proceeds reasonably. This means that Agile development requires a diverse set of development skills in a certain proportion and largely isolates some skilled programmers from creating code.
In the end, Agile must itself be agile and adapt to the projects and teams which implement it.
There are many documentation styles, and most of these rely on documenting after the fact. This means that a programmer will design some code, test it, tweak it, and only after the purposes of each function and object have grown stale attempt to explain it. I swiftly discovered that documenting as I go and leaving a bit of a metaphorical paper trail, even at the expense of large comment blocks and abundant inline comments, makes maintenance and final documentation much more efficient.
In addition, I discovered the Doxygen documentation tool, which requires only minimal adaptation of common in-file documentation to produce easily-navigated hypertext documentation of C and other languages. Using this tool, a document-as-you-go style becomes even more efficient as one need only create one set of documentation in the code to produce countless permutations of human-readable documents.
The Importance of Style
Every programmer has a preferred coding style, and some of these styles are more accessible than others. I found that standardizing line lengths to 80 characters ensured wrapping was unnecessary, which means that the code I see on my wide screen presents the same way on paper and vice-versa. Employing standard tab sizes across the board also ensures that a change of medium does not create a visible change in the code, improving readability. In addition, using white space to logically separate and define code segments makes it more appealing to the human eye and easier to visually scan. Finally, using meaningful naming conventions to logically group and distinguish elements expedites understanding.
Almost more important than clear style, though, is consistent style. Knowing what to expect and how to read a file makes that process faster, while mixing styles creates some mental confusion and limits the speed at which work and understanding can be achieved.