Wednesday, October 21, 2009

Draft SHA-1 Implementation

I recently committed a draft implementation of the SHA-1 hash algorithm to the ACO repository. This implementation is only a draft because it currently lacks any test cases and documentation. I thus consider it to be incomplete at this time. However, this is the first activity in the ACO project in a while so I thought it would be good to post a message about it anyway. I am alive! Hopefully I will be able to make steady progress from now on.

My SHA-1 implementation follows as directly as possible the description of the primary algorithm given in FIPS 180-1. My idea was to create an implementation that could eventually be proved correct with respect to the specification. I assumed that would be easier if I followed the specification as literally as possible. Alas, constructing a proof of correctness will be complicated by the fact that the specification isn't as formal as one might like. I will worry about that problem later. In the meantime it is my hope that my implementation is at least easy to review manually.

Ultimately I would also like to provide an implementation of the low memory consuming "alternate" algorithm described in FIPS 180-1. Of course it will additionally be necessary to formally show that the alternate algorithm produces the same result as the primary algorithm (FIPS 180-1 doesn't really do this), and that my implementation of it is correct with respect to its specification. What fun!

Tuesday, February 10, 2009

Refactoring for SPARK

I decided to try using SPARK after I had already started the design of the ACO library and implemented one encryption algorithm (Blowfish). Once I decided to use SPARK for the core algorithms, I knew that I needed to refactor my existing code to make it conform to SPARK rules. I wasn't sure how that was going to work out so I felt the refactoring had to be done right away to ensure that it was even going to be possible.

I just completed the refactoring yesterday. I had to rearrange a number of packages, as expected, but overall it went fine and the resulting code is quite reasonable. The main tricky aspect was related to the intrinsic shift procedures. GNAT provides Shift_Left and Shift_Right procedures as "intrinsics" in package Interfaces. This means that they are compiled directly into single machine instructions without procedure call overhead. However, I didn't want to use the types defined in package Interfaces because I wanted to use representation clauses to define the properties of my basic types precisely (specifically, the size). Such clauses must appear in the same package as where the type is defined, and I don't have the option of editing the library package Interfaces.

However, my benchmark experiments with Blowfish showed that it was very important to performance to use the intrinsic procedures. Correctness may have first priority in ACO, but performance is important too. I didn't feel like I could worry about this matter later. Fortunately GNAT allows me to specify procedures of my own as intrinsic provided the procedures have the proper names and operate on types with appropriate properties. This allowed me to define Shift_Left and Shift_Right procedures for my type ACO.Octet while retaining the speed of the intrinsic operations.

The complication that arose was that I also wanted to do the same for the types ACO.Double_Octet, ACO.Quadruple_Octet, and ACO.Octuple_Octet. To make all the necessary shifts intrinsic I had to use particular procedure names. However, SPARK does not allow overloading of procedure names so this created a conflict between SPARK rules and my desire for an efficient implementation. To work around this, I created separate packages for the operations on each type. This caused something of a proliferation of packages, but it worked out to be a reasonable approach.

I also factored out the core algorithms from the object oriented interfaces so that the core algorithms could be made SPARKable while still retaining the convenient interfaces for users who want them. This went smoothly. There is now slightly more overhead involved when using the object oriented interfaces than there used to be because of the extra layer of procedure calling involved. However, benchmark tests on Blowfish show only a slight slowdown due to this effect. The beauty of this approach is that while the main algorithms are (will be) SPARKable, the high level interfaces made available by the library can continue to use full Ada 2005.

So far I have only applied flow annotations to the packages that I intend to be SPARKable. This helps increase confidence in those packages somewhat, but I believe the real gain in reliability will come when I start using the proof annotations. I intend to put that off for a while, however, until the library has been fleshed out a bit more. Yet even the flow annotations are useful. They forced me to properly check error status on all called procedures; something I hadn't been doing consistently before.

I feel that now the basic structure of ACO is complete and what remains is mostly a matter of building on that structure with actual algorithm implementations. I will probably continue by implementing SHA1 next.

Tuesday, November 18, 2008

Testing Crypto Implementations

How does one test an implementation of a cryptographic algorithm? This question is surprisingly tricky. In general many algorithms have at least some easy to determine correct results, so writing test cases for them is relatively straight forward. For example consider red-black trees. If one inserts a collection of integers into a red-black tree it is easy to see that a search operation on one of those integers should succeed. The test case author can now focus on writing test cases that search for particularly "interesting" integers so the insertion and search code are well exercised. However, the output of an encryption algorithm looks indistinguishable from random data. How can one know what the correct "random" data should be? Of course one could decrypt the result of encryption and verify that it agrees with the original input. Such round trip testing is certainly valuable but it's not enough. A round trip might succeed even though the ciphertext produced by the encryption is wrong. The implementation of decryption would have to be flawed in a complementary way, but it is easy to imagine that happening.

It seems like the only way to really test the validity of an encryption implementation is to check its output against a known correct implementation. The problem is how do we know a correct implementation when we see it? Certainly if multiple independently written implementations produce the same output that increases confidence in the correctness of all of them. However, how many implementations of an algorithm are really independently written?

For example, the Blowfish algorithm is initialized with 4168 digits from the hexadecimal expansion of Pi. The implementation of Blowfish currently in ACO uses the values for those digits that are posted on Bruce Schneier's web site. As a result ACO's implementation of Blowfish is not 100% independent of Schneier's. If those digits of Pi are wrong, we are both wrong in the same way. Even when multiple implementation of an algorithm are written independently there is still a possibility of them all making the same error, each on their own. While this might seem unlikely, it is certainly conceivable.

Published cryptographic algorithms are analyzed for security using mathematical methods. That analysis is done based on the published formal description of the algorithm. In order to ensure that an implementation of a cryptographic algorithm actually provides the security the theoretical results suggest, it is important to be using code that actually implements the algorithm's specification correctly. No amount of checking against other implementations will verify that the formal mathematical specification is being followed. Thus I submit that one should instead try to prove that an implementation is correct using formal methods.

It is for this reason that I intend to use the SPARK tools during the development of ACO. Although ACO is not at this point yet, I hope to eventually use SPARK to formally prove the correctness of all the core algorithms in the library. Of course formal proof at the level of the source code will not find errors in the compiler's code generator, or in numerous other places where errors might arise. Thus I fully intend to also check ACO's output against other implementations. It is my hope that the combination of testing and formal proof will give users a high level of confidence that ACO library actually provides the security assurances claimed by the published mathematical analysis.

Saturday, November 15, 2008

Introduction

This is a blog about the Ada Cryptographic Objects (ACO) library and all things related to it. ACO is a pet project of mine to build a high quality crypto library in Ada. I have a number of Ada projects in mind that could use cryptography and for various reasons none of the existing free crypto libraries really suited my tastes. In any case, as a hobby project this effort is not necessarily about quick development or maximum software reuse. Rather this project is just for my own enjoyment. Using other people's code, while highly practical, isn't nearly as much fun as writing my own.

Right now ACO is in its early stages. Most of my work has been focused on organizing the library rather than actually implementing cryptographic algorithms. I believe this is as it should be since I want to create a good foundation for the library before spending a lot of time writing the wrong code. I am using the AUnit unit test framework for my unit tests, I am building benchmark programs to check the performance of my algorithm implementations, and I am writing detailed documentation for ACO as I write the code. I hate dealing with software that has incomplete documentation and I don't want ACO to have that problem.

My plan is also to use the SPARK tool set to analyze my implementation of the core algorithms. SPARK is a powerful static analysis tool that can be used, for example, to help prove the correctness of one's code. I'm not sure right now how far I will take my use of SPARK; I'm using it here partly to learn about SPARK's issues and capabilities.

Right now ACO lives on SourceForge. I have been using the developers forum there to make notes about my work as I go along. My thought was that if another developer joins me in the future he or she might want to review those notes to get a better understanding of my design decisions and why I made them. After making a few entries in the developers forum, I realized that I probably really should be posting such material in a public place. Others who are interested in Ada, cryptography, or highly reliable software might be interested in following or commenting on ACO's development. It is in that spirit that I created this blog.