In the early years of the past decade, the exponential growth of the sequential computing performance that has characterized the previous fifty years suffered a setback. Although the quantity of transistors on a chip continues to follow Moore’s law, it has become increasingly difficult to continue to improve the performance of sequential processors by simply raising the clock frequency, mainly due to power and cooling motivations. To remedy this situation, the industry released the so-called “multicore”, or “chip multiprocessors” systems, which provide for the presence of multiple processing units on a single chip and connected through a shared memory. In subsequent years the number of processors on a chip will increase at the Moore’s law rate, as well as the peak number of instructions executed per seconds, allowing this architecture to be the potential solution to the problem of stalled performance growth.
On the other hand, a parallel program is far more difficult to design than an equivalent sequential program, and rarely offers a significant performance increase which may be attributable to the nature of the program and the impossibility to structure it in a set of parallel independent tasks. The most real-world computational problems cannot be effectively parallelized without incurring the cost of inter-processor communication and coordination because, while parts of the program mandatory need to be performed in a serial manner, the parallel parts may also need to access shared data, which in turn require particular synchronization techniques. Parallelization and synchronization can have therefore a dominant effect on performance resulting in this way in an extremely non-linear speed-up curve tending to stall or worse. Thus, parallel programming makes more complex the programmer work with respect to the sequential one; a simple task requires much more attention in order to guarantee fundamental properties such as “safety” and “liveness”, and an approach that seems to be very good in solving a problem could give rise to bad outcomes in solving another one.
While parallelism has been a difficult problem for general-purpose programming, database systems have successfully exploited parallel hardware for decades by executing many queries concurrently on multiple processors. The author of the query does not care anymore about parallelism and has only to focus on the correctness of the query itself, leaving the hard task of ensuring atomicity, consistency, isolation and durability (A.C.I.D.) to the transactional engine that is part of the database management system (DBMS). The transaction is indeed the heart of the programming model for databases and can be expressed as a group of read and write operations performed on shared objects which must appear to be executed atomically at a single point in time, before and after the effect of other transactions running or not concurrently, in a serial one-at-a-time order. Transactions offer therefore a proven abstraction mechanism in database systems for constructing reusable parallel computations, and the advent of multicore processors has renewed interest in an old idea, that of incorporating transactions into the programming model used to write parallel programs.
This is the approach followed in Transactional Memory (TM), a paradigm that enables programmers of concurrent applications to rely on atomicity and isolation guarantees provided by some TM layer. In this thesis I will present a fully innovative mechanism enabling in-memory transactions to be executed as preemptable tasks, with preemption actuated according to very fine grain intervals. This in turn enables a single thread to exploit one CPU-core in the most effective manner with respect to different priority levels of the transactions to be processed, an issue that as not yet been tackled by any literature study.