Time filter

Source Type

Lexington, MA, United States

Alistarh D.,Ecole Polytechnique Federale de Lausanne | Bender M.A.,Tokutek, Inc. | Gilbert S.,National University of Singapore | Guerraoui R.,Ecole Polytechnique Federale de Lausanne
Proceedings - Annual IEEE Symposium on Foundations of Computer Science, FOCS | Year: 2012

Asynchronous task allocation is a fundamental problem in distributed computing in which p asynchronous processes must execute a set of m tasks. Also known as write-all or do-all, this problem been studied extensively, both independently and as a key building block for various distributed algorithms. In this paper, we break new ground on this classic problem: we introduce the To-Do Tree concurrent data structure, which improves on the best known randomized and deterministic upper bounds. In the presence of an adaptive adversary, the randomized To-Do Tree algorithm has O(m + p log p log2 m) work complexity. We then show that there exists a deterministic variant of the To-Do Tree algorithm with work complexity O(m + p log5 m log2 max(m, p)). For all values of m and p, our algorithms are within log factors of the Ω(m + p log p) lower bound for this problem. The key technical ingredient in our results is a new approach for analyzing concurrent executions against a strong adaptive scheduler. This technique allows us to handle the complex dependencies between the processes' coin flips and their scheduling, and to tightly bound the work needed to perform subsets of the tasks. © 2012 IEEE.

Bender M.A.,Tokutek, Inc. | Gilbert S.,National University of Singapore
Proceedings - Annual IEEE Symposium on Foundations of Computer Science, FOCS | Year: 2011

This paper presents a new algorithm for mutual exclusion in which each passage through the critical section costs amortized O(log 2 log n) RMRs with high probability. The algorithm operates in a standard asynchronous, local spinning, shared memory model with an oblivious adversary. It guarantees that every process enters the critical section with high probability. The algorithm achieves its efficient performance by exploiting a connection between mutual exclusion and approximate counting. © 2011 IEEE.

Alistarh D.,Massachusetts Institute of Technology | Bender M.A.,Tokutek, Inc. | Gelashvili R.,Massachusetts Institute of Technology | Gilbert S.,National University of Singapore
Proceedings of the Annual ACM-SIAM Symposium on Discrete Algorithms | Year: 2014

Task allocation is a classic distributed problem in which a set of p potentially faulty processes must cooperate to perform a set of tasks. This paper considers a new dynamic version of the problem, in which tasks are injected adversarially during an asynchronous execution. We give the first asynchronous shared-memory algorithm for dynamic task allocation, and we prove that our solution is optimal within logarithmic factors. The main algorithmic idea is a randomized concurrent data structure called a dynamic to-do tree, which allows processes to pick new tasks to perform at random from the set of available tasks, and to insert tasks at random empty locations in the data structure. Our analysis shows that these properties avoid duplicating work unnecessarily. On the other hand, since the adversary controls the input as well the scheduling, it can induce executions where lots of processes contend for a few available tasks, which is inefficient. However, we prove that every algorithm has the same problem: given an arbitrary input, if OPT is the worst-case complexity of the optimal algorithm on that input, then the expected work complexity of our algorithm on the same input is O(OPT log3 m), where m is an upper bound on the number of tasks that are present in the system at any given time. Copyright © 2014 by the Society for Industrial and Applied Mathematics.

Agency: National Science Foundation | Branch: | Program: SBIR | Phase: Phase I | Award Amount: 100.00K | Year: 2008

This Small Business Innovation Research (SBIR) Phase I project will investigate adaptive techniques to speed up key database operations dramatically. Specifically, the project will investigate adaptive algorithms and idle-time re-balancers which can respond to bursts of insertions and to changing insertion and query patterns. Many applications insert millions of indexed records per second into storage systems. The proposed research is based on new algorithms for transactional databases that improve insertion speeds by two orders of magnitude, achieving about 2% of disk bandwidth for worst-case insertion patterns of 100-byte records, as compared to 0.01% for traditional B-tree-based databases, a 200-fold speedup. Although impressive, there remains another factor of 50 before disk bandwidth is fully utilized. The specific research objective is to obtain another order-of-magnitude speedup for insertions, allowing databases to insert millions of indexed records per second on a modestly sized disk array. The anticipated outcome of the research is an algorithm with a theoretical performance analysis, along with a design document for incorporating the design into a proposed database product. The market for databases and file systems is over $15 billion per year and growing. Furthermore, there are many application areas which do not employ database because their performance is too slow. Orders-of-magnitude speedup for databases can help grow the market by additional billions of dollars per year. Societal impact: Applications in finance, retail, homeland security, telecommunications, and scientific computing will benefit from high-performance databases. Enhanced scientific and technological understanding: The proposed research will further understanding of how to organize data on disk, which is a core problem for computation on large persistent data.

Agency: NSF | Branch: Standard Grant | Program: | Phase: SMALL BUSINESS PHASE II | Award Amount: 425.00K | Year: 2011

This Small Business Innovation Research (SBIR) Phase II project will apply multithreading techniques to provide multi-terabyte (and larger) high-performance databases in MySQL. The company has developed a highperformance storage engine for MySQL, which maintains indexes on live data 100 times faster than current commonly-used structures. The technology solves the problem of maintaining indexes on large databases in the face of high trickle-load indexing rates. In Phase I, the company developed a multithreaded bulk loader to solve the problem of how to load data quickly. The next significant research problems for large MySQL databases are to allow online, or hot, schema changes in which, for example, an index can be added without taking the database down, and to use multithreading to speed up joins and reductions so that the large data sets can be queried quickly. In this project, the researchers will investigate the use of multithreading to support hot indexing and parallel joins reductions.

If successful, multi-terabyte and larger databases will be manageable and fast on modest hardware, and the hardware will be scalable both with CPU cores and disks. The broader impact of this work is driven by faster, cheaper, lower-power on-disk storage. Organizations that have very large databases will be able to use much less hardware, both saving money and reducing power consumption significantly. Currently many application areas do not employ databases because their performance is too slow. Speeding up databases by two orders-of-magnitude can help grow the market. Currently, many organizations fail to make good use of the data they have collected because they cannot manage it, index it, or query it fast enough to be useful. Applications in finance, retail, homeland security, telecommunications, and scientific computing will benefit from improved manageability and performance. As users appetite for data continues to outstrip the availability of fast memory, organizing multithreaded queries on disk-based data for performance will continue to grow in importance.

Discover hidden collaborations