Monday, September 10, 2018

Exactly Once is NOT Exactly the Same: Article Analysis


I decided to analyze an article describing some interesting details of the stream processing exactly-once. The thing is that sometimes the authors misunderstand important terms. The analysis of the article just will clarify many aspects and details so revealing illogicalities and oddities allows you to fully experience the concepts and meaning.

Let’s get started.


Everything starts very well:

Distributed event stream processing has become an increasingly hot topic in the area of Big Data. Notable Stream Processing Engines (SPEs) include Apache Storm, Apache Flink, Heron, Apache Kafka (Kafka Streams), and Apache Spark (Spark Streaming). One of the most notable and widely discussed features of SPEs is their processing semantics, with “exactly-once” being one of the most sought after and many SPEs claiming to provide “exactly-once” processing semantics.

Meaning that data processing is extremely important bla-bla-bla and the topic under discussion is exactly-once processing. Let us discuss it.

There exists a lot of misunderstanding and ambiguity, however, surrounding what exactly “exactly-once” is, what it entails, and what it really means when individual SPEs claim to provide it.

Indeed, it is very important to understand what it is. To do this, it would be nice to give a correct definition before the lengthy reasoning. And who am I to give such damn sensible advice?

Thursday, July 19, 2018

Heterogeneous Concurrent Exactly-Once Real-Time Processing

Concurrent sausage


Exactly-once data processing in real-time is an extremely non-trivial task and requires serious and thoughtful approach for the entire pipeline. Someone even believes that such a task is impossible. In reality, one wants to have an approach that provides generic fault-tolerant processing without any delays together with using the different data storages, which puts forward an even stronger requirement for the system: concurrent exactly-once and heterogeneity of the persistent layer. To date, any of the existing systems do not support this requirement.

The proposed approach will consistently reveal secret ingredients and necessary concepts allowing to implement heterogeneous concurrent exactly-once processing relatively easy literally based on two components.


The developer of distributed systems passes several stages:

Stage 1: Algorithms. Here we study the basic algorithms, data structures, approaches to OOP type programming, and so on. The code is solely single-threaded. The initial phase of entering the profession. Nevertheless, it is rather difficult and can last for years.

Stage 2: Multithreading. Then there are questions of extracting maximum efficiency from hardware, there is multithreading, asynchrony, races, debugging, stracing, sleepless nights... Many get stuck at this stage and even start from some moment to catch an inexplicable buzz. But only a few come to understand the architecture of virtual memory and memory models, lock-free/wait-free algorithms, various asynchronous models. And almost no one ever comes to the verification of multi-threaded code.

Stage 3: Distributed programming. Here such shit happens that words cannot describe it.

Thursday, April 26, 2018

Attainability of the Lower Bound of the Processing Time of Highly Available Distributed Transactions


Recently I've read another article from the series: "we are better than a two-phase commit". Here I will not analyze the contents of this article (although I'm thinking about giving a detailed analysis). The task of my opus is to offer the most effective version of the distributed commit in terms of number of round trips. Of course, such a commit comes at a price. However, the goal is to assess and show that the two-phase commit is not a drag, as many believe.

It should also be noted that there will be no full-scale experiments and fake comparisons. Algorithms and theoretical analysis will be given simply. If desired, you can independently implement and test it in practice. Of course, it would be much better that this was presented in the current article, but it all depends on free time and motivation. In my opinion, it's more important to describe algorithms than to present charts, because almost everyone can draw the charts based on algorithms while the opposite is not true.

Thursday, November 16, 2017

Replicated Object. Part 3: Subjector Model

Parallel execution


This article is a continuation of the series of articles about asynchrony:

  1. Asynchronous Programming: Back to the Future.
  2. Asynchronous Programming Part 2: Teleportation through Portals.

After 3 years, I have decided to expand and generalize the available spectrum of asynchronous interaction based on coroutines. In addition to these articles, it is also recommended to read the article related to god adapter:

  1. Replicated Object. Part 2: God Adapter.


Consider an electron. What do we know about it? A negatively charged elementary particle, a lepton having some mass. This means that it can participate in at least electromagnetic and gravitational interactions.

Saturday, August 19, 2017

Kinetics of Large Clusters


  1. Martin Kleppmann's fatal mistake.
  2. Physicochemical kinetics does mathematics.
  3. The half-life of the cluster.
  4. We solve nonlinear differential equations without solving them.
  5. Nodes as a catalyst.
  6. The predictive power of graphs.
  7. 100 million years.
  8. Synergy.

In the previous article, we discussed in detail Brewer's article and Brewer's theorem. This time we will analyze the post of Martin Kleppmann "The probability of data loss in large clusters".

In the mentioned post, the author attempts to simulate the following task. To ensure the preservation of data, the data replication method is usually used. In this case, in fact, it does not matter whether erasure is used or not. In the original post, the author sets the probability of dropping one node, and then raises the question: what is the probability of data loss when the number of nodes increases?

The answer is shown in this picture:

Data loss

Sunday, August 13, 2017

Latency of Geo-Distributed Databases

Theorem 0. The minimum guaranteed latency for the globally highly available strong consistency database is 133 ms.


1 Abstract

The article introduces step-by-step the formal vocabulary and auxiliary lemmas and theorems to prove the main theorem 0. Finally, as a consequence, the CAL theorem is formulated.

2 Introduction

Modern applications require intensive work with huge amount of data. It includes both massive transactional processing and analytical research. As an answer to the current demand, the new generation of databases appears: NewSQL databases. Those databases provide the following important characteristics: horizontal scalability, geo-availability, and strong consistency.

NewSQL era opens new possibilities to store and process so called Big Data. At the same time, the important question appears: "how fast the databases might be?". It is very challenging task to improve the performance and latency parameters because it involves almost all layers while building the databases: from hardware questions about data centers connectivity and availability to software sophisticated algorithms and architectural design.

Thus, we need to understand the degree of latency optimizations and corresponding limitations that we have to deal with. The article tries to find answers to that challenge.

Saturday, March 4, 2017

CAP Theorem Myths



The article explains the most widespread myths of CAP theorem. One of the reason is to analyze recent Spanner, TrueTime & The CAP Theorem article and to make clear understanding about terms involved in the theorem and discussed a lot under different contexts.

We consider that article closer to the end, armed with the concepts and knowledge. Before that, we analyze the most common myths associated with the CAP theorem.