Saturday, March 31, 2018

Book review: Last lecture by Randy Pausch

Of course you have watched the late Prof. Randy Pausch's last lecture. If for some reason you haven't, please right that wrong now.

I recently read the book "Last Lecture" by Randy Pausch. I got the book from my local library, yes, a physical library building. They still exist, and they are a great resource.

It is a short book, and I have a very brief review.

The book does not repeat the "last lecture" presentation. It tells mostly disjoint and complementary story, and it gets into more personal stuff. I learned several more interesting anectodes about Randy in this book.

It is of course a very heartbreaking story. The book is a "memento mori".

It makes you think about "What would you want to be known for?" and "How would you prioritize your life if you had 5 years left?"

I am still thinking about those long after I returned the book to the library.

I am currently reading about "Crypto: How the Code Rebels Beat the Government Saving Privacy" by Steven Levy and "Tribe of Mentors" by Tim Ferriss. Both courtesy of my local library, and both very interesting. I will write short summaries about them once I am done.

Tuesday, March 27, 2018

Master your tools

Couple months ago, a friend sent me this article about "Speed learner Max Deutsch challenging chess grandmaster Magnus Carlsen". My friend called the story a life-hack story, and remarked that this is the "Silicon Valley bro" frame of mind in action: "Nobody deserves any respect because I am so confident and valley savvy I could naturally write an app and do it better".

This was my reply on the article.
This is a very nice illustration of a hack versus expert: Expert had years of battle-scar, and internalized everything. Hacks/shortcuts are going to take you only to where the expert's domain starts.
Also another take away is, we humans are dumb. We don't have anything to be proud of about our mental prowess. I make so many stupid mistakes every week, it is frustrating. It is a shame we can't get much smarter on the spot. But in the offline mode (doing research/thinking by writing), I think we can do better. We can only make up for the shortcoming of our brains by using discipline (writing or math-symbolic computing).

This story is a perfect illustration of the need for tooling and the need for mastering your tools.

There are many levels of learning/mastering

This is what I wrote in 2012 about the many levels of learning. It has a nice backstory as well, about how I almost flunked differential mathematics as a freshman.
There are many levels of knowing. The first level is "knowing by word". With passive learning (like I did for studying for the diff final), you get the most limited form of understanding/knowing. It is ephemeral, and you are not able to reconstruct that on your own. The second level is "knowing by first hand experience". At this level, you are able to understand/witness and if necessary rediscover the knowledge on your own. Finally, the third level is "knowing by heart" or grokking it. At this level, you internalized the knowledge so that it becomes part of your nature.
The best way of learning is by doing. You need to work hands-on, make a lot of mistakes so you can start to internalize that knowledge. That means you should make, and produce things continuously. For this to work in a sustainable manner, you need to change your default mode from the consumer mode to the producer mode.

As it comes to learning/mastering a tool, hacks/shortcut does not cut it. Just having a passing familiarity with the tool doesn't empower you.  You need to master and internalize your tools.

No pain, no gain. If you don't bend yourself, you did not master the tool. You are short-cutting and won't get the full benefits of the tool and level up.

You truly master a tool only when you master yourself and bend yourself with that tool's thinking mode. Mastering the tool and yourself through the tool, takes many weeks, and sometimes years. When you pay your dues and go deep, only then, you transcend and level up.

Learning/mastering is a gradient descent process

You (well, at least I) progress through the levels of learning gradually almost via a gradient descent like process. Let me give an example from Paxos.

The first time I taught Paxos in my distributed systems class around 10 years ago, I don't think I understood it well. I was a hack. I felt very unsatisfied about how I couldn't explain it better. Through these regrets, I started getting better at it. This echoes what I wrote earlier about "what I learn from teaching".
In that down hour after teaching, what do I do? I have regret-flashbacks about some parts of the lecture I delivered. I have remorse about how I botched some parts of the lecture and how I couldn't answer a student's question better. The reason I botch up a part of the lecture is often because I haven't internalized that part yet. The reason I wasn't able to make a smooth transition from one part to another part is often because I didn't have a better understanding/insight about how those parts fit. It is likely I was missing some context about it. Or maybe more likely that I knew the context, but I didn't internalize it well enough. There are many different levels of knowing, and the more reliable/resilient/dependable level of knowing is knowing by experience. Teaching is a way to have hands-on training in your research area. You learn through your regrets about how you couldn't explain it better. 

So the way I got better at internalizing and explaining Paxos was by making mistakes and feeling like an idiot. Once in a while I would be saying "Oh, wait, this doesn't make sense in Paxos", or "Wait, how does Paxos deal with this?", and then work my way out of these problems. I would feel bad and say to myself "How come I didn't notice this after this many years?", but I now got used to this feeling. I know this is a natural part of the learning/mastering the tool. This is a good kind of screw up, because it makes me learn. When I screw up and learn from it, I am leveling up.

Tools grow your brain

A good tool is a force multiplier. So it is definitely worth the time to master a good tool, because you will get a lot of mileage out of that tool as multiplicative factor.

Young Steve Jobs was pitching computers as bicycles for the mind.
I think one of the things that really separates us from the high primates is that we’re tool builders. I read a study that measured the efficiency of locomotion for various species on the planet. The condor used the least energy to move a kilometer. And, humans came in with a rather unimpressive showing, about a third of the way down the list. It was not too proud a showing for the crown of creation. So, that didn’t look so good. But, then somebody at Scientific American had the insight to test the efficiency of locomotion for a man on a bicycle. And, a man on a bicycle, a human on a bicycle, blew the condor away, completely off the top of the charts.
And that’s what a computer is to me. What a computer is to me is it’s the most remarkable tool that we’ve ever come up with, and it’s the equivalent of a bicycle for our minds.

To make a transformative contribution, sometimes you will need to invent your own tools as I wrote earlier in "How to go 10X":
Adopt/Invent better tools "Give me six hours to chop down a tree and I will spend the first four sharpening the axe." -- Abraham Lincoln
I mean, not just better but transformative tools of course. Most often, you may need to invent that transformative tool yourself. When you are faced with an inherent worthy problem, don't just go for a point solution, generalize your solution, and ultimately make it in to a tool to benefit for the future. Generalizing and constructing the tool/system pays that technical debt and gets you to have truly 10X benefit. MapReduce as a tool comes to my mind as a good example for this. By generalizing on the kind of big data processing tasks/programs written at Google, Jeff Dean was able to see an underlying pattern, write a good tool to solve the problem once and for all.
Scientists spend decades to invent transformative tools (Hadron Collider, Hubble telescope) and then they get breakthrough results. As researchers in computer science, we should try to adopt/cultivate this mentality more.
The good news is mastering a tool makes it easier for you to pick up another one. As you become expert in one tool/domain, you can map other tools/domain to that and contextualize and master them much faster.

Acquiring new tools is the surest way to get smarter.

After you master a tool

After you master a tool, you internalize it and go minimalist, and reduce it to first principles. You not only internalize how to use it, you also internalize its limitations and learn when not to use it. You learn from your tools and start converging to more minimalistic tools and first principles.

As to methods, there may be a million and then some, but principles are few. The man who grasps principles can successfully select his own methods. The man who tries methods, ignoring principles, is sure to have trouble.
-- Ralph Waldo Emerson

MAD questions

1. What qualifies as a tool? 
Does critical reading qualify? How about writing well?

2. I want to learn from you, what are the tools that benefited you the most?

3. What does your mastering process look like?

4. What are some good tools for computer science students?
Here is John Regehr's answer, but this is just one answer and it is given with a pretty literal interpretation of "tool".

5. Refining from the tools/methods to principles, what are some good principles for "computer science" thinking?

Related links

I really liked "You are your tools" post by Daniel Lemire.


Intuition pumps. 

How to go 10X. 

Good engineers use good tools. 

Teach Yourself Programming in Ten Years.

Monday, March 26, 2018

Replying to why decentralization matters

Last month Chris Dixon published "Why decentralization matters".

I felt it was one-sided and lacked solid arguments. I picked up some populist vibes as well.
"There go my people. I must find out where they are going, so I can lead them." --Alexandre Auguste Ledru-Rollin

At that time, I had jotted down my criticisms to some of the paragraphs in that article in my draft posts folder. Then I saw Todd Hoff's (from High Scalability fame) brief write up about the article. It captures nicely my reactions to the article. So I will start with Todd's response and cover only the remaining parts of my criticisms in the rest of this post.
"Nerds always dream of decentralization, I certainly do, but every real world force aligns on the side of centralization. We still have NAT with IPv6! Ironically, the reason given why decentralization will win is exactly why it won't: "Decentralized networks can win the third era of the internet for the same reason they won the first era: by winning the hearts and minds of entrepreneurs and developers." Decentralization is harder for developers, it's harder to build great software on top of, it's harder to monetize, it's harder to monitor to control, it's harder to onboard users, everything becomes harder. The advantages are technical, ideological, and mostly irrelevant to users. Wikipedia is crowd-sourced, it's in no way decentralized, it's locked down with process like Fort Knox. Decentralization is great for dumb pipes, that is the original internet, but not much else. Cryptonetworkwashing can't change architectures that have long proven successful in the market."

The myth of the decentralized Internet

Chris says: "During the first era of the internet, internet services were built on open protocols that were controlled by the internet community."

Open protocols do not imply a decentralized system.

Also the myth of decentralized Internet (or early Internet) needs to die already. "The Internet is better viewed as being distributed, both in terms of technologies and governance arrangements. The shift in perspective, from decentralised to distributed, is essential to understand the past and present internet, and to imagine possible future internets which preserve and support the public good."

The excellent book "Where Wizards Stay Up Late: The Origins Of The Internet" tells the story of how ARPA and the involved universities built the origins of Internet and captures really well the thinking/perspectives at that time.

Render unto GAFA

Chris says:
> During the second era of the internet for-profit tech companies, Google, Apple, Facebook, and Amazon (GAFA), built software and services that rapidly outpaced the capabilities of open protocols.
Yes, this is because [logically] centralized coordination is effective and efficient.

> For 3rd parties, this transition from cooperation to competition feels like a bait-and-switch. Over time, the best entrepreneurs, developers, and investors have become wary of building on top of centralized platforms. We now have decades of evidence that doing so will end in disappointment. In addition, users give up privacy, control of their data, and become vulnerable to security breaches. These problems with centralized platforms will likely become even more pronounced in the future.
The first part is a fair point (below I talk a bit about what could be a better model). But I am not sure I agree with the second part. Centralized services provide easy to use platforms for third parties to reach to millions of people that use the service. IoS appstore, Android appstore, AWS, Facebook, and Google ad platforms are examples. These platforms are sharing a great resource/platform to bootstrap and scale to millions to the 3rd parties, and therefore can afford to leverage on that and be whimsical and demanding. This comes with freemarket cut-throat capitalism. I am very disappointed (but not surprised) with how this is abused and I hope we can converge to a better model that prevents the abuses. Hopefully some of the ideas from the decentralized systems can be adopted to help converge to that model.

That said, I don't think decentralized systems would be immune to abuses and tyranny, only the type of power wielders will change.

> Decentralized systems start out half-baked but, under the right conditions, grow exponentially as they attract new contributors.
It is still unclear to me how a decentralized network can bootstrap easily. Decentralized does not automatically resolve the bootstrapping and traction problem, right? Why would I join a decentralized system in the beginning? I don't know how many Byzantine nodes are out there? At the beginning of the network, it is easy to have a Byzantine gang trap. I think the cryptocurrencies play to the greed and FOMO of the people while there is hype in this field. Initially it is cheap to join, and there is a chance to bank on by being the early adapter and make money off the late comers. But that works only if this particular thing/platform becomes popular. That is not a sustainable model. For the dozens of cryptocurrencies that become popular, we have thousands of cryptocurrencies that tanked which are missed. I think we have a case of survivorship bias in evaluating blockchain bootstrapping process.

Appeal of the emotional appeal

Chris says: "Decentralized networks can win the third era of the internet for the same reason they won the first era: by winning the hearts and minds of entrepreneurs and developers."

If you  make an emotional appeal to substitute for a logical argument, you are not doing well. On a side note, recently I had watched a speaker at an important venue that made the following appeal for decentralization: "The cloud took away our personal computers and it is trying to move us back to the mainframe age. With blockchain and peer-to-peer computing we will take back the control to our computers." Oh, please, cry me a river.

As for the response to Chris, Todd's assessment is right about the developers' perspective: "Decentralization is harder for developers, it's harder to build great software on top of, it's harder to monetize, it's harder to monitor to control, it's harder to onboard users, everything becomes harder. The advantages are technical, ideological, and mostly irrelevant to users."

Since the [logically] centralized cloud model has been around for a long time, it has done a lot of improvements in its developer support. Thanks to the logical centralization, the model is also easier for the developers to build on and provide high-availability and virtually unlimited high-scalability as well. But for the decentralized systems, the current support is bad, and there is no indication it can improve fast quickly. 

I think what attracts developers to fully-decentralized platforms currently is not that their minds and hearts are won, but it is the novelty, curiosity, and excitement factors. There is an opportunity to make a winning in this domain if you are one of the first movers. But due to the volatility of the domain, lack of a killer application as of yet, and uncertainty about what could happen when regulations hit, or the bubble burst, it is still a risky move. 

> Cryptonetworks combine the best features of the first two internet eras: community-governed, decentralized networks with capabilities that will eventually exceed those of the most advanced centralized services.
This is another big unsubstantiated claim. Chris says this as a matter of fact and as inevitable without providing any justification for it. I am not convinced that a fully decentralized architecture can be made to exceed the capabilities of [logically] centralized cloud services. Instead I think the centralized solutions can quickly copy the best parts of the technology and outpace decentralized solutions. Paypal can just implement the smartcontracts as the good idea, and still do this as a centralized solution, and avoid all the problems of decentralization. Maybe they can throw in attestation and verifiable claims exchange, which they can more easily implement in their logically-centralized platform. Linked-in may implement attestation and identity services. And identity services will come to social network platforms inevitably/eventually, given the current rate of manipulations and propaganda going on. Dropbox can easily add content-based addressing and leave IPFS in obscurity. Kickstarter can easily implement the token/ICO idea as a better way to support projects with stakeholders.

> Today’s cryptonetworks suffer from limitations that keep them from seriously challenging centralized incumbents. The most severe limitations are around performance and scalability. The next few years will be about fixing these limitations and building networks that form the infrastructure layer of the crypto stack. After that, most of the energy will turn to building applications on top of that infrastructure.
As for the efficiency limitations of decentralized systems, it may not be possible to fix the limitations of decentralized systems. Decentralized is at an inherent disadvantage from a coordination perspective.

What is a better model for preventing abuses with centralized services?

The hand of the free market helps somewhat for keeping centralized services straight: when one big corporation starts playing foul and upsets the users, new companies and startups quickly move in to disrupt the space and fill in the void. But the hand of the free market moves slowly.

This is not exactly my domain of expertise, but here are some things that may help prevent abuses from centralized services.

1. The services can provide some open core protocols/contracts/APIs that won't change. This will give these services an edge for adoption by developers compared to others. Maybe they may even make these enforcable by law with some penalty. Cloud providers offer SLAs, maybe they should offer these contracts to developers as well.

2. Users and developers can be made stakeholders via initial token offerings. Why can't a centralized service not be able to offer tokens? As long as there is a real cost of getting a token (the user and developer makes some contribution to the growth of the platform) and there is eventually going to be real value associated with a token, this incentive economy can also work for centralized services. Badges are a form of token, and it worked OK for a long time for some services for FourSquare for example. Why not take this to the next level with some cryptotokens, and tokens that can be exchanged/sold on and off the platform?

3. The services may be build closer to Unix philosophy of specialized components with better defined inputs/outputs. This can enable third party services to integrate transform. And collaboration and cooperation may bring win-win business to each service involved as well. Why do the services need to be close walled gardens?

4. Finally better regulations from law-makers can help, such as EU's laws about customer privacy and the right to be forgotten.

I realize number 3 is a bit far-fetched. But if we could have it, we could use our contributions and our data to have some leverage over the services. I had thought about this problem for the smartphone crowdsourcing model back in 2013, but did not follow up because it is a very ambitious undertaking.

Other MAD questions

1. What if the decentralized approach wins a decisive victory quickly, what would that look like?
I find this unlikely but let's say this is a black swan event: somehow the decentralized approach click really well with developers and users alike and takes off. And we see more and more of the blockchain based services. What does that world look like? Does that mean there are more stakeholders in hit services. I am all for spreading the wealth. If blockchain wins, is it automatically guaranteed that we have more stakeholders and more accountable services. Or would we still see relatively few number of big money stakeholders dominating the ecosystem? How do you convince a skeptical this would not be the case?

2.  What are some very good applications for PoW blockchains? (I had the same question for IPFS recently.) Chris doesn't give any killer application, but insists that the applications will come. Even if we still miss the killer application, at this point we should be able to name some reasonably good fit applications there, and name applications from the general territory. Here is a checklist to eliminate some answers in advance.
  • It is store of value against the devaluating currency of developing countries. (They have been using gold, USD, Euro for that for many decades thank you.)
  • It is for international money transfer. (Does it beat Paypal, Western Union, banks, etc., significantly in the features that consumers value the most?)
  • But blockchains have this feature and that feature. (What is the killer application? Let's talk applications not features.)
And while we are at it, I repeat Todd's plea. Let's not call Wikipedia a decentralized system ever again. Crowdsourced does not imply decentralized, and certainly not the permissionless fully-hyperbolically-decentralized systems suggested in blockchain PoW approach.

Related links

The comments under Chris's tweet includes a lot of interesting counterpoints and very valid concerns.

The comments under Chris's medium blog post also has several counterpoints presented.

Blockchain applications in Supply Chain Management

Paper review. IPFS: Content addressed, versioned, P2P file system

Paper review. Blockchains from a distributed computing perspective

Saturday, March 24, 2018

Blockchain applications in Supply Chain Management

Here is the whitepaper that started this post, which is admittedly outside my domain of expertise. Also here is some discussion we had on this on Twitter. 

The white paper includes quotes like "7 in 10 consumer industry executives expect to have a blockchain production network in 3 years" and "Blockchains are expected to greatly reduce nine frictions. Including: inaccessible marketplaces, restrictive regulations, institutional inertia, invisible threats, imperfect information, inaccessible information."

Man, if that last statement is true, the companies should just employ blockchain and fire their CEO's and management teams as they are no longer needed. Blockchain is a miracle worker indeed. It enters inaccessible marketplaces, it loosens restrictive regulations, eliminates institutional inertia, makes visible the invisible threats (watchout ninjas), corrects/curates the imperfect information, and makes all information accessible.

I know that I tend to go hypercritical when I am analyzing some new idea/application. It is a vocational disease. On another day, I may try to write a more positive article about blockchain applications in supply chains. But, since there are already many overly optimistic articles on the blockchain applications in supply chains, this is the article you get from me today.

Do you need blockchains for supply-chain management?

There is this nice paper, titled "Do you need a blockchain?", that explains the application context blockchains would be useful for. This figure is really nice. I will keep this handy here, and refer blockchain fanatics to this figure.

And this table provides a nice comparison of your options as well.

As a case study the paper looks at supply-chain management (SCM). In SCM, the flow of materials and services required in manufacturing a given product is managed. This includes various intermediate storage and production cycles across multiple companies. Consider how complex the SCM for Walmart is. And, while many people don't realize this, military SCM problems rival or even exceed that of giant corporations such as Walmart. (Military is the biggest consumer of RFIDs behind Walmart.) If the military screws up their SCM logistics that may mean life and death. Due to misallocation of resources, some troops in deployment may suffer while some others would have surplus food/items that may need to be thrashed. (Don't search for military applications of blockchain. I searched for it and regretted reading superficial first-mover articles hyping blockchain use for military supply chain management.)

Of course an immediate issue facing the use of blockchains for SCM is that too much transparency could be a bad thing for business strategy (even more dangerous for military strategy).

Here is another great takeaway from the "Do you need a blockchain?" paper.  Harkening back to Figure 1, what is the nature of trust to the writer of the blockchain entry? How is integrity enforced at entry time (for recording the type, quantity, and condition of the goods being shipped)? If we trust the writer, what is the benefit of the blockchain system, let's replicate it across some append-only stores (that is how accounting does ledgers anyway).
"This reasoning leaves us with the question whether all writers can be trusted. SCM has the inherent problem of the interface between the digital and the physical world. A human, or some machine under the control of a single writer, typically is required to register that a certain good has arrived in a warehouse, and if for example its quality is appropriate. If there is no trust in the operation of these employees, then the whole supply chain is technically compromised as any data can be supplied by a malicious writer. If, on the other hand, all writers are trusted, a blockchain is not needed as a regular database with shared write access can be used instead. Note that if through some technical means, the connection between the digital and physical world could be realized in a secure manner, then the previous reasoning might change."
If you like to avoid the too much transparency problem and have some trust in the writers (or can have leverage on them via no-payment and no-future-business threats), then instead of a permissionless/public blockchain, going with a private blockchain could be a better option. But then why not just use a database or Kafka streaming to a database. Hyperledger in its current form is pretty close to that actually. If you have trust issues with writers, run PBFT on top of that.

MAD questions

1. What are some SCM use-cases where (distributed) databases/datastores would not work, and we positively need blockchains?

2. How much data is SCM applications going to generate?
In my previous post, I wrote up my notes from Dr. Laura Haas's talk on data science challenges in food supply chains. The talk mentioned from one food testing sample for metagenomics, you can expect to see 2400 microbial species. As a result, one metagenomics file is 250 GB, and 50 metagenomics samples result in 149 workflows invoked 42K times producing 271K files and 116K graphs.

I was surprised by the size of this data. And this is only from the food safety aspect of SCM. So the data sizes SCM generate has a potential to grow big quickly and we have a big data problem in SCMs. Then how do we store this huge data on all blockchain nodes? Big data will not only overwhelm the storage resources of the nodes but also strain the network as well. Are we going to just store the hash of the big data? If so, why can't we store the hash in an append-only multiversion distributed datastore and call it a day.

Moreover, don't you need to run some analytics on this data? So you will need a database/data store any how. Instead of investigating ways to do SCM with blockchains, wouldn't it make more sense to look for ways to enforce existing database solutions with provenance/attestation? (I have been hearing about provenance work in databases but never had time to learn about them.)

3. How do we teach blockchains to keep secrets?
For SCM, organizations would not like other parties to know the information about goods being transported in order not to lose their strategic advantage. I guess there are some techniques to anonymize information and some blockchains use them. But given that this information is permanently stored, how hard is it for an incentivized third party to data-mine (in machine learning sense, not in blockchain sense :-) this information and de-anonymize the information.

4. One chain to rule them all?
If we use blockchains for SCM, how do we get the public miners on board? I guess we give some tokens via an ICO, and ask them to mine for more tokens. People say blockchains facilitate crowdfunding new initiatives and I certainly appreciate some of the advantages ICOs bring. On the other hand, if done responsibly (and not just playing to people's greed) I am sure this won't be as easy as it sounds. Would there be dozens of SCM blockchains? Would it be possible to maintain a healthy number of diverse miner pools? When things start to level off would we end up with blockchain graveyards? A responsible startup should also have an end-of-life plan, right?

Thursday, March 22, 2018

Anatomical similarities and differences between Paxos and blockchain consensus protocols

Now that we read through some blockchain papers in our seminar, I started to develop some understanding of the "blockchain way". Since I already know about the "Paxos way", I thought it would be instructive and fun to compare & contrast Paxos and blockchain consensus protocols. If you know about either of these protocols well, this post can help you get a headstart on learning the other one.

I will be referring to the below Paxos figure for the protocol phases, so let's pin it here.

Leader election: silent vs loud

A couple days ago I tweeted that. That is how leader election is done in blockchain consensus protocol. The first miner to solve the puzzle first becomes the leader for the current round. For blockchain it is par for the course that for each round another leader (the one who solves the next puzzle first) serves. In other words, instead of Phase1a "propose leadership" and Phase1b "promise to follow" back-and-forth communication among the nodes, blockchain uses a proof-of-work puzzle to elect a leader. What if multiple miners solve the puzzle at about the same time? Well, see the below section for that comparison.

For Paxos, the leader election is done via Phase 1. A leader candidate sends a phase 1a message with its ballot number (essentially a counter or a logical clock if you like) and try to get OKs from a majority number of participants in the quorum. Receiving one "Not OK" message is a deal breaker, because it indicates there is another leader candidate with a higher ballot number reaching to the participants.

In Paxos applications, typically we like to continue with the same leader for many rounds, actually as long as we can. So to avoid paying the cost of phase 1 repeatedly, MultiPaxos skips phase 1 and continues with iterating over phase 2 messages for the consecutive rounds. If an incumbent leader arises, the phase 1 leader election may need to happen again. An orthogonal mechanism, such as a failure detection service, is used to ensure that there are not many candidates running to become a leader at a given time, as that may interfere with the progress of the protocol. That situation is called the "dueling leaders" situation. But worry not. Even when there are multiple leader candidates, Paxos is safe as we discuss below.

Multiple leaders case: fork vs no-fork

In blockchain, the difficulty of the puzzle makes sure that the probability of having two leaders at the same time is low. When there are multiple leaders, you will have a fork. The fork is resolved eventually, within a couple more block additions because most likely one fork will grow faster (the chances of both forks growing at the same speed becomes negligible after a couple of blocks) and the participants prefer to mine on the longest chain for having a chance to receive incentive: the mining fee if they become the leader for that round. In other words blockchain provides a eventually-consistent/stabilizing consensus protocol.

In Paxos, a failure detector service (which relies on heartbeats and often has some randomized backoff implementation) often helps with converging the number of leaders/candidates to 1. However, Paxos has the built-in mechanism (with the ballot numbers and majority acknowledgment via Phase 1 and Phase 2) that ensures safety of agreement even when there are multiple active leaders/candidates. If a participant is aware of an alternative candidate/leader with a higher ballot number, it sends a "Not OK" message to the incumbent leader. The incumbent leader waits for receiving an acknowledgment from a majority of participants to commit a value. If it receives a "Not OK" message, it needs to go back to Phase 1 to duel in the leader election again. On the other hand, it is safe for the incumbent leader to commit a value after a majority acknowledgment is received because even when the incumbent leader is dethroned, the newcomer is bound by Paxos protocol rules to repropose and accept the same value as the consensus decision. Therefore, even when there are multiple leaders, you won't have forks in Paxos, because only one of them--if at all-- will be able to complete Phase 2 and commit a value.

Communication in Accept phase: 1-way vs 2-way 

The blockchain protocol employs communication only once and only in one way, from the leader to the participants, and this corresponds to the phase 2a of Paxos: the Accept message Phase.

Since the blockchain consensus protocol skips Phase 2b and Phase 3, it provides unavoidably a probabilistic consensus.(I had talked about why this is the case for leader based distributed consensus such as Paxos in this previous post.) Since blockchain consensus is inherently probabilistic, you need to wait 6 more blocks to see if that block sticks. At that point the work required to rewrite history is so huge that you can conclude the block is finally committed.

For phase 2 in Paxos, the leader has a two-way acknowledged communication. The two-way communication works OK for the leader because the number of the participants is typically less than half a dozen (for the reasons explained below).

On the other hand, if 1000s nodes participated in a Paxos protocol, the leader would not be able to survive receiving 1000s of ack messages for each round. (Well, I guess it could have, but the phases would take several seconds than the typical 10s of milliseconds.) To avoid this problem, with 1000s of participants, the Blockchain has only one way broadcast communication, which is propagated from the leader to the participants via a gradual gossip protocol.

So to recap, and to refer back to the figure, here is what we have. Blockchain does a silent Phase 1 (via proof-of-work) and follows that with only Phase 2a.

The environment: public vs private

The typical use case of Paxos is to replicate state to make it consistently and safely survive crash failures in a private environment. So keeping the number of participants small (around 5 nodes) is OK for this use case.

The blockchain applications bring new constraints and requirements to the consensus problem. In blockchains, the participants can now be Byzantine, motivated by financial gains. So it is not sufficient to limit the consensus participants to be 3-5 nodes, because the rest of the network does not necessarily trust these nodes. In Blockchains, for reasons of attestability and tolerating colluding groups of Byzantine participants, it is preferred to keep the participants at 1000s to 10,000s. Thus the problem becomes: How do you design byzantine tolerant consensus algorithm that scales to 1000s of participants?

Moreover, the consensus problems solved are also slightly different. The Bitcoin-NG paper provided a nice formulation of the blockchain consensus problem in its model section.

Comparing the Nakamoto consensus problem with the classic Byzantine consensus problem is very instructional. Nakamoto consensus has probabilistic guarantees for Termination, Agreement, and Validity, whereas the classic Byzantine Consensus (which a Byzantized version of Paxos solves) has deterministic guarantees for them. (To keep the discussion simple in this post, I didn't get into a byzantine tolerant Paxos protocol, which is more complicated. By utilizing randomization/probabilities, blockchain has a really simple protocol --I have to give it to it.)

MAD questions

(This section is here due to my New Year's resolution.)
1. Well, this entire post started as a result of a MAD question: "What if I try to map Blockchain protocol components to that of Paxos protocol components?" So I chalk that up as my first MAD question :-)

2. Now that we have this mapping, is there a way to leverage on this to synthesize a new insight? What is the next step? Sometimes providing such a mapping can help give someone a novel insight, which can lead to a new protocol.

Tuesday, March 20, 2018

Leveraging data and people to accelerate data science (Dr. Laura Haas)

Last week Dr. Laura Haas gave a distinguished speaker series talk at our department. Laura is currently the Dean of the College of Information and Computer Sciences at University of Massachusetts at Amherst. Before that, she was at IBM for many years, and most recently served as the Director of IBM Research’s Accelerated Discovery Lab. Her talk was on her experiences at the Accelerated Discovery Lab, and was titled "Leveraging data and people to accelerate data science"

Accelerated Discovery Lab

The mission of the lab was to "help people get insight from data -- quickly".
The lab aimed to manage the technology and data complexity, so that clients can focus on solving their challenges. This job involved providing the clients with:

  1. expertise: math, ml, computing
  2. environment: hosted big data platforms
  3. data: curated & governed data sets to provide context
  4. analytics: rich collection of analytics and tools.

By providing these services, the lab "accelerated" around 30 projects. The talk highlighted 3 of them:

  • Working with CRM and social media information, which involved managing complex data.
  • Working with a major seed company, which involved analyzing satellite imaging, and deciding what/how to plant, and choosing the workflows that provide auditability, quality, accuracy.
  • Working with a drug company, which involved "big" collaboration across diverse teams and organizations.

The Food Safety Project

The bulk of the talk focused on the food-safety case-study application that the lab worked on.

Food safety is an important problem. Every year, in the USA alone food poisoning affects 1/6th of people, causes $50 million ilnesses, 128K hospitalizations, 3K deaths, and costs 8 billion dollars.

Food poisoning is caused by contaminants/pathogens introduced in the supply-chain. Laura mentioned that the state of the art in food testing was based on developing a suspicion and as a result testing a culture. This means you need to know what you are looking for and when you are looking for it.

Recently DNA sequencing became affordable & fast and enabled the field of metagenomics. Metagenomics is the study of genetic material (across many organisms rather than a single organism) recovered directly from environmental samples. This enabled us to  build a database of what is normal for each type of food bacteria pattern. Bacteria are the tiny witnesses to what is going on in the food! They are canary in the coal mine. Change in the bacteria population may point to several pathologies, including lead poisoning. (Weimer et al. IBM Journal of R&D 2016.)

The big data challenge in food safety

If you have a safe sample for metagenomics, you can expect to see 2400 microbial species. As a result, one metagenomics file is 250 GB! And 50 metagenomics samples result in 149 workflows invoked 42K times producing 271K files and 116K graphs.
(Murat's inner voice: Umm, good luck blockchainizing that many big files. Ok, it may be possible to store only the hashes for integrity.)

Another big challenge here is that contextual metadata is needed for the samples: when was the sample collected, how, under what conditions, etc.

Data lake helps with the management tasks. A data lake is a data ecosystem to acquire catalogue, govern, find use data in contexts. It includes components such as

  • schema: fields, columns, types, keys
  • semantics: lineage, ontology
  • governance: owner, risk, guidelines, maintenance
  • collaboration: users, applications, notebooks

Since the data collected involves sensitive information, the employees that have access to the data were made to sign away rights for ensuring privacy of the data. Violating these terms constituted a firable offence. (This doesn't seem to be a rock solid process though. This relies on good intentions of people and the coverage of the monitoring to save the day.)

The big analytics challenge in food safety

Mapping a 60Gb raw test-sample data file against a 100Gb references database may take hours to days! To make things work data and references files keep getting updated.

The lab developed a workbench for metagenomic computational analytics. The multipurpose extensible analysis platform tracks 10K datasets and their derivation, performs automatic parallelization across compute clusters, and  provides interactive UI and repeatibility/transparency/accountability. (Edund ibm journal 2016)

The big collaboration challenge in food safety

4 organizations worked together on the food safety project: IBM, mars petfood company (pet-food chains were the most susceptible to contamination), UC Davis, and Bio-Rad labs. Later Cornell also joined. The participants spanned across US, UK, and China. The participants didn't talk the same language: the disciplines spanned across business, biology, bioinformatics, physics, chemistry, cs, ml, stat, math, and operations research.

For collaboration, email is not sufficient. There are typically around 40K datasets, which ones would you be emailing? Moreover, emailing also doesn't provide enough context about the datasets and metadata.

To support collaboration the lab built a labbook integration hub. (Inner voice: I am relieved it is not Lotus Notes.) The labbook is a giant knowledge graph that is annotatable, searchable, and is interfaced with many tools, including Python, Notebooks, etc. Sort of like Jupiter Notebooks on steroids?

Things learned

Laura finished with some lessons learned. Her take on this was: Data science needs to be holistic, incorporating all these three: people, data, analytics.

  • People: interdisciplinary hard, social practices/tools can help
  • Data: data governance is hard, it needs policies, tools
  • Analytics: many heterogenous set of tools need to be integrated for handling uncertainty

As for the future, Laura mentioned that being very broad does not work/scale well for a data science organization, since it is hard to please every one. She said that the accelerated discovery lab will focus on metagenomic and materials science to stay more relevant.

MAD questions

(This section is here due to my New Year's resolution.)

1. At the question answer time for the talk, I asked Laura about IBM's recent push for blockchains in food safety and supply-chain applications. Given that the talk outlined many hard challenges for food safety supply chains, I wanted to learn about her views on what roles blockchains can play here. Laura said that in food supply-chains there were some problems with faked provenance of sources and blockchains may help address that issue. She also said that she won't comment if blockchain is the right way to address it, since it is not her field of expertise.

2. My question was prompted by IBM's recent tweet of this whitepaper which I found to be overly exuberant about the role blockchains can play in the supply-chain problems.( Below is the Twitter exchange ensued after I took issue with the report. Notice how mature @IBMServices account is about this? They pressed on to learn more about the criticism, which is a good indicator of intellectual honesty. (I wish I could have been a bit more restrained.)

I like to write a post about the use of blockchains in supply-chains. Granted I don't know much about the domain, but when did that ever stop me?

3. This is a little unrelated, but was there any application of formal methods in the supply-chain problem before?

4. The talk also mentioned a bit about applying datascience to measure contributions of datascience in these projects. This is done via collaboration trace analysis: who is working with who, how much, and in what contexts? A noteworthy finding was the "emergence of new vocabulary right before a discovery events". This rings very familiar in our research group meeting discussions. When we observe an interesting new phenomenon/perspective, we are forced to give it a made-up name, which sounds clumsy at first. When we keep referring to this given name, we know there is something interesting coming up from that front.

Sunday, March 18, 2018

Paper review. Enhancing Bitcoin Security and Performance with Strong Consistency via Collective Signing.

This paper appeared in USENIX Security in 2016, and is by Eleftherios Kokoris Kogias, Philipp Jovanovic, Nicolas Gailly, Ismail Khoffi, Linus Gasser, and Bryan Ford at EPFL.

The link includes the conference presentation video which is useful. Kudos to USENIX for providing this. Other organizations should take a hint. It is 2018-- how hard is it to make the conference videos and presentation material available online?

The problem

Bitcoin does not provide instant irreversibility. Bitcoin protocol needs 6 consecutive blocks to be appended to conclude the irreversibility of a block with very high probability. A useful analogy here is to imagine the 6 additional blocks trapping the original block in amber layers. After that the adversaries don't have the computing power to go back 6 blocks to rewrite history and catch up and beat the current longest chain.

Instant irreversibility would be helpful, because it would save you from having to wait 6 more blocks to be added (which amounts to 1 hour in Bitcoin's block mining rate) to finalize a transaction.

The key idea

To provide instant irreversibility in a PoW-based blockchain, the paper proposes the Byzcoin protocol which employs practical byzantine consensus protocol (PBFT) over a special committee/council of nodes.

Byzcoin assembles the special council group by populating it with the PoW winners in the last 24 hours, and updates this council in a sliding window fashion. In contrast to Bitcoin which employs PoW for electing a leader that has the ultimate say on the next block, in Byzcoin PoW is used for electing members to the council which collectively endorses the next block. For consensus the council runs PBFT and signs the block with their blessing. This makes the block instantly irreversible. The instant irreversibility works provided that the council has less than 1/3 byzantine nodes.

The 24 hour window-frame for assembling council members from is chosen because for a mining rate of 10 minutes per PoW, a 24 hour period corresponds to 144 PoW winners. A smaller number of nodes in the council would be problematic, because there would still be a non-insignificant probability that more than the 1/3 of selected members to the group is Byzantine, even though in the larger population byzantine nodes are less than 1/3. The rewarding for mining a PoW puzzle is done slowly over the window frame, and not immediately when miner mines a keyblock. This way the miner is incentivized to stay and serve as part of the council to collect the entire reward.

To enhance throughput, Byzcoin adopts Bitcoin-NG's microblocks idea.

To improve efficiency of collective endorsement of blocks, Byzcoin employs using Shnorr signatures for scalable collective signing and public validation of blocks. Collective signing reduces both the costs of PBFT rounds and the costs for light clients to verify transaction commitment.

ByzCoin implements Byzantine consensus using collective signing rounds to make PBFT's prepare and commit phases scalable. Once a miner creates a new keyblock, it forms a CoSi communication tree for collective signing with itself as the leader.  Collective signing enables the leader to request publicly validated statement through Schnorr multi signatures with communication trees that are used in multicast protocols for scalability purposes.

In the original PBFT protocol, the trustees authenticate each other via non-transferable symmetric-key MACs: each trustee must communicate directly with most other trustees in every round, thus yielding O($n^2$) communication complexity.
By replacing MAC-authenticated communication with digital signatures, and employing scalable collective signing over multicast communication trees, Byzcoin reduces per-round communication complexity further to O(log n) and reduces signature verification complexity from O(n) to O(1).

Increased attack surface?

I think Byzcoin creates a problem by increasing the attack surface. It gives a period of 24 hours (or due to the sliding window, most likely less than that) for the attackers to conspire and buy council members.

As recent studies show Bitcoin is not very decentralized. There are heavy players mining good fraction of the blocks. Within a frame of many hours, it may be possible for the known members of the council to conspire to commit a heist.

Of course you cannot invent transactions on behalf of others because you don't have their private keys, but when council members conspire such that the number of Byzantine nodes increase of 1/3rd of the council, they may rewrite history and pull off some double-spending transactions. Or, maybe they are motivated not by profit but by another motive such as making the system nonoperational for some duration.

MAD questions

1. What is the maximum damage a Byzantine council can do? I am not savvy about the type of exchanges and smartcontracts deployed in blockchains in practice. Especially, since I haven't read much about smartcontracts yet, I am still unclear about what could be the worst heist a Byzantine council can pull.

2. How does Elastico compare with Byzcoin? Elastico paper and Byzcoin papers came out around the same time, so they do not cite each other. Elastico also used councils and run PBFT in the councils as part of the sharding protocol it implemented. Elastico did not claim instant-irreversibility. While it probably comes close to achieving it, there is still a short probabilistic reversibility window remaining in Elastico. Byzcoin provides instant-irreversibility thanks to the council running PBFT to endorse blocks to be added to the chain.

Friday, March 16, 2018

Paper review. A secure sharding protocol for open blockchains.

This paper appeared in ACM CCS'16. It is authored by Loi Luu, Viswesh Narayanan, Chaodong Zheng,  Kunal Baweja, Seth Gilbert, and Prateek Saxena.

Here is a video of the conference presentation.

The problem

The Bitcoin transaction throughput does not scale. Bitcoin's PoW blockchain consumes massive computational power yet can only process up to 7 transactions per second.

The paper proposes, Elastico, a new distributed agreement protocol, based on a non-PoW Byzantine consensus protocol, for permissionless blockchains.

The challenge is that classical/non-PoW Byzantine consensus protocols do not work in an open environment:

The main idea

The key idea in Elastico is to partition the network into smaller committees, each of which processes a disjoint set of transactions (or a "shard"). The number of committees grows linearly in the total computational power of the network. Each committee has a reasonably small number of members, around 100, so they can run a classical byzantine consensus protocol to decide their agreed set of transactions in parallel.

Sharding protocols are commonly used in distributed databases and in cloud infrastructure in trusted environments. Elastico provides the first sharding protocol for permissionless blockchains tolerating a constant fraction of byzantine network nodes.

The model

The agreement property is a relaxation of the original byzantine consensus problem. The "agreement" property allows the honest processors to be in "probabilistic agreement" such that processors agree on a value with some high probability, rather than being in exact agreement.

This probabilistic part comes from step 5 of the algorithm below. There is still a proof of work component in Elastico, hence, the agreement is probabilistic.

The Elastico protocol

  1. Identity establishment and committee formation: each node uses IP, public key and PoW to locally generate an identity. 
  2. Overlay setup for committees: Nodes communicate to discover identities of other nodes in their committee. A directory committee is formed to do this more efficiently, which entails more details.
  3. Intra-committee consensus: Nodes run a standard byzantine agreement protocol, PBFT, within their assigned committees to agree on a single set of transactions.
  4. Final consensus broadcast: The final committee computes a final block from all the values received from other committees by running PBFT and broadcasts it to the network.
  5. Epoch randomness generation: the final committee runs a distributed commit-and-xor scheme to generate an exponential based but bounded set of random values. These are broadcast and used in the PoW in the next epoch.

The results

Due to its use of committees, Elastico is expected to scale transaction rates almost linearly with available computation for mining: the more the computation power in the network, the higher the number of transaction blocks selected per unit time. The scalability experiments are done on Amazon EC2 with up to 1,600 nodes and confirm the theoretical scaling properties:

"With the same network implementation as in Bitcoin, the scale up (blocks per epoch) for 100, 200, 400, 800 and 1,600 nodes with equal computational power 2 are as theoretical expectation, namely 1, 1.89, 3.61, 6.98 and 13.5 times respectively. Finally, Elastico’s clean-slate design decouples the consensus from block-data broadcasts, hence the bandwidth spent by each node remains almost constant, regardless of the size of the network. Our simulations are necessarily on a smaller scale than Bitcoin; however, if we project our results to a full deployment to a network of Bitcoin’s scale, we can expect a scale up of 10,000 in the number of agreed values per epoch. This agreement throughput is 4 orders of magnitude larger than Bitcoin's."

The related work

Recently Bitcoin-NG, Aspen, and ByzCoin are also related work. They also satisfy the same properties with Elastico in that table.

MAD questions

1. Which is more secure: a probabilistic PoW blockchain or Elastico with the "#byzantine<1/3" assumption?

This may be a false dichotomy, because even in the PoW case the byzantine nodes is assumed to be less than 1/3rd of the network to avoid selfish mining attacks. Ok, given that, let's try to analyze further. In either case the leader has limited power: it cannot invent transactions for others but can only decide on whose transactions to include or not. So only double-spending attack can be performed. The PoW does a good job of it if you wait for 6 more blocks to be added to consider a transaction to be finalized/irreversible. But for Elastico, if the "byzantine<n/3" is violated it is easier to do the double spending attack because there is no PoW and 6 blocks rule to guard the chain against it.

2. The agreement in Elastico is probabilistic, because there is still a PoW component in step 5 of the algorithm: Epoch Randomness Generation. Does that mean Elastico does not provide *instant-irreversibility* of the chain and history can rewritten? Even when the byzantine ratio is less than 1/3?

I didn't see this discussed in the paper. Elastico does not use PoW to select leader that adds a block, but rather select committees. So Elastico may indeed be providing instant-irreversibility of an added block, when byzantine<1/3.

On the other hand, maybe there is a slight chance for violating instant-reversbility. What if a different set of nodes are chosen for the committees which do not have a memory of the last block in the log? There may be a slight chance to pull this off by selecting enough number nodes to whom the last block broadcast has not reached yet. The Byzcoin paper, which I will summarize later, provides instant-irreversibility using a more cleaner approach.

3. Why do we need the committees to run PBFT? Aren't they just putting together transactions in a block? You can do that without using PBFT provided that the transactions satisfy some integrity constraints/checks, right?

I guess this is just to make sure that the block produced is the work of a group, and not by just one individual. Otherwise, a byzantine individual node masquerading the group can unduly influence the outcome. So even when byzantine<1/3, with collusion of such byzantine nodes, agreement can be violated.

4. This is more of a nitpick than a question. It looks like the paper could have provided a more clear discussion on the bound on f: byzantine nodes. At one point the paper says: "Here, 1/4 is an arbitrary constant bounded away from 1/3, selected as such to yield reasonable constant parameters."  Then later it amends: "Note that we select f = 1/4 in order to achieve a practical value of committee size. Theoretically, Elastico can work with any f less than 1/3 by increasing the committee size c accordingly to f. The 1/3 bound is because we need to run a consensus protocol (e.g., PBFT) at every committee in Step 3, which can tolerate at most 1/3 fraction of malicious committee members.)"

5. How does Aspen compare with Elastico?
Aspen also has parallel tracks/channels for processing blocks. In Aspen parallel tracks are dedicated to specific channels. In Elastico parallel track sharding is mostly for improving throughput maybe sharding with user-ids. Elastico provides improvements in faster irreversibility. On the other hand, Aspen's sharding protocol is much simpler/cleaner than Elastico's.

Wednesday, March 14, 2018

Paper review. Service-oriented sharding with Aspen

This week in our seminar, we discussed this linked whitepaper/shortpaper from Adem Efe Gencer, Robbert van Renesse, Emin Gün Sirer.

The problem

Blockchains provide trustless auditability, tamper-resistance, and transparency due to the face of Byzantine participants.  As such there is a community of users that would like to use blockchains to record in an authoritative manner transactions/provenance for many assets (in addition to the prevalent use of cryptocurrencies), such as gold, silver, diamonds, gems, land records, deeds, mortgages, boat titles, fine art, health records, and domain names.

The simplest approach address this demand is to layer these additional blockchains on top of an existing secure blockchain such as Bitcoin. The OP_RETURN opcode is adopted for this purpose, and its use has been increasing rapidly. However, this is not scalable, as it leads to a stream of costly and burdensome transactions. And Bitcoin blockchain is already overburdened and crawling.

The other approach is to create a dedicated, specialized, standalone blockchain for each asset. But that approach suffers from the lack of mining power. How do you bootstrap each blockchain? (That is not very easy problem, especially initially when the number of miners is low, how do you know Byzantine miners are in the minority? At least when piggybacking on BitCoin we are assured that the blockchain had sufficient mining power to withstand attacks.) Moreover, even if it is possible to create a separate thriving blockchain ecosystem for each asset, that would be very wasteful. Given that Bitcoin mining consumes more electricity than many countries in the world does, what would be the cost of having so many different chains? Pretty please, let's not burn down the planet.

The main idea

The proposed protocol, Aspen, securely shards the blockchain (say a popular blockchain like Bitcoin) to provide high scalability to the service users.

Sharding is a well-established technique to improve scalability by distributing contents of a database across nodes in a network. But sharding blockchains is non-trivial. The main challenge is to preserve the trustless nature while hiding transactions in other services of a blockchain from users participating in one service.

"Aspen employs a sharding approach that comes with the following benefits: (1) preserves the total computational power of miners to secure the whole blockchain, (2) prevents users from double-spending their funds while maintaining the same trustless setup assumptions as Bitcoin, (3) improves scalability by absolving non-miner participants --i.e. service users-- from the responsibility of storing, processing, and propagating irrelevant data to confirm the validity of services they are interested in."

The paper would benefit a lot from clarifying this point further. There are two different entities in the blockchain: the service users and the miners. The sharding in Aspen helps reduce the workload on the users, because the users can just focus the transactions on the service they are using. On the other hand, the miners need to know about the history of all the services/channels as well as their service and integration rules. The miners are the nodes that generate the keyblocks and publish service transactions as microblocks as described below.

The architecture

Aspen is an instantiation of service-oriented sharding on Bitcoin-NG. Please go read my short summary of the Bitcoin-NG paper, if you are unfamiliar with it. It is a great paper that prescribes a way to scale the throughput of a PoW blockchain.

Each channel corresponds to a service. Each channel contains 1) the global genesis block (the exact same special block for each channel) 2) all key blocks (the exact same set of blocks for all channels) and 3) the set of microblocks containing custom transactions exclusive to that service.

In other words, Aspen distributes transactions to microblocks with respect to services. The key blocks keep microblock chains together; the key blocks are not per-service, they are the same (i.e. common, shared) for all services.

If you like to look at this from a service-oriented perspective, there are two protocols for each channel/service:
1) A service protocol defines the validity of transactions in a given channel, including the syntax for each transaction type, the relationship between transactions within a channel, the size, frequency, and format constraints for blocks that keep transactions.
2) An integration protocol specifies the security, incentive mechanism, valid service numbers, the genesis block, and the inter-channel communication process between the payment channel and the other channels.

Each channel maintains key blocks to enforce the integration protocol. And each channel maintains the microblocks only to contain the service-specific transactions. Two special channels, payment (to exchange funds) and registration (to add/update services), are defined by Aspen to help bootstrap the blockchain. We discuss these two next.

Flow of funds

To prevent double spending, Aspen uses the payment channel to make each fund spendable only in a specific channel. A special payment channel transaction, funding pore, enables users to lock funds to other channels. Transfers across channels are allowed only in one way, from the payment channel to others.

Alternatively, users can directly buy locked funds at the target channel to pay for the service running on the corresponding channel. Aspen enforces a high minimum fee for serializing funding pores to (1) discourage users from bloating the payment channel and (2) improve the fungibility of funds in nonpayment channels.

As shown in Figure 3.b, a coinbase transaction in a key block provides separate outputs to compensate the current and the previous miner for each service they provision.

Service integration

Users propose protocols to introduce or update services by generating transactions for the registration channel. A service protocol is specified in a platform independent language such as WebAssembly or Lua. The miners conduct an election to choose a registration channel transaction and indicate their choice using ballots. If a particular transaction is referred by more than a certain fraction of ballots, its protocols become active.

The voting process works similar to that in Bitcoin.

MAD questions

1. It may be too much of a burden to require that the miners store/process the entire history of all services/channels, and could create a scalability barrier for the blockchain. Checkpoints (a.k.a. snapshots) can help for this as they make the state more compact: you don't need to go all the way to the origin block, instead you can refer to a recent checkpoint to validate transactions. The paper mentions that the key blocks include checkpoints, but it is unclear if these checkpoints are snapshots at all.

What could go wrong if we included checkpoints as snapshots? An obvious strawman attack is where the leader is proposing an incorrect checkpoint to misrepresent information. I guess the other miners can notice this and add a poison transaction.

I don't know if there are other measures need to be taken to incorporate checkpoints. On a philosophical point, does using checkpoints violate the transparency of blockchain? When the whole log history is available, the nodes have more information to go with. Starting from a recent checkpoint adds some obscurity to the history.

2. It may not be necessary to link all the services together with each key block. So it may be possible to relax Aspen to have specialized miners that only track certain services; when those miners mine a keyblock they would only link the corresponding tracked services together. Statistically, each service will get a miner that tracks it soon and get a key block that integrates it to other services.

I am not sure if this relaxation would lead to security problems. I guess one problem could be incompatibilities between to consequent miners; the rewarding/incentive mechanism may need to be rethought for miners of different kinds trailing each other. Another problem could be that some less popular services may attract less number of miners and high latency transactions. On the other hand, if there is way to make this work, this relaxation can help alleviate the scalability problem for the miners as the number of services grow.

Two-phase commit and beyond

In this post, we model and explore the two-phase commit protocol using TLA+. The two-phase commit protocol is practical and is used in man...