<?xml version="1.0" encoding="utf-8" ?>
<?xml-stylesheet href="/pub/Applications/RssViewTemplate/pretty-feed.xsl" type="text/xsl" ?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom">
<channel>
  <title>Atomikos - Comments on The Achilles heel of the CAP theorem</title>
  <link>https://www.atomikos.com/Blog/TheAchillesHeelOfTheCAPTheorem</link>
  <description>Comments</description>
  <atom:link href="https://www.atomikos.com/Blog/TheAchillesHeelOfTheCAPTheorem" rel="self" type="application/rss+xml" />
  <dc:source>Atomikos</dc:source>
  <image>
	 <url>/pub/Main/SitePreferences/atomikos_logo.webp</url>
	 <title>Atomikos - Comments on The Achilles heel of the CAP theorem</title>
	 <link>https://www.atomikos.com/Blog/TheAchillesHeelOfTheCAPTheorem</link>
  </image>
  <language>en-us</language>
  <copyright>Copyright 2026 Atomikos BVBA</copyright>
<item>
  <title>» Only process reque ...</title>
  <link>https://www.atomikos.com/Blog/TheAchillesHeelOfTheCAPTheorem#comment27</link>
  <guid>https://www.atomikos.com/Blog/TheAchillesHeelOfTheCAPTheorem#comment27</guid>
  <dc:creator>barry</dc:creator>
  <dc:date>2009-03-28T21:45:54+01:00</dc:date>
  <description> <![CDATA[ » Only process requests when there is no partition problem.
<p></p>
Doesn't this mean that you are sacrificing availability? You've turned a failure in partitioning into a failure in availability. While the answers and responses are queued so no request or response is lost, that doesn't mean all is well. A response may take a long time to come back which is as much of a problem as getting an error. ]]></description>
</item>
<item>
  <title>Hi Dan,
Sure have I ...</title>
  <link>https://www.atomikos.com/Blog/TheAchillesHeelOfTheCAPTheorem#comment12</link>
  <guid>https://www.atomikos.com/Blog/TheAchillesHeelOfTheCAPTheorem#comment12</guid>
  <dc:creator>Guy</dc:creator>
  <dc:date>2009-01-20T20:30:00+01:00</dc:date>
  <description> <![CDATA[ Hi Dan,
<p></p>
Sure have I read "Impossibility of distributed consensus with one faulty process" - it is at the basis of the heuristic exceptions in all two-phase commit solutions (including Atomikos).
<p></p>
However, what I am saying is that the failure usually only lasts for so long, and afterward things can move on. Exploiting the right tools to do that can help availability.
<p></p>
That is the main advantages of (persistent) queues and that is all I am saying. Lynch et al do not seem to exploit it as much as they could...
<p></p>
Guy ]]></description>
</item>
<item>
  <title>Have you read:
http ...</title>
  <link>https://www.atomikos.com/Blog/TheAchillesHeelOfTheCAPTheorem#comment11</link>
  <guid>https://www.atomikos.com/Blog/TheAchillesHeelOfTheCAPTheorem#comment11</guid>
  <dc:creator>PetrolHead</dc:creator>
  <dc:date>2009-01-20T20:11:00+01:00</dc:date>
  <description> <![CDATA[ Have you read:
<p></p>
<a href='http://portal.acm.org/citation.cfm?id=214121' class='natExternalLink' rel='nofollow noopener noreferrer' target='_blank'>http://portal.acm.org/citation.cfm?id=214121</a>
<p></p>
"Impossibility of distributed consensus with one faulty process"?
<p></p>
This is an important result and has significance to your comments and the CAP theorem. Essentially one can't tell the difference between a genuine failure and a slow running machine or busy network.
<p></p>
Thus your solution might work for a very small number of machines all in a single data-centre but for larger installations, failure of machines, routers, switches, cables etc will happen several times a day and thus quorums and clusters become considerably less practical and loose consistency more attractive.
<p></p>
Note also that the theorem isn't just about clustered services in the traditional sense but also services that run across multiple data-centres.
<p></p>
I also have a specific observation:
<p></p>
"....note that quorum solutions exist to avoid that the complete cluster has to be up at the same time."
<p></p>
This is true but they are limited by a number of factors practically:
<p></p>
(1) The assumption that you will have a majority - seemingly this is straightforward but a partition plus a loss of a machine can leave you without a majority.
<p></p>
(2) Getting all members back into sync. Can require all sorts of special admin involvement and it can go wrong.
<p></p>
(3) Performance - quorum protocols especially across enough nodes to ensure survival can be slow.
<p></p>
(4) Ensuring that clients don't continue to make use of the minority during a partition e.g. reporting out-of-date information.
<p></p>
(5) You can have a cluster capable of achieving consensus but you can't reach it because the network is broken between cluster and clients.
<p></p>
Best,
<p></p>
Dan.<br />http://www.dancres.org/blitzblog ]]></description>
</item>
</channel>
</rss>