Artwork

Content provided by Robert X. Cringely. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Robert X. Cringely or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://player.fm/legal.
Player FM - Podcast App
Go offline with the Player FM app!

Cool Threads

 
Share
 

Archived series ("Inactive feed" status)

When? This feed was archived on December 03, 2016 13:59 (8y ago). Last successful fetch was on August 01, 2016 12:41 (8y ago)

Why? Inactive feed status. Our servers were unable to retrieve a valid podcast feed for a sustained period.

What now? You might be able to find a more up-to-date version using the search function. This series will no longer be checked for updates. If you believe this to be in error, please check if the publisher's feed link below is valid and contact support to request the feed be restored or if you have any other concerns about this.

Manage episode 154847688 series 1136620
Content provided by Robert X. Cringely. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Robert X. Cringely or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://player.fm/legal.

A couple of columns ago we touched on the practical rebirth of parallel computing. In case you missed that column (it's in this week's links), the short version is that Moore's Law is letting us down a bit when it comes to the traditional way of increasing the power of microprocessors, which is by raising clock speeds. We've hiked them to the point where processors are so small and running so hot that they are in danger of literally melting. Forget about higher clock speeds then; instead we'll just pack two or four or 1000 processor cores into the same can, running them in parallel at slower speeds. Instantly we can jump back onto the Moore's Law performance curve, except our software generally doesn't take advantage of this because most programs were written for single cores. So we looked back at the lessons of parallel supercomputers, circa 1985, and how some of today's software applies those lessons, such as avoiding dependencies and race conditions.

But we didn't really talk much in that column about the use of threads, which are individual processes spun off by the main CPU. Each time the microprocessor adds a new task, it creates a thread for that task. If the threads are running on the same processor they are multiplexed using time-slicing and only appear to run in parallel. But if the threads are assigned to different processors or different cores they can run truly in parallel, which can potentially get a lot of work done in a short amount of time.

Most traditional PC applications are single-threaded, meaning the only way to make them go faster without a completely new architecture is to run the CPU at a faster clock rate. Single-threaded apps are simpler in that they are immune to the dependencies and race conditions that can plague true parallel code. But they are also totally dependent on tasks being completed in a quite specific order, so in that sense they can be dramatically slower or dramatically more resource-intensive than multi-threaded apps.

For an example of where single-threaded applications are just plain slower, consider Eudora, which is still my favorite e-mail client (I'm old, you don't have to tell me). Until not very long ago Eudora still couldn't send mail in background, so everything (and everyone, including the user -- me) had to wait until the mail was sent before completing anything else, like writing a new message or replying to an old one. I KNOW THIS IS NO LONGER THE CASE, SMARTY-PANTS -- THIS IS JUST AN EXAMPLE. The program was single-threaded and, since sending mail is a very slow activity, users were generally aware that they were waiting. Today Eudora sends mail in background, which is the same as saying "in another thread."

Multithreading has been great for user interactivity because nothing should ever stop the input of data from typing, mouse movements, etc.

There are many ways to use threads and before we consider some let's think about scale -- literally how many threads are we talking about? To run at true clock speed we'd have only one thread per CPU core, but a fast processor can multiplex hundreds or even thousands of threads and multi-core processors can do even more. So the EFFICIENT shift to multi-threaded programming requires a significant change in thinking on the part of developers.

Here's an example: A hard problem with programming games is when you want something to happen every so often. That's not very efficient to code because it traditionally requires a program loop that spins as fast as the CPU will let it (making the CPU go to 100 percent) and keeps checking the time to see if it is time to do that thing. But threads are different: With threads you can very easily put them to sleep for any period of time, or even put them to sleep indefinitely until some event occurs. It's not only easier for the programmer, it takes nearly no CPU power compared to the looping system.

How do you use threads to write an e-mail server that handles thousands of simultaneous incoming e-mails? Well, you write it as if you were writing a server that can only handle ONE e-mail at a time. Just write very simple code that knows how to accept e-mail, then test it by sending in an e-mail. It works? Cool. Now send 1000 threads into that same piece of code. Each thread has its own state as to what e-mail (FROM, TO, SUBJECT, etc.) it is receiving, but despite the fact that that the content is different, the process is exactly the same. Now you have an e-mail server capable of serving a thousand simultaneous connections.

See, writing multi-threaded apps may require a different approach but the benefits from doing so can be fantastic.

A new area of multi-threaded programming that is REALLY hard (hard even for developers who do normal multi-threaded programming really well) is the use of optimistic concurrency. Two columns ago I alluded to this in my example of Appistry's decision to forego using a database for a credit card processing application. I said I would show a hack that was yet another approach to the same problem. Well here comes the hack.

Let's consider this problem: My wife (the young and lovely Mary Alyce) and I happen to be in different parts of town, each of us standing in front of a bank ATM machine. I am a thread, Mary Alyce is a thread, and the ATM is main memory. Contrary to our usual behavior in which we only take money OUT of the bank, we are paradoxically planning near-simultaneous bank deposits. Our balance starts at $1000. I am depositing $200 while Mary Alyce is depositing $300.

The ATM does the process like this:

  1. Retrieve existing balance
  2. Add new deposit to that balance
  3. Update new total balance

The trick is that we are both doing this same thing at the same time. What if this happens:

Bob starts deposit transaction
Mary Alyce starts deposit transaction
Bob's ATM grabs the balance ($1000)
Mary Alyce's ATM grabs the balance ($1000)
Bob deposits $200 (his ATM updates the balance to $1200)
Mary Alyce deposits $300 (her ATM updates the balance to $1300)
Bob's ATM updates the main database with the new balance ($1200)
Mary Alyce's ATM updates the main database with the new balance ($1300)

This is a race condition. Mary Alyce updated the balance last so my deposit is lost completely. There are many ways for this to work out still, but also many ways for it not to work out OK. And any traditional solution requires a LOT of back-end reconciliation and computation. We need something simpler.

Classic concurrency control would basically LOCK the database. This is called, not surprisingly, "pessimistic concurrency." So I go up to the ATM and my ATM requests the database to be locked for me. If Mary Alyce then went to another ATM it would tell her she has to wait because the database was being updated elsewhere. This ensures that the race condition can't happen, but it also holds up Mary Alyce, who does not like to be kept waiting.

Optimistic concurrency control says, "We KNOW there could be a race condition, but we'll add a very cheap way to detect if it occurred. And if so, we'll pay the very expensive cost of restarting one of the transactions from the beginning."

The only changes from the above sequence are in the last two lines:

Bob's ATM updates the main database with the new balance ($1200)
Mary Alyce's ATM updates the main database with the new balance ($1300)

These become:

Bob's ATM updates the main database with the new balance ($1200) so long as the current balance is still $1000.
Mary Alyce's ATM updates the main database with the new balance ($1300) so long as the current balance is still $1000.

That's easy to write but trickier to implement. The check of the balance and the update of the new balance must occur within the microprocessor as an atomic action. It all must happen basically as one single operation. Some processors have had these instructions for a long time, but they're pretty common now and called "test-and-set" or "compare-and-set" instructions.

If the "so long as" part fails, the whole transaction must be restarted from scratch. In our example, Mary Alyce's $300 would pop back out of the machine and she'd have to start over.

That's very expensive for Mary Alyce, but the actual occurrence of the race condition is very rare. So although the redo is expensive it hardly ever happens, so no one has to wait for another person doing an update operation.

Apply this to 25,000 ATMs and suddenly the database is decoupled from transaction processing and the system is additionally controlled for internal race conditions such that it can run with less code and at full speed, which is saying something. Suddenly the system can be 100 times faster (cascading 10X improvements) or run 10 times faster on one tenth the hardware (take your pick), all thanks to the timely embrace of clever multi-threaded programming.

  continue reading

15 episodes

Artwork
iconShare
 

Archived series ("Inactive feed" status)

When? This feed was archived on December 03, 2016 13:59 (8y ago). Last successful fetch was on August 01, 2016 12:41 (8y ago)

Why? Inactive feed status. Our servers were unable to retrieve a valid podcast feed for a sustained period.

What now? You might be able to find a more up-to-date version using the search function. This series will no longer be checked for updates. If you believe this to be in error, please check if the publisher's feed link below is valid and contact support to request the feed be restored or if you have any other concerns about this.

Manage episode 154847688 series 1136620
Content provided by Robert X. Cringely. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Robert X. Cringely or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://player.fm/legal.

A couple of columns ago we touched on the practical rebirth of parallel computing. In case you missed that column (it's in this week's links), the short version is that Moore's Law is letting us down a bit when it comes to the traditional way of increasing the power of microprocessors, which is by raising clock speeds. We've hiked them to the point where processors are so small and running so hot that they are in danger of literally melting. Forget about higher clock speeds then; instead we'll just pack two or four or 1000 processor cores into the same can, running them in parallel at slower speeds. Instantly we can jump back onto the Moore's Law performance curve, except our software generally doesn't take advantage of this because most programs were written for single cores. So we looked back at the lessons of parallel supercomputers, circa 1985, and how some of today's software applies those lessons, such as avoiding dependencies and race conditions.

But we didn't really talk much in that column about the use of threads, which are individual processes spun off by the main CPU. Each time the microprocessor adds a new task, it creates a thread for that task. If the threads are running on the same processor they are multiplexed using time-slicing and only appear to run in parallel. But if the threads are assigned to different processors or different cores they can run truly in parallel, which can potentially get a lot of work done in a short amount of time.

Most traditional PC applications are single-threaded, meaning the only way to make them go faster without a completely new architecture is to run the CPU at a faster clock rate. Single-threaded apps are simpler in that they are immune to the dependencies and race conditions that can plague true parallel code. But they are also totally dependent on tasks being completed in a quite specific order, so in that sense they can be dramatically slower or dramatically more resource-intensive than multi-threaded apps.

For an example of where single-threaded applications are just plain slower, consider Eudora, which is still my favorite e-mail client (I'm old, you don't have to tell me). Until not very long ago Eudora still couldn't send mail in background, so everything (and everyone, including the user -- me) had to wait until the mail was sent before completing anything else, like writing a new message or replying to an old one. I KNOW THIS IS NO LONGER THE CASE, SMARTY-PANTS -- THIS IS JUST AN EXAMPLE. The program was single-threaded and, since sending mail is a very slow activity, users were generally aware that they were waiting. Today Eudora sends mail in background, which is the same as saying "in another thread."

Multithreading has been great for user interactivity because nothing should ever stop the input of data from typing, mouse movements, etc.

There are many ways to use threads and before we consider some let's think about scale -- literally how many threads are we talking about? To run at true clock speed we'd have only one thread per CPU core, but a fast processor can multiplex hundreds or even thousands of threads and multi-core processors can do even more. So the EFFICIENT shift to multi-threaded programming requires a significant change in thinking on the part of developers.

Here's an example: A hard problem with programming games is when you want something to happen every so often. That's not very efficient to code because it traditionally requires a program loop that spins as fast as the CPU will let it (making the CPU go to 100 percent) and keeps checking the time to see if it is time to do that thing. But threads are different: With threads you can very easily put them to sleep for any period of time, or even put them to sleep indefinitely until some event occurs. It's not only easier for the programmer, it takes nearly no CPU power compared to the looping system.

How do you use threads to write an e-mail server that handles thousands of simultaneous incoming e-mails? Well, you write it as if you were writing a server that can only handle ONE e-mail at a time. Just write very simple code that knows how to accept e-mail, then test it by sending in an e-mail. It works? Cool. Now send 1000 threads into that same piece of code. Each thread has its own state as to what e-mail (FROM, TO, SUBJECT, etc.) it is receiving, but despite the fact that that the content is different, the process is exactly the same. Now you have an e-mail server capable of serving a thousand simultaneous connections.

See, writing multi-threaded apps may require a different approach but the benefits from doing so can be fantastic.

A new area of multi-threaded programming that is REALLY hard (hard even for developers who do normal multi-threaded programming really well) is the use of optimistic concurrency. Two columns ago I alluded to this in my example of Appistry's decision to forego using a database for a credit card processing application. I said I would show a hack that was yet another approach to the same problem. Well here comes the hack.

Let's consider this problem: My wife (the young and lovely Mary Alyce) and I happen to be in different parts of town, each of us standing in front of a bank ATM machine. I am a thread, Mary Alyce is a thread, and the ATM is main memory. Contrary to our usual behavior in which we only take money OUT of the bank, we are paradoxically planning near-simultaneous bank deposits. Our balance starts at $1000. I am depositing $200 while Mary Alyce is depositing $300.

The ATM does the process like this:

  1. Retrieve existing balance
  2. Add new deposit to that balance
  3. Update new total balance

The trick is that we are both doing this same thing at the same time. What if this happens:

Bob starts deposit transaction
Mary Alyce starts deposit transaction
Bob's ATM grabs the balance ($1000)
Mary Alyce's ATM grabs the balance ($1000)
Bob deposits $200 (his ATM updates the balance to $1200)
Mary Alyce deposits $300 (her ATM updates the balance to $1300)
Bob's ATM updates the main database with the new balance ($1200)
Mary Alyce's ATM updates the main database with the new balance ($1300)

This is a race condition. Mary Alyce updated the balance last so my deposit is lost completely. There are many ways for this to work out still, but also many ways for it not to work out OK. And any traditional solution requires a LOT of back-end reconciliation and computation. We need something simpler.

Classic concurrency control would basically LOCK the database. This is called, not surprisingly, "pessimistic concurrency." So I go up to the ATM and my ATM requests the database to be locked for me. If Mary Alyce then went to another ATM it would tell her she has to wait because the database was being updated elsewhere. This ensures that the race condition can't happen, but it also holds up Mary Alyce, who does not like to be kept waiting.

Optimistic concurrency control says, "We KNOW there could be a race condition, but we'll add a very cheap way to detect if it occurred. And if so, we'll pay the very expensive cost of restarting one of the transactions from the beginning."

The only changes from the above sequence are in the last two lines:

Bob's ATM updates the main database with the new balance ($1200)
Mary Alyce's ATM updates the main database with the new balance ($1300)

These become:

Bob's ATM updates the main database with the new balance ($1200) so long as the current balance is still $1000.
Mary Alyce's ATM updates the main database with the new balance ($1300) so long as the current balance is still $1000.

That's easy to write but trickier to implement. The check of the balance and the update of the new balance must occur within the microprocessor as an atomic action. It all must happen basically as one single operation. Some processors have had these instructions for a long time, but they're pretty common now and called "test-and-set" or "compare-and-set" instructions.

If the "so long as" part fails, the whole transaction must be restarted from scratch. In our example, Mary Alyce's $300 would pop back out of the machine and she'd have to start over.

That's very expensive for Mary Alyce, but the actual occurrence of the race condition is very rare. So although the redo is expensive it hardly ever happens, so no one has to wait for another person doing an update operation.

Apply this to 25,000 ATMs and suddenly the database is decoupled from transaction processing and the system is additionally controlled for internal race conditions such that it can run with less code and at full speed, which is saying something. Suddenly the system can be 100 times faster (cascading 10X improvements) or run 10 times faster on one tenth the hardware (take your pick), all thanks to the timely embrace of clever multi-threaded programming.

  continue reading

15 episodes

All episodes

×
 
Loading …

Welcome to Player FM!

Player FM is scanning the web for high-quality podcasts for you to enjoy right now. It's the best podcast app and works on Android, iPhone, and the web. Signup to sync subscriptions across devices.

 

Quick Reference Guide