Programming in the Age of Concurrency - Anders Hejlsberg and Joe Duffy: Concurrent Programming with | Going Deep

Microsoft is developing a number of technologies to simplify the expression of parallelism in code. An example of this work is Parallel Extensions for the .NET Framework (PFX), a managed programming model for data parallelism, task parallelism, scheduling, and coordination on parallel hardware.

PFX makes it easier for developers to write programs that take advantage of parallel hardware (you've all heard of multi-core and what the future holds with many-core...), without having to deal with the complexities of threads and locks in today’s concurrent programming story. Of course, PFX is not a concurrent programming silver bullet. There is still a great deal of work left to do in the imperative programming world's approach to concurrency. PFX is an excellent start with a syntax that .NET developers can relate to and understand.

Here, Joe Duffy, Senior Software Engineer, and Technical Fellow Anders Hejlsberg sit down with me to discuss the basics and some of the details of the managed PFX library's architecture and implementation, whiteboard included.

For more information on specific technologies, check out the PLINQ and TPL articles in the October 2007 issue of MSDN Magazine.

High res video download file here.

Follow the Discussion

Follow
Oops, something didn't work.

Try again?

Sign In to begin to follow these comments

What does this mean?
Following an item on Channel 9 allows you to watch for new content and comments that you are interested in. You need to be signed in to Channel 9 to use this feature.

Getting "follow" information

Stop following these comments

Start following these comments

What does this mean?
Following an item on Channel 9 allows you to watch for new content and comments that you are interested in and view them all on your notifications page.

Removing your "follow"

Setting up your "follow"

Did you know you can
sign up for email notifications?
- Subscribe to these comments

PerfectPhase "This is not war, this is pest control!" - Dalek to Cyberman

Oct 12, 2007 at 1:42 PM

Is the Task Parallel Lib built on the 2.0 CLR, wasn't clear to me if it's a pure .net assembly, or if it needs changes to and is intergrated with the CLR.
tarlano

Oct 12, 2007 at 2:42 PM

Can anyone answer the question of how PFX relates to the Concurrency and Coordination Runtime (CCR) that ships with the MS robotics studio?

Anthony
Glucose Scott Hanselman

Oct 12, 2007 at 3:27 PM

Great stuff! This is going to be a fun technology. Also check out a podcast interview with Stephen Toub, another member of the team.

--
Scott Hanselman
littleguru <3 Seattle

Oct 13, 2007 at 3:41 AM

I have some questions. Let's have this simple call:

Parallel.For(0, count, i => a(i));

I would like to know how you guys make sure that function a(i) isn't calling a x(...) that relies on a global variable (like a singleton). How are you checking that? Isn't that checked at all - is it up to the dev to check that? I can see a lot of people saying: "my code worked and now it isn't working anymore, damn PFX library. It killed it!" instead of checking out there code for possible "side effects".

Are there papers or docs around that you used for your for global variables and shared resources checks?

I liked it when we got into the more "geeky" part... the thing about what item is picked from the queue etc. Would have been nicer if we would have gotten more about that stuff - speaking of the "geeky" stuff.

But it was a very nice video! Thanks for your time guys.
ktr two sides to everything

Oct 13, 2007 at 9:47 AM

wow that is very impressive. i'm really stunned that it is that easy now. multiple threads optimized for core count, plus work stealing and all those awesome feautures essentially for free! too cool.

are there any plans to integrate this library into the actual language going into the future? that would be very cool.

maybe like:

parallel for (int i = 0; i < 10; i++)
// do something

that would feel very natural, and i don't think it would be that much of a leap. LINQ queries should be automatically implemented as parallel, just because of the nature of a query - there should be no expectation of sequentiality.
amotif No Silver Bullet

Oct 13, 2007 at 11:26 AM

Good video.

littleguru wrote:

I would like to know how you guys make sure that function a(i) isn't calling a x(...) that relies on a global variable (like a singleton).

That's an interesting thought. I have no particular insight into the answer, but I'm guessing it would be quite difficult to implement due to at least a couple reasons:

a) To generalize your question (correctly, I hope), it's not just globals that are a concern, it's any shared state--is your lambda expression using shared state in a thread-safe fashion? Here's a simple non-global example:

    R r = new R();
    Parallel.For(0, 1000,
        i => {
            int v = r.Next(1, 6);
            ...
        });

Here we have N threads accessing r. How do you determine whether r is being used in a thread-safe manner? If you know that class R is not thread-safe, then this code isn't either. But how would a compiler or runtime determine whether this is safe, through static or run-time analysis of all the possible code paths? This seems to me to be a rather difficult problem.

b) Aside from the difficultly of analyzing the code, how do you define what's "correct?" If I substitute a thread-safe class for R how would you now determine that the use of r is now thread-safe? Perhaps analysis would show the use of a locking mechanism? What if the new class is written using a lockless technique, how would you determine whether it's safe or not?

As mentioned in the video some problems require shared state and some problems don't. For example, consider applying update operations to a graph in parallel. Shared state isn't a side effect here; in fact, the whole point is to update shared state. How would analysis take that into account?

It's an interesting thought, though.
esoteric λ

Oct 13, 2007 at 2:24 PM

I assume there is a reason why you have to explicitly say you want your code parallelized. It's still an imperative world with islands of declarativeness. But I assume that in the future the compiler or CLR will auto-parallelize some code, where it can determine the safety of it.

But the team has its priorities right. It first creates the foundation for parallelism. Then later it can start to think about (or someone else can) the cases that are safe to parallelize - and where it makes sense to do so.

Maybe, with all the threads running on the machine, it wouldn't make sense to parallize some tasks, even though it would be possible to do so.

Just a thought...

One thing is for sure. The C#/CLR team has a great understanding of how to maximize impact and value whilst keeping disruption minimal.

One thing I didn't pick up from the video - is the Parallel Task Lib built on top of the existing Thread Pool stuff?

Another great video.
littleguru <3 Seattle

Oct 13, 2007 at 2:41 PM

amotif wrote:

a) To generalize your question (correctly, I hope), it's not just globals that are a concern, it's any shared state--is your lambda expression using shared state in a thread-safe fashion?

Yeah. I just picked one case (the singleton) because it's always the same domain of problems... But how you generalized the question is exactly what I had in mind...

I can't really think that static checks (or even runtime checks) are implemented... Such checks could really take very long and could possibly include a big part of the application that is compiled.

Static check would also need to be implemented in a sort of compiler task or in the compiler itself, which isn't that cool if you have a library that possibly targets n compilers. How do you make sure that all the compilers call that task or do the checks on their own?
JChung2006

Oct 13, 2007 at 6:01 PM

ktr wrote:

parallel for (int i = 0; i < 10; i++)
// do something

The way they're doing parallel for is better. The way you've suggested serializes access to i unnecessarily.
Judah

Oct 13, 2007 at 10:39 PM

Great video, Anders & Joe.

Really looking forward to this!

Some folks were asking whether PLinq and the ParallelFX library detects accessing shared state between thread and other naughty things. Perpahs Joe can confirm this, but I'm nearly certain it does not.

Plinq is about allowing developers to easily parallelize queries.

ParallelFx is about allowing developers to more easily parallelize common tasks.

Neither, however, prevents you from creating threading issues by using shared mutable state between threads.

The good news, however, is that by using LINQ, we're headed towards a more functional, declarative future. Those who have used functional languages like Haskell will know that concurrency is a breeze because of no or limited mutable state (e.g. all shared state is readonly, little or nothing has side-effects). As we move forward with Linq, I suspect C# programs will look more and more like functional programs, with less and less shared mutable state.
staceyw Before C# there was darkness...

Oct 14, 2007 at 9:38 PM

littleguru wrote:

I have some questions. Let's have this simple call:

Parallel.For(0, count, i => a(i));

I would like to know how you guys make sure that function a(i) isn't calling a x(...) that relies on a global variable (like a singleton). How are you checking that? Isn't that checked at all - is it up to the dev to check that? I can see a lot of people saying: "my code worked and now it isn't working anymore, damn PFX library. It killed it!" instead of checking out there code for possible "side effects".

Are there papers or docs around that you used for your for global variables and shared resources checks?

I liked it when we got into the more "geeky" part... the thing about what item is picked from the queue etc. Would have been nicer if we would have gotten more about that stuff - speaking of the "geeky" stuff.

But it was a very nice video! Thanks for your time guys.

Based on what I hear and see, there is no checks for RW to shared state. Naturally RO state is safe and functions with only local variable access is safe. Any RW to any shared var would require manual sync as today with a monitor, etc. TMK, there is no magic pill to solve the shared state problem (other then transactional memory which as Anders said is still an ongoing research problem). Pure safe parallel functions would be nice, but not realistic in most apps because we almost always have some shared state to deal with.

Very cool stuff, thanks guys. Look forward to using it.
JoshRoss Niner since 2004

Oct 14, 2007 at 10:06 PM

If anyone hasn't taken a look at Joe's blog, I would highly recommend that they do so. It is truly outstanding.
Charles Welcome Change

Oct 15, 2007 at 8:19 AM

tarlano wrote:

Can anyone answer the question of how PFX relates to the Concurrency and Coordination Runtime (CCR) that ships with the MS robotics studio?

Anthony

It's a different solution to a similar problem... The CCR provides a more complicated grammar and requires a different way of thinking about concurrent operations (ports and messages).

They are not related technologies (though they are both just managed class libraries), specifically. PFX provides a very simple API that is in line with LINQ semantics and easy for .NET developers to grasp with little study (lamda expressions, etc). CCR is a different approach that, for most, requires more ramp up time to get your head around. It's also extremely powerful.

Again, they are not related.
C
joedu

Oct 15, 2007 at 8:56 AM

PerfectPhase,

Yes, these technologies are written entirely in managed code, and run on top of the stock CLR. While this is true today, it's of course possible, like any .NET Framework class library, that we'll pursue opportunities for tighter integration with the runtime as the libraries are further developed.

---joe
joedu

Oct 15, 2007 at 8:58 AM

esoteric,

I'm glad to hear you agree with our direction. As to whether TPL is built on top of the existing thread-pool, it is currently not. But we know that programs will be written that use both in the same process, both moving forward and when considering legacy apps, and thus there must be some resource management cooperation in the final solution. Nothing is baked enough to discuss, but once it is you can bet we'll be looking for feedback on the approach.

---joe
joedu

Oct 15, 2007 at 9:03 AM

Now, about the more general issue of shared state. Judah hit the nail right on the head in his response. We do not (currently) reject programs due to reliance on shared state. LINQ tends to lead programmers down a more functional programming style so the problem is less pervasive (though still there) in PLINQ.

Please take a look at http://www.bluebytesoftware.com/blog/2007/09/15/ParallelFXMSDNMagArticles.aspx for some more details on what we call "parallelism blockers." This includes shared state, thread affinity, and slight changes in exception behavior. Our story here is not completely ironed out, at least not ironed out enough to describe to everybody right now. When it is, you can be sure we'll be back here on Channel9 to discuss it.

Thanks for all of the great feedback. Keep it coming.

---joe
littleguru <3 Seattle

Oct 15, 2007 at 9:44 AM

Interesting to hear that the library does not check for shared resources. I thought so - it would just introduce to much problems and issues Nice to have the confirmation.

Btw. I think it is a nice way how you guys implemented the library. And keep us informed when you get a version out there or more information available
mlalevic

Oct 16, 2007 at 4:48 AM

The way I see it this library should help us build more efficient multithreaded applications easier. It does not remove all the job from developer, you still need to parallelize your program in inteligent manner.

What I want to say is that if you wrote a program to be single threaded then there is no way that you'll just say Parallel.For and you instantly have multithreaded application; you most probably have to do some additional work. I believe though it is possible and it should be easily possible to analyze some shared state access and at least warn developer on possibility that there might be an access that is not thread safe.

I was investigating some multicore architectures with thread level speculative parallelism and they do what some of comments suggested, check on dependencies between multiple loop executions trying to speculatively execute them in parallel. Well if you want to know more about that here is the link: http://www-hydra.stanford.edu/ ; maybe that could help.

I can't wait for CTP, do we know the date when it is coming out? This should be big leap forward from efficiency point of view, and I would like to see if it could be customized to use a similar approach for, lets say, desktop grid computing?

Cheers.
Mihailo
kbac70

Oct 17, 2007 at 4:40 AM

Ever since Delphi, I can never get enough of information as provided by Anders. I love concept of Linq and the progression highlighted by this movie is amazing. Don't you simply love this stuff?

Many thanks to Anders, Joe and the channel 9 team
Dals

Oct 17, 2007 at 8:35 AM

Congratulation! I'm very excited about this technology since I saw a link to Joe's blog saying about PLINQ.

One thing I would like to know if there will be on TaskManager a way to inform a dependency between tasks. So, using some kind of topological sorting, the TaskManager would know how to deal with these dependencies.

Dals
channel9celebrity

Oct 17, 2007 at 11:39 AM

"Problems and issues" is putting it entirely in a different class of difficulty. The reality is: It is simply impossible to detect arbitrary mutable shared state without changes to the LANGUAGE and the RUNTIME.

That is several orders of magnitude more complex than implementing a library. In fact, don't bet that it will ever be done for you. Just like all those 'new programming features' talked about in the mainstream press that will automate your job for you.
tomkirbygreen

Oct 26, 2007 at 9:01 AM

Joe's book just went up on Safari. You can purchase the PDF of it now under their early access program:

   Concurrent Programming on Windows Vista: Architecture, Principles, and Patterns
   by Joe Duffy
   Last Updated on Safari: 2007/10/26
   Publisher: Addison Wesley Professional
   Pub Date: April 25, 2008 (est.)

That's my weekend sorted! Happy Dev!
gdesroches

Nov 03, 2007 at 8:02 PM

It looks like the Parallel.For call being used in the video assumes the caller wants to increment a variable by 1 until it reaches a certain number.

This is not always the case. How would I translate the following into a parallel for loop?

// ShouldCancel() creates a lock and checks some boolean value

for (int i = 99; i >= 3 && !ShouldCancel(); i -= 3) { ... }

Are there overloads for this yet? I would imagine it would be something like this...

int i = 0;
Parallel.For(
() => i = 99;
() => i >= 3 && !ShouldCancel(),
() => i -= 3,
() => ...
);
Jaime Bula

Nov 29, 2007 at 2:15 PM

Hi!

How do you break a Parallel.For

Got this...

System.Threading.Parallel.For(1, maximumIterations_, 1, dd =>
{
   s = trapzd(function, lower, upper, dd);

   if (Math.Abs(s - olds) < tolerance_ * Math.Abs(olds))
   {
      exeeded = true;
      // break;
      return;
   }
   olds = s;
});

How can I break the loop? return keeps the loop running. And the old break does not compile?

Any Help?
joedu

Nov 29, 2007 at 4:16 PM

Hi Jaime,
There is an overload of Parallel.For whose 'body' lambda is passed a ParallelState object. This object offers a 'Stop' method which is effectively the same as 'break' in a sequential loop. So for example, your code would look something like (changes highlighted):

System.Threading.Parallel.For(1, maximumIterations_, 1, (dd, state) =>
{
   s = trapzd(function, lower, upper, dd);

   if (Math.Abs(s - olds) < tolerance_ * Math.Abs(olds))
   {
      exeeded = true;
      state.Stop();
      return;
   }
   olds = s;
});

I didn't try to compile this, but it should work and do what you are looking for. Take care,
---joe
Charles Welcome Change

Nov 29, 2007 at 9:55 PM

The Parallel Computing Platform team has launched the Parallel Computing Platform dev center. Check it out and get the ParallelFX CTP bits!
Jaime Bula

Dec 03, 2007 at 5:30 AM

Worked Great! Thanx!
magicalclick C9 slogan, #dealwithit. C9 Broken Non-Scroll Editor.

Dec 04, 2007 at 6:15 PM

Hello guys,
Just a thought on shared data.

Would it be safe to say that the data is read only or write only is thread safe? It seems like if I separate the output to a dedicated field, I can merged the result and overwrite(redirect) the source data if necessary. That's just a special case though.

For some other cases, I think it would be much helpful if .Net itself have a build-in data manager to manage shared resources. Something like DB but on the level of dynamic memory. Since .Net already have its own memory manager, that able to track who is referencing the data, this could be an great add-on.

The way I think of it is that when you pass by reference, you will need the lock. If you pass by value is fine.
You can declare the variable to read-only(not const) and write-only, ~~but they can change the read-write property~~. Only one thread, the parent/master thread has the full access to update read-only data/read from write only data. With a build-in strict timer/wait timer/indefinite wait on the fields for the parent thread. Also you can suspend all other thread when the parent thread is using the field / or allow X amount of threads / allow all threads.

Just a thought
slyi

Dec 21, 2007 at 12:06 PM

Will HPC++ be to take advantage of the programs written for PFX by default or do we need to include logic for the HPC 2008 job scheduler as well?
staceyw Before C# there was darkness...

Dec 21, 2007 at 12:53 PM

magicalclick wrote:

Hello guys,
Just a thought on shared data.

Would it be safe to say that the data is read only or write only is thread safe? It seems like if I separate the output to a dedicated field, I can merged the result and overwrite(redirect) the source data if necessary. That's just a special case though.

For some other cases, I think it would be much helpful if .Net itself have a build-in data manager to manage shared resources. Something like DB but on the level of dynamic memory. Since .Net already have its own memory manager, that able to track who is referencing the data, this could be an great add-on.

The way I think of it is that when you pass by reference, you will need the lock. If you pass by value is fine.
You can declare the variable to read-only(not const) and write-only, ~~but they can change the read-write property~~. Only one thread, the parent/master thread has the full access to update read-only data/read from write only data. With a build-in strict timer/wait timer/indefinite wait on the fields for the parent thread. Also you can suspend all other thread when the parent thread is using the field / or allow X amount of threads / allow all threads.

Just a thought

Read-only is safe always as long as data was set in a ctor. Write only would be safe as long as you never read it - but then what is the point? If you have 1 read and all others write, you still need to protect the var.

They are working this very thing. It is called transactional memory, but still in research. It is not a simple fix as it get very complex.