top | item 8692016

System.Linq.Parallel Is Now Open Source

146 points| fekberg | 11 years ago |github.com | reply

59 comments

order
[+] blubbi2|11 years ago|reply
Sorry to be "that guy", but what is System.Linq.Parallel? Can anyone provide a brief description/ list of links where to read more about it? Why is open sourcing it such a big deal?
[+] ultimape|11 years ago|reply
Linq is a library that injects a whole bunch functional style operations on top of list-like objects. Technically speaking, its implementing a type-safe query (think SQL) language on top of them, but the difference is very minimal in practice.

What this does is allow for certain operations to automatically be run in parallel. If you wanted to add 1 to every element in an array, you could do that with with Linq:

    IEnumerable<int> numbers = new List<int>() { 1, 2, 3, 4, 5 };
    var newList = from num in numbers.AsParallel()
                    select (x => x+1);
    
but by adding ".asParallel()" to the chain:

    var newList = from num in numbers.AsParallel()
                    select (x => x+1);
your code is automatically run on each element in a parallel fashion and the runtime handles marshalling it all into a queue for processing... or however it does it's magic.

The MSDN document on the way it does it is quite clear: http://msdn.microsoft.com/en-us/library/dd997399(v=vs.110).a... - describing many linq operations as 'delightfully parallel'.

I personally love Linq. To me it feels like using a Functional Relational Mapping system when combined with Linq-To-SQL, and the same semantics can be used to create a reactive programming model similar to what is intended with kefir.js and Akka ( http://msdn.microsoft.com/en-us/data/gg577609.aspx )

The idea itself isn't new, Perl 6 has something like this planned for trivial cases like for (1..10) { $_ = $_ +1; }

----

I don't know the full story, but for me the significance of open sourcing this code is that it will potentially encourage forks and extensions that allow running stuff upon GPGPUs or other massively parallel architectures while still using Linq style syntax.

TL;DR: With Linq, Anything that implements iEnumerable (lists, arrays, streams, etc) can be made to automatically support all sorts of nifty transforms. The Parallel bits let many of them run in parallel quite trivially

[+] useerup|11 years ago|reply
http://msdn.microsoft.com/en-us/library/dd460688(v=vs.110).a...

Parallel LINQ is an extension to LINQ (Language Integrated Querying). The "parallel" hints that it is a way to process results in parallel - using multiple threads, possibly on multiple CPUs/cores.

It allows you to express patterns like fork/join in a very elegant, fluent way.

If I have a set of Orders each with a collection of OrderItems, and each OrderItem* has a Consolidate method (I know, contrived, but bear with me).

    Orders.SelectMany(o => o.OrderItems).AsParallel().ForAll(Consolidate);
will "consolidate" all of the order items in parallel (or with the optimal degree of parallelism given CPUs/cores).
[+] Quequau|11 years ago|reply
This is actually becoming my chief complaint about submissions. It appears to me as if folks now believe that it's cool to submit links which resolve as deeply and closely as possible to subject (in this case a the subdirectory inside the project's repo) with total disregard to the fact that there is insufficient (or no) information describing why we should be interested in the link or what, if anything, we should discuss.

I'm willing to bet the submitter went through a series of links beginning with some sort of announcement that something changed to get to the link they submitted. Why do folks think that we would not also benefit from that information or that it would not help to foster more interesting discussion?

[+] CmonDev|11 years ago|reply
LINQ = Enumeration Monad

Parallel = executes tasks on multiple cores simultaneously if possible

[+] NKCSS|11 years ago|reply
Hmm, I wonder why this was added:

    #if DEBUG
                    currentKey = unchecked((int)0xdeadbeef);
    #endif
https://github.com/dotnet/corefx/blob/master/src/System.Linq... line 110 :P
[+] Demiurge|11 years ago|reply
looks like a marker to see if the code got past two previous lines?
[+] gbvb|11 years ago|reply
It seems to be to avoid using currentKey inadvertently. If someone refers it elsewhere they get deadbeef.
[+] Demiurge|11 years ago|reply
would be awesome if any of the dotnet open sourcing would allow to port linq to python. i would use it in a hearbeat.
[+] pekk|11 years ago|reply
I guess that's useful for people who are already bought in to the .NET world, but isn't LINQ encumbered by patents that pose a risk to anyone building on top of it?
[+] useerup|11 years ago|reply
No. That was/is just FSF FUD.

You have always been able to build anything on top of it without concerns about patents from the implementation. This claim was so hilarious that it is incomprehensible that anyone ever believed it.

You have also been able to reimplement .NET CLR and core libraries without fear of patent litigation (from Microsoft). They have placed the CLR and core libraries under the legal estoppel of the community promise since 2007 (IIRC), in addition to publicly granting patent license to anyone creating an implementation of NET CLR and core libraries from the specifications. This latter was part of the process under which C#, .NET CLR and core libraries was standardized under ISO. A precondition for the standardization was that any necessary patents (for implementation) be offered on RAND terms (reasonable and non-discriminatory). Microsoft has always offered the patent grants free.

The community promise was created in response to FUD from (among others) FSF that Microsoft would just sue anyway (despite patent grants), and with their vast army of lawyers and deep coffers they could bury in court. The community promise creates legal estoppel, whereby a case by Microsoft would be dismissed if you acted "in good faith" by relying on the promise.

Open sourcing Parallel LINQ has no bearing on the patent status of anything building on top of it. If you believe the FSF FUD, you will be ensnared in .NET technology and Microsoft will sue you out of existence if you ever become successful. If you do not believe the FUD, you can continue to take advantage of LINQ, Parallel LINQ, .NET, C#, F# etc.

[+] uniformlyrandom|11 years ago|reply
The part that has been opensourced is under MIT license.

According to the wiki (http://en.wikipedia.org/wiki/MIT_License),

> Whether or not a court might imply a patent grant under the MIT license therefore remains an open question.

This probably means MSFT is unlikely to enforce those patents.

[+] mhaymo|11 years ago|reply
Can anyone knowledgeable tell us how this compares to Java's parallel streams?
[+] kasey_junk|11 years ago|reply
LINQ/PLINQ are the obvious inspirations for Java 8s stream API. I don't know if they are exactly functionally equivalent but streams bring declarative style programming to Java iterable collections that C# has had for years.

But, don't be confused by the .NET Rx libraries which provide LINQ style methods to observable event streams and have been ported as RxJava. Personally, I think that is a better use of the term streams and find the Java 8 standard streams name confusing.

[+] QuadDamaged|11 years ago|reply
As far as I am concerned, all of the BCL has pretty much been 'open source' since Reflector came around in 200..2?
[+] rjbwork|11 years ago|reply
There's a big BIG difference between "source readable" and "open source".
[+] MichaelGG|11 years ago|reply
By that logic, essentially every binary is open source. After all, you can dump a listing for any binary. And decompilers will create C source code (though it isn't always compilable). It's much different to getting the local var names, comments, actual source structure.
[+] CmonDev|11 years ago|reply
No, you still have to deliberately decompile it. JavaScript web sites on the other hand are always open-source (readable with a standard browser).