在Scala中实现Go的并发模式难吗?

There is no doubt about it, Go's syntax is much simpler than Scala's. It also has fewer language features. I quite like the ease with which one can write concurrency code with Go.

As it turns out, performant code is none blocking code ( see http://norvig.com/21-days.html#Answers ) and both Go and Scala are very good at this.

My question is about how one can write programs in Scala that behaves the exact same way as Go programs, by implementing the same concurrency patterns. The first thing that comes to mind is using Futures in a similar way with Channels.

I'm looking for

  • possible implementations of Go's concurrency patterns in Scala
  • if the Go constructs are hard to simulate exactly in Scala
  • code snippets

Any help is much appreciated.

[Edit] A few examples of Go concurrency patterns http://talks.golang.org/2012/concurrency.slide

Fan-in

func fanIn(input1, input2 <-chan string) <-chan string {
  c := make(chan string)
  go func() {
    for {
      select {
        case s := <-input1:  c <- s
        case s := <-input2:  c <- s
      }
    }
  }()
  return c
}

Timeout granularity (one channel vs whole conversation)

Replicating service calls amongst multiple instances and returning the value of the first one to respond. (this is uses a bundle of patterns)

All with: No locks. No condition variables. No callbacks. (Scala Futures use callbacks)

Go has concurrency features built in to the core language while Scala uses the concurrent package and concurrency primitives from Java's java.util.concurrent.

In Scala it's idiomatic to use either thread-based concurrency or the Actor Model, while Go concurrency is based on Hoare's Communicating Sequential Processes.

Although the concurrency primitives between the two languages aren't the same, it looks like there is some similarity.

In Go concurrency is usually achieved using Goroutines and Channels. There are also other more traditional low level synchronization primitives such as mutexes and wait groups.

In Scala, as far as I know, any class that is declared "Runnable" will be launched in a separate thread, and will not block. This is functionally similar to goroutines.

In Scala Queues can be used to pass information between routines in a similar fashion to Channels in Go.

EDIT: As pointed out by Chuck, "the crucial difference between Scala's Queues and Go channels is that, by default, Go's channels block on write until something is ready to read from them and block on read until something is ready to write to them.". This would need to be written into any Scala implementation of channels.

EDIT 2: As pointed out by Maurício Linhares, "You can do concurrency without visible callbacks in Scala using Async - github.com/scala/async - but you can't do it without callbacks at all, it's just not possible given the way the JVM is currently implemented.".

Thanks to all for the constructive comments.

For more info see:

The short answer is no, it is not difficult.

As you know, concurrency by message passing can operate with blocking or non-blocking synchronisation primitives. Go's channels can do both - they can be unbuffered or buffered - you choose.

A lot is said in JVM languages about non-blocking concurrency being always superior. This is not true in general; it's just a feature of the JVM that threads are quite expensive on it. In response, most JVM concurrency APIs provide only a non-blocking model, although this is unfortunate.

For relatively modest concurrency of up to, say, 1000 JVM threads, blocking concurrency can work very effectively even on the JVM. Because this style doesn't involve any callbacks, it is easy to write and then read later.

The excellent JCSP library from the University of Canterbury is a good way to write Java/Scala/... programs using CSP channels. This is the same style used by Go; JCSP channels are very similar to Go channels, giving the option of unbuffered or buffered (or overwriting fixed buffer) sizes. Its select is called Alternative and has been proven correct by the JCSP developers via formal analysis.

But because the JVM cannot realistically support more than 1000 or so threads, this will not be appropriate for some application areas. But then, there's Go...


Footnote: the current version of JCSP is v1.1rc5 in the Maven repos, contrary to what the JCSP website says.

Apparently there is a third party lib (Netflix) which provide reactive extensions to Scala (but also Java and other JVM languages). The RX's observables can be treaded in a similar way to Go's channels.

https://github.com/Netflix/RxJava/tree/master/language-adaptors/rxjava-scala

The documentation is useful as well, providing visual representations of common patterns.

Good implementation is non-trivial.

I.e. you can implement one in 'blocking' way, where each go blocking primitive (channel wait) will actually block execution thread. Implementation would be trivial, but useless.

Alternative is build a mechanism, which allows 'suspend/resume' execution flow for waits asynchronously. Since we have no builtin support of continuations in JVM, implementing this is quite complex and require or AST transformations or bytecode weaving.

For implementation of #1 approach (i.e. with AST transformation on top of SIP-22 async), you can look at https://github.com/rssh/scala-gopher (warning: I'm author).

Implementing CSP-style concurrency on the JVM is not easy, whether it is for Java or for Scala. The reason is that CSP is based on threads with a reduced context, which are often called green threads. The reduced context consumes less memory, which means that you can run much more green threads than OS threads or Java threads (1 Java thread corresponds to 1 OS thread). I once tried it out: With 4 GB RAM you can start about 80.000 Goroutines (the variant of green thread in Go) compared to about 2.000 Java threads.

Now why does that matter? The idea in CSP is that if some channel contains no data there is "only" one green thread lost that now sits on that channel till it receives input. Let's say you have a web application being accessed by 40.000 users. The 80.000 Goroutines that can be started on a machine with 4 GB RAM can handle those 40.000 connection right away on the spot (1 inbound connection and one outbound connection). Without green threads you need a lot more memory or more servers.

The other point in green threads is that you just don't need to worry if a green thread sits on a channel as you have so many of them. Now with channel-oriented code you can look at code that truly behind the surface is asynchronous as if it were sychronous. Following message flow through channels is as easy as following any other method calls. Robert Pike explains this well in this Youtube-Video at about position 29:00. This makes CSP-style concurrent code much easier to get right from the beginning and also easier to find concurrency related bugs.

The other issue is continuations. Let's say you have a function that consumes data from 2 channels in a row and computes the data somehow. Now, the first channel has data, but the second has not. So when the second channel receives data, the language runtime has to jump inside the function to the place where the second channel supplies data to the function. For being able to do that the runtime needs to remeber where to jump and it had to store the data taken from the first chanels somewhere and restore it, because it is being computed together with the data from the second channel. This can be done on the JVM using continuations libraries that make use of byte code injection to make "stashing" intermediate results and remember locations where to jump to. One library for Java nad Kotlin that can do this is Quasar: http://docs.paralleluniverse.co/quasar/ Quasar also has fibers which serve as a means to have something similar to green threads on the JVM. The developer of Quasar is Ron Pressler who got hired by Oracle to work on Projekt Loom: http://cr.openjdk.java.net/~rpressler/loom/Loom-Proposal.html The idea of this project is to support for fibers and continuations on the JVM-level, which would make fibers more efficient and byte code injection for continuations less cumbersome.

Then there are also Coroutines in Kotlin: https://kotlinlang.org/docs/reference/coroutines.html Kotlin's Couroutines also implement fibers and continuations supplied by the Kotlin compiler, so the developer does not need to assist the CSP library (e.g. Quasar) in knowing what function needs byte code injection.

Unhappily, Kotlin's Couroutines are only for Kotlin and cannot be used outside of it. So they are not available to other JVM languages. Quasar does not work with Scala as byte code injection for Scala for continuations would be much more difficult as for Java or Kotlin as Scala is a much more elaborate language. At least that is the reasoning provided by the developer of Quasar.

So the best thing to do as what Scala is concerned is to stick to Akka or wait fro Project Loom to finish. Then some Scala people could start implementing CSP for Scala on a level that truly implements CSP. At the time of writing Project Loom is in the working, but not yet officially approved by Oracle. So it is so far not clear whether some future JDK will contains those things needed for full-scale CSP.