Lec 10 - Monad and Parallel Stream
Monad
At a high level, a Monad is a design pattern (from category theory) used in functional programming to wrap values and chain computations in a safe and composable way.
The follwing containers we have learned and implemented are all considered as monad.
Maybe<T>
The value might be there (i.e., Some<T>) or might not be there (i.e., None)
Lazy<T>
The value has been evaluated or not
The log describing the operations done on the value
Each of these classes has:
an
ofmethod to initialize the value and side information.have a
flatMapmethod to update the value and side information.
The class may also have other methods besides the two above. Additionally, the methods may have different names.
More specifically, monads are the classes we wrote that can follow certain patterns that make them well-behaved when we create them with of and chain them with flatMap.
Monad Laws
Let Monad be a type that is a monad and monad be an instance of it, a Monad should follow the following three laws,
The left identity law
Monad.of(x).flatMap(x -> f(x)) must be the same as f(x)
Our of method should not do anything extra to the value and side information — it should simply wrap the value x into the Monad. Our flatMap method should not do anything extra to the value and the side information, it should simply apply the given lambda expression to the value.
The right identity law
monad.flatMap(x -> Monad.of(x)) must be the same as monad
Since of should behave like an identity, it should not change the value or add extra side information. The flatMap above should do nothing and the expression above should be the same as monad.
The associative law
monad.flatMap(x -> f(x)).flatMap(x -> g(x)) must be the same as monad.flatMap(x -> f(x).flatMap(y -> g(y)))
Regardless of how we group those calls to flatMap, their behavior must be the same.
A Monad helps us reason about our code!
Functors
A functor is a simpler construction than a monad in that it only ensures lambdas can be applied sequentially to the value, without worrying about side information.
Functor Laws
Let Functor be a type that is a functor and functor be an instance of it, a functor should follow the following two laws,
Identity Law
functor.map(x -> x) is the same as functor
Composition Law
functor.map(x -> f(x)).map(x -> g(x)) is the same as functor.map(x -> g(f(x)).
Our classes from cs2030s.fp, Lazy<T>, Maybe<T>, and InfiniteList<T> are functors as well.
Parallel Stream
Prallel and Concurrent Programming

Sequencity means that at any one time, there is only one instruction of the program running on a processor.

Concurrency refers to the ability of a program to manage multiple tasks at the same time. Even on a single-core processor — where only one instruction can be executed at any moment (this is sequencity) — concurrency allows the program to switch between tasks quickly, giving the illusion that they are running simultaneously.
This is often achieved using threads, which are smaller units of a process. By dividing a program into multiple threads (e.g., one for handling user input, another for doing background work), the system can switch between them as needed. This improves responsiveness and efficiency, especially when some threads are waiting (like for I/O), allowing others to make progress in the meantime.

While concurrency gives the illusion of subtasks running at the same time, parallel computing refers to the scenario where multiple subtasks are truly running at the same time — either we have a processor that is capable of running multiple instructions at the same time, or we have multiple cores/processors and dispatch the instructions to the cores/processors so that they are executed at the same time.
Parallel Stream
In Java, a parallel stream is a type of stream that allows you to process elements concurrently using multiple threads. It’s created by
Calling .parallel() on a standard stream
Example
.parallel() is a lazy operation — it merely marks the stream to be processed in parallel. As such, you can insert .parallel() anywhere in the pipeline after the data source and before the terminal operation.
Calling Stream.parallelStream() on a collection
You may notice that the output is reordered.
This is because
Streamhas broken down the original stream into subsequences (more formally speaking, it is spliting into multiple threads), and runforEachfor each subsequence in parallel.Since there is no coordination among the parallel tasks on the order of the printing, whichever parallel tasks that complete first will output the result to screen first.
This causes the sequence of numbers to be reordered.
What can be parallelized
To ensure that the output of the parallel execution is correct, the stream operations
must not interfere with the stream data
Interference means that one of the stream operations modifies the source of the stream during the execution of the terminal operation. For instance:
would cause ConcurrentModificationException to be thrown. Note that this non-interference rule applies even if we are using stream() instead of parallelStream().
most of the time should be stateless
Stateless operations are stream operations where the processing of one element does not depend on elements processed before. For example, map(), filter(), flatMap() and peek() are stateless.
Stateful operations are stream operations that require knowledge of previous or all elements to produce a result. For example, sorted(), distinct() and limit() are stateful.
Side effects should be kept minimum
What are sides effects? Go back to review Pure Functions from Lec 08.
Why? This is because parallel streams run in multiple threads, and if multiple threads modify the same variable at the same time → you get race conditions and unpredictable behavior. For example,
The forEach lambda generates a side effect — it modifies result. ArrayList is what we call a non-thread-safe data structure. If two threads manipulate it at the same time, an incorrect result may result.
To solve this issue, we can call the .toList(). It will be quite useful in PE2.
More on reduce()
reduce()The reduce() stream operation can be parallelizable! Because instead of reducing all elements one by one sequentially, we can split the stream into chunks, reduce each chunk in parallel, and then combine the results. For example, think of this,
But for this to work correctly, the way we combine things must not depend on order or timing.
To unlock the parallelizability, you need to call the three parameter version of reduce(), which is
Besides that, to unlock the full power of parallelizability, there are three rules that identity, accumulator and combiner must follow
Identity Rule
The identity must do nothing.
If you're combining something with the identity, it shouldn’t change the value.
Associativity Rule
The grouping of operations doesn’t affect the result.
Why it matters:
When you split the stream, reduce in different threads, and combine later,
You’re changing the grouping — so associativity ensures the result stays the same.
What's the target of the Associativity Law?
Both the accumulator and the combiner should adhere to this law because
1. Accumulator
Used to combine a stream element into a result.
In parallel streams, elements are grouped arbitrarily and reduced in parallel.
If the accumulator is not associative, different groupings will give different results.
2. Combiner
Merges the results of substreams.
If combiner is not associative, merging partial results can lead to inconsistencies.
Compatibility Rule
This ensures that reducing a small part, then combining it with a bigger result, is the same as doing it all at once. It's like saying:
I reduced
twithidentity(got a partial result)I combined that with
uIt better be the same as just reducing
uwithtdirectly.
Otherwise, parallel combining would give wrong results.
Performance of Parallel Stream
Parallelizing a stream does not always improve the performance. This is because creating a thread to run a task incurs some overhead, and the overhead of creating too many threads might outweigh the benefits of parallelization.
Last updated