( Latest Revision:
Feb 18, 2018
)
Chapter Four -- Threads -- Condensed Lecture Notes
- 4.0 Objectives
- Understand the notion of a thread
- Discuss various API's for threads
- Explore implicit threading
- Examine issues related to multithreading
- Cover operating system support for threads
- 4.1 Overview
- When a group of processes all share about as much context as possible,
we tend to refer to them as threads of one task.
- We also think of the group of threads as a single (multi-threaded)
process or "task."
- Threads CANNOT share the same ID number, program counter, stack,
or register set.
- 4.1.1 Motivation
- It can be handy to have multiple threads running in a single application -
for example a web browser in which one thread renders an image on the
screen while another thread downloads a file from a server.
- An application with multiple threads can run some of them
on multiple CPUs at the same time.
- Many operating system kernels are multithreaded.
- 4.1.2 Benefits (of Threads)
- Interactive programs can be more RESPONSIVE because
if one thread blocks, another can respond to the user.
- Threads SHARE resources like program text and main memory
instead of using separate copies. This means there can be more
applications in memory, and IPC between threads of one process
is easier.
- ECONOMY - Because of the sharing, it takes fewer resources and
less work to create a new thread within a process than to create
a new process. The OS only has to allocate a few things
(new thread ID, program counter, stack, register set).
Also there's less to do when context-switching
among threads of the same process, so it's usually faster.
- SCALABILITY - Applications with multiple threads tend to perform
better on computers with more CPUs because more
of the threads can execute simultaneously.
- 4.2 Multicore Programming
- Multicores are multiple CPUs on a single chip. They are
very common now. Thus modern computers have the potential
for true parallel processing, not merely the concurrency
that can be implemented on uni-processors.
- 4.2.1 Programming Challenges --
- It's a challenge to write multi-threaded programs.
- There are the questions of dividing the work, load balancing,
division of data, data dependency, testing, and debugging.
- Meeting these challenges requires new approaches to designing
software.
- 4.2.2 Types of Parallelism -- To achieve parallelism,
we ...
- assign different threads to perform the same kind of work
on different parts of the data (data parallelism), and/or
- assign different threads to perform different sub-tasks
(task parallelism).
- 4.3 Multithreading Models
- There are kernel level threads and user level threads. The kernel
directly supports kernel level threads. The kernel represents each kernel
thread with a "thread control block" (tcb). The kernel schedules kernel
threads from the ready queue. Kernel threads make system calls
and block to wait for I/O - just as (heavyweight) processes do.
- The kernel is not 'aware' of user level threads. A package of library
functions implements - you might say 'simulates' - these user threads
outside the kernel. For example the effect of the library might be
to 'simulate' several threads using just one kernel thread
to support them.
- The supporting relationship (the mapping) between user level threads
and kernel level threads can be many-to-one, one-to-one, or
many-to-many.
- 4.3.1. Many-to-One Model (Refer to slide 4.14)
In the many-to-one model,
many user-level threads are
supported by a single kernel-level thread. Context switching
is extremely fast among these user-level threads
(because it's "simulated"),
and the model supports programmers that want to organize software
as a group of concurrent threads. However true parallelism is not
possible with this model, and all user threads are blocked if any one
of them makes a blocking system call.
- 4.3.2 One-to-One Model (Refer to slide 4.15)
Use of the one-to-one model allows threads to
block independently and
operate in parallel on multiprocessors.
However the creation of
large numbers of threads may tax system
resources, there is no assurance that the OS will schedule
threads to operate in parallel optimally, and context switches
are slower generally (because they are real context switches in the
kernel).
- 4.3.3 Many-to-Many Model (Refer to slide 4.16)
The many-to-many model allows creating a large
multiplicity of user-level threads that may switch context
with great rapidity. The application can have greater control
over the scheduling of the user-level threads.
Much of the
advantage of the one-to-one model remains: parallelism and
independent blocking.
- 4.4 Thread Libraries
- POSIX thread (pthread) implementation varies from system to system --
could be user-level or kernel-level.
- Windows threads are kernel-level.
- The Java thread API is typically implemented using a native thread
package on the host system (for example, Pthreads or Windows).
- In asynchronous threading, the parent creates
one or more child threads and then executes concurrently with them.
- In synchronous threading, the parent creates one or more
child threads and waits for all the child threads to exit
before resuming execution.
- (Refer to slides 4.20-4.22)
Section 4.4 contains three examples in which a parent thread
creates a child thread to execute a function. The parent
blocks until the child has exited, and then the parent
resumes execution.
- 4.5 Implicit Threading
- Implicit threading is a methodology for
coping with some of the
difficulties of programming
multithreaded applications through the
use of such tools as
compilers and run-time libraries.
- 4.5.1 Thread Pools: Generally it is
faster for a server to use an existing
thread to service a client request, rather than create one and destroy it
after it performs the service. Using a pool of threads also
builds in a limit on the number of threads a server can utilize
- protecting the system from too much thread proliferation, which could
use up too many system resources. The server creates a number of threads
at the time of process start-up
and assigns threads from the pool to service clients. Clients may have to
block until a thread from the pool becomes available.
- 4.5.2 OpenMP:
A programmer can insert labels in the code that
identify certain sections that should be executed by parallel
threads. The compiler responds to the labels by generating code
that creates threads that execute those sections of code
in parallel.
- 4.5.3 Grand Central Dispatch is comprised of extensions to C, an
API, and a run-time library.
Like OpenMP, it provides
parallel processing, although details of the implementation differ.
- 4.5.4 Other Approaches include
Threading Building Blocks (Intel),
products from Microsoft, and the
java.util.concurrent package.
- 4.6 Threading Issues
- 4.6.1 The fork() and exec() system calls
When an application is multi-threaded,
should the fork()
system call duplicate all threads, or just the calling thread?
Some API's provide both options.
- Implementations of exec() typically overwrite the entire process
of the calling thread. Therefore, if the child created by a
fork() is going to call exec() immediately, there's no point
in having the fork() duplicate all the threads in the process.
- 4.6.2 Signal Handling
Signals are a simple form of interprocess communication in some
operating systems, primarily versions of unix. Signals
behave something like interrupts but they are not
interrupts.
- The OS delivers and handles signals.
Delivering signals and
handling (responding to) signals are routine
tasks the OS performs as opportunities arise. Sometimes delivery
of a signal to a process (or thread) is required as part of the
OS performance of interrupt service, or a system call.
The OS delivers signals to a process (thread) by setting a
bit in a context variable of the process (thread). Just
before scheduling a process (thread) to execute, the OS checks
to see if any signals have been delivered to the process (thread)
that have not been handled yet. If so, the OS will cause the
signal to be handled properly. Sometimes it does this by
executing code in kernel mode, and sometimes it handles a
signal by jumping into the user process at the start address
of a special handler routine the process has for responding
to the signal.
The exact appropriate way of handling a signal depends on the
nature of the signal.
-
Multithreading complicates the problem of implementing signal
delivery.
Should a signal be delivered to all the threads in a
process or just some?
- Often the handler for a signal should run only once. A signal sent
to a process may be delivered only to the first thread that is not
blocking it.
- The OS may provide a function to send a signal to one particular
thread.
- 4.6.3 Thread Cancellation
Sometimes a thread starts work but it should be cancelled
before it finishes - for example if two threads are searching
a database for a record and one of them finds it, the other
thread should be cancelled.
Thread cancellation can be implemented in a manner similar to how
signals work. In fact it may be implemented using signals.
Since
problems could be caused by instantly
cancelling a thread in a task that is in the midst of doing some
work, the implementation of cancellation typically includes ways
for threads to defer their cancellation so that they have time
to 'clean up' first - for example to deallocate resources they
are holding, or to finish updating shared data.
- 4.6.4 Thread Local Storage
Typically threads have some need for thread-specific data
(thread local storage). This is
data not shared with other threads. In Pthreads processing,
local variables can play this role, but
local variables exist
only within one function, so provision for thread local storage
that is more 'global' may be needed. Most thread APIs provide
support for such thread local storage.
- 4.6.5 Scheduler Activations
This section describes some rather arcane details of how
the relationship between user-level threads and kernel-level
threads may be implemented. We won't cover this.
- 4.7 Operating-System Examples
- 4.7.1 Windows Threads
Applications run as separate processes, which may have multiple
threads. Per-thread resources include, ID number, register set,
user stack and kernel stack (for the use of the OS when executing
in behalf of the process, for example when executing a system
call for the process), and private storage used by library code
- 4.7.2 Linux Threads
Linux has a traditional fork() system call that creates an
exact duplicate of the parent.
Linux also has a clone()
system call with flag parameters that determine what
resources will be shared between parent and child. If most resources
are flagged to be shared, then the new task is equivalent to a thread.
- 4.8 Summary
- "A thread is a flow of control within a process." There
may be many threads within a single process.
- Possible advantages of threading include responsiveness, resource
sharing, economy, scalability, and efficiency.
- Programmers can manipulate user-level threads that are not
visible to the kernel. There are many-to-one, one-to-one, and
many-to-many models for mapping user-level threads to kernel-level
threads.
- POSIX Pthreads, Windows threads, and Java threads are provided
as libraries and APIs to support threading in most modern operating
systems. Compilers and run-time libraries exist that provide
implicit threading - which frees programmers from explicitly
writing code to create and manage threads.
- "Multithreaded programs introduce many challenges for programmers"