Multi-threading is an extensive and complex subject, and many good reference texts on the subject exist. The C++ multi-threading is built upon the facilities offered by the pthreads library (cf. Nichols, B, et al.'s Pthreads Programming, O'Reilly ). However, in line with C++'s current-day philosophy the multi-threading implementation offered by the language offers a high level interface to multi-threading, and using the raw pthread building blocks is hardly ever necessary (cf. Williams, A. (2012): C++ Concurrency in action, Manning).
This chapter covers the facilities for multi-threading as supported by C++. Although the coverage aims at providing the tools and examples allowing you to create your own multi-threaded programs, coverage necessarily is far from complete. The topic of multi threading is too extensive for that. The mentioned reference texts provide a good starting point for any further study of multi threading.
A thread of execution (commonly abbreviated to a thread) is a single flow of control within a program. It differs from a separately executed program, as created by the fork(1) system call in the sense that threads all run inside one program, while fork(1) creates independent copies of a running program. Multi-threading means that multiple tasks are being executed in parallel inside one program, and no assumptions can be made as to which thread is running first or last, or at what moment in time. Especially when the number of threads does not exceed the number of cores, each thread may be active at the same time. If the number of threads exceed the number of cores, the operating system will resort to task switching, offering each thread time slices in which it can perform its tasks. Task switching takes time, and the law of diminishing returns applies here as well: if the number of threads greatly exceeds the number of available cores (also called overpopulation), then the overhead incurred may exceed the benefit of being able to run multiple tasks in parallel.
Since all threads are running inside one single program, all threads share the program's data and code. When the same data are accessed by multiple threads, and at least one of the threads is modifying these data, access must be synchronized to avoid that threads read data while these data are being modified by other threads, and to avoid that multiple threads modify the same data at the same time.
So how do we run a multi-threaded program in C++? Let's look at hello world, the multi-threaded way:
1: #include <iostream> 2: #include <thread> 3: 4: void hello() 5: { 6: std::cout << "hello world!\n"; 7: } 8: 9: int main() 10: { 11: std::thread hi(hello); 12: hi.join(); 13: }
thread
is included, informing the compiler
about the existence of the class std::thread
(cf. section 20.2.2);
std::thread hi
object is created. It is provided
with the name of a function (hello) which will be called in a separate
thread. Actually, the second thread, running hello
, is immediately started
when a std::thread
is defined this way;
main
function itself also represents a thread: the program's
first thread. It should wait until the second thread has finished. This
is realized in line 12, where hi.join()
waits until the thread hi
has finished its job. Since there are no further statements in main
, the
program itself ends immediately thereafter.
hello
itself, defined in lines 4 through 7, is
trivial: it simply inserts the text `hello world
' into cout
, and
terminates, thus ending the second thread.
When compiling multi-threaded programs using the Gnu g++
compiler the
-pthread
option must be specified. At link-time the libpthread
library must be
available as well.
To create a multi-threaded program defined in a source file multi.cc
the g++
compiler can be called like this:
g++ --std=c++11 -pthread -Wall multi.ccWhen several pre-compiled objects must be linked, the
-lpthread
linker option should also be specified.
In multi threaded programs threads are frequently suspended, albeit usually for a very short time interval. E.g., when a thread must access a variable, but the variable is currently updated by another thread, then the first thread should wait until the second thread has completed the update. Updating a variable usually doesn't take much time, but if it may take an unexpectedly long time, then the waiting thread may want to be informed about it, so it can do something else as long as the second thread hasn't finished updating the variable.
Sleep
and select
can be used for wating, but as they were designed in
an era when multi threading was commonly unavailable, their capabilities are
limited when used in multi threaded programs.
The gap is bridged by the STL offering dedicated classes for specifying time, which mix well with time-dependent thread members. Threads are the topic of the next section (20.2). Before that, we'll first have a look at facilities for specifying time.
std::ratio
.
Before the class ratio
can be used, the <ratio>
header file must be
included. Usually just the <chrono>
header file is included, as chrono
includes ratio
, and facilities for specifying time are available after
including the chrono
header file.
The class template ratio
expects two integral template arguments,
defining, respectively, the numerator (amount) and denominator (fraction) of
an amount. By default the denominator equals 1, resulting in the ratio's
first argument (the numerator) being interpreted as the represented amount.
Examples:
ratio<1> - representing one; ratio<60> - representing 60 ratio<1, 1000> - representing 1/1000.
The class template ratio
defines two directly accessible static data
fields: num
represents its numerator, den
its denominator. A ratio
definition by itself simply defines a certain
amount. E.g., when executing the following program
#include <ratio> #include <iostream> using namespace std; int main() { cout << ratio<5, 1000>::num << ',' << ratio<5, 1000>::den << '\n'; }the text 1,200 is displayed, as that's the `amount' represented by
ratio<5, 1000>
: ratio
simplifies the fraction whenever possible.
A fairly large number of predefined ratio
types exist. They can be used
instead of the more cumbersome ratio<x>
or ratio<x, y>
specification:
yocto, zepto, zetta
and yotta
use integral constants exceeding 64 bits. Although these constants are defined
in C++, they are not available on 64 bit or smaller architectures.)
std::chrono::duration
is defined
in the
std::chrono
namespace.
Objects of the class duration
define amounts of time.
Before using the class duration
the <chrono>
header file must be
included (which in turn includes the <ratio>
header file).
The class template duration
requires two template type arguments: a
numeric type (commonly int64_t
) defining the duration's value, and a
time-unit, called its Period, usually defined using the class template
ratio
.
Here predefined ratio
types simplify the job of using the right
granularity. E.g., to define 30 minutes you could use
std::chrono::duration<int64_t, std::deca> halfHr(180)(resulting in 180 deca-seconds, so 1800 seconds) but even if you specify `
using namespace std
' and `using namespace chrono
' this is rather
complex and non-intuitive. Fortunately, various predefined duration types
exist:
std::chrono::minutes halfHour(30)
.
The class template duration
itself defines two types:
std::chrono::duration<Value, Period>::rep
:theValue
type used by theduration
type (e.g.,int64_t
)
std::chrono::duration<Value, Period>::period
:theratio
type used by theduration
(e.g.,std::ratio::nano
) (soperiod
itself has static valuesnum
and (den), as mentioned in the previous section).
These types can be retrieved from a duration
object using
decltype
. E.g.,
auto time(minutes(3) * 3); cout << decltype(time)::period::num; // displays 60
In addition to these types the class template duration
offers the
following constructors:
constexpr duration()
:
the default constructor defines a duration of zero time units.
constexpr explicit duration(Value const &value)
:
a specific duration of value
time units. Here Value
refers to
the duration's numeric type (e.g., int64_t
). So, when defining
std::chrono::minutes halfHour(30);the argument 30 is stored inside an
int64_t
.
duration
also offers copy and move constructors.
In addition, duration
has these members:
Duration objects may be added, or subtracted, and they may be multiplied and divided by a numeric value. They also support the modulo operator, using an integral constant for its right-hand side operand. The binary arithmetic and the binary compound assignment operators are also available.
constexpr Value count() const
:returns the value that is stored inside aduration
object. ForhalfHour
this would return 30, not 1800, as the time unit itself is obtained from itsduration<Value, Unit>::period
type.
static constexpr duration zero()
:this is a static member, returning a duration object representing a duration of zero time units.
static constexpr duration min()
:this is a static member, returning a duration object
representing numeric_limits<Rep>::lowest()
.
static constexpr duration max()
:this is a static member, returning a duration object
representing numeric_limits<Rep>::max()
.
Different duration
types may be combined, unless precision would be
lost. When the binary arithmetic operators are used the resulting duration
uses the finer of the two granularities. When the binary compound assignment
operator is used the granularity of the left-hand side operand must at least
be equal to the granularity of the right-hand side operand, or a compilation
error is issued. E.g.,
minutes halfHour(30); seconds half_a_minute(30); cout << (halfHour + half_a_minute).count(); // displays 1830 //halfHour += half_a_minute; won't compile: precision is lost half_a_minute += halfHour; cout << half_a_minute.count(); // displays 1830
The C++14 standard defines suffixes h, min, s, ms, us, ns
for integral
values, creating the corresponding duration
time intervals. E.g.,
minutes oneMinute = 1m
.
std::chrono
namespace.
Before using these clocks the <chrono>
header file must be included.
A clock type must be specified when referring to a point in time using
std::chrono::time_point
(covered by the next section). It is also possible
to define your own clock type (not covered by the C++ Annotations (clause
20.11.3 of the C++11 standard lists the requirements for a clock type)).
If Clock
is a predefined clock type, then Clock
defines the following
types:
std::chrono::Clock::duration
:theduration
type which used byClock
(e.g.,std::chrono::nanoseconds
)
std::chrono::Clock::period
:the time period used byClock
(e.g.,std::ratio::nano
)
std::chrono::Clock::rep
:defining the value type used byClock
to store duration values (e.g.,int64_t
)
std::chrono::Clock::time_point
:time points used byClock
(e.g.,std::chrono::time_point<system_clock, duration>
)
In addition to these types predefined clocks offer a member
static std::chrono::time_point
std::chrono::Clock::now()
:returns the time_point
representing the current time
There are three predefined clock types:
std::chrono::system_clock
:the `wall clock' time, using the system's real time clock. In addition tonow
the classsystem_clock
offers these two static members:
static time_t std::chrono::system_clock::to_time_t
(std::chrono::time_point const & timePoint)
atime_t
value (the same type as returned by C's time(2) function) as used by the representing the same point in time astimePoint
. Example:std::chrono::system_clock::to_time_t( std::chrono::system_clock().now() )
static std::chrono::time_point std::chrono::system_clock::from_time_t
(time_t seconds)
atime_point
value representing the same point in time astime_t
.
std::chrono::steady_clock
:a clock whose time increases in parallel with the increase of real time.
std::chrono::high_resolution_clock
:the computer's fastest clock (i.e., the clock having the shortest
timer-tick period). In practice this is the same clock as system_clock
.
As an example: to access the current time you could use:
auto point = std::chrono::system_clock::now();
time_point
is defined in the std::chrono
namespace.
Objects of the class std::chrono::time_point
define a point in time.
Before using the class std::chrono::time_point
the <chrono>
header
file must be included.
The class time_point
is a class template, requiring two template type
arguments: a Clock
type and a Duration
type. The Clock
type
usually is one of the predefined clock types, e.g.,
chrono::system_clock
. The Duration
type may be omitted, in which case
the Clock's duration
type is used. An explicit duration
type may also
be provided.
In the previous section auto
was used to specify the type of the return
value of system_clock::now
. The explicit definition looks like this:
std::chrono::time_point<std::chrono::system_clock> now = std::chrono::system_clock::now();
The class std::chrono::time_point
features three constructors:
time_point()
:the default constructor represents the beginning of the clock's epoch (E.g., Jan, 1, 1970, 00:00h);
time_point(time_point<Clock, Duration> const &timeStep)
:initializes atime_point
object to represent a point in timetimeStep Duration
units byond the clock's epoch;
time_point(time_point<Clock, Duration2> const &timeStep)
:this constructor is defined as a member template, using the template headertemplate <typename Duration2>
. The typeDuration2
is astd::chrono::duration
(or comparable) type, using a possibly larger period for its unit thantime_point's Duration
type. It initializes atime_point
object to represent a point in timetimeStep Duration2
units byond the clock's epoch.
The class std::chrono::time_point
has these operators and members:
std::chrono::time_point &operator+=(Duration const &duration)
:this operator is also available as binary arithmetic operator, expecting astd::chrono::time_point const &
and aDuration const &
argument (in any order). The amount of time represented byduration
is added to the currenttime_point
value. Example:std::chrono::system_clock::now() + seconds(5);
std::chrono::time_point &operator-=(Duration const &duration)
:this operator is also available as binary arithmetic operator, expecting astd::chrono::time_point const &
and aDuration const &
argument (in any order). The amount of time represented byduration
is subtracted from the currenttime_point
value. Example:auto point = std::chrono::system_clock::now(); point -= seconds(5);
constexpr Duration time_since_epoch() const
:returns the object's Duration
since the epoch.
static constexpr time_point min()
:a static member returning the value returned by the time point's
duration::min
value.
static constexpr time_point max()
:a static member returning the value returned by the time point's
duration::max
value.
All predefined clocks use nanoseconds as their time unit. To obtain the
time expressed in a larger time unit, divide the value returned by the
time_point's count
value by larger time unit
converted to nanoseconds. E.g., the number of hours passed since the beginning
of the epoch is:
using namespace std; using namespace chrono; // for brevity cout << system_clock::now().time_since_epoch().count() / nanoseconds(hours(1)).count() << " hours since the epoch\n";
std::chrono::system_clock::to_time_t
function. Standard C
functions can be used to convert the returned time_t
values to a
textual representation. E.g.,
using namespace std; using namespace chrono; // for brevity time_t tm = system_clock::to_time_t(system_clock::now() + hours(1)); cout << asctime(localtime(&tm));
Here are some additional examples showing how time_point
objects can be
used:
#include <iostream> #include <chrono> using namespace std; using namespace chrono; int main() { // the current time (or use `auto') // 'now' is a time_point<system_clock> auto now(system_clock::now()); // its value in seconds: cout << system_clock::to_time_t(now) << '\n'; // now + two hours: cout << system_clock::to_time_t(now + hours(2)) << '\n'; // define a time_point 1 hour after the epoch: time_point<system_clock> oneHrLater(hours(1)); // show the epoch and the time in seconds of oneHrLater: cout << system_clock::to_time_t(time_point<system_clock>()) << ' ' << system_clock::to_time_t(oneHrLater) << '\n'; }
C++'s main tool for creating multi-threaded programs is the class
std::thread
, and some examples of its use have already been shown at the
beginning of this chapter.
Characteristics of individual threads can be queried from the
std::this_thread
namespace. Also, std::this_thread
offers some control
over the behavior of an individual thread.
To synchronize access to shared data C++ offers mutexes (implemented
by the class std::mutex
) and condition variables (implemented by the
class std::condition_variable
).
Members of these classes may throw system_error
objects (cf. section
10.9) when encountering a low-level error condition.
namespace std::this_thread
contains functions that are uniquely associated with the currently running
thread.
Before using the namespace this_thread
the <thread>
header file must
be included.
Inside the std::this_thread
namespace several free functions are defined,
providing information about the current thread or that can be used to control
its behavior:
thread::id this_thread::get_id() noexcept
:returns an object of typethread::id
that identifies the currently active thread of execution. For an active thread the returnedid
is unique in the sense that it maps 1:1 to the currently active thread, and is not returned by any other thread. If a thread is currently not runningthread::id()
is returned by thestd::thread
object'sget_id
member.
void yield() noexcept
:when a thread calls this_thread::yield()
the current thread is
briefly suspended, allowing other (waiting) threads to start.
void sleep_for(chrono::duration<Rep, Period> const &relTime)
noexcept
:when a thread callthis_thread::sleep_for(...)
it is suspended for the amount of time that's specified in its argument. E.g.,std::this_thread::sleep_for(std::chrono::seconds(5));
void sleep_until(chrono::time_point<Clock, Duration> const &absTime)
noexcept
:when a thread calls this member it is suspended until the specifiedabsTime
is in the past. The next example has the same effect as the previous example:// assume using namespace std this_thread::sleep_until(chrono::system_clock().now() + chrono::seconds(5));Conversely, thesleep_until
call in the next example immediately returns:this_thread::sleep_until(chrono::system_clock().now() - chrono::seconds(5));
std::thread
. Each object of this class handles a separate
thread.
Before using Thread
objects the <thread>
header file must be included.
Thread objects can be constructed in various ways:
thread() noexcept
:The default constructor creates a thread
object. As it receives no
function to execute, it does not start a separate thread of execution. It is
used, e.g., as a data member of a class, allowing class objects to start a
separate thread at some later point in time;
thread(thread &&tmp) noexcept
:The move constructor takes ownership of the thread controlled bytmp
, whiletmp
, if it runs a thread, loses control over its thread. Following this,tmp
is in its default state, and the newly created thread is responsible for calling, e.g.,join
.
explicit thread(Fun &&fun, Args &&...args)
:This member template (cf. section 22.1.3) expects a function (or functor) as its first argument. The function is immediately started as a separate thread. If the function (or functor) expects arguments, then these arguments can be passed to thethread's
constructor immediately following its first (function) argument. Additional arguments are passed with their proper types and values tofun
. Following thethread
object's construction, a separately running thread of execution is started.The notation
Arg &&...args
indicates that any additional arguments are passed as is to the function. The types of the arguments that are passed to thethread
constructor and that are expected by the called function must match: values must be values, references must be reference, r-value references must be r-value references (or move construction must be supported). The following example illustrates this requirement:1: #include <iostream> 2: #include <thread> 3: 4: using namespace std; 5: 6: struct NoMove 7: { 8: NoMove() = default; 9: NoMove(NoMove &&tmp) = delete; 10: }; 11: 12: struct MoveOK 13: { 14: int d_value = 10; 15: 16: MoveOK() = default; 17: MoveOK(MoveOK const &) = default; 18: 19: MoveOK(MoveOK &&tmp) 20: { 21: d_value = 0; 22: cout << "MoveOK move cons.\n"; 23: } 24: }; 25: 26: void valueArg(int value) 27: {} 28: void refArg(int &ref) 29: {} 30: void r_refArg(int &&tmp) 31: { 32: tmp = 100; 33: } 34: void r_refNoMove(NoMove &&tmp) 35: {} 36: void r_refMoveOK(MoveOK &&tmp) 37: {} 38: 39: int main() 40: { 41: int value = 0; 42: 43: std::thread(valueArg, value).join(); 44: std::thread(refArg, ref(value)).join(); 45: std::thread(r_refArg, move(value)).join(); 46: 47: // std::thread(refArg, value); 48: 49: std::thread(r_refArg, value).join(); 50: cout << "value after r_refArg: " << value << '\n'; 51: 52: // std::thread(r_refNoMove, NoMove()); 53: 54: NoMove noMove; 55: // std::thread(r_refNoMove, noMove).join(); 56: 57: MoveOK moveOK; 58: std::thread(r_refMoveOK, moveOK).join(); 59: cout << moveOK.d_value << '\n'; 60: }
- At lines 43 through 45 we see a value, reference, and and r-value reference being passed to a
std::thread
: with the functions running the threads expecting matching argument types.
- Line 47 fails to compile, as a value argument doesn't match the reference expected by
refArg
. Note that this problem was solved in line 43 by using thestd::ref
function.
- On the other hand lines 49 and 58 compile OK, as
int
values and class-types supporting move operations can be passed as values to functions expecting r-value references. In this case notice that the functions expecting the r-value references do not access the provided arguments (except for the actions performed by their move constructors), but use move construction to create temporary values or objects on which the functions operate.
- Lines 52 and 55 won't compile as the
NoMove
struct doesn't offer a move constructor.Be careful when passing local variables as arguments to thread objects: if the thread continues to run when the function whose local variables are used terminates, then the thread suddenly uses wild pointers or wild references, as the local variables no longer exist. To prevent this from happening (illustrated by the next example) do as follows:
- pass an anonymous copy of the local variable as argument to the
thread
constructor, or
- call
join
on the thread object to ensure that the thread has finished within the local variable's lifetime.
1: #include <iostream> 2: #include <thread> 3: #include <string> 4: #include <chrono> 5: 6: void threadFun(std::string const &text) 7: { 8: for (size_t iter = 1; iter != 6; ++iter) 9: { 10: std::cout << text << '\n'; 11: sstd::this_thread::sleep_for(std::chrono::seconds(1)); 12: } 13: } 14: 15: std::thread safeLocal() 16: { 17: std::string text = "hello world"; 18: return std::thread(threadFun, std::string(text)); 19: } 20: 21: int main() 22: { 23: std::thread local(safeLocal()); 24: std::cout << "safeLocal has ended\n"; 25: local.join(); 26: }In line 18 be sure not to call
std::ref(text)
instead ofstd::string(text)
.If the thread cannot be created a
std::system_error
exception is thrown.Since this constructor not only accepts functions but also function objects as its first argument, a local context may be passed to the function object's constructor. Here is an example of a thread receiving a function object using a local context:
#include <iostream> #include <thread> #include <array> using namespace std; class Functor { array<int, 30> &d_data; int d_value; public: Functor(array<int, 30> &data, int value) : d_data(data), d_value(value) {} void operator()(ostream &out) { for (auto &value: d_data) { value = d_value++; out << value << ' '; } out << '\n'; } }; int main() { array<int, 30> data; Functor functor(data, 5); thread funThread(functor, ref(cout)); funThread.join(); };
std::thread
does not provide a copy constructor.
The following members are available:
thread &operator=(thread &&tmp) noexcept
:
If the operator's left-hand side operand (lhs) is a joinable thread, thenterminate
is called. Otherwise,tmp
is assigned to the operator's lhs andtmp's
state is changed to the thread's default state (i.e.,thread()
).
void detach()
:Requiresjoinable
(see below) to returntrue
. The thread for whichdetach
is called continues to run. The (e.g., parent) thread callingdetach
continues immediately beyond thedetach
-call. After callingobject.detach()
, `object
' no longer represents the (possibly still continuing but now detached) thread of execution. It is the detached thread's implementation's responsibility to release its resources when its execution ends.Since
detach
disconnects a thread from the running program, e.g.,main
no longer can wait for the thread's completion. As a program ends whenmain
ends, its still running detached threads also stop, and a program may not properly finish all its threads, as demonstrated by the following example:#include <thread> #include <iostream> #include <chrono> void fun(size_t count, char const *txt) { for (; count--; ) { std::this_thread::sleep_for(std::chrono::milliseconds(100)); std::cout << count << ": " << txt << std::endl; } } int main() { std::thread first(fun, 5, "hello world"); first.detach(); std::thread second(fun, 5, "a second thread"); second.detach(); std::this_thread::sleep_for(std::chrono::milliseconds(400)); std::cout << "leaving" << std::endl; }A detached thread may very well continue to run after the function that launched it has finished. Here, too, you should be very careful not to pass local variables to the detached thread, as their references or pointers will be undefined once the function defining the local variables terminates:
#include <iostream> #include <thread> #include <chrono> using namespace std; using namespace chrono; void add(int const &p1, int const &p2) { this_thread::sleep_for(milliseconds(200)); cerr << p1 << " + " << p2 << " = " << (p1 + p2) << '\n'; } void run() { int v1 = 10; int v2 = 20; // thread(add, ref(v1), ref(v2)).detach(); // DON'T DO THIS thread(add, int(v1), int(v2)).detach(); // this is OK: own copies } void oops() { int v1 = 0; int v2 = 0; } int main() { run(); oops(); this_thread::sleep_for(seconds(1)); }
id get_id() const noexcept
:If the current object does not represent a running threadthread::id()
is returned. Otherwise, the thread's unique ID (also obtainable from with the thread fromthis_thread::get_id()
) is returned.
void join()
:Requiresjoinable
to returntrue
. Blocks the thread callingjoin
until the thread for whichjoin
is called has completed. Following its completion the object whosejoin
member was called no longer represents a running thread, and itsget_id
member will returnstd::thread::id()
.This member was used in several examples shown so far. As noted: when
main
ends while a joinable thread is still running,terminate
is called, aborting the program.
bool joinable() const noexcept
:returnsobject.get_id() != id()
, whereobject
is thethread
object for whichjoinable
was called.
void swap(thread &other) noexcept
:The states of thethread
object for whichswap
was called andother
are swapped. Note that threads may always be swapped, even when their thread functions are currently being executed.
unsigned thread::hardware_concurrency() noexecpt
:This static member returns the number of threads that can run at the same time on the current computer. On a stand-alone multi-core computer it (probably) returns the number of cores.
Things to note:
join
. E.g.,
void doSomething(); int main() { thread(doSomething); // nothing happens?? thread(doSomething).join() // doSomething is executed?? }This similar to the situation we encountered in section 7.5: the first statement doesn't define an anonymous
thread
object at all. It
simply defines the thread
object doSomething
. Consequently,
compilation of the second statement fails, as there is no thread(thread &)
constructor. When the first statement is omitted, the doSomething
function
is executed by the second statement. If the second statement is omitted, a
default constructed thread
object by the name of doSomething
is
defined.
thread object(thread(doSomething));the move constructor is used to transfer control from an anonymous thread executing
doSomething
to the thread object
. Only after object
's
construction has completed doSomething
is started in the separate thread.
packaged_task
and a
future
(cf., respectively, sections 20.12 and 20.9).
A thread ends when the function executing a thread finishes. When a
thread
object is destroyed while its thread function is still running,
terminate
is called, aborting the program's end. Bad news: the destructors
of existing objects aren't called and exceptions that are thrown are left
uncaught. This happens in the following program as the thread is still active
when main
ends:
#include <iostream> #include <thread> void hello() { while (true) std::cout << "hello world!\n"; } int main() { std::thread hi(hello); }
There are several ways to solve this problem. One of them is discussed in the next section.
The thread_local
keyword provides this intermediate data level. Global
variables declared as thread_local
are global within each individual
thread. Each thread owns a copy of the thread_local
variables, and may
modify them at will. A thread_local
variable in one thread is completely
separated from that variable in another thread. Here is an example:
1: #include <iostream> 2: #include <thread> 3: 4: using namespace std; 5: 6: thread_local int t_value = 100; 7: 8: void modify(char const *label, int newValue) 9: { 10: cout << label << " before: " << t_value << ". Address: " << 11: &t_value << endl; 12: t_value = newValue; 13: cout << label << " after: " << t_value << endl; 14: } 15: 16: int main() 17: { 18: thread(modify, "first", 50).join(); 19: thread(modify, "second", 20).join(); 20: modify("main", 0); 21: }
thread_local
variable t_value
is defined. It is
initialized to 100, and that becomes the initial value for each separately
running thread;
modify
is defined. It assigns
a new value to t_value
;
modify
.
t_value
being 100, and then modifies it without affecting the values of
t_value
used by other threads.
Note that, although the t_value
variables are unique to each thread,
identical addresses may be shown for them. Since each thread uses its own
stack, these variables may occupy the same relative locations within their
respective stacks, giving the illusion that their physical addresses are
identical.
void childActions(); void doSomeWork(); void parent() { thread child(childActions); doSomeWork(); child.join(); }However, maybe
doSomeWork
can't complete its work, and throws an
exception, to be caught outside of parent
. This, unfortunately, ends
parent
, and child.join()
is missed. Consequently, the program aborts
because of a thread that hasn't been joined.
Clearly, all exceptions must be caught, join
must be called, and the
exception must be rethrown. But parent
cannot use a function try-block, as
the thread object is already out of scope once execution reaches the matching
catch
-clause. So we get:
void childActions(); void doSomeWork(); void parent() { thread child(childActions); try { doSomeWork(); child.join(); } catch (...) { child.join(); throw; } }This is ugly: suddenly the function's code is clobbered with a
try-catch
clause, as well as some unwelcome code-duplication.
This situation can be avoided using object based programming. Like, e.g., unique pointers, which use their destructors to encapsulate the destruction of dynamically allocated memory, we can use a comparable technique to encapsulate thread joining in an object's destructor.
By defining the thread
object inside a class we're sure that by the time
the our object goes out of scope, even if the childActions
function
throws an exception, the thread's join
member is called. Here are the bare
essentials of our JoinGuard
class, providing the join-guarantee (using
in-line member implementations for brevity):
1: #include <thread> 2: 3: class JoinGuard 4: { 5: std::thread d_thread; 6: 7: public: 8: JoinGuard(std::thread &&threadObj) 9: : 10: d_thread(std::move(threadObj)) 11: {} 12: ~JoinGuard() 13: { 14: if (d_thread.joinable()) 15: d_thread.join(); 16: } 17: };
thread
object, which is moved, in line 10, to JoinGuard's d_thread
data member.
JoinGuard
object ceases to exist, its destructor (line
12) makes sure the thread is joined if it's still joinable (lines 14 and 15).
JoinGuard
could be used:
1: #include <iostream> 2: #include "joinguard.h" 3: 4: void childActions(); 5: 6: void doSomeWork() 7: { 8: throw std::runtime_error("doSomeWork throws"); 9: } 10: 11: void parent() 12: { 13: JoinGuard{std::thread{childActions}}; 14: doSomeWork(); 15: } 16: 17: int main() 18: try 19: { 20: parent(); 21: } 22: catch (std::exception const &exc) 23: { 24: std::cout << exc.what() << '\n'; 25: }
childActions
is declared. Its implementation (not
provided here) defines the child thread's actions.
main
function (lines 17 through 25) provides the function
try-block to catch the exception thrown by parent
;
parent
function defines (line 13) an anonymous JoinGuard
,
receiving an anonymous thread
object. Anonymous objects are used, as the
parent function doesn't need to access them anymore.
doSomeWork
is called, which throws an exception. This
ends parent
, but just before that JoinGuard's
destructor makes sure
that the child-thread has been joined.
Before using mutexes the <mutex>
header file must be included.
One of the key characteristics of multi-threaded programs is that threads may share data. Functions running as separate threads have access to all global data, and may also share the local data of their parent threads. However, unless proper measures are taken, this may easily result in data corruption, as illustrated by the following simulation of some steps that could be encountered in a multi-threaded program:
--------------------------------------------------------------------------- Time step: Thread 1: var Thread 2: description --------------------------------------------------------------------------- 0 5 1 starts T1 active 2 writes var T1 commences writing 3 stopped Context switch 4 starts T2 active 5 writes var T2 commences writing 6 10 assigns 10 T2 writes 10 7 stopped Context switch 8 assigns 12 T1 writes 12 9 12 ----------------------------------------------------------------------------In this example, threads 1 and 2 share variable
var
, initially having
the value 5. At step 1 thread 1 starts, and starts to write a value into
var
. However, it is interrupted by a context switch, and thread 2 is
started (step 4). Thread 2 also wants to write a value into var
, and
succeeds until time step 7, when another context switch takes place. By now
var
is 10. However, thread 1 was also in the process of writing a value
into var
, and it is given a chance to complete its work: it assigns 12
to var
in time step 8. Once time step 9 is reached, thread 2 proceeds on
the (erroneous) assumption that var
must be equal to 10. Clearly, from the
point of view of thread 2 its data have been corrupted.
In this case data corruption was caused by multiple threads accessing the same data in an uncontrolled way. To prevent this from happening, access to shared data should be protected in such a way that only one thread at a time may access the shared data.
Mutexes are used to prevent the abovementioned kinds of problems by offering a guarantee that data are only accessed by the thread that could lock the mutex that is used to synchronize access to those data.
Exclusive data access completely depends on cooperation between the threads. If thread 1 uses mutexes, but thread 2 doesn't, then thread 2 may freely access the common data. Of course that's bad practice, which should be avoided.
It is stressed that although using mutexes is the programmer's responsibility, their implementation isn't: mutexes offer the necessary atomic calls. When requesting a mutex-lock the thread is blocked (i.e., the mutex statement does not return) until the lock has been obtained by the requesting thread.
Apart from the class std::mutex
the class
std::recursive_mutex
is available. When a recursive_mutex
is called
multiple times by the same thread it increases its lock-count. Before other
threads may access the protected data the recursive mutex must be unlocked
again that number of times. Moreover, the classes
std::timed_mutex
and
std::recursive_timed_mutex
are available. Their locks expire when released, but also after a certain
amount of time.
The members of the mutex classes perform atomic actions: no context
switch occurs while they are active. So when two threads are trying to
lock a mutex only one can succeed. In the above example: if both threads
would use a mutex to control access to var
thread 2 would not have been
able to assign 12 to var
, with thread 1 assuming that its value was 10. We
could even have two threads running purely parallel (e.g., on two separate
cores). E.g.:
------------------------------------------------------------------------- Time step: Thread 1: Thread 2: escription ------------------------------------------------------------------------- 1 starts starts T1 and T2 active 2 locks locks Both threads try to lock the mutex 3 blocks... obtains lock T2 obtains the lock, and T1 must wait 4 (blocked) processes var T2 processes var, T1 still blocked 5 obtains lock releases lock T2 releases the lock, and T1 immediately obtains the lock 6 processes var now T1 processes var 7 releases lock T1 also releases the lock -------------------------------------------------------------------------Although mutexes can directly be used in programs, this rarely happens. It is more common to embed mutex handling in locking classes that make sure that the mutex is automatically unlocked again when the mutex lock is no longer needed. Therefore, this section merely offers an overview of the interfaces of the mutex classes. Examples of their use will be given in the upcoming sections (e.g., section 20.4).
All mutex classes offer the following constructors and members:
mutex() constexpr
:The default constexpr
constructor is the only available
constructor;
~mutex()
:The destructor does not unlock a locked mutex. If locked it must
explicitly be unlocked using the mutex's unlock
member;
void lock()
:The calling thread blocks until it owns the mutex. Unless lock
is
called for a recursive mutex a system_error is thrown if the thread
already owns the lock. Recursive mutexes increment their internal
lock count;
bool try_lock() noexcept
:The calling thread tries to obtain ownership of the mutex. If ownership is obtained,true
is returned, otherwisefalse
. If the calling thread already owns the locktrue
is also returned, and in this case a recursive mutex also increments its internal lock count;
void unlock() noexcept
:The calling thread releases ownership of the mutex. A
system_error
is thrown if the thread does not own the
lock. A recursive mutex decrements its interal lock count, releasing
ownership of the mutex once the lock count has decayed to zero;
The timed-mutex classes (timed_mutex, recursive_timed_mutex
) also offer
these members:
bool try_lock_for(chrono::duration<Rep, Period> const
&relTime) noexcept
:The calling thread tries to obtain ownership of the mutex within the specified time interval. If ownership is obtained,true
is returned, otherwisefalse
. If the calling thread already owns the locktrue
is also returned, and in this case a recursive timed mutex also increments its internal lock count. TheRep
andDuration
types are inferred from the actualrelTime
argument. E.g.,std::timed_mutex timedMutex; timedMutex.try_lock_for(chrono::seconds(5));
bool try_lock_until(chrono::time_point<Clock,
Duration> const &absTime) noexcept
:The calling thread tries to obtain ownership of the mutex untilabsTime
has passed. If ownership is obtained,true
is returned, otherwisefalse
. If the calling thread already owns the locktrue
is also returned, and in this case a recursive timed mutex also increments its internal lock count. TheClock
andDuration
types are inferred from the actualabsTime
argument. E.g.,std::timed_mutex timedMutex; timedMutex.try_lock_until(chrono::system_clock::now() + chrono::seconds(5));
std::once_flag
and the
std::call_once
function, introduced in this section, the
<mutex>
header file must be included.
In single threaded programs the initialization of global data not necessarily
happens at the same point in code. An example is the initialization of the
object of a singleton class (cf. Gamma et al. (1995), Design Patterns,
Addison-Wesley). Singleton classes may define a single static pointer data
member Singleton *s_object
, pointing to the singleton's object, and may
offer a static member instance
, implemented something like this:
Singleton &Singleton::instance() { return s_object ? s_object : (s_object = new Singleton); }
With multi-threaded programs this approach immediately gets complex. For
example, if two threads call instance
at the same time, while s_object
still equals 0, then both may call new Singleton
, resulting in one
dynamically allocated Singleton
object becoming unreachable. Other
threads, called after s_object
was initialized for the first time, may
either return a reference to that object, or may return a reference to the
object initialized by the second thread. Not exactly the expected behavior of
a singleton.
Mutexes (cf. section 20.3) can be used to solve these kinds of problems,
but they result in some overhead and inefficiency, as the mutex must be
inspected at each call of Singleton::instance
.
When variables must dynamically be initialized, and the initialization should
take place only once the std::once_flag
type and the std::call_once
function should be used.
The call_once
function expects two or three arguments:
once_flag
variable, keeping track of the
actual initialization status. The call_once
function simply returns if
the once_flag
indicates that initialization already took place;
call_once's
third argument.
instance
function can
now easily be designed (using in-class implementations for brevity):
class Singleton { static std::once_flag s_once; static Singleton *s_singleton; ... public: static Singleton *instance() { std::call_once(s_once, []{s_singleton = new Singleton;} ); return s_singleton; } ... };
However, there are additional ways to initialize data, even for multi-threaded programs:
constexpr
keyword (cf. section 8.1.4.1), satisfying the requirements for constant
initialization. In this case, a static object, initialized using that
constructor, is guaranteed to be initialized before any code is run as part of
the static initialization phase. This used by std::mutex
, as it eliminates
the possibility of race conditions when global mutexes are initialized.
#include <iostream> struct Cons { Cons() { std::cout << "Cons called\n"; } }; void called(char const *time) { std::cout << time << "time called() activated\n"; static Cons cons; } int main() { std::cout << "Pre-1\n"; called("first"); called("second"); std::cout << "Pre-2\n"; Cons cons; } /* Displays: Pre-1 firsttime called() activated Cons called secondtime called() activated Pre-2 Cons called */
This feature causes a thread to wait automatically if another thread is still initializing the static data (note that non-static data never cause problems, as non-static local variables only exist within their own thread of execution).
std::shared_mutex
, available after
including the <shared_mutex>
header file.
The type std::shared_mutex
is a shared mutex type. Shared mutex types
behave like timed_mutex
types and optionally have the characteristics
described below.
> In this description, m denotes an object of a > mutex type, rel_type denotes an object of an instantiation of duration > (20.11.5), and abs_time denotes an object of an instantiation of time_point > (20.11.6).
The class shared_mutex
provides a non-recursive mutex with shared
ownership semantics, comparable to, e.g., the shared_ptr
type.
A program using shared_mutexes
is undefined if:
shared_mutex
;
shared_mutex
.
Shared mutex types provide a shared lock ownership mode. Multiple threads can
simultaneously hold a shared lock ownership of a shared_mutex
type of
object. But no thread can hold a shared lock while another thread holds an
exclusive lock on the same shared_mutex
object, and vice-versa.
The type shared_mutex
offers the following members:
void lock_shared()
:Blocks the calling thread until shared ownership of the mutex can be obtained by the calling thread. An exception is thrown if the current thread already owns the lock, if it is not allowed to lock the mutex, or if the mutex is already locked and blocking is not possible;
void unlock_shared()
:Releases a shared lock on the mutex held by the calling thread. Nothing happens if the current thread does not already own the lock;
bool try_lock_shared()
:The current thread attempts to obtain shared ownership of the mutex without blocking. If shared ownership is not obtained, there is no effect andtry_lock_shared
immediately returns. Returnstrue
if the shared ownership lock was acquired,false
otherwise. An implementation may fail to obtain the lock even if it is not held by any other thread. Initially the calling thread may not yet own the mutex;
bool try_lock_shared_for(rel_time)
:Attempts to obtain shared lock ownership for the calling thread within the relative time period specified byrel_time
. If the time specified byrel_time
is less than or equal torel_time.zero()
, the member attempts to obtain ownership without blocking (as if by callingtry_lock_shared()
). The member shall return within the time interval specified by rel_time only if it has obtained shared ownership of the mutex object. Returnstrue
if the shared ownership lock was acquired,false
otherwise. Initially the calling thread may not yet own the mutex;
bool try_lock_shared_until(abs_time)
:Attempts to obtain shared lock ownership for the calling thread until the time specified byabs_time
has passed. If the time specified byabs_time
has already passed then the member attempts to obtain ownership without blocking (as if by callingtry_lock_shared()
). Returnstrue
if the shared ownership lock was acquired,false
otherwise. Initially the calling thread may not yet own the mutex;
<mutex>
header file must be included.
Whenever threads share data, and at least one of the threads may change common data, mutexes should be used to prevent threads from using the same data synchronously.
Usually locks are released at the end of action blocks. This requires explicit
calls to the mutexes' unlock
function, with introduces comparable problems
as we've seen with the thread's join
member.
To simplify locking and unlocking two mutex wrapper classes are available:
std::lock_guard
:objects of this class offer the basic unlock-guarantee: their
destructors call the member unlock
of the mutexes they control;
std::unique_lock
:objects of this class offer a more extensive interface, allowing
explicit unlocking and locking of the mutexes they control, while their
destructors preserve the unlock-guarantee also offered by lock_guard
;
The class lock_guard
offers a limited, but useful interface:
lock_guard<Mutex>(Mutex &mutex)
:when defining alock_guard
object the mutex type (e.g.,std::mutex, std::timed_mutex, std::shared_mutex
) is specified, and a mutex of the indicated type is provided as its argument. The construction blocks until thelock_guard
object owns the lock. Thelock_guard's
destructor automatically releases the mutex lock.
lock_guard<Mutex>(Mutex &mutex, std::adopt_lock_t)
:this constructor is used to transfer control over the mutex from the calling thread to thelock_guard
. The mutex lock is released again by thelock_guard's
destructor. At construction time the mutex must already be owned by the calling thread. Here is an illustration of how it can be used:1: void threadAction(std::mutex &mut, int &sharedInt) 2: { 3: std::lock_guard<std::mutex> lg{mut, std::adopt_lock_t()}; 4: // do something with sharedInt 5: }
- At line 1
threadAction
receives a reference to a mutex. Assume the mutex owns the lock;
- At line 3 control is transferred to the
lock_guard
. Even though we don't explicitly use thelock_guard
object, an object should be defined to prevent the compiler from destroying an anonymous object before the function ends;
- When the function ends, at line 5, the mutex's lock is released by the
lock_guard's
destructor.
mutex_type
:in addition to the constructors and destructor,lock_guard<Mutex>
types also define the typemutex_type
: it is a synonym of theMutex
type that is passed to thelock_guard
's constructor.
Here is a simple example of a multi-threaded program using lock_guards
to prevent information inserted into cout
from getting mixed.
bool oneLine(istream &in, mutex &mut, int nr) { lock_guard<mutex> lg(mut); string line; if (not getline(in, line)) return false; cout << nr << ": " << line << endl; return true; } void io(istream &in, mutex &mut, int nr) { while (oneLine(in, mut, nr)) this_thread::yield(); } int main(int argc, char **argv) { ifstream in(argv[1]); mutex ioMutex; thread t1(io, ref(in), ref(ioMutex), 1); thread t2(io, ref(in), ref(ioMutex), 2); thread t3(io, ref(in), ref(ioMutex), 3); t1.join(); t2.join(); t3.join(); }
As with lock_guard
, a mutex-type must be specified when defining
objects of the class std::unique_lock
. The class
unique_lock
is much more elaborate than the basic lock_guard
class
template. Its interface does not define a copy constructor or overloaded
assignment operator, but it does define a move constructor and a move
assignment operator. In the following overview of unique_lock
's inteface
Mutex
refers to the mutex-type that is specified when defining a
unique_lock
:
unique_lock() noexcept
:the default constructor is not yet associated with amutex
object. It must be assigned amutex
(e.g., using move-assignment) before it can do anything useful;
explicit unique_lock(Mutex &mutex)
:initializes aunique_lock
with an existingMutex
object, and callsmutex.lock()
;
unique_lock(Mutex &mutex, defer_lock_t) noexcept
:initializes aunique_lock
with an existingMutex
object, but does not callmutex.lock()
. Call it by passing adefer_lock_t
object as the constructor's second argument, e.g.,unique_lock<mutex> ul(mutexObj, defer_lock_t())
unique_lock(Mutex &mutex, try_to_lock_t) noexcept
:initializes aunique_lock
with an existingMutex
object, and callsmutex.try_lock()
: the constructor won't block if the mutex cannot be locked;
unique_lock(Mutex &mutex, adopt_lock_t) noexcept
:initializes aunique_lock
with an existingMutex
object, and assumes that the current thread has already locked the mutex;
unique_lock(Mutex &mutex, chrono::duration<Rep, Period> const
&relTime) noexcept
:this constructor tries to obtain ownership of theMutex
object by callingmutex.try_lock_for(relTime)
. The specified mutex type must therefore support this member (e.g., it is astd::timed_mutex
). It could be called like this:std::unique_lock<std::timed_mutex> ulock(timedMutex, std::chrono::seconds(5));
unique_lock(Mutex &mutex, chrono::time_point<Clock, Duration> const
&absTime) noexcept
:this constructor tries to obtain ownership of theMutex
object by callingmutex.try_lock_until(absTime)
. The specified mutex type must therefore support this member (e.g., it is astd::timed_mutex
). This constructor could be called like this:std::unique_lock<std::timed_mutex> ulock( timedMutex, std::chrono::system_clock::now() + std::chrono::seconds(5) );
void lock()
:blocks the current thread until ownership of the mutex that is managed by theunique_lock
is obtained. If no mutex is currently managed, then asystem_error
exception is thrown.
Mutex *mutex() const noexcept
:returns a pointer to the mutex object stored inside theunique_lock
(anullptr
is returned if no mutex object is currently associated with theunique_lock
object.)
explicit operator bool() const noexcept
:returnstrue
if theunique_lock
owns a locked mutex, otherwisefalse
is returned;
unique_lock& operator=(unique_lock &&tmp) noexcept
:if the left-hand operand owns a lock, it will call its mutex'sunlock
member, whereaftertmp's
state is transferred to the left-hand operand;
bool owns_lock() const noexcept
:returnstrue
if theunique_lock
owns the mutex, otherwisefalse
is returned;
Mutex *release() noexcept
:returns a pointer to the mutex object that is associated with the
unique_lock
object, discarding that association;
void swap(unique_lock& other) noexcept
:swaps the states of the currentunique_lock
andother
;
bool try_lock()
:tries to obtain ownership of the mutex that is associated witg theunique_lock
, returningtrue
if this succeeds, andfalse
otherwise. If no mutex is currently associated with theunique_lock
object, then asystem_error
exception is thrown;
bool try_lock_for(chrono::duration<Rep, Period> const
&relTime)
:this member function tries to obtain ownership of theMutex
object managed by theunique_lock
object by calling the mutex'stry_lock_for(relTime)
member. The specified mutex type must therefore support this member (e.g., it is astd::timed_mutex
);
bool try_lock_until(chrono::time_point<Clock,
Duration> const &absTime)
:this member function tries to obtain ownership of theMutex
object managed by theunique_lock
object by calling the mutex'smutex.try_lock_until(absTime)
member. The specified mutex type must therefore support this member (e.g., it is astd::timed_mutex
);
void unlock()
:releases ownership of the mutex (or reduces the mutex's lock count). Asystem_error
exception is thrown if theunique_lock
object does not own the mutex.
In addition to the members of the classes std::lock_guard
and
std::unique_lock
the functions std::lock
and
std::try_lock
are available. These functions can be
used to prevent deadlocks, the topic of the next section.
std::lock
and std::try_lock
functions that can
be used to help preventing such situations.
Before these functions can be used the <mutex>
header file must be
included
In the following overview L1 &l1, ...
represents one or more
references to objects of lockable types:
void std::lock(L1 &l1, ...)
:When the function returns locks were obtained on all li
objects. If a lock could not be obtained for at least one of the
objects, then all locks obtained so far are relased, even if the
object for which no lock could be obtained threw an exception;
int std::try_lock(L1 &l1, ...)
:This function calls the lockable objects' try_lock
members. If all
locks could be obtained, then -1 is returned. Otherwise the (0-based)
index of the first argument which could not be locked is returned,
releasing all previously obtained locks.
As an example consider the following little multi-threaded program: The
threads use mutexes to obtain unique access to cout
and to an int
value
. However, fun1
first locks cout
(line 7), and then value
(line 10); fun2
first locks value
(line 16) and then cout
(line
19). Clearly, if fun1
has locked cout
fun2
can't obtain the lock
until fun1
has released it. Unfortunately, fun2
has locked value
,
and the functions only release their locks when returning. But in order to
access the information in value
fun1
it must have obtained a lock on
value
, which it can't, as fun2
has already locked value
: the
threads are waiting for each other, and neither thread gives in.
1: int value; 2: mutex valueMutex; 3: mutex coutMutex; 4: 5: void fun1() 6: { 7: lock_guard<mutex> lg1(coutMutex); 8: cout << "fun 1 locks cout\n"; 9: 10: lock_guard<mutex> lg2(valueMutex); 11: cout << "fun 1 locks value\n"; 12: } 13: 14: void fun2() 15: { 16: lock_guard<mutex> lg1(valueMutex); 17: cerr << "fun 2 locks value\n"; 18: 19: lock_guard<mutex> lg2(coutMutex); 20: cout << "fun 2 locks cout\n"; 21: } 22: 23: int main() 24: { 25: thread t1(fun1); 26: fun2(); 27: t1.join(); 28: }
A good recipe for avoiding deadlocks is to prevent nested (or multiple) mutex
lock calls. But if multiple mutexes must be used, always obtain the locks in
the same order. Rather than doing this yourself, std::lock
and
std::try_lock
should be used whenever possible to obtain multiple mutex
locks. These functions accept multiple arguments, which must be lockable types
like lock_guard, unique_lock,
or even a plain mutex
. The previous
deadlocking program, can be modified to call std::lock
to lock both
mutexes. In this example using one single mutex would also work, but the
modified program now looks as similar as possible to the previous
program. Note how in lines 10 and 21 a different
ordering of the unique_locks
arguments was used: it is not necessary to
use an identical argument order when calling std::lock
or
std::try_lock
.
1: int value; 2: mutex valueMutex; 3: mutex coutMutex; 4: 5: void fun1() 6: { 7: unique_lock<mutex> lg1(coutMutex, defer_lock); 8: unique_lock<mutex> lg2(valueMutex, defer_lock); 9: 10: lock(lg1, lg2); 11: 12: cout << "fun 1 locks cout\n"; 13: cout << "fun 1 locks value\n"; 14: } 15: 16: void fun2() 17: { 18: unique_lock<mutex> lg1(coutMutex, defer_lock); 19: unique_lock<mutex> lg2(valueMutex, defer_lock); 20: 21: lock(lg2, lg1); 22: 23: cout << "fun 2 locks cout\n"; 24: cout << "fun 2 locks value\n"; 25: } 26: 27: int main() 28: { 29: thread t1(fun1); 30: thread t2(fun2); 31: t1.join(); 32: t2.join(); 33: }
std::shared_lock
,
available after including the <shared_mutex>
header file.
An object of the type std::shared_lock
controls the shared ownership of a
lockable object within a scope. Shared ownership of the lockable object may be
acquired at construction time or thereafter, and once acquired, it may be
transferred to another shared_lock
object. Objects of type shared_lock
cannot be copied, but move construction and assignment is supported.
The behavior of a program is undefined if the contained pointer to a mutex
(pm
) has a non-zero value and the lockable object pointed to by pm
does
not exist for the entire remaining lifetime of the shared_lock
object. The supplied mutex type must be a shared_mutex
or a type having
the same characteristics.
The type shared_lock
offers the following constructors, destructor and
operators:
shared_lock() noexcept
:The default constructor creates ashared_lock
which is not owned by a thread and for whichpm == 0
;
explicit shared_lock(mutex_type &mut)
:This constructor locks the mutex, callingmut.lock_shared()
. The calling thread may not already own the lock. Following the constructionpm == &mut
, and the lock is owned by the current thread;
shared_lock(mutex_type &mut, defer_lock_t) noexcept
:This constructor assignspm
to&mut
, but the calling thread does not own the lock;
shared_lock(mutex_type &mut, try_to_lock_t)
:This constructor tries to locks the mutex, callingmut.try_lock_shared()
. The calling thread may not already own the lock. Following the constructionpm == &mut
, and the lock may or may not be owned by current thread, depending on the return value oftry_lock_shared;
shared_lock(mutex_type &mut, adopt_lock_t)
:This constructor can be called if the calling thread has shared
ownership of the mutex. Following the construction pm == &mut
, and the
lock is owned by the current thread;
shared_lock(mutex_type &mut,
chrono::time_point<Clock, Duration> const &abs_time)
:This constructor is a member template, whereClock
andDuration
are types specifying a clock and absolute time (cf. section 20.1). It can be called if the calling thread does not already own the mutex. It callsmut.try_lock_shared_until(abs_time)
. Following the constructionpm == &mut
, and the lock may or may not be owned by current thread, depending on the return value oftry_lock_shared_until;
shared_lock(mutex_type &mut,
chrono::duration<Rep, Period> const &rel_time)
:This constructor is a member template, whereClock
andPeriod
are types specifying a clock and relative time (cf. section 20.1). It can be called if the calling thread does not already own the mutex. It callsmut.try_lock_shared_for(abs_time)
. Following the constructionpm == &mut
, and the lock may or may not be owned by current thread, depending on the return value oftry_lock_shared_for;
shared_lock(shared_lock &&tmp) noexcept
:The move constructor transfers the information intmp
to the newly constructedshared_lock
. Following the constructiontmp.pm == 0
andtmp
no longer owns the lock;
~shared_lock()
:If the lock is owned by the current thread,
pm->unlock_shared()
is called;
shared_lock &operator=(shared_lock &&tmp) noexcept
(The move assignment operator calls pm->unlock_shared
and then
transfers the information in tmp
to the
current shared_lock
object. Following this tmp.pm == 0
and tmp
no longer owns the lock;)
explicit operator bool () const noexcept
:Returns whether or not the shared_lock
object owns the lock.
The following members are provided:
void lock()
:Callspm->lock_shared()
, after which the current tread owns the shared lock. Exceptions may be thrown fromlock_shared
, and otherwise ifpm == 0
or if the current thread already owns the lock;
mutex_type *mutex() const noexcept
:Returns pm
;
mutex_type *release() noexcept
:Returns the previous value of pm
, which is equal to zero after
calling this member. Also, the current object no longer owns the lock;
void swap(shared_lock &other) noexcept
:Swaps the data members of the current and theother
shared_lock
objects. There is also a free memberswap
, a function template, swapping twoshared_lock<Mutex>
objects, whereMutex
represents the mutex type for which the shared lock objects were instantiated:void swap(shared_lock<Mutex> &one, shared_lock<Mutex> &two) noexcept
;
bool try_lock()
:Callspm->try_lock_shared()
, returning this call's return value. Exceptions may be thrown fromtry_lock_shared
, and otherwise ifpm == 0
or if the current thread already owns the lock;
bool try_lock_for(const chrono::duration<Rep, Period>&
rel_time)
:A member template, whereClock
andPeriod
are types specifying a clock and relative time (cf. section 20.1). It callsmut.try_lock_shared_for(abs_time)
. Following the call the lock may or may not be owned by current thread, depending on the return value oftry_lock_shared_until
. Exceptions may be thrown fromtry_lock_shared_for
, and otherwise ifpm == 0
or if the current thread already owns the lock;
bool try_lock_until(const chrono::time_point<Clock,
Duration>& abs_time)
:A member template, whereClock
andDuration
are types specifying a clock and absolute time (cf. section 20.1). It callsmut.try_lock_shared_until(abs_time)
, returning its return value. Following the call the lock may or may not be owned by current thread, depending on the return value oftry_lock_shared_until
. Exceptions may be thrown fromtry_lock_shared_until
, and otherwise ifpm == 0
or if the current thread already owns the lock;
void unlock()
:Unlocks the shared mutex lock, releasing its ownership. Throws an exception if the shared mutex was not owned by the current thread.
Before condition variables can be used the <condition_variable>
header
file must be included.
To start our discussion, consider a classic producer-consumer scenario: the producer generates items which are consumed by a consumer. The producer can only produce a certain number of items before its storage capacity has filled up and the client cannot consume more items than the producer has produced.
At some point the producer's storage capacity has filled to the brim, and the producer has to wait until the client has at least consumed some items, thereby creating space in the producer's storage. Similarly, the consumer cannot start consuming until the producer has at least produced some items.
Implementing this scenario only using mutexes (data locking) is not an attractive option, as merely using mutexes forces a program to implement the scenario using polling: processes must continuously (re)acquire the mutex's lock, determine whether they can perform some action, followed by the release of the lock. Often there's no action to perform, and the process is busy acquiring and releasing the mutex's lock. Polling forces threads to wait until they can lock the mutex, even though continuation might already be possible. The polling interval could be reduced, but that too isn't an attractive option, as that increases the overhead associated with handling the mutexes (also called `busy waiting').
Condition variables can be used to prevent polling. Threads can use condition variables to notify waiting threads that there is something for them to do. This way threads can synchronize on data values (states).
As data values may be modified by multiple threads, threads still need to use mutexes, but only for controlling access to the data. In addition, condition variables allow threads to release ownership of mutexes until a certain value has been obtained, until a preset amount of time has been passed, or until a preset point in time has been reached.
The prototypical setup of threads using condition variables looks like this:
lock the mutex while the required condition has not yet been attained (i.e., is false): wait until being notified (this automatically releasing the mutex's lock). once the mutex's lock has been reacquired, and the required condition has been attained: process the data release the mutex's lock.
lock the mutex while the required condition has not yet been attained: do something to attain the required condition notify waiting threads (that the required condition has been attained) release the mutex's lock.
This protocol hides a subtle initial synchronization requirement. The consumer
will miss the producer's notification if it (i.e., the consumer) hasn't yet
entered its waiting state. So waiting (consumer) threads should start
before notifying (producer) threads. Once threads have started, no
assumptions can be made anymore about the order in which any of the condition
variable's members (notify_one, notify_all, wait, wait_for
, and
wait_until
) are called.
Condition variables come in two flavors: objects of the class
std::condition_variable
are used in combination
with objects of type unique_lock<mutex>
. Because of
optimizations which are available for this specific combination using
condition_variables
is somewhat more efficient than using the more
generally applicable class
std::condition_variable_any
, which may be
used with any (e.g., user supplied) lock type.
Condition variable classes (covered in detail in the next two sections) offer
members like wait, wait_for, wait_until, notify_one
and notify_all
that may concurrently be called. The notifying members are always atomically
executed. Execution of the wait
members consists of three atomic parts:
wait
call).
wait
-members the previously waiting thread
has reacquired the mutex's lock.
In addition to the condition variable classes the following free function and
enum
type is provided:
void
std::notify_all_at_thread_exit(condition_variable &cond,
unique_lock<mutex> lockObject)
:once the current thread has ended, all other threads waiting oncond
are notified. It is good practice to exit the thread as soon as possible after callingnotify_all_at_thread_exit
.Waiting threads must verify that the thread they were waiting for has indeed ended. This is usually realized by first obtaining the lock on
lockObject
, followed by verifying that the condition they were waiting for is true and that the lock was not reacquired beforenotify_all_at_thread_exit
was called.
std::cv_status
:thecv_status
enum is used by several member functions of the condition variable classes (cf. sections 20.5.1 and 20.5.2):namespace std { enum class cv_status { no_timeout, timeout }; }
std::condition_variable
merely offers a
default constructor. No copy constructor or overloaded assignment operator is
provided.
Before using the class condition_variable
the <condition_variable>
header file must be included.
The class's destructor requires that no thread is blocked by the thread
destroying the condition_variable
. So all threads waiting on a
condition_variable
must be notified before a condition_variable
object's lifetime ends. Calling notify_all
(see below) before a
condition_variable's
lifetime ends takes care of that, as the
condition_variable's
thread releases its lock of the mutex
variable,
allowing one of the notified threads to lock the mutex.
In the following member-descriptions a type Predicate
indicates that a
provided Predicate
argument can be called as a function without arguments,
returning a bool
. Also, other member functions are frequently referred
to. It is tacitly assumed that all member referred to below were called using
the same condition variable object.
The class condition_variable
supports several wait
members, which
block the thread until notified by another thread (or after a configurable
waiting time). However, wait
members may also spuriously unblock, without
having reacquired the lock. Therefore, returning from wait
members threads
should verify that the required condition is actually true. If not,
again calling wait
may be appropriate. The next piece of pseudo code
illustrates this scheme:
while (conditionNotTrue()) condVariable.wait(&uniqueLock);
The class condition_variable
's members are:
void notify_one() noexcept
:one wait
member called by other threads returns. Which one
actually returns cannot be predicted.
void notify_all() noexcept
:all wait
members called by other threads unblock their wait
states. Of course, only one of them will subsequently succeed in
reacquiring the condition variable's lock object.
void wait(unique_lock<mutex>& uniqueLock)
:before callingwait
the current thread must have acquired the lock ofuniqueLock
. Callingwait
releases the lock, and the current thread is blocked until it has received a notification from another thread, and has reacquired the lock.
void wait(unique_lock<mutex>& uniqueLock, Predicate pred)
:this is a member template, using the template headertemplate <typename Predicate>
. The template's type is automatically derived from the function's argument type and does not have to be specified explicitly.Before calling
wait
the current thread must have acquired the lock ofuniqueLock
. As long as `pred
' returnsfalse
wait(lock)
is called.
cv_status wait_for(unique_lock<mutex> &uniqueLock,
std::chrono::duration<Rep, Period> const &relTime)
:this member is defined as a member template, using the template headertemplate <typename Rep, typename Period>
. The template's types are automatically derived from the types of the function's arguments and do not have to be specified explicitly. E.g., to wait for at most 5 secondswait_for
can be called like this:cond.wait_for(&unique_lock, std::chrono::seconds(5));This member returns when being notified or when the time interval specified byrelTime
has passed.When returning due to a timeout,
std::cv_status::timeout
is returned, otherwisestd::cv_status::no_timeout
is returned.Threads should verify that the required data condition has been met after
wait_for
has returned.
bool wait_for(unique_lock<mutex> &uniqueLock,
chrono::duration<Rep, Period> const &relTime, Predicate
pred)
:this member is defined as a member template, using the template headertemplate <typename Rep, typename Period, typename Predicate>
. The template's types are automatically derived from the types of the function's arguments and do not have to be specified explicitly.As long as
pred
returns false, the previouswait_for
member is called. If the previous member returnscv_status::timeout
, thenpred
is returned, otherwisetrue
.
cv_status wait_until(unique_lock<mutex>& uniqueLock,
chrono::time_point<Clock, Duration> const &absTime)
:this member is defined as a member template, using the template headertemplate <typename Clock, typename Duration>
. The template's types are automatically derived from the types of the function's arguments and do not have to be specified explicitly. E.g., to wait until 5 minutes after the current timewait_until
can be called like this:cond.wait_until(&unique_lock, chrono::system_clock::now() + std::chrono::minutes(5));This function acts identically to thewait_for(unique_lock<mutex> &uniqueLock, chrono::duration<Rep, Period> const &relTime)
member described earlier, but uses an absolute point in time, rather than a relative time specification.This member returns when being notified or when the time interval specified by
relTime
has passed. When returning due to a timeout,std::cv_status::timeout
is returned, otherwisestd::cv_status::no_timeout
is returned.
bool wait_until(unique_lock<mutex> &lock,
chrono::time_point<Clock, Duration> const &absTime,
Predicate pred)
:this member is defined as a member template, using the template headertemplate <typename Clock, typename Duration, typename Predicate>
. The template's types are automatically derived from the types of the function's arguments and do not have to be specified explicitly.As long as
pred
returns false, the previouswait_until
member is called. If the previous member returnscv_status::timeout
, thenpred
is returned, otherwisetrue
.
true
when
wait-members of condition variables return.
condition_variable
the class
std::condition_variable_any
can be used with
any (e.g., user supplied) lock type, and not just with the stl-provided
unique_lock<mutex>
.
Before using the class condition_variable_any
the <condition_variable>
header file must be included.
The functionality that is offered by condition_variable_any
is identical
to the functionality offered by the class condition_variable
, albeit that
the lock-type that is used by condition_variable_any
is not
predefined. The class condition_variable_any
therefore requires the
specification of the lock-type that must be used by its objects.
In the interface shown below this lock-type is referred to as
Lock
. Most of condition_variable_any's
members are defined as
member templates, defining a Lock
type as one of its parameters. The
requirements of these lock-types are identical to those of the stl-provided
unique_lock
, and user-defined lock-type implementations should provide at
least the interface and semantics that is also provided by unique_lock
.
This section merely presents the interface of the class
condition_variable_any
. As its interface offers the same members as
condition_variable
(allowing, where applicable, passing any lock-type
instead of just unique_lock
to corresponding members), the reader is
referred to the previous section for a description of the semantics of the
class members.
Like condition_variable
, the class condition_variable_any
only
offers a default constructor. No copy constructor or overloaded assignment
operator is provided.
Also, like condition_variable
, the class's destructor requires that no
thread is blocked by the current thread. This implies that all other (waiting)
threads must have been notified; those threads may, however, subsequently
block on the lock specified in their wait
calls.
Note that, in addition to Lock
, the types Clock, Duration, Period,
Predicate,
and Rep
are template types, defined just like the identically
named types mentioned in the previous section.
Assuming that MyMutex
is a user defined mutex type, and that MyLock
is
a user defined lock-type (cf. section 20.4 for details about
lock-types), then a condition_variable_any
object can be defined and used
like this:
MyMutex mut; MyLock<MyMutex> ul(mut); condition_variable_any cva; cva.wait(ul);
These are the class condition_variable_any's
members:
void notify_one() noexcept;
void notify_all() noexcept;
void wait(Lock& lock);
void wait(Lock& lock, Predicate pred);
cv_status wait_until(Lock& lock, const
chrono::time_point<Clock, Duration>& absTime);
bool wait_until(Lock& lock, const chrono::time_point<Clock, Duration>&
absTime, Predicate pred);
cv_status wait_for(Lock& lock, const chrono::duration<Rep,
Period>& relTime);
bool wait_for(Lock& lock, const chrono::duration<Rep, Period>&
relTime,)
Predicate pred
;
consumer loop: - wait until there's an item in store, then reduce the number of stored items - remove the item from the store - increment the number of available storage locations - do something with the retrieved item producer loop: - produce the next item - wait until there's room to store the item, then reduce the number of available storage locations - store the item - increment the number of stored items
It is important that the two storage administrative tasks (registering the number of available items and available storage locations) are either performed by the client or by the producer. For the consumer `waiting' means:
Semaphore
, offering members
wait
and notify_all
. For a more extensive discussion of semaphores see
Tanenbaum, A.S. and Austin, T. (2013)
Structured Computer Organization, Pearson Prentice-Hall.
The data member containing the actual count is called d_semaphore
. It is
protected by mutex d_mutex
. In addition a condition_variable
d_condition
is defined:
mutable std::mutex d_mutex; std::condition_variable d_condition; size_t d_available;
The waiting process is implemented through its member function wait
:
1: void Semaphore::wait() 2: { 3: std::unique_lock<std::mutex> lk(d_mutex); // get the lock 4: while (d_available == 0) 5: d_condition.wait(lk); // internally releases the lock 6: // and waits, on exit 7: // acquires the lock again 8: --d_available; // dec. available 9: } // the lock is released
In line 5 d_condition.wait
releases the lock. It waits until receiving
a notification, and re-acquires the lock just before returning. Consequently,
wait's
code always has complete and unique control over d_semaphore
.
What about notifying the a waiting thread? This is handled in lines 4 and
5 of the member function notify_all
:
1: void Semaphore::notify_all() 2: { 3: std::lock_guard<std::mutex> lk(d_mutex); // get the lock 4: if (d_available++ == 0) 5: d_condition.notify_all(); // use notify_one to notify one other 6: // thread 7: } // the lock is released
At line 4 d_semaphore
is always incremented; by using a postfix
increment it can simultaneously be tested for being zero. If it was initially
zero then d_semaphore
is now one. A thread waiting until d_semaphore
exceeds zero may now continue. A waiting thread is notified by calling
d_condition.notify_one
. In situations where multiple threads are waiting
`notify_all
' can also be used.
Using the facilities of the class Semaphore
whose constructor expects
an initial value of its semaphore
data member, the classic
consumer-producer paradigm can now be implemented using
multi-threading (A more elaborate example of the producer-consumer
program is found in the yo/threading/examples/events.cc
file in the
C++ Annotations's source archive):
Semaphore available(10); Semaphore filled(0); std::queue itemQueue; void consumer() { while (true) { filled.wait(); // mutex lock the queue with multiple consumers size_t item = itemQueue.front(); itemQueue.pop(); available.notify_all(); process(item); // not implemented here } } void producer() { size_t item = 0; while (true) { ++item; available.wait(); // mutex lock the queue with multiple consumers itemQueue.push(item); filled.notify_all(); } } int main() { thread consume(consumer); thread produce(producer); consume.join(); produce.join(); }
<atomic>
header
file must be included.
When data are shared among multiple threads, data corruption is usually
prevented using mutexes. To increment a simple int
using this strategy
code as shown below is commonly used:
{ lock_guard<mutex> lk(intVarMutex); ++intVar; }The compound statement is used to limit the
lock_guard's
lifetime, so
that intVar
is only locked for a short little while.
This scheme is not complex, but at the end of the day having to define a
lock_guard
for every single use of a simple variable, and having to define
a matching mutex for each simple variable is a bit annoying and cumbersome.
C++ offers a way out through the use of atomic data types. Atomic data types are available for all basic types, and also for (trivial) user defined types. Trivial types are (see also section 23.6.2) all scalar types, arrays of elements of a trivial type, and classes whose constructors, copy constructors, and destructors all have default implementations, and their non-static data members are themselves of trivial types.
The class template std::atomic<Type>
is available for all
built-in types, including pointer types. E.g., std::atomic<bool>
defines
an atomic bool
type. For many types alternative somewhat shorter
type names are available. E.g, instead of std::atomic<unsigned short>
the
type std::atomic_ushort
can be used. Refer to the atomic
header file
for a complete list of alternate names.
If Trivial
is a user-defined trivial type then
std::atomic<Trivial>
defines an atomic variant of Trivial
: such a type does not require
a separate mutex
to synchronize access by multiple threads.
Objects of the class template std::atomic<Type>
cannot directly be copied
or assigned to each other. However, they can be initialized by values of type
Type
, and values of type Type
can also directly be assigned to
std::atomic<Type>
objects. Moreover, since atomic<Type>
types offer
conversion operators returning their Type
values, an atomic<Type>
objects can also be assigned to or initialized by another atomic<Type>
object using a static_cast
:
atomic<int> a1 = 5; atomic<int> a2(static_cast<int>(a1));The class
std::atomic<Type>
provides several public members, shown
below. Non-member (free) functions operating on atomic<Type>
objects are
also available.
The std::memory_order
enumeration defines the following symbolic
constants, which are used to specify ordering constraints of atomic operations:
memory_order_acq_rel:
the operation must be a read-modify-write
operation, combining memory_order_acquire
and
memory_order_release
;
memory_order_acquire:
the operation is an acquire operation. It
synchronizes with a release operation that wrote the same memory
location;
memory_order_consume:
the operation is a consume operation on the
involved memory location;
memory_order_relaxed:
no ordering constraints are provided by the
operation;
memory_order_release:
the operation is a release operation. It
synchronizes with acquire operations on the same location;
memory_order_sec_cst:
the default memory order specification for all
operations. Memmory storing operations use memory_order_release
,
memory load operations use memory_order_acquire
, and
read-modify-write operations use memory_order_acq_rel
.
The memory order cannot be specified for the overloaded operators provided by
atomic<Type>
. Otherwise, most atomic
member functions may also be
given a final memory_order
argument. Where this is not available it is
explictly mentioned at the function's description.
Here are the standard available std::atomic<Type>
member functions:
bool compare_exchange_strong(Type
¤tValue, Type newValue) noexcept
:The value in the atomic object is compared tonewValue
using byte-wise comparisons. If equaltrue
is returned, andnewValue
is stored in the atomic object; if unequalfalse
is returned and the object's current value is stored incurrentValue
;
bool compare_exchange_weak(Type &oldValue,
Type newValue) noexcept
:The value in the atomic object is compared tonewValue
using byte-wise comparisons. If equaltrue
is returned, andnewValue
is stored in the atomic object; if unequal, ornewValue
cannot be atomically assigned to the current objectfalse
is returned and the object's current value is stored incurrentValue
;
Type exchange(Type newValue) noexcept
:The object's current value is returned, and newValue
is assigned
to the current object;
bool is_lock_free() const noexept
:If the operations on the current object can be performed lock-freetrue
is returned, otherwisefalse
. This member has nomemory_order
parameter;
Type load() const noexcept
:The object's value is returned;
operator Type() const noexcept
:The object's value is returned;
void store(Type newValue) noexcept
:NewValue
is assigned to the current object. Note that the standard
assignment operator can also be used.
In addition to the above members, integral atomic types `Integral
'
(essentially the atomic variants of all built-in integral types) also offer
the following member functions:
Integral fetch_add(Integral value) noexcept
:Value
is added to the object's value, and the object's
value at the time of the call is returned;
Integral fetch_sub(Integral value) noexcept
:Value
is subtracted from the object's value, and the object's
value at the time of the call is returned;
Integral fetch_and(Integral mask) noexcept
:Thebit-and
operator is applied to the object's value andmask
, assigning the resulting value to the currrent object. The object's value at the time of the call is returned;
Integral fetch_|=(Integral mask) noexcept
:Thebit-or
operator is applied to the object's value andmask
, assigning the resulting value to the currrent object. The object's value at the time of the call is returned;
Integral fetch_^=(Integral mask) noexcept
:Thebit-xor
operator is applied to the object's value andmask
, assigning the resulting value to the currrent object. The object's value at the time of the call is returned;
Integral operator++() noexcept
:The prefix increment operator, returning object's new value;
Integral operator++(int) noexcept
:The postfix increment operator, returning the object's value before it was incremented;
Integral operator--() noexcept
:The prefix decrement operator, returning object's new value;
Integral operator--(int) noexcept
:The postfix decrement operator, returning the object's value before it was decremented;
Integral operator+=(Integral value) noexcept
:Value
is added to the object's current value and the object's
new value is returned;
Integral operator-=(Integral value) noexcept
:Value
is subtracted from the object's current value and the
object's new value is returned;
Integral operator&=(Integral mask) noexcept
:Thebit-and
operator is applied to the object's current value andmask
, assigning the resulting value to the currrent object. The object's new value is returned;
Integral operator|=(Integral mask) noexcept
:Thebit-or
operator is applied to the object's current value andmask
, assigning the resulting value to the currrent object. The object's new value is returned;
Integral operator^=(Integral mask) noexcept
:Thebit-xor
operator is applied to the object's current value andmask
, assigning the resulting value to the currrent object. The object's new value is returned;
Some of the free member functions have names ending in _explicit
. The
_explicit
functions define an additional parameter `memory_order
order
', which is not available for the non-_explicit
functions (e.g.,
atomic_load(atomic<Type> *ptr)
and atomic_load_explicit(atomic<Type>
*ptr, memory_order order)
)
Here are the free functions that are available for all atomic types:
bool
std::atomic_compare_exchange_strong(_explicit)(std::atomic<Type> *ptr,
Type *oldValue, Type newValue) noexept
:returns ptr->compare_exchange_strong(*oldValue, newValue)
;
bool
std::atomic_compare_exchange_weak(_explicit)(std::atomic<Type> *ptr,
Type *oldValue, Type newValue) noexept
:returns ptr->compare_exchange_weak(*oldValue, newValue)
;
Type
std::atomic_exchange(_explicit)(std::atomic<Type> *ptr, Type newValue)
noexept
:returns ptr->exchange(newValue)
;
void std::atomic_init(std::atomic<Type> *ptr, Type
init) noexept
:Storesinit
non-atomically in*ptr
. The object pointed to byptr
must have been default constructed, and as yet no member functions must have been called for it. This function has nomemory_order
parameter;
bool std::atomic_is_lock_free(std::atomic<Type>
const *ptr) noexept
:returnsptr->is_lock_free()
. This function has nomemory_order
parameter;
Type
std::atomic_load(_explicit)(std::atomic<Type> *ptr) noexept
:returns ptr->load()
;
void
std::atomic_store(_explicit)(std::atomic<Type> *ptr, Type value)
noexept
:calls ptr->store(value)
.
In addition to the abovementioned free functions atomic<Integral>
types
also offer the following free member functions:
Integral
std::atomic_fetch_add(_explicit)(std::atomic<Integral> *ptr, Integral
value) noexcept
:returns ptr->fetch_add(value)
;
Integral
std::atomic_fetch_sub(_explicit)(std::atomic<Integral> *ptr, Integral
value) noexcept
:returns ptr->fetch_sub(value)
;
Integral
std::atomic_fetch_and(_explicit)(std::atomic<Integral> *ptr, Integral
mask) noexcept
:returns ptr->fetch_and(value)
;
Integral
std::atomic_fetch_or(_explicit)(std::atomic<Integral> *ptr, Integral
mask) noexcept
:returns ptr->fetch_or(value)
;
Integral
std::atomic_fetch_xor(_explicit)(std::atomic<Integral> *ptr, Integral
mask) noexcept
:returns ptr->fetch_xor(mask)
.
n
elements, it works like this:
To convert this algorithm to a multi-threaded algorithm appears to be be a simple task:
void quicksort(Iterator begin, Iterator end) { if (end - begin < 2) // less than 2 elements are left return; // and we're done Iter pivot = partition(begin, end); // determine an iterator pointing // to the pivot element thread lhs(quicksort, begin, pivot);// start threads on the left-hand // side sub-arrays thread rhs(quicksort, pivot + 1, end); // and on the right-hand side // sub-arrays lhs.join(); rhs.join(); // and we're done }
Unfortunately, this translation to a multi-threaded approach won't work for reasonably large arrays because of a phenomenon called overpopulation: more threads are started than the operating system is prepared to give us. In those cases a Resource temporarily unavailable exception is thrown, and the program ends.
Overpopulation can be avoided by using a pool of workers, where each `worker' is a thread, which in this case is responsible for handling one (sub) array, but not for the nested calls. The pool of workers is controlled by a scheduler, receiving the requests to sort sub-arrays, and passing these requests on to the next available worker.
The main data structure of the example program developed in this section is a
queue of std::pairs
containing iterators of the array to be sorted
(cf. Figure 22, the sources of the program are found in the annotation()'s
yo/threading/examples/multisort
directory). Two queues are being used: one
queue is a task-queue, receiving the iterators of sub-arrays to be
partitioned. Instead of immediately launching new threads (the lhs
and
rhs
threads in the above example), the ranges to be sorted are pushed on
the task-queue. The other queue is the work-queue: elements are moved from the
task-queue to the work-queue, where they will be processed by one of the
worker threads.
The program's main
function starts the workforce, reads the data, pushes
the arrays begin
and end
iterators on the task queue and then starts
the scheduler. Once the scheduler ends the sorted array is displayed:
int main() { workForce(); // start the worker threads readData(); // read the data into vector<int> g_data g_taskQ.push( // prepare the main task Pair(g_data.begin(), g_data.end()) ); scheduler(); // sort g_data display(); // show the sorted elements }
The workforce consists of a bunch of detached threads. Each thread represents
a worker, implemented in the function void worker
. Since the number of
worker threads is fixed, overpopulation doesn't occur. Once the array has been
sorted and the program stops these detached threads simply end:
for (size_t idx = 0; idx != g_sizeofWorkforce; ++idx) thread(worker).detach();
The scheduler continues for as long as there are sub-arrays to sort. When this is the case the task queue's front element is moved to the work queue. This reduces the work queue's size, and prepares an assignment for the next available worker. The scheduler now waits until a worker is available. Once workers are available one of them is informed of the waiting assignment, and the scheduler waits for the next task:
void scheduler() { while (newTask()) { g_workQ.rawPushFront(g_taskQ); g_workforce.wait(); // wait for a worker to be available g_worker.notify_all(); // activate a worker } }
The function newTask
simply checks whether the task queue is empty. If so,
and none of the workers is currently busy sorting a sub-array then the array
has been sorted, and newTask
can return false
. When the task queue is
empty but a worker is still busy, it may be that new sub-array dimensions are
going to be placed on the task queue by an active worker. Whenever a worker is
active the Semaphore g_workforce's
size is less than the size of the work
force:
bool wip() { return g_workforce.size() != g_sizeofWorkforce; }
bool newTask() { bool done; unique_lock<mutex> lk(g_taskMutex); while ((done = g_taskQ.empty()) && wip()) g_taskCondition.wait(lk); return not done; }
Each detached worker thread performs a continuous loop. In the loop it waits for a notification by the scheduler. Once it receives a notification it retrieves its assignment from the work queue, and partitions the sub-array specified in its assignment. Partitioning may result in new tasks. Once this has been completed the worker has completed its assignment: it increments the available workforce and notifies the scheduler that it should check whether all tasks have been performed:
void worker() { while (true) { g_worker.wait(); // wait for action partition(g_workQ.popFront()); g_workforce.notify_all(); lock_guard<mutex> lk(g_taskMutex); g_taskCondition.notify_one(); } }
Sub-arrays smaller than two elements need no partitioning. All larger
sub-arrays are partitioned relative to their first element. The
std::partition
generic algorithm does this well, but if the pivot is
itself an element of the array to partition then the pivot's eventual location
is undetermined: it may be found anywhere in the series of elements which are
at least equal to the pivot. The two required sub-arrays, however, can easily
be constructed:
std::partition
relative to an array's first element,
partitioning the array's remaining elements, returning mid
,
pointing to the first element of the series of elements that are at
least as large as the array's first element;
mid - 1
points;
array.begin()
to
mid - 1
(elements all smaller than the pivot), and from mid
to
array.end()
(elements all at least as large as the pivot).
void partition(Pair const &range) { if (range.second - range.first < 2) return; auto rhsBegin = partition(range.first + 1, range.second, bind2nd(less<int>(), *range.first)); auto lhsEnd = rhsBegin - 1; swap(*range.first, *lhsEnd); pushTask(range.first, lhsEnd); pushTask(rhsBegin, range.second); }
Objects that contain such shared states are called
asynchronous return objects. However,
due to the nature of multi threading, a thread may request the results of an
asynchronous return object before these result are actually available. In
those cases the requesting thread blocks, waiting for the results to become
available. Asynchronous return objects offer wait
and get
members
which, respectively, wait until the results have become available, and
produce the asynchronous results once they are available. The phrase that
is used to indicate that the results are available is `the shared state has
been made ready'.
Shared states are made ready by asynchronous providers. Asynchronous providers are simply objects or functions providing results to shared states. Making a shared state ready means that an asynchronous provider
wait
, to return).
Once a shared state has been made ready it contains a value, object, or exception which can be retrieved by objects having access to the shared state. While code is waiting for a shared state to become ready the value or exception that is going to be stored in the shared state may be computed. When multiple threads try to access the same shared state they must use synchronizing mechanisms (like mutexes, cf. section 20.3) to prevent access-conflicts.
Shared states use reference counting to keep track of the number of asynchronous return objects or asynchronous providers that hold references to them. These return objects and providers may release their references to these shared states (which is called `releasing the shared state). This happens when a return object or provider holds the last reference to the shared state, and the shared state is destroyed.
On the other hand, an asynchronous provider may also abandon its shared state. In that case the provider, in sequence,
std::future_error
, holding the error condition
std::broken_promise
in its shared state;
Objects of the class std::future
(see the next section) are asynchronous
return objects. They can be produced by the std::async
(section
20.11) family of functions, and by objects of the classes
std::packaged_task
(section 20.12), and std::promise
(section
20.13).
join
member.
Waiting may be unwelcome: instead of just waiting our thread might also be doing something useful. It might as well pick up the results produced by a sub-thread at some point in the future.
In fact, exchanging data among threads always poses some difficulties, as it
requires shared variables, and the use of locks and mutexes to prevent data
corruption. Rather than waiting and using locks it would be nice if some
asynchronous task could be started, allowing the initiating thread (or even
other threads) to pick up the result at some point in the future, when the
results are needed, without having to worry about data locks or waiting times.
For situations like these C++ provides the class std::future
.
Before using the class std::future
the <future>
header file
must be included.
Objects of the class template std::future
harbor the results
produced by asynchronously executed tasks. The class std::future
is a
class template. Its template type parameter specifies the type of the result
returned by the asynchronously executed task. This type may be void
.
On the other hand, the asynchronously executed task may throw an exception
(ending the task). In that case the future
object catches the exception,
and rethrows it once its return value (i.e., the value returned by the
asynchronously executed task) is requested.
In this section the members of the class template future
are
described. Future
objects are commonly initialized through anonymous
future
objects returned by the factory function std::async
or by the
get_future
members of the classes std::promise
, and
std::packaged_task
(introduced in upcoming sections). Examples of the use
of std::future
objects are provided in those sections.
Some of future
's members return a value of the strongly typed
enumeration std::future_status
. This enumeration defines
three symbolic constants: future_status::ready, future_status::timeout,
and future_status::deferred
.
Error conditions are returned through std::future_error
exceptions. These error conditions are represented by the values of the
strongly typed enumeration std::future_errc
(covered in the next section).
The class future
itself provides the following constructors:
future()
:The default constructor constructs anfuture
object that does not refer to shared results. Itsvalid
member returnsfalse
.
future(future &&tmp) noexcept
:The move constructor is available. Itsvalid
member returns whattmp.valid()
would haved returned prior to the constructor invocation. After calling the move constructortmp.valid()
returnsfalse
.
future
does not offer a copy constructor or an overloaded
assignment operator.
Here are the members of the class std::future
:
future &operator=(future &&tmp)
:The move assignment operator grabs the information from thetmp
object; following this,tmp.valid()
returnsfalse
.
std::shared_future<ResultType> share() &&
:Returns astd::shared_future<ResultType>
(see section 20.10). After calling this function, thefuture's valid
member returnsfalse
.
ResultType get()
:Firstwait
(see below) is called. Oncewait
has returned the results produced by the associated asynchronoust task are returned. Withfuture<Type>
specifications the returned value is the moved shared value ifType
supports move assignment, otherwise a copy is returned. Withfuture<Type &>
specifications aType &
is returned, withfuture<void>
specifications nothing is returned. If the shared value is an exception, it is thrown instead of returned. After calling this member thefuture
object'svalid
member returnsfalse
.
bool valid() const
:Returnstrue
if the (future
) object for whichvalid
is called refers to an object returned by an asynchronous task. Ifvalid
returnsfalse
, thefuture
object exists, but in addition tovalid
only its destructor and move constructor can safely be called. When other members are called whilevalid
returnsfalse
astd::future_error
exception is thrown (having the valuefuture_errc::no_state
).
void wait() const
:The thread is blocked until the results produced by the associated asynchronous task are available.
std::future_status wait_for(chrono::duration<Rep, Period> const
&rel_time) const
:This member template derives the template typesRep
andPeriod
from the actually specified duration (cf. section 20.1.2). If the results contain a deferred function nothing happens. Otherwisewait_for
blocks until the results are available or until the amount of time specified byrel_time
has expired. Possible return values are:
future_status::deferred
if the results contains a deferred function;future_status::ready
if the results are avaiable;future_status::timeout
if the function is returning because the amount of time specified byrel_time
has expired.
future_status wait_until(chrono::time_point<Clock, Duration> const
&abs_time) const
:This member template derives the template typesClock
andDuration
from the actually specifiedabs_time
(cf. section 20.1.4). If the results contain a deferred function nothing happens. Otherwisewait_until
blocks until the results are available or until the point in time specified byabs_time
has expired. Possible return values are:
future_status::deferred
if the results contain a deferred function;future_status::ready
if the results are available;future_status::timeout
if the function is returning because the point in time specified byabs_time
has expired.
std::future<ResultType>
declares the following friends:
std::promise<ResultType>(sf. section 20.13), and
template<typename Function, typename... Args> std::future<typename result_of<Function(Args...)>::type> std::async(std::launch, Function &&fun, Args &&...args);(cf. section 20.11).
std::future
may return errors by throwing
std::future_error
exceptions. These error conditions are represented by
the values of the strongly typed enumeration
std::future_errc
which defines the following symbolic
constants:
broken_promise
Broken_promise
is thrown when afuture
object was received whose value was never assigned by apromise
orpackaged_task
. For example, an object of the classpromise<int>
should set the value of thefuture<int>
object returned by itsget_future
member (cf. section 20.13), but if it doesn't do so, then abroken_promise
exception is thrown, as illustrated by the following program:1: std::future<int> fun() 2: { 3: return std::promise<int>().get_future(); 4: } 5: 6: int main() 7: try 8: { 9: fun().get(); 10: } 11: catch (std::exception const &exc) 12: { 13: std::cerr << exc.what() << '\n'; 14: }At line 3 a
promise
object is created, but its value is never set. Consequently, it `breaks its promise' to produce a value: whenmain
tries to retrieve its value (in line 9) astd::futue_error
exception is thrown containing the thefuture_errc::broken_promise
value
future_already_retrieved
Future_already_retrieved
is thrown when multiple attempts are made to retrieve thefuture
object from, e.g., apromise
orpackaged_task
object that (eventually) should be ready. For example:1: int main() 2: { 3: std::promise<int> promise; 4: promise.get_future(); 5: promise.get_future(); 6: }Note that after defining the
std::promise
object in line 3 it has merely been defined: no value is ever assigned to itsfuture
. Even though no value is assigned to thefuture
object, it is a valid object. I.e., after some time the future should be ready, and the future'sget
member should produce a value. Hence, line 4 succeeds, but then, in line 5, the exception is thrown as `the future has already been retrieved'.
promise_already_satisfied
Promise_already_satisfied
is thrown when multiple attempts are made to assign a value to apromise
object. Assigning a value orexception_ptr
to thefuture
of apromise
object may happen only once. For example:1: int main() 2: { 3: std::promise<int> promise; 4: promise.set_value(15); 5: promise.set_value(155); 6: }
no_state
No_state
is thrown when a member function (other thanvalid
, see below) of afuture
object is called when itsvalid
member returnsfalse
. This happens, e.g., when calling members of a default constructedfuture
object.No_state
is not thrown forfuture
objects returned by theasync
factory function or returned by theget_future
members ofpromise
orpackaged_task
type of objects. Here is an example:1: int main() 2: { 3: std::future<int> fut; 4: fut.get(); 5: }
The class std::future_error
is derived from the class
std::exception
, and offers, in addition to the char const *what()
const
member also the member std::error_code const &code() const
,
returning an std::error_code
object associated
with the thrown exception.
std::async
) then
the return value of the asynchronously called function becomes available in
its activating thread through a std::future
object. The
future
object cannot be used by another thread. If this is required (e.g.,
see this chapter's final section) the future
object must be converted to a
std::shared_future
object.
Before using the class std::shared_future
the <future>
header file
must be included.
Once a shared_future
object is available, its get
member (see below)
can repeatedly be called to retrieve the results of the original future
object. This is illustrated by the next small example:
1: int main() 2: { 3: std::promise<int> promise; 4: promise.set_value(15); 5: 6: auto fut = promise.get_future(); 7: auto shared1 = fut.share(); 8: 9: std::cerr << "Result: " << shared1.get() << '\n'; 10: << "Result: " << shared1.get() << '\n'; 11: << "Valid: " << fut.valid() << '\n'; 12: 13: auto shared2 = fut.share(); 14: 15: std::cerr << "Result: " << shared2.get() << '\n'; 16: << "Result: " << shared2.get() << '\n'; 17: }
In lines 9 and 10 the promise's
results are retrieved multiple times,
but having obtained the shared_future
in line 7, the original future
object no longer has an associated shared state. Therefore, when another
attempt is made (in line 13) to obtain the shared_future
, a no
associated state exception is thrown and the program aborts.
However, multiple copies of shared_future
objects may co-exist. When
multiple copies of shared_future
objects exist (e.g. in different
threads), the results of the associated asynchronous task are made ready
(become available) at exactly the same moment in time.
The relationship between the classes future
and shared_future
resembles the relationship between the classes unique_ptr
and
shared_ptr
: there can only be one instance of a unique_pointer
,
pointing to data, whereas there can be many instances of a shared_pointer
,
each pointing to the same data.
The effect of calling any member of a shared_future
object for which
valid() == false
other than the destructor, the move-assignment operator,
or valid
is undefined.
The class shared_future
supports the following constructors:
shared_future() noexcept
an emptyshared_future
object is constructed that does not refer to shared results. After using this constructor the object'svalid
member returnsfalse
.
shared_future(shared_future const &other)
ashared_future
object is constructed that refers to the same results asother
(if any). After using this constructor the object'svalid
member returns the same value asother.valid()
.
shared_future(shared_future<Result> &&tmp) noexcept
Effects: move constructs a shared_future object that refers to the results that were originally referred to bytmp
(if any). After using this constructor the object'svalid
member returns the same value astmp.valid()
returned prior to the constructor invocation, andtmp.valid()
returnsfalse
.
shared_future(future<Result> &&tmp) noexcept
Effects: move constructs a shared_future object that refers to the results that were originally referred to bytmp
(if any). After using this constructor the object'svalid
member returns the same value astmp.valid()
returned prior to the constructor invocation, andtmp.valid()
returnsfalse
.
The class's destructor destroys the shared_future
object for which it is
called. If the object for which the destructor is called is the last
shared_future
object, and no std::promise
or
std::packaged_task
is associated with the results associated
with the current object, then the results are also destroyed.
Here are the members of the class std::shared_future
:
shared_future& operator=(shared_future &&tmp)
:The move assignment operator releases the current opject's shared results, and move assignstmp's
results to the current object. After calling the move assignment operator the current object'svalid
member returns the same value astmp.valid()
returned prior to the invocation of the move assignment operator, andtmp.valid()
returnsfalse
;
shared_future& operator=(shared_future const &rhs)
:The assignment operator releases the current opject's shared results, andrhs
's results are shared with the current object. After calling the assignment operator the current object'svalid
member returns the same value astmp.valid()
;
Result const &shared_future::get() const
:(Specializations forshared_future<Result &>
andshared_future<void>
are also available). This member waits until the shared results are available, and subsequently returnsResult const &
. Note that access to the data stored inResults
, accessed throughget
is not synchronized. It is the responsibility of the programmer to avoid race conditions when accessingResult's
data. IfResult
holds an exception, it is thrown whenget
is called;
bool valid() const
:Returns true
if the current object refers to shared
results;
void wait() const
:Blocks until shared results are available (i.e., the associated asynchronous task has produced results);
future_status wait_for(const chrono::duration<Rep, Period>& rel_time)
const
:(The template typesRep
andPeriod
normally are derived by the compiler from the actualrel_time
specification.) If the shared results contain a deferred function (cf. section 20.11) nothing happens. Otherwisewait_for
blocks until the results of the associated asynchronous task has produced results, or until the relative time specified byrel_time
has expired. The member returns
future_status::deferred
if the shared results contain a deferred function;future_status::ready
if the shared results are available;future_status::timeout
if the function is returning because the amount of time specified byrel_time
has expired;
future_status wait_until(const chrono::time_point<Clock, Duration>&
abs_time) const
:(The template typesClock
andDuration
normally are derived by the compiler from the actualabs_time
specification.) If the shared results contain a deferred function nothing happens. Otherwisewait_until
blocks until the shared results are available or until the point in time specified byabs_time
has expired. Possible return values are:
future_status::deferred
if the shared results contain a deferred function;future_status::ready
if the shared results are available;future_status::timeout
if the function is returning because the point in time specified byabs_time
has expired.
std::async
is
covered. Async
is used to start asynchronous tasks, returning values (or
void
) to the calling thread, which is hard to realize merely using the
std::thread
class.
Before using the function async
the <future>
header file must be
included.
When starting a thread using the facilities of the class std::thread
the
initiating thread at some point commonly calls the thread's join
method. At that point the thread must have finished or execution blocks until
join
returns. While this often is a sensible course of action, it may not
always be: maybe the function implementing the thread has a return value, or
it could throw an exception.
In those cases join
cannot be used: if an exception leaves a thread, then
your program ends. Here is an example:
1: void thrower() 2: { 3: throw std::exception(); 4: } 5: 6: int main() 7: try 8: { 9: std::thread subThread(thrower); 10: } 11: catch (...) 12: { 13: std::cerr << "Caught exception\n"; 14: }
In line 3 thrower
throws an exception, leaving the thread. This
exception is not caught by main
's try-block (as it is defined in another
thread). As a consequence, the program terminates.
This scenario doesn't occur when std::async
is used. Async
may start a
new asynchronous task, and the activating thread may retrieve the return value
of the function implementing the asynchronous task or any exception leaving
that function from a std::future
object returned by the async
function. Basically, async
is called similarly to the way a thread is
started using std::thread
: it is passed a function and optionally
arguments which are forwarded to the function.
Although the function implementing the asynchronous task may be passed as
first argument, async
first argument may also be a value of the strongly
typed enumeration std::launch
:
enum class launch { async, deferred };When passing
launch::async
the asynchronous task immediately starts;
when passing launch::deferred
the asynchronous task is deferred. When
std::launch
is not specified the default value launch::async |
launch::deferred
is used, giving the implementation freedom of choice,
usually resulting in deferring execution of the asynchronous task.
So, here is the first example again, this time using async
to start the
sub-thread:
1: bool fun() 2: { 3: return std::cerr << " hello from fun\n"; 4: } 5: int exceptionalFun() 6: { 7: throw std::exception(); 8: } 9: 10: int main() 11: try 12: { 13: auto fut1 = std::async(std::launch::async, fun); 14: auto fut2 = std::async(std::launch::async, exceptionalFun); 15: 16: std::cerr << "fun returned " << std::boolalpha << fut1.get() << '\n'; 17: std::cerr << "exceptionalFun did not return " << fut2.get() << '\n'; 18: } 19: catch (...) 20: { 21: std::cerr << "caught exception thrown by exceptionalFun\n"; 22: }
Now the threads immediately start, but although the results are available around line 13, the thrown exception isn't terminating the program. The first thread's return value is made available in line 16, the exception thrown by the second thread is simply caught by main's try-block (line 19).
The function template async
has several overloaded versions:
std::future
holding the function's return value or exception
thrown by the function:
template <typename Function, class ...Args> std::future< typename std::result_of< Function(Args ...) >::type > std::async(Function &&fun, Args &&...args);
bit_or
operator) of the enumeration values of the std::launch
enumeration:
template <class Function, class ...Args> std::future<typename std::result_of<Function(Args ...)>::type> std::async(std::launch policy, Function &&fun, Args &&...args);
std::launch
values, the second
argument may also be the address of a member function. In that case the
(required) thirs argument is an object (or a pointer to an object) of that
member function's class. Any remaining arguments are passed to the member
function (see also the remarks below).
async
all arguments except for the std::launch
argument must be references, pointers or move-constructible objects:
async
function template then
copy construction is used to construct a copy of the argument which is
then forwarded to the thread-launcher.
async
function template
then move construction is used to forward the anonymous object to the
thread launcher.
get
member is
called).
Because of the default std::launch::deferred | std::launch::async
argument
used by the basic async
call it is likely that the function which is
passed to async
doesn't immediately start. The launch::deferred
policy
allows the implementor to defer its execution until the program explicitly
asks for the function's results. Consider the following program:
1: void fun() 2: { 3: std::cerr << " hello from fun\n"; 4: } 5: 6: std::future<void> asyncCall(char const *label) 7: { 8: std::cerr << label << " async call starts\n"; 9: auto ret = std::async(fun); 10: std::cerr << label << " async call ends\n"; 11: return ret; 12: } 13: 14: int main() 15: { 16: asyncCall("First"); 17: asyncCall("Second"); 18: }
Although async
is called in line 9, the program's output may not show
fun's
output line when it is run. This is a result of the (default) use
of lauch::deferred
: the system simply defers fun's
execution until
requested, which doesn't happen. But the future
object that's returned by
async
has a member wait
. Once wait
returns the shred state must be
available. In other words: fun
must have finished. Here is what happens
when after line 9 the line ret.wait()
is inserted:
First async call starts hello from fun First async call ends Second async call starts hello from fun Second async call endsActually, evaluation of
fun
can be requested at the point where we
need its results, maybe even after calling asyncCall
, as shown in the next
example:
1: int main() 2: { 3: auto ret1 = asyncCall("First"); 4: auto ret2 = asyncCall("Second"); 5: 6: ret1.get(); 7: ret2.get(); 8: }
Here the ret1
and ret2 std::future
objects are created, but their
fun
functions aren't evaluated yet. Evaluation occurs at lines 6 and 7,
resulting in the following output:
First async call starts First async call ends Second async call starts Second async call ends hello from fun hello from fun
The std::async
function template is used to start a thread, making its
results available to the calling thread. On the other hand, we may only be
able to prepare (package) a task (a thread), but may have to leave the
completion of the task to another thread. Scenarios like this are realized
through objects of the class std::package_task
, which is the topic of the
next section.
std::packaged_task
allows a program to
`package' a function or functor and pass the package to a thread for further
processing. The processing thread then calls the packaged function, passing it
its arguments (if any). After completing the function the packaged_task's
future is ready, allowing the program to retrieve the results produced by
the function. Thus, functions and the results of function calls can be
transferred between threads.
Before using the class template packaged_task
the <future>
header file
must be included.
Before describing the class's interface, let's first look at an example to get
an idea about how a packaged_task
can be used. Remember that the essence
of packaged_task
is that part of your program prepares (packages) a task
for another thread to complete, and that the program at some point needs the
result of the completed task.
To clarify what's happening here, let's first look at a real-life
analogon. Every now and then I make an appointment with my garage to have my
car serviced. The `package' in this case are the details about my car: its
make and type determine the kind of actions my garage performs when servicing
it. My neighbor also has a car, which also needs to be serviced every now and
then. This also results in a `package' for the garage. At the appropriate time
me and my neighbor take our cars to the garage (i.e., the packages are passed
to another thread). The garage services the cars (i.e., calls the functions
stored in the packaged_tasks
[note that the tasks differ, depending on the
types of the cars]), and performs some actions that are associated with it
(e.g., registering that my or my neighbor's car has been serviced, or order
replacement parts). In the meantime my neighbor and I perform our own
businesses (the program continues while a separate thread runs as well). But
by the end of the day we'd like to use our cars again (i.e., get the results
associated with the packaged_task
). A common result in this example is the
garage's bill, which we have to pay (the program obtains the
packaged_task's
results).
Here is a little C++ program illustrating the use of a packaged_task
(assuming the required headers and using namespace std
have been
specified):
1: mutex carDetailsMutex; 2: condition_variable condition; 3: string carDetails; 4: packaged_task<size_t (std::string const &)> serviceTask; 5: 6: size_t volkswagen(string const &type) 7: { 8: cout << "performing maintenance by the book for a " << type << '\n'; 9: return type.size() * 75; // the size of the bill 10: } 11: 12: size_t peugeot(string const &type) 13: { 14: cout << "performing quick and dirty maintenance for a " << type << '\n'; 15: return type.size() * 50; // the size of the bill 16: } 17: 18: void garage() 19: { 20: while (true) 21: { 22: unique_lock<mutex> lk(carDetailsMutex); 23: while (carDetails.empty()) 24: condition.wait(lk); 25: 26: cout << "servicing a " << carDetails << '\n'; 27: serviceTask(carDetails); 28: carDetails.clear(); 29: } 30: } 31: 32: int main() 33: { 34: thread(garage).detach(); 35: 36: while (true) 37: { 38: string car; 39: if (not getline(cin, car) || car.empty()) 40: break; 41: { 42: lock_guard<mutex> lk(carDetailsMutex); 43: carDetails = car; 44: } 45: serviceTask = packaged_task<size_t (string const &)>( 46: car[0] == 'v' ? volkswagen : peugeot 47: ); 48: auto bill = serviceTask.get_future(); 49: condition.notify_one(); 50: cout << "Bill for servicing a " << car << 51: ": EUR " << bill.get() << '\n'; 52: } 53: }
packaged_task: serviceTask
is initialized with a
function (or functor) expecting a string
, returning a size_t
;
volkswagen
and
peugeot
represent the tasks to perform when cars of the provided
types come in for service; presumably they return the bill.
void garage
, defining the actions
performed by the garage when cars come in for service. These actions
are performed by a separate detached thread, starting in line 34. In a
continuous loop it waits until it obtains a lock on the
carDetailsMutex
and carDetails
is no longer empty. Then, at
line 27, it passes carDetails
to the packaged_task
`serviceTask'
. By itself this is not identical to calling the
packaged_task's
function, but eventually its function will be
called. At this point the packaged_task
receives its function's
arguments, which it eventually will forward to its configured
function. Finally, at line 28 it clears carDetails
, thus preparing
itself for the next request.
main
:
garage
is started.
packaged_task
, provided with
the right servicing function, is constructed
next (line 45).
future
, are
retrieved. Although at this point the future
might not be
ready, the future
object itself is, and it is simply
returned as the bill.
bill.get()
in line 51. If, by this time, the car is still being serviced, the
bill isn't ready yet, and bill.get()
blocks until it is, and
the bill for servicing a car is shown.
packaged_task
,
let's have a look at its interface. Note that the class packaged_task
is a
class template: its template type parameter specifies the prototype of a
function or function object implementing the task performed by the
packaged_task
object.
Constructors and destructor:
packaged_task() noexcept
:The default constructor constructs a packaged_task
object which is
not associated with a function or shared state;
explicit packaged_task(ReturnType(Args...) &&function)
:Apackaged_task
is constructed for a function or functor expecting arguments of typesArgs...
, and returning a value of typeReturnType
. Thepackaged_task
class template specifiesReturnType (Args...)
as its template type parameter. The constructed object contains a shared state, and a (move constructed) copy offunction
.Optionally an
Allocator
may be specified as second template type parameter, in which case the first two arguments arestd::allocator_arg_t, Allocator const &alloc
. The typestd::allocator_arg_t
is a type introduced to disambiguate constructor selections, and can simply be specified asstd::allocator_arg_t()
.This constructor may throw a
std::bad_alloc
exception or exceptions thrown byfunction's
copy or move constructors
packaged_task(packaged_task &&tmp) noexcept
:The move constructor moves any existing shared state fromtmp
to the newly constructed object, removing the shared state fromtmp
.
~packaged_task()
:The object's shared state (if any) is abandoned
Member functions:
future<ReturnType> get_future()
:Astd::future
object haring the current object's shared state is returned. Afuture_error
exception is thrown upon error, containingNote: Any
future_already_retrieved
ifget_future
was already called on apackaged_task
object containing the same shared state as the current object;no_state
if the current object has no shared state.futures
that share the object's shared state may access the result returned by the object's task.
void make_ready_at_thread_exit(Args... args)
:Calls void operator()(Args... args)
(see below) when the current
thread exits, once all objects of thread storage duration associated
with the current thread have been destroyed.
packaged_task &operator=(packaged_task &&tmp)
:The move assignment operator first releases the current object's
shared state (if available), after which the current object and
tmp
are swapped;
void operator()(Args... args)
:Theargs
arguments are forwarded to the current object's stored task. When the stored task returns its return value is stored in the current object's shared state. Otherwise any exception thrown by the task is stored in the object's shared state. Following this the object's shared state is made ready, and any threads blocked in a function waiting for the object's shared state to become ready are unblocked. Afuture_error
exception is thrown upon error, containingCalling this member synchronizes with calling any member function of a
promise_already_satisfied
if the shared state has already been made ready;no_state
if the current object does not have any shared state.(shared_)future
object that provides access to thepackaged_task's
results.
void reset()
:Abandons any available shared state, initializing the current object topackaged_task(std::move(funct))
, wherefunct
is the object's stored task. This member may throw the following exceptions:
bad_alloc
if memory for the new shared state could not be allocated;- any exception thrown by the move constructor of the task stored in the shared state;
future_error
with ano_state
error condition if the current object contains no shared state.
void swap(packaged_task &other) noexcept
:The shared states and stored tasks of the current object and other are swapped.
bool valid() const noexcept
:Returnstrue
if the current object contains a shared state, otherwisefalse
is returned;
The following non-member (free) function operating on packaged_task
objects is available:
void swap(packaged_task<ReturnType(Args...)> &lhs,
packaged_task<ReturnType(Args...)> &rhs) noexcept
Calls lhs.swap(rhs)
std::package_task
and std::async
the class template
std::promise
can be used to obtain the results from a
separate thread.
Before using the class template promise
the <future>
header file
must be included.
A promise
is useful to obtain the results from another thread without
further synchronization requirements. Consider the following program:
void compute(int *ret) { *ret = 9; } int main() { int ret = 0; std::thread(compute2, &ret).detach(); cout << ret << '\n'; }
Chances are that this program shows the value 0: the cout
statement is
already executed before the detached thread has had a chance to complete its
work. In this example that problem can easily be solved by using a
non-detached thread, and using the thread's join
member, but when multiple
threads are used that requires named threads and as many join
calls. Instead, using a promise
might be preferred:
1: void compute1(promise<int> &ref) 2: { 3: ref.set_value(9); 4: } 5: 6: int main() 7: { 8: std::promise<int> p; 9: std::thread(compute, ref(p)).detach(); 10: 11: cout << p.get_future().get() << '\n'; 12: }
In this example again a detached thread is used, but its results are kept
for future reference in a promise
object, instead of directly being
assigned to a final destination variable. The promise
object contains a
future
object holding the computed value. The future's get
member
blocks until the future has been made ready, at which point the result becomes
available. By then the detached thread may or may not yet have been
completed. If it already completed its work then get
immediately returns,
otherwise there will be a slight delay.
Promises can be useful when implementing a multi threaded version of some
algorithm without having to use additional synchronization statements. As an
example consider matrix multiplications. Each element of the resulting
product matrix is computed as the inner product of two vectors: the inner
product of a row of the left-hand matrix operand and a column of the
right-hand matrix operand becomes element [row][column]
of the resulting
matrix. Since each element of the resulting matrix can independently be
computed from the other elements, a multi threaded implementation is well
possible. In the following example the function innerProduct
(lines 4..11)
leaves its result in a promise
object:
1: int m1[2][2] = {{1, 2}, {3, 4}}; 2: int m2[2][2] = {{3, 4}, {5, 6}}; 3: 4: void innerProduct(promise<int> &ref, int row, int col) 5: { 6: int sum = 0; 7: for (int idx = 0; idx != 2; ++idx) 8: sum += m1[row][idx] * m2[idx][col]; 9: 10: ref.set_value(sum); 11: } 12: 13: int main() 14: { 15: promise<int> result[2][2]; 16: 17: for (int row = 0; row != 2; ++row) 18: { 19: for (int col = 0; col != 2; ++col) 20: thread(innerProduct, ref(result[row][col]), row, col).detach(); 21: } 22: 23: for (int row = 0; row != 2; ++row) 24: { 25: for (int col = 0; col != 2; ++col) 26: cout << setw(3) << result[row][col].get_future().get(); 27: cout << '\n'; 28: } 29: }
Each inner product is computed by a separate (anonymous and detached)
thread (lines 17..21), which starts as soon as the run-time system allows it
to start. By the time the threads have finished the resulting inner products
can be retrieved from the promises' futures. Since futures' get
members
block until their results are actually available, the resulting matrix can
simply be displayed by calling those members in sequence (lines 23..28).
So, a promise
allows us to use a thread to compute a value (or
exception, see below), which value may then be collected by another thread at
some future point in time. The promise remains available, and as a consequence
further synchronization of the threads and the program starting the threads is
not necessary. When the promise object contains an exception, rather than a
value, its future's get
member rethrows the stored exception.
Here is the class promise's
interface. Note that the class promise
is
a class template: its template type parameter ReturnType
specifies the
template type parameter of the std::future
that can be retrieved from it.
Constructors and destructor:
promise()
:The default constructor constructs apromise
object containing a shared state. The shared state may be returned by the memberget_future
(see below), but that future has not yet been made ready;
promise(promise &&tmp) noexcept
:The move constructor constructs apromise
object, transferring the ownership oftmp's
shared state to the newly constructed object. After the object has been constructed,tmp
no longer contains a shared state;
~promise()
:The object's shared state (if any) is abandoned;
Member functions:
std::future<ReturnType> get_future()
:Astd::future
object sharing the current object's shared state is returned. Afuture_error
exception is thrown upon error, containingNote: Any
future_already_retrieved
ifget_future
was already called on apackaged_task
object containing the same shared state as the current object;no_state
if the current object has no shared state.futures
that share the object's shared state may access the result returned by the object's task;
promise &operator=(promise &&rhs) noexcept
:The move assignment operator first releases the current object's
shared state (if available), after which the current object and
tmp
are swapped;
void promise<void>::set_value()
:See below, at the last set_value
member's description;
void set_value(ReturnType &&value)
:See below, at the last set_value
member's description;
void set_value(ReturnType const &value)
:See the next member function's description;
void set_value(ReturnType &value)
:The argument (value
) is atomically stored in the shared state, which is then also made ready. Afuture_error
exception is thrown upon error, containingAlternatively, any exception thrown by
promise_already_satisfied
if the shared state has already been made ready;no_state
if the current object does not have any shared state.value
's move or copy constructor may be thrown;
void set_exception(std::exception_ptr obj)
:Exception_ptr obj
(cf. section 20.13.1) is atomically stored in the shared state, making that state ready. Afuture_error
exception is thrown upon error, containing
promise_already_satisfied
if the shared state has already been made ready;no_state
if the current object does not have any shared state;
void set_exception_at_thread_exit(exception_ptr ptr)
:The exception pointerptr
is stored in the shared state without immediately making that state ready. The state becomes ready when the current thread exits, once all objects of thread storage duration which are associated with the ending thread have been destroyed. Afuture_error
exception is thrown upon error, containing
promise_already_satisfied
if the shared state has already been made ready;no_state
if the current object does not have any shared state;
void set_value_at_thread_exit()
:See below, at the last set_value_at_thread_exit
member's
description;
void set_value_at_thread_exit(ReturnType &&value)
:See below, at the last set_value_at_thread_exit
member's
description;
void set_value_at_thread_exit(ReturnType const &value)
:See the next set_value_at_thread_exit
member's
description;
void set_value_at_thread_exit(ReturnType &value)
:Storesvalue
in the shared state without immediately making that state ready. The state becomes ready when the current thread exits, once all objects of thread storage duration which are associated with the ending thread have been destroyed. Afuture_error
exception is thrown upon error, containing
promise_already_satisfied
if the shared state has already been made ready;no_state
if the current object does not have any shared state;
void swap(promise& other) noexcept
:The shared states (if any) of the current object and other
are
exchanged.
The following non-member (free) function operating on promise
objects is
available:
void swap(promise<ReturnType> &lhs, promise<ReturnType> &rhs)
noexcept
:Calls lhs.swap(rhs)
Std::promise's
member set_exception
does not expect an
std::exception
argument, but an object of the class
std::exception_ptr
. In this section we take a closer look
at the class exception_ptr
.
Before using the class exception_ptr
the <future>
header file
must be included.
The class an exception_ptr's
default constructor initializes it to a
null-pointer. In the following code snippet the variable isNull
is set to
true:
std::exception_ptr obj; bool isNull = obj == nullptr && obj == 0;The class
exception_ptr
provides copy and move constructors as well as
copy and move assignment operators.
Two exception_ptr
objects can be compared for equality. They are equal
if they refer to the same exception. Move assignment transfers the exception
referred to by the right-hand side operand to the left-hand side operand, and
turns the right-hand side operand into a null pointer.
There is no published method directly retrieving the exception to which an
exception_ptr
object refers. However, there are some free functions
constructing or handling exception_ptr
objects:
std::exception_ptr std::current_exception() noexcept
:Anexception_ptr
object is returned referring to the currently handled exception (or a copy of the currently handled exception, or a default constructedexception_ptr
object if no current exception is available). This function can also be called when a default exception catcher is used. E.g., assuming thatobj
refers to an availablestd::promise
object, then the following code snippet assigns the exception caught by default catch clause toobj
:... catch (...) { obj.set_exception(std::current_exception()); }The exception referred to bycurrent_exception
does not have to be an object of the classstd::exception
. Any type of object or value thrown as an exception is retrieved as anexception_ptr
bycurrent_exception
. The exception referred to by anexception_ptr
object remains valid for at least as long as there exists anexception_ptr
object that refers to it. Callingcurrent_exception
twice in a row then the two returnedexception_ptr
objects may or may not refer to the same exception object.
std::exception_ptr make_exception_ptr(Type
value) noexcept
:This function template constructs anexception_ptr
from a value of any type which is passed as its argument.Type
does not necessarily have to be anstd::exception
. The constructedexception_ptr
could, e.g., be assigned to astd::promise
. When the promise's future'sget
member is subsequently called (possibly from within another thread) the exception will be thrown. Here are some examples, showing how values of different types can be passed as arguments tomake_exception_ptr
, and showing how the eventually constructedexception_ptr
is assigned to theobj
, which is assumed to be of astd::promise
type:auto ptr = make_exception_ptr(exception()); ptr = make_exception(string("hello world")); ptr = make_exception(12);obj.set_exception(make_exception_ptr(ptr); )
void std::rethrow_exception(exception_ptr obj)
:The exception to whichobj
refers is thrown. Note:obj
cannot be anullptr
.
packaged_tasks
.
Like the multi-threaded quicksort example a worker pool is used. However, in this example the workers in fact do not know what their task is. In the current example the tasks happens to be identical, but diffent tasks might as well have been used, without the need to update the workers.
The program uses a class Task
containing a command-specification
(d_command
, and a task specification (d_task
) (cf. Figure 23, the
sources of the program are found in the
yo/threading/examples/multicompile
directory) of the C++ Annotations).
Like before main
first starts its workforce as detached threads. Following
this, the compilation jobs are performed by yet another detached
thread. Eventually, the results of the compilation jobs are handled by
results
:
int main() { workforce(); // start the worker threads thread(jobs).detach(); // start working on the jobs results(); // handle the results. }
The jobs
-thread reads the names of the files to compile from the standard
input stream, and passes them on to dispatch
for further handling by the
workers. Lines 7 and 8 are executed at the end, and are a safeguard against
empty input. The safeguard is discussed below, at newResult's
description.
1: void jobs() 2: { 3: string line; 4: while (getline(cin, line) && dispatch(line)) 5: ; 6: 7: g_ready = true; 8: g_resultCond.notify_one(); 9: }
Next the dispatcher. It ignores empty lines. Also, if a compilation has failed by the time the dispatcher is called, processing stops (lines 6-7). Otherwise, the dispacher waits for an available worker, prepares a new task, and notifies a worker to handle it:
bool dispatch(string const &line) { if (line.empty()) return true; if (g_done.load()) return false; g_workforce.wait(); newTask(line); g_worker.notify_all(); return true; }
The function newTask
prepares the program for the next task. First a
Task
object is created. Task
contains the name of the file to compile,
and a packaged_task
. It encapsulates all activities that are associated
with a packaged_task
. Here is its (in-class) definition:
1: typedef packaged_task<Result (string const &fname)> PackagedTask; 2: 3: class Task 4: { 5: string d_command; 6: PackagedTask d_task; 7: 8: public: 9: Task() = default; 10: 11: Task(string const &command, PackagedTask &&tmp) 12: : 13: d_command(command), 14: d_task(move(tmp)) 15: {} 16: 17: void operator()() 18: { 19: d_task(d_command); 20: } 21: 22: shared_future<Result> result() 23: { 24: return d_task.get_future().share(); 25: } 26: };
Note (lines 22-25) that result
returns a shared_future
. Since the
dispatcher runs in a different thread than the one processing the results, the
futures
created by the dispatcher must be shared with the futures
required by the function processing the results. Hence the shared_futures
returned by Task::result
.
Once a Task
object has been constructed its shared_future
is pushed on
the result queue. Although the actual results aren't available by this time,
the result
function (see below) is notified that something has been pushed
on its queue. Additionally, the Task
itself is pushed on the task queue,
where a worker may retrieve it:
typedef packaged_task<Result (string const &fname)> PackagedTask; class Task { string d_command; PackagedTask d_task; public: Task() = default; Task(string const &command, PackagedTask &&tmp) : d_command(command), d_task(move(tmp)) {} void operator()() { d_task(d_command); } shared_future<Result> result() { return d_task.get_future().share(); } };
void pushResultQ(shared_future<Result> const &sharedResult) { lock_guard<mutex> lk(g_resultQMutex); g_resultQ.push(sharedResult); g_ready = true; g_resultCond.notify_one(); }
The workers have a simple task: retrieve the next task from the task queue's front, then perform that task. Whatever happens inside the tasks themselves is of no concern: the worker thread eventually ends when the program ends:
void worker() { Task task; while (true) { g_worker.wait(); g_taskQ.popFront(task); task(); g_workforce.notify_all(); } }
This completes the description of how tasks are handled. The task itself
remains to be described. In the current example program source files are being
compiled. The compilation command is passed to the constructor of a
CmdFork
object, which starts the compiler as a child process. The result
of the compilation is retrieved via its childExit
(returning the
compiler's exit code) and childOutput
(returning any textual output
produced by the compiler) members. If compilation fails, the exit value won't
be zero. In this case no further compilation tasks will be initiated (lines 11
and 12; the implementation of the class CmdFork
is available from the
C++ Annotations' yo/threading/examples/cmdfork
directory). Here is the
function compile
:
1: Result compile(string const &line) 2: { 3: string command("/usr/bin/g++ -Wall --std=c++11 -c " + line); 4: 5: CmdFork cmdFork(command); 6: cmdFork.fork(); 7: 8: Result ret {cmdFork.childExit() == 0, 9: line + "\n" + cmdFork.childOutput()}; 10: 11: if (not ret.ok) 12: g_done = true; 13: 14: return ret; 15: }
The results
function continues for as long as there are results. Once
results are available they are displayed. Whenever a compilation has failed
the Result's d_ok
member is false
and results
ends:
void results() { while (newResult()) { Result const &result = g_resultQ.front().get(); cerr << result.display; if (not result.ok) return; g_resultQ.pop(); } }
The function newResult
controls results' while
-loop. It returns
true
when there are some results available in the result queue. When no
filenames are presented at the standard input stream or when no result has as
yet been pushed on the result queue newResult
waits. It also waits when
there are no results available in the results queue, but at least one worker
thread is busy. Whenever a result is pushed on the result queue, and also once
the input stream has been processed newResult
is notified. Following the
notification newResults
returns, and results
either ends or shows the
results appearing at te result queue's front:
bool newResult() { unique_lock<mutex> lk(g_resultQMutex); while ( ( g_resultQ.empty() && g_workforce.size() != g_sizeofWorkforce ) || not g_ready.load() ) g_resultCond.wait(lk); return not g_resultQ.empty(); }