Concurrency

This section discusses issues surrounding the proper compilation of multithreaded applications which use the Standard C++ library. This information is GCC-specific since the C++ standard does not address matters of multithreaded applications.

Prerequisites

All normal disclaimers aside, multithreaded C++ application are only supported when libstdc++ and all user code was built with compilers which report (via gcc/g++ -v ) the same thread model and that model is not single. As long as your final application is actually single-threaded, then it should be safe to mix user code built with a thread model of single with a libstdc++ and other C++ libraries built with another thread model useful on the platform. Other mixes may or may not work but are not considered supported. (Thus, if you distribute a shared C++ library in binary form only, it may be best to compile it with a GCC configured with --enable-threads for maximal interchangeability and usefulness with a user population that may have built GCC with either --enable-threads or --disable-threads.)

When you link a multithreaded application, you will probably need to add a library or flag to g++. This is a very non-standardized area of GCC across ports. Some ports support a special flag (the spelling isn't even standardized yet) to add all required macros to a compilation (if any such flags are required then you must provide the flag for all compilations not just linking) and link-library additions and/or replacements at link time. The documentation is weak. Here is a quick summary to display how ad hoc this is: On Solaris, both -pthreads and -threads (with subtly different meanings) are honored. On GNU/Linux x86, -pthread is honored. On FreeBSD, -pthread is honored. Some other ports use other switches. AFAIK, none of this is properly documented anywhere other than in ``gcc -dumpspecs'' (look at lib and cpp entries).

Thread Safety

In the terms of the 2011 C++ standard a thread-safe program is one which does not perform any conflicting non-atomic operations on memory locations and so does not contain any data races. The standard places requirements on the library to ensure that no data races are caused by the library itself or by programs which use the library correctly (as described below). The C++11 memory model and library requirements are a more formal version of the SGI STL definition of thread safety, which the library used prior to the 2011 standard.

The library strives to be thread-safe when all of the following conditions are met:

  • The system's libc is itself thread-safe,

  • The compiler in use reports a thread model other than 'single'. This can be tested via output from gcc -v. Multi-thread capable versions of gcc output something like this:

    %gcc -v
    Using built-in specs.
    ...
    Thread model: posix
    gcc version 4.1.2 20070925 (Red Hat 4.1.2-33)
    

    Look for "Thread model" lines that aren't equal to "single."

  • Requisite command-line flags are used for atomic operations and threading. Examples of this include -pthread and -march=native, although specifics vary depending on the host environment. See Machine Dependent Options.

  • An implementation of atomicity.h functions exists for the architecture in question. See the internals documentation for more details.

The user code must guard against concurrent function calls which access any particular library object's state when one or more of those accesses modifies the state. An object will be modified by invoking a non-const member function on it or passing it as a non-const argument to a library function. An object will not be modified by invoking a const member function on it or passing it to a function as a pointer- or reference-to-const. Typically, the application programmer may infer what object locks must be held based on the objects referenced in a function call and whether the objects are accessed as const or non-const. Without getting into great detail, here is an example which requires user-level locks:

     library_class_a shared_object_a;

     void thread_main () {
       library_class_b *object_b = new library_class_b;
       shared_object_a.add_b (object_b);   // must hold lock for shared_object_a
       shared_object_a.mutate ();          // must hold lock for shared_object_a
     }

     // Multiple copies of thread_main() are started in independent threads.

Under the assumption that object_a and object_b are never exposed to another thread, here is an example that does not require any user-level locks:

     void thread_main () {
       library_class_a object_a;
       library_class_b *object_b = new library_class_b;
       object_a.add_b (object_b);
       object_a.mutate ();
     } 

All library types are safe to use in a multithreaded program if objects are not shared between threads or as long each thread carefully locks out access by any other thread while it modifies any object visible to another thread. Unless otherwise documented, the only exceptions to these rules are atomic operations on the types in <atomic> and lock/unlock operations on the standard mutex types in <mutex>. These atomic operations allow concurrent accesses to the same object without introducing data races.

The following member functions of standard containers can be considered to be const for the purposes of avoiding data races: begin, end, rbegin, rend, front, back, data, find, lower_bound, upper_bound, equal_range, at and, except in associative or unordered associative containers, operator[]. In other words, although they are non-const so that they can return mutable iterators, those member functions will not modify the container. Accessing an iterator might cause a non-modifying access to the container the iterator refers to (for example incrementing a list iterator must access the pointers between nodes, which are part of the container and so conflict with other accesses to the container).

Programs which follow the rules above will not encounter data races in library code, even when using library types which share state between distinct objects. In the example below the shared_ptr objects share a reference count, but because the code does not perform any non-const operations on the globally-visible object, the library ensures that the reference count updates are atomic and do not introduce data races:

    std::shared_ptr<int> global_sp;

    void thread_main() {
      auto local_sp = global_sp;  // OK, copy constructor's parameter is reference-to-const

      int i = *global_sp;         // OK, operator* is const
      int j = *local_sp;          // OK, does not operate on global_sp

      // *global_sp = 2;          // NOT OK, modifies int visible to other threads      
      // *local_sp = 2;           // NOT OK, modifies int visible to other threads      

      // global_sp.reset();       // NOT OK, reset is non-const
      local_sp.reset();           // OK, does not operate on global_sp
    }

    int main() {
      global_sp.reset(new int(1));
      std::thread t1(thread_main);
      std::thread t2(thread_main);
      t1.join();
      t2.join();
    }
      

For further details of the C++11 memory model see Hans-J. Boehm's Threads and memory model for C++ pages, particularly the introduction and FAQ.

Atomics

IO

This gets a bit tricky. Please read carefully, and bear with me.

Structure

A wrapper type called __basic_file provides our abstraction layer for the std::filebuf classes. Nearly all decisions dealing with actual input and output must be made in __basic_file.

A generic locking mechanism is somewhat in place at the filebuf layer, but is not used in the current code. Providing locking at any higher level is akin to providing locking within containers, and is not done for the same reasons (see the links above).

Defaults

The __basic_file type is simply a collection of small wrappers around the C stdio layer (again, see the link under Structure). We do no locking ourselves, but simply pass through to calls to fopen, fwrite, and so forth.

So, for 3.0, the question of "is multithreading safe for I/O" must be answered with, "is your platform's C library threadsafe for I/O?" Some are by default, some are not; many offer multiple implementations of the C library with varying tradeoffs of threadsafety and efficiency. You, the programmer, are always required to take care with multiple threads.

(As an example, the POSIX standard requires that C stdio FILE* operations are atomic. POSIX-conforming C libraries (e.g, on Solaris and GNU/Linux) have an internal mutex to serialize operations on FILE*s. However, you still need to not do stupid things like calling fclose(fs) in one thread followed by an access of fs in another.)

So, if your platform's C library is threadsafe, then your fstream I/O operations will be threadsafe at the lowest level. For higher-level operations, such as manipulating the data contained in the stream formatting classes (e.g., setting up callbacks inside an std::ofstream), you need to guard such accesses like any other critical shared resource.

Future

A second choice may be available for I/O implementations: libio. This is disabled by default, and in fact will not currently work due to other issues. It will be revisited, however.

The libio code is a subset of the guts of the GNU libc (glibc) I/O implementation. When libio is in use, the __basic_file type is basically derived from FILE. (The real situation is more complex than that... it's derived from an internal type used to implement FILE. See libio/libioP.h to see scary things done with vtbls.) The result is that there is no "layer" of C stdio to go through; the filebuf makes calls directly into the same functions used to implement fread, fwrite, and so forth, using internal data structures. (And when I say "makes calls directly," I mean the function is literally replaced by a jump into an internal function. Fast but frightening. *grin*)

Also, the libio internal locks are used. This requires pulling in large chunks of glibc, such as a pthreads implementation, and is one of the issues preventing widespread use of libio as the libstdc++ cstdio implementation.

But we plan to make this work, at least as an option if not a future default. Platforms running a copy of glibc with a recent-enough version will see calls from libstdc++ directly into the glibc already installed. For other platforms, a copy of the libio subsection will be built and included in libstdc++.

Alternatives

Don't forget that other cstdio implementations are possible. You could easily write one to perform your own forms of locking, to solve your "interesting" problems.

Containers

This section discusses issues surrounding the design of multithreaded applications which use Standard C++ containers. All information in this section is current as of the gcc 3.0 release and all later point releases. Although earlier gcc releases had a different approach to threading configuration and proper compilation, the basic code design rules presented here were similar. For information on all other aspects of multithreading as it relates to libstdc++, including details on the proper compilation of threaded code (and compatibility between threaded and non-threaded code), see Chapter 17.

Two excellent pages to read when working with the Standard C++ containers and threads are SGI's http://www.sgi.com/tech/stl/thread_safety.html and SGI's http://www.sgi.com/tech/stl/Allocators.html.

However, please ignore all discussions about the user-level configuration of the lock implementation inside the STL container-memory allocator on those pages. For the sake of this discussion, libstdc++ configures the SGI STL implementation, not you. This is quite different from how gcc pre-3.0 worked. In particular, past advice was for people using g++ to explicitly define _PTHREADS or other macros or port-specific compilation options on the command line to get a thread-safe STL. This is no longer required for any port and should no longer be done unless you really know what you are doing and assume all responsibility.

Since the container implementation of libstdc++ uses the SGI code, we use the same definition of thread safety as SGI when discussing design. A key point that beginners may miss is the fourth major paragraph of the first page mentioned above (For most clients...), which points out that locking must nearly always be done outside the container, by client code (that'd be you, not us). There is a notable exceptions to this rule. Allocators called while a container or element is constructed uses an internal lock obtained and released solely within libstdc++ code (in fact, this is the reason STL requires any knowledge of the thread configuration).

For implementing a container which does its own locking, it is trivial to provide a wrapper class which obtains the lock (as SGI suggests), performs the container operation, and then releases the lock. This could be templatized to a certain extent, on the underlying container and/or a locking mechanism. Trying to provide a catch-all general template solution would probably be more trouble than it's worth.

The library implementation may be configured to use the high-speed caching memory allocator, which complicates thread safety issues. For all details about how to globally override this at application run-time see here. Also useful are details on allocator options and capabilities.