This document is broken down into sections and sub-sections. To understand a specific section, you need to understand all of its parent sections as well as any prerequisites that it lists. For example, if section Fruits/Apples/Granny Smith
has prerequisites Vegetables/Peas
and Fish
listed, you'll need to have read ...
Fruits
(just that section, not its sub-sections)Fruits/Apples
(just that section, not its sub-sections)Vegetables/Peas
(that section and including ALL of its sub-sections)Fish
(that section including ALL of its sub-sections)This is essentially a tree where each section is a node. To understand a node, you need to understand ...
The following document is my attempt at charting out the various pieces of the modern C++ landscape, focusing on the 80% of features that get used most of the time rather than the 20% of highly esoteric / confusing features. It isn't comprehensive and some of the information may not be entirely correct / may be missing large portions.
The key points of similarity to remember:
The key point of dissimilarity to remember:
🔍SEE ALSO🔍
The following are a base set of language constructs required for understanding the rest of the document.
The general purpose integral type is int
.
Variables use the format modifiers type name initializer
.
int a = 0;
int b (0); // parenthesis
int c {0}; // curly braces
C++ provides a bewildering number of ways to initialize a variable, each with its own set of edge cases. For best results, stick to the curly braces.
Functions use the format modifiers return-type name(param-type1 arg-name1, param-type2 arg-name2, ...) modifiers { body }
.
int myFunction(int a) {
return x + a;
}
C++ functions don't necessarily have to be methods (members of a class).
Classes use either struct
or class
.
struct
makes all members of the class public by default, while class
makes them all private by default. Members need to be grouped together by visibility, where a visibility (e.g. private
) is a label within the class.
class MyClass {
int myFunction(int a) {
return x + a;
}
private: // everything under this label is private
int x {0};
};
Source code often comes in pairs: A header file usually contains declarations (e.g. just the function's signature / prototype) while a C++ file usually contains definitions (e.g. the function implementation).
// MyCode.hpp (header file w/ declarations)
int myFunction(int a);
// MyCode.cpp (source file w/ definitions)
#include "MyCode.hpp"
int myFunction(int a) {
return x + a;
}
This isn't required. Source files may contain declarations and / or header files may contain definitions, but the split is typically done for a variety of reasons: faster compile times, sharing the same object across multiple source files, compiling when there are cyclical references, etc..
⚠️NOTE️️️⚠️
The above points aren't entirely correct or complete. They're generalizations that help set up a base for the explanations in the rest of the document.
The following is an example C++ program that prints "hello world" to stdout.
// hello.cpp file
#include <iostream>
int main() {
std::cout << "hello world\n";
return 0;
}
The ...
#include <iostream>
pulls in a library that lets you interface with stdout, stderr, and stdin.int main() { ... }
is the entry point of the program.std:cout << ...
is what prints to stdout.return 0
returns from the main()
function, ending the program with an exit code of 0.Pretty much any modern C++ compiler will compile the above code. The output below uses the GNU C++ compiler to compile the example, then runs the executable.
$ g++ hello.cpp
$ ./a.out
hello world
↩PREREQUISITES↩
Several C++ compilers exist, the most popular of which are the GNU C++ compiler and LLVM clang. C++ compilers generally follow the same set of steps to go from C++ code to an executable.
The C++ language has a lot of legacy baggage, edge cases, and ambiguous behaviour. Regardless of the compiler chosen, at least some of the following warning options should be enabled:
-Wall
- Warns about questionable but easily avoidable constructs.-Wextra
- Warns about other questionable constructs not covered by -Wall
.-Wpedantic
- Warns about ISO conformance.-Weverything
- Turns on all warnings.Most compilers support some or all of the flags above.
⚠️NOTE️️️⚠️
A good online tool to try things in is cppinsights, which breaks down C++ code and allows you some visibility into what the compiler is doing / what the compiler sees.
↩PREREQUISITES↩
For each source code file that gets compiled, the compiler needs to know that the entities (variables, functions, classes, etc..) accessed within that file actually exist. The scope at which the compiler keeps track of these entities is per source code file. For example, imagine a source code file that defines a function named myFunction
(definition). There are 5 other source code files that call myFunction
at some point. Each of those 5 other files is required to tell the compiler what myFunction
is (declaration) before it can invoke it.
One way to handle this scenario is to put myFunction
's declaration in each source code file that calls it.
OtherClass myFunction(int a);
The problem with doing this is that ...
myFunction
(e.g. myFunction
requires OtherClass
, which may require even more entities).The preferred way to handle this scenario is to put myFunction
's declaration into a header file. Then, any file that needs to know about myFunction
can use the #include
directive.,,
// MyFunction.hpp
#include "OtherClass.hpp"
OtherClass myFunction(int a);
// UsageFile1.cpp
#include "MyFunction.hpp"
myFunction(44);
If an entity is declared once already by an #include
, it shouldn't be declared again. For example, imagine that the file Main.cpp
includes ParentA.hpp
and ParentB.hpp
. Both ParentA.hpp
and ParentB.hpp
then go on to include Child.hpp
....
The problem the above example scenario creates is that Child.hpp
gets #include
'd twice, meaning that everything in it is declared twice. To mitigate this problem, an include guard is typically provided in each header file.
// MyFunction.hpp
#ifndef MY_FUNCTION_H // include guard
#define MY_FUNCTION_H
#include "OtherClass.hpp"
OtherClass myFunction(int a);
#endif
⚠️NOTE️️️⚠️
#ifdef
, #define
, and #endif
are preprocessor macros that aren't covered here. Look them up online if you need to.
You may notice that sometimes #include
puts quotes around the files and sometimes angle brackets. Use quotes when the files are in the same directory structure, angle brackets when the files are coming from some external library.
#include <vector> // library header
#include "OtherClass.hpp" // local header
↩PREREQUISITES↩
There are many different compilers, IDEs, build systems, and dependency managers for C++.
Common compilers:
Common IDEs:
Common build systems:
⚠️NOTE️️️⚠️
CMake isn't a build system itself, but a tool that generates the configuration needed for build systems. The idea is that, since C++ code can be compiled on many different platforms and build systems, this high-level tool can be used to generate the configuration for those build systems. For example, building on Linux is commonly done using Make while on Windows it's commonly done through Microsoft Visual Studio IDE project files. CMake can configure both using the same CMake script.
Common dependency managers:
Of the tools above, the best mixture I've found so far is to use ...
apt install clang-12
)snap install --classic code
)apt install cmake
)There are basic guides / tutorials for each of these tools available online. With the C++ extensions (5, 6, and 7), vscode (3) works similar to a professional IDE. It will parse a CMake configuration (3) to figure out how the code should be built as well as to provide C++ intellisense / auto-complete / formatting / debugging / etc.. support. Conan (4) integrates with CMake, so intellisense and builds through vscode automatically include the libraries.
⚠️NOTE️️️⚠️
Make sure to turn off the C++ extension's intellisense support or else it'll interfere with clangd's superior intellisense support. You can do this by adding the following to your .vscode/settings.json file
...
{
"C_Cpp.intelliSenseEngine": "Disabled",
"C_Cpp.autocomplete": "Disabled", // So you don't get autocomplete from both extensions.
"C_Cpp.errorSquiggles": "Disabled", // So you don't get error squiggles from both extensions (clangd's seem to be more reliable anyway).
}
⚠️NOTE️️️⚠️
Make sure that you don't have other C++ extensions installed. I'd initially installed a Makefile plugin into vscode that was tripping up the CMake plugin and breaking my intellisense.
Assuming you have all the software above installed, this cookie cutter template can be used to set up a simple project structure that you can open directly in vscode. The template primes the project by ...
src/main
).src/test
).⚠️NOTE️️️⚠️
I keep reading that globs aren't recommended in CMake. If you don't use globs, you'll have to go in and manually add in each source file into the CMake configuration.
⚠️NOTE️️️⚠️
Recall that ...
Conan changes ARE NOT automatically picked up. You need to re-run conan (from ./build
-- see the cookie cutter template post hook) to pick up any library changes.
↩PREREQUISITES↩
The following subsection loosely details core C++ language features. It isn't comprehensive and some of the information may not be entirely correct / may be missing large portions.
The following is a list of operators available in C++. Some operators are obvious, while others are explained in other sections.
Bitwise Logical Operators
name | example | note |
---|---|---|
Bitwise AND (& ) |
0b1011 & 0b0110 |
|
Bitwise OR (| ) |
0b1011 | 0b0110 |
|
Bitwise XOR (^ ) |
0b1011 ^ 0b0110 |
|
Bitwise NOT (~ ) |
~0b1011 |
|
Bitwise left-shift (<< ) |
0b1011 << 2 |
|
Bitwise right-shift (>> ) |
0b1011 >> 2 |
Results on signed may be different than unsigned. |
Boolean Logical Operators
name | example | note |
---|---|---|
Logical AND (&& ) |
true && true |
|
Logical OR (|| ) |
true || false |
|
Logical NOT (! ) |
!true |
Arithmetic Operators
name | example | note |
---|---|---|
Unary Plus (+ ) |
+10 |
|
Unary Minus (- ) |
-10 |
|
Addition (+ ) |
1 + 2 |
|
Subtraction (- ) |
2 - 1 |
|
Multiplication (* ) |
2 * 3 |
|
Division (/ ) |
6 / 2 |
|
Modulo (% ) |
6 % 4 |
There are implicit rules for how fundamental types get promoted. The general rule of thumb is that the result of the operator is promoted to the operand with the "greater" type. For example, if an int
is added to a float
, the result will be a float
.
These rules are similar to those in other languages (e.g. Java and Python).
⚠️NOTE️️️⚠️
If confused, use type deduction via the auto
keyword: auto x {5 + y}
, then check to see what the type of y
is in the IDE or using typeid
.
Assignment Operators
name | example | note |
---|---|---|
Assignment (= ) |
x = 5 |
|
Assignment Bitwise AND (&= ) |
x &= 0b0110 |
|
Assignment Bitwise OR (|= ) |
x |= 0b0110 |
|
Assignment Bitwise XOR (^= ) |
x ^= 0b0110 |
|
Assignment Bitwise left-shift (<<= ) |
x <<= 2 |
|
Assignment Bitwise right-shift (>>= ) |
x >>= 2 |
Result on signed may be different than unsigned. |
Assignment Addition (+= ) |
x += 2 |
|
Assignment Subtraction (-= ) |
x -= 1 |
|
Assignment Multiplication (*= ) |
x *= 3 |
|
Assignment Division (/= ) |
x /= 2 |
|
Assignment Modulo (%= ) |
x %= 4 |
|
Increment (++ ) |
x++ |
Applicable BEFORE or AFTER the operand: ++x returns the value AFTER modification, x++ returns the value BEFORE modification. |
Decrement (-- ) |
x-- |
Applicable BEFORE or AFTER the operand: --x returns the value AFTER modification, x-- returns the value BEFORE modification. |
All assignment operators work similar to those in Java except for the increment and decrement operators. Due to the confusion it causes, Java disallows the increment / decrement from returning a value, meaning that it can't be used in an expression. Not so in C++. In addition to modifying the variable passed as the operand, in C++ these operators also return a result, meaning that it's okay to increment / decrement operator within some larger expression.
int x {3};
int y {(x++) + 2};
// at this point, x is 4, y is 5
int a {3};
int b {(++a) + 2};
// at this point, a is 4, b is 6
⚠️NOTE️️️⚠️
You probably shouldn't do this because it gets confusing. Also, incrementing/decrementing the same variable more than once in the same expression isn't defined behaviour: The order of incrementing/decrementing can change based on whatever the compiler thinks is best, meaning that the results won't be consistent across different platforms / compilers / compiler options / etc...
Comparison Operator
name | example | note |
---|---|---|
Equal To (== ) |
5 == 7 |
|
Not Equal To (!= ) |
5 != 7 |
|
Less Than (< ) |
5 < 7 |
|
Less Than Or Equal To (<= ) |
5 <= 7 |
|
Greater Than (> ) |
5 > 7 |
|
Greater Than Or Equal To (>= ) |
5 >= 7 |
|
Three-way Comparison (<=> ) |
5 <=> 7 |
Returns a special ordering type, not boolean (discussed in spaceship operator section). |
In addition, the ternary conditional operator is a pseudo operator that takes in 3 operands similar to those found in other high-level languages: CONDITION ? EXPRESSION_IF_TRUE : EXPRESSION_IF_FALSE
. It's essentially a shorthand if-else block.
int x {n % 7 == 1 ? 1000 : -1000};
// equiv to...
if (n % 7 == 1) {
x = 1000;
} else {
x = -1000;
}
Member Access Operators
name | example | note |
---|---|---|
Subscript ([] ) |
x[0] |
|
Indirection (* ) |
*x |
Doesn't conflict with arithmetic multiplication operator because this is a unary operator. |
Address Of (& ) |
&x |
|
Member Of Object (. ) |
x.member |
|
Member Of Pointer (-> ) |
x->member |
These operators are used in scenarios that deal with accessing the members of an object (e.g. element in an array, field of a class) or dealing with memory addresses / pointers. The subscript and and member of object operators are similar to their counterparts in other high-level languages (e.g. Java, Python, C#, etc..). The others are unique to languages with support for lower-level programming like C++. Their usage is detailed in other sections.
Dynamic Object Operators
name | example | note |
---|---|---|
Create Dynamic Object (new ) |
new int |
|
Create Dynamic Array (new[] ) |
new int[50] |
|
Destroy Dynamic Object (delete ) |
delete x |
|
Destroy Dynamic Array (delete[] ) |
delete[] x |
⚠️NOTE️️️⚠️
If you already know about dynamic objects and arrays and constructors/destructors, make sure you delete an array using delete[]
. It makes sure to call the destructor for each element of the array.
Size Operator
name | example | note |
---|---|---|
Size (sizeof ) |
sizeof x] |
This operator gets the size of an object in bytes. Note that an object's byte size may not be indicative of the da may include padding required by the platform (e.g. an object requiring 5 bytes may get expanded to 8 bytes because the platform requires 8 byte boundary alignments).
Other Operators
C++ provides a set of other operators such as the ...
,
).()
)._
)While it isn't worth going into them in detail here, the reason the language explicitly lists them as operators is because they're overload-able (e.g. operator overloading). Overloading these operators is heavily discouraged since doing so causes confusion.
⚠️NOTE️️️⚠️
The book mentions the comma operator specifically. It doesn't look like this is used for much and the book recommends against using it for anything (e.g. operator overloading) due to the confusion it causes. This gives off similar vibes to Python's tuple syntax, where you can pass an unenclosed tuple as a subscript to something. When I was learning Python, that also came off as very confusing.
x = obj['column name', 100]
↩PREREQUISITES↩
C++ variable declarations have the following form: modifiers type name initializer
.
type (required) - Type of variable.
name: (required) - Name of variable.
initializer: (optional) - Initial value to assign (object initialization).
There are multiple ways to initialize a variable, each with their own advantages and disadvantages.
int x {a + b}
).int x = 5
).int x = { a + b }
).⚠️NOTE️️️⚠️
The above is an over-simplification. The ways to initialize are vast and complex. See here for a full accounting and here for an hour long talk about the edge cases.
It seems like the safest bet is to always use brace initialization where possible. Just use the braces as if they were parentheses or braces in Java (specific to the context). The others have surprising behaviour (e.g. they won't warn about narrowing conversions).
modifiers (optional) - Markers controlling the behaviour / properties of a variable.
(e.g. const
, volatile
, constexpr
, inline
, ...)
int a; // no initializer -- garbage possibly contained at memory location
int b {}; // empty initializer -- zeros out the memory for the int
int c {0}; // assign to constant 0
int d {c}; // assign to value in c
In C++, variables that are fields (assigned to a class) are called member variables. This section deals with non-member variables (e.g. scoped somewhere other than a class -- global, inside a function, etc..).
↩PREREQUISITES↩
The following sections list out core C++ types and their analogs. These include numeric types, character types, and string types.
C++'s core integer types are as follows...
short int
int
long int
long long int
The above integer types come in two forms: signed and unsigned. The range of ...
By default, the integer types above are signed (speculation). Signed-ness can be explicitly stated by prefixing either signed
or unsigned
to the type, but if the type is signed the prefix is usually omitted.
signed | unsigned |
---|---|
short int / signed short int |
unsigned short int |
int / signed int |
unsigned int |
long int / signed long int |
unsigned long int |
long long int / signed long long int |
unsigned long long int |
Integer types char int
, short int
, long int
, and long long int
can optionally omit the int
keyword.
signed | unsigned |
---|---|
short / signed short |
unsigned short |
int / signed int |
unsigned int |
long / signed long |
unsigned long |
long long / signed long long |
unsigned long long |
The only guarantees for core integer types are that ...
short
>= range of int
).All other specifics are platform-dependent. Specifically, ...
🔍SEE ALSO🔍
std::endian
)Integer ranges, although platform-specific, are queryable in the climits header.
type | min | max |
---|---|---|
signed short |
SHRT_MIN |
SHRT_MAX |
signed int |
INT_MIN |
INT_MAX |
signed long |
LONG_MIN |
LONG_MAX |
signed long long |
LLONG_MIN |
LLONG_MAX |
unsigned short |
0 |
USHRT_MAX |
unsigned int |
0 |
UINT_MAX |
unsigned long |
0 |
ULONG_MAX |
unsigned long long |
0 |
ULLONG_MAX |
By default, literals are represented using base10. Literals may be presented in different bases via the prefix.
base | literal prefix | example |
---|---|---|
2 (binary) | 0b | 0b1111 |
8 (octal) | 0 | 016 |
10 (decimal) | 15 |
|
16 (hex) | 0x | 0xF |
Integer literals are targeted to specific integer types by their suffix.
type | literal suffix | example |
---|---|---|
signed short |
||
signed int |
||
signed long |
L | 2L |
signed long long |
LL | 2LL |
unsigned short |
||
unsigned int |
U | 2U |
unsigned long |
UL | 2UL |
unsigned long long |
ULL | 2ULL |
⚠️NOTE️️️⚠️
Notice that int
, short
, and unsigned short
don't have explicit suffixes. If no suffix is present, it's an int (speculation). To get it to a short, the easiest way is to cast it: static_cast<short>(2)
.
🔍SEE ALSO🔍
Integer types with standardized bit lengths are defined in the cstdlib header.
signed | unsigned | description |
---|---|---|
intmax_t |
uintmax_t |
widest possible bit length |
int8_t |
uint8_t |
exactly 8 bits |
int16_t |
uint16_t |
exactly 16 bits |
int32_t |
uint32_t |
exactly 32 bits |
int64_t |
uint64_t |
exactly 64 bits |
int_least8_t |
uint_least8_t |
8 bits or greater |
int_least16_t |
uint_least16_t |
16 bits or greater |
int_least32_t |
uint_least32_t |
32 bits or greater |
int_least64_t |
uint_least64_t |
64 bits or greater |
int_fast8_t |
uint_fast8_t |
8 bits or greater |
int_fast16_t |
uint_fast16_t |
16 bits or greater |
int_fast32_t |
uint_fast32_t |
32 bits or greater |
int_fast64_t |
uint_fast64_t |
64 bits or greater |
intptr_t |
uintptr_t |
wide enough to hold a void * |
size_t |
wide enough to hold the maximum number of bytes of something in memory |
The minimum and maximum extents of each type are defined in {TYPE}_MIN
and {TYPE}_MAX
, where {TYPE}
doesn't include the _t
suffix. For example the maximum value an uint64_t
can be is UINT64_MAX
.
⚠️NOTE️️️⚠️
Not all types guaranteed to be present (e.g. 64-bit types may be missing if the platform can't support it). Unsigned types don't have a minimum extent defined because a minimum of any unsigned integer type is always 0 (e.g. uint64_t can't go any lower than 0).
To expand any integer literal to a ...
intmax_t
, use the macro INTMAX_C(...)
.uintmax_t
, use the macro UINTMAX_C(...)
.int{N}_t
, use the macro INT{N}_C(...)
(where {N}
is the bit length).uint{N}_t
, use the macro UINT{N}_C(...)
(where {N}
is the bit length).⚠️NOTE️️️⚠️
There is no macro SIZE_C(...)
for size_t
. Best to just assign a `size_t to one of the other types's literals and hope the compiler warns about any narrowing conversions that might happen.
⚠️NOTE️️️⚠️
What's the point of the above? You don't know what internal integer type each standardized type maps to. For example, uint64_t
may map to unsigned long long
, which means when you want to assign a literal to a variable of that type you need to add a ULL
suffix...
uint64_t test {9999999999999999999ULL}
The macros above make it so that you don't need to know the underlying mapping...
uint64_t test {UINT64_C(9999999999999999999)}
⚠️NOTE️️️⚠️
See also std::numeric_limits
in the limits. This seems to also provide platform-specific definitions that are queryable via functions..
C++'s core floating point types are as follows...
type | description | literal suffix | example |
---|---|---|---|
float |
single precision | f |
123.0f |
double |
double precision | 123.0 |
|
long double |
extended precision | L |
123.0L |
The specifics of each type are platform-dependent. The only guarantee is that each type has to hold at least the same range as the type before it (e.g. double
's range should cover float
's range). Other than that, ...
Floating point characteristics, although platform-specific, are queryable in the cfloat header.
type | min | max | min exponent | max exponent | mantissa digits | radix | epsilon |
---|---|---|---|---|---|---|---|
float |
FLT_MIN |
FLT_MAX |
FLT_MIN_EXP |
FLT_MAX_EXP |
FLT_MANT_DIG |
FLT_RADIX |
FLT_EPSILON |
double |
DBL_MIN |
DBL_MAX |
DBL_MIN_EXP |
DBL_MAX_EXP |
DBL_MANT_DIG |
DBL_RADIX |
DBL_EPSILON |
long double |
LDBL_MIN |
LDBL_MAX |
LDBL_MIN_EXP |
LDBL_MAX_EXP |
LDBL_MANT_DIG |
LDBL_RADIX |
LDBL_EPSILON |
⚠️NOTE️️️⚠️
Mantissa digits is the number of digits (of the base specified in radix) that the floating point type uses (speculation).
Epsilon is the difference between 1 and the floating point number just before 1.
⚠️NOTE️️️⚠️
The sizeof operator should NOT be used to infer limits / characteristics of a floating point type. For example, a sizeof(long double)
16 doesn't necessarily mean that the type is a quadruple precision float (128-bit). Rather, it's likely that the floating point type has less precision but the platform requires padding.
The rounding behaviour of all floating point types is queryable via FLT_ROUNDS
, where a ...
The floating point evaluation behaviour is queryable via FLT_EVAL_METHOD
, where a ...
⚠️NOTE️️️⚠️
Unsure about the last point. How's the last point any different than -1?
⚠️NOTE️️️⚠️
I see online that FLT_DIG
, DBL_DIG
, LDBL_DIG
, and DECIMAL_DIG
define the number of "decimal digits" that can be converted to floating point and back without a loss in precision. I'm assuming that just means the max number of digits that can be represented in a float where exp is 1?
⚠️NOTE️️️⚠️
See also std::numeric_limits
in the limits header. This seems to also provide platform-specific definitions that are queryable via functions..
↩PREREQUISITES↩
Core C++ strings are represented as an array of characters, where that array ends with a null character to signify its end. This is in contrast to other major platforms that typically structure strings a size integer along with the array (no null terminator).
Individual characters all map to integer types, where literals are defined by wrapping the character in single quotes. Even though they're integers, the signed-ness of each of the types below isn't guaranteed.
type | bits | literal prefix | example | description |
---|---|---|---|---|
char |
>= 8 | 'T' |
>= 8-bit wide character (smallest unit of memory -- 1 byte) | |
char8_t |
8 | u8 |
u8'T' |
16-bit wide character (e.g. UTF-8) |
char16_t |
16 | u |
u'T' |
16-bit wide character (e.g. UTF-16) |
char32_t |
32 | U |
U'T' |
32-bit wide character (e.g. UTF-32) |
wchar_t |
L |
L'T' |
at least as wide as char |
Note that char
and wchar_t
don't have predefined bit lengths. They are platform-dependent. The bit length for...
char
is defined in CHAR_BIT
of climits and must be at least 8 bits.wchar_t
must be equal to or greater than that of char
.⚠️NOTE️️️⚠️
char
literals can also be integers, but the signed-ness of the char
type isn't defined by default (speculation). It can specifically be made to be signed / unsigned by prefixing it as such: signed char
/ unsigned char
.
Strings literals are wrapped in double quotes instead of single quotes, where they get transformed into an array terminated by a null character.
type | literal prefix | example | description |
---|---|---|---|
char * |
"hello" |
unknown encoding (platform specific?) | |
wchar_t * |
L |
L"hello" |
unknown encoding (platform specific?) |
char16_t * |
u |
u"hello" |
encoded as UTF-16 |
char32_t * |
U |
U"hello" |
encoded as UTF-32 |
char8_t * |
u8 |
u8"hello" |
encoded as UTF-8 |
Typically escaping rules apply to string literals. Unescaped string literals are allowed by adding an R
at the end of the literal prefix, which make it so that the ...
These delimiter characters are characters that aren't encountered in the contents of the string itself. For example, in u8R"|hello|"
, the delimiter is |
and isn't included in the resulting UTF-8 string.
void
is a type that represents an empty set of values. Since it can't hold a value, C++ won't allow you to declare an object of type void. However, you can use it to declare that a function ...
void
return).void
parameter list).↩PREREQUISITES↩
C++ allows for the creation of arrays of constant length (size of the array must be known at compile-time). Elements of an array are guaranteed to be contiguous in memory (speculation).
int x[100]
- Creates an array of 100 ints where those 100 ints are junk values (data previously at that memory location is not zeroed out).int x[] { 5, 5, 5 }
- Creates an array of 3 ints where each of those ints have been initialized to 5 (braced initialization).int x[] = { 5, 5, 5 }
- Same as above (assignment does not do any extra work).int x[3] {}
- Creates an array of 3 ints where each of those ints are 0 (memory zeroed out -- braced initialization).int x[3] = {}
- Same as above (assignment does not do any extra work).int x[n]
- Disallowed by C++ if n isn't a constant. These types of arrays are allowed in C (called variable length arrays / VLA), but not in C++ because C++ has collection classes that allow for sizes not known at compile-time.Accessing arrays is done similarly to how it is in most other languages, by subscripting (e.g. x[0] = 5
). The only difference is that array access isn't bounds-checked and array length information isn't automatically maintained at run-time. For example, if an array has 100 elements, C++ won't stop you from trying to access element 250 -- out-of-bounds array access is undefined behaviour.
One way to think of an array is as a pointer to a contiguous block of elements of the array type. In fact, if an array type gets used where it isn't expected, that array type automatically decays to a pointer type.
int test(int *x) {
return x[0] + x[1];
}
void run() {
int x[3] = { 1, 2, 3 };
int y { test(x) };
}
⚠️NOTE️️️⚠️
My understanding is that arrays are typically passed to functions as pointers + array length. This is because the array length information is only available at compile-time, meaning that if you have a function that takes in an array, how would it know the size of the array it's working with when it runs (it isn't the one who declared it). It looks like a function parameter can be an array type of fixed size, but apparently that doesn't mean anything? The compiler doesn't enforce that a caller use an array of that fixed size, and using sizeof on the array will produce a warning saying that it's decaying into a pointer.
size_t test(int x[10]) {
return sizeof(x); // compiler warning that this is returning sizeof(int *)
}
int main() {
int x[3] = { 1, 2, 3 };
size_t y { test(x) }; // compiler doesn't complain that test() expects int[10] but this is int[3]
cout << y;
return 0;
}
Be careful when using the sizeof
operator on an array. If the type is the original array type, sizeof
will return the number of bytes taken up by the elements of that array (known at compile-time). However, if the type has decayed to a pointer type, sizeof
will return the number of bytes to hold on to a pointer.
int x[3];
int *y {x}; // equiv to setting to &(x[0]);
std::cout << sizeof x; // should be the size of 3 ints
std::cout << sizeof y; // should be the size of a pointer
Similarly, range-based for loops won't work if the type has decayed to a pointer type because the array size of that pointer isn't known at compile-time.
int x[3] = {1,2,3};
int *y {x};
for (int i {0}; i < 3; i++) { // OK
std::cout << y[i] << std::endl;
}
for (int v : x) { // OK
std::cout << v << std::endl;
}
for (int v : y) { // ERROR
std::cout << v << std::endl;
}
You may be tempted to use sizeof(array) / sizeof(type)
to determine the number of elements within an array. It's a better idea to use std::size(array)
instead (found in the iterator header) because it should have logic to workaround and platform-specific behaviours that might cause inconsistent results / unexpected behaviour (speculation).
↩PREREQUISITES↩
C++ provides types that reference a memory address, called pointers. Variables of these types can point to different memory addresses / objects.
Adding an asterisk (*) to the end of any type makes it a pointer type (e.g. int *
is a type that can contain a pointer to an int
). A pointer to any object can be retrieved using the address-of unary operator (&). Similarly, the value in any pointer can be retrieved using the dereference unary operator (*).
int w {5};
int *x { &w }; // x points to w
int *y { &w }; // y points to w
int z { *x }; // z is a copy of whatever x points to, which is w, which means it gets set to 5
*x = 7; // w is set to 5 through x
int **a { &x }; // a points to x, which points to w (a pointer to a pointer to an int)
As shown in the example above, it's perfectly valid to use the dereference operator on the left-side of the equals. It defines where the result of the right side should go.
int w {5};
int *x { &w }; // x points to w
int **y { &x }; // y point to x, which points to w
**y = 7; // y dereferenced twice and set to 7 -- w should now be 7
The notation is confusing because asterisk (*) has different meanings. In the context of a ...
⚠️NOTE️️️⚠️
See also: member-of-pointer operator.
In addition, a pointer can optionally be set to nothing via the nullptr
literal. nullptr
is actually of type std::nullptr_t
, but the compiler will implicit conversion to/from other pointer types when required.
int *y { nullptr }; // implicit conversion
if (y == nullptr) {
// report error
}
⚠️NOTE️️️⚠️
It seems like there's some implicit conversions to boolean that are possible with pointers. If whatever the pointer is going to expect a boolean, it's implicitly converted to ptr != nullptr
? So in if / while/ for conditions, you can just use the pointer as is without explicitly writing out a condition?
⚠️NOTE️️️⚠️
How is this different than the NULL macro? I guess because it's a distance type, you can have a function overload that takes in param of type std::nullptr_t
? But why would you ever want to do that?
↩PREREQUISITES↩
Certain arithmetic operators are allowed on pointers, called pointer arithmetic. Adding or subtracting integer types on a pointer will move that pointer by the number of bytes that makes up its underlying type.
int []x = {1, 5, 7};
int *ptrA { &(x[1]) }; // points to idx 1 of x (5)
int *ptrB { ptrA + 1 }; // points to idx 2 of x (7)
This is similar to array access via the subscript operator. In fact, both arrays and pointers can be accessed in the same way using the subscript operator and pointer arithmetic.
int x[] { 1, 2, 3, 4, 5, 6, 7, 8, 9 };
int *y {x};
*(y+1) = 99; // same as x[1] = 99
x[2] = 101; // same as *(y+2) = 101;
⚠️NOTE️️️⚠️
An array guarantees that its elements appear contiguously and in order within memory (I think?), so if the pointer is from a decayed array, using pointer arithmetic to access its elements is perfectly fine.
A pointer to the void type means that the type being pointed to is unknown. Since the type is unknown, dereferencing a void pointer isn't possible. In other words, it isn't possible to read or write to the data pointed to by a void * because the underlying type is void / unknown.
int x[] { 1, 2, 3, 4, 5, 6, 7, 8, 9 };
void *y { x };
*y = 2; // fails
Since the underlying type of the pointer is unknown, pointer arithmetic isn't allowed either.
int x[] { 1, 2, 3, 4, 5, 6, 7, 8, 9 };
void *y { x };
y = y + 2; // fails
⚠️NOTE️️️⚠️
If you have a void *
and you want to do raw memory manipulation at that address, use a std::byte *
instead. Why not just use char *
instead? Is a char
guaranteed to be 1 byte (I think it is)? According to this, it's because certain assumptions about char
s may not hold with bytes? I don't know. Just remember std::byte *
if you're working with raw data.
↩PREREQUISITES↩
A pointer to a function means the type being pointed to is a function with some specific structure. All functions have a type associated with them, defined by their return type, parameter type, and owning class if the function is a method.
// type is: int (int, int)
int add(int a, int b) {
return a + b;
}
To declare a function pointer to a free function or a static class method, write out the function type (return type and parameter list without names) but place the pointer name preceded by an asterisk (*) wrapped in parenthesis where the function name would be. Invoke it just like you would any other function.
int (*func_ptr)(int, int) {}; // unset pointer to a function of structure int (int, int)
int add(int a, int b) {
return a + b;
}
int multiply(int a, int b) {
return a * b;
}
func_ptr = &add; // point func_ptr to address of add()
func_ptr(1, 2); // invoke
To declare a function pointer to a non-static class method (member function), the class type needs to be included before the asterisk (*) using the scoped resolution operator (::).
int (MyClass::*func_ptr)(int, int) {}; // unset pointer to a function of structure int (int, int) in or inherited from MyClass
MyClass x {};
func_ptr { &MyClass::multiply }; // point to: int MyClass::multiply(int, int)
(x.*func_ptr)(2, 3); // provide x as the MyClass instance when invoking
Unlike normal functions, functors cannot be assigned to raw function pointers. A functor's version of a function pointer is a pointer to the function-call operator overload (method).
int (MyFunctor::*func_ptr)(int, int) {}; // unset pointer to a function of structure int (int, int) in or inherited from MyClass
MyFunctor x {};
func_ptr { &MyFunctor::operator() };
(x.*func_ptr)(2, 3); // provide x as the MyClass instance when invoking
Alternatively, to support both functions and functors, the parameter expecting a function pointer should be changed to the std::function
or the code doing the invocation should be changed to use the std::invoke
wrapper. These wrappers abstract away the differences between pointers to functions and functors.
↩PREREQUISITES↩
C++ provides a more sanitized version of pointers called references. A reference type is declared by adding an ampersand (&) after the type rather than an asterisk (*), and it implicitly takes the address of whatever is passed into it when it's created.
int w {5};
int *x { &w }; // x points to w
int &y { w }; // y references to w (note address-of operator not used here)
The main difference between pointer types and reference types is that a reference type doesn't need to explicitly dereference to access the object pointed to. The object pointed to by the reference type is accessed as if it were the object itself.
*x = 10; // x explicitly dereferenced to w and set to 10
y = 15; // y implicitly dereferenced to w and set to 15
As shown in the example above, assignment to a reference type is assignment on the underlying object being referenced. As such, having the reference type point to a different object isn't possible (referred to as reseating).
Similarly, it's not possible to have a reference to a reference.
int &&z { y }; // this isn't a thing -- fail
⚠️NOTE️️️⚠️
The way to think of references is documented here. Don't consider a reference as an object the same way a pointer is an object. In the compiler's eyes, a reference doesn't store anything like a pointer does (stores a memory address). It's just a "reference" to an object -- the object itself has storage, but the reference to that object doesn't.
In that sense, it's impossible to have ...
const
reference like you have a const
pointer or an array of references.... the same way that you can have with a pointer.
↩PREREQUISITES↩
An rvalue reference is similar to a reference except that it tells the compiler that it's working with an rvalue. Rvalue references are declared by adding two ampersands (&&) after the type rather than just one.
// Function return type is an rvalue reference
MyObject && gimmie_an_rvalueref(int x) {
...
}
A variable of type rvalue reference is actually an lvalue to an rvalue reference. As such, passing a variable of type revalue reference as a function argument will treat it as if it were an lvalue.
⚠️NOTE️️️⚠️
Confused? Recall from the expression categories section that, if it has a name (named variable or function), it's probably an lvalue.
void my_func(MyObject & x) {
std::cout << "NO RREF";
}
void my_func(MyObject && xRref) {
std::cout << "YES RREF";
}
MyObject &&a { gimmie_an_rvalueref(42) }; // a has a name, meaning its an lvalue to an rvalue reference
my_func(a); // calls "NO RREF" version
If you need to pass a variable of type rvalue reference as a function argument, the typical approach is to either never store it as a variable or to use std::forward
to ensure the object remains an rvalue reference.
my_func(gimmie_an_rvalueref(42)); // calls "YES RREF" version
my_func(std::forward<MyObject &&>(b)); // calls "YES RREF" version
// NOTE: you MUST specify the full type in std::forward's template parameter -- automatic type inference not supported
⚠️NOTE️️️⚠️
See here.
Rvalue references are typically used for moving objects (not copying, but actually moving the guts of one object into another). This is done through something called a move constructor, which is explained in another section.
🔍SEE ALSO🔍
↩PREREQUISITES↩
sizeof
is a unary operator that returns the size of its operand in bytes as a size_t
type. If the operand is a ...
data type or a variable, it'll return the number of bytes needed to hold that type. For example, ...
sizeof char
is guaranteed to be 1.sizeof (char &)
is guaranteed to be 1.sizeof (char *)
is platform dependent, typically either 4 or 8.char * x { "hi" }; sizeof x
is same as sizeof (char *)
(see above).an expression such as a structure/class literal, array literal, or string literal, it'll return the number of bytes needed to hold it. For example, ...
sizeof "hi"
is 3 (added 1 for the null terminator at the end)
sizeof { 5, 5, 4 }
is platform dependent, typically either 12 or 24.
sizeof (int[3])
is platform dependent, typically either 12 or 24.
x = int[n]; sizeof x
is invalid C++ (variable length arrays allocated on stack are not allowed in C++).
In other words, sizeof
returns the size of things known at compile-time. If a variable is passed in, it outputs the size of the data type. For example, if the data type is a struct of type MyStruct
, it'll return the number of bytes used to store a MyStruct
. However, if the data type is a pointer to MyStruct
, it'll return the number of bytes to hold that pointer. That is, you can't use it to get the size of something like a dynamically allocated array of integers.
In certain cases, the compiler may add padding to objects (e.g. byte boundary alignments or performance reasons), meaning that the size returned by sizeof
for an object shouldn't be used to make inferences about the characteristics of that object. For example, a long double
may get reported as being 16 bytes, but that doesn't necessarily mean that a long double
is a 128-bit quad floating point. It could be that only 12 of those bytes are used to represent the floating point number while the remainder is just padding for alignment reasons.
⚠️NOTE️️️⚠️
As shown in the examples above, the sizeof
a C++ reference is the same as the raw size. For example, sizeof char == sizeof (char &)
.
⚠️NOTE️️️⚠️
The last example is valid in C99 (called a VLA -- variable length array) but not C++. The reason is C++ has std::vector and std::array that give you basically the same thing as variable arrays.
In C, where VLAs are allowed, doing a sizeof
on a VLA is undefined.
⚠️NOTE️️️⚠️
Remember that sizeof
is a unary operator, similar to how the negative sign is a unary operator that negates whatever is to the right of it. People usually structure its usage in code as if it were a function (e.g. sizeof(x)
vs sizeof x
). This sometimes causes confusion for people coming from other languages.
↩PREREQUISITES↩
The using
keyword is used to give synonyms to types. Other than having a new name, a type alias is the exact same as the originating type.
using IntegerButWithNewName = int;
int x {42};
IntegerButWithNewName y {42}; // same as: int y {42};
IntegerButWithNewName z {x + y}; // same as: int z {x + y};
int func(float x);
int func(short x);
int func(int x);
int func(IntegerButWithNewName x); // NOT ALLOWED -- this overload is same as the overload above
⚠️NOTE️️️⚠️
To allow for use-cases such as the function overloading case in the example above, the cleanest solution is to wrap the type in a class
The benefit of type aliasing is that it helps shorten type names, which can be especially useful when using a template.
using BasicGraph = DirectedGraph::Graph<std::string, std::map<std::string, std::string>, std::string, std::map<std::string, std::string>>;
BasicGraph removeLimbs(const BasicGraph &g);
🔍SEE ALSO🔍
↩PREREQUISITES↩
For types, any part of that type can be made unmodifiable by adding a const
immediately after it.
int a {5}; // a is changeable -- set to 5
int const x {a}; // x is unchangeable -- set to 5 (value in a)
int * const y {&a}; // y is an unchangeable pointer to a changeable int -- set to a (points to a)
int const * const z {&x}; // z is an unchangeable pointer to a unchangeable int -- set to x (points to x)
The simplest way to interpret const
-ness of a type is to read it from right-to-left.
One caveat to the above is that a type beginning with const
is the same as the first part of that type having const
applied on it.
const int x {5}; // same as int const x {5}
All of the examples above were for fundamental types. Appending a const
on a class type works exactly the same way: None of its fields are modifiable ever, even by its own methods.
struct MyStruct {
int x {5}
};
MyStruct const inst {};
inst.x = 5; // compiler error
↩PREREQUISITES↩
⚠️NOTE️️️⚠️
Unlike in Java, The volatile
keyword in C++ is not used for thread-safety.
Adding the keyword volatile
before a type makes it immune to compiler optimizations such as operation re-ordering and removal. Mutations and accesses, no matter how irrelevant they may seem, are kept in-place and in-order by the compiler.
int f(int a) {
int x {a};
x = 6;
int y {x};
int x {y};
return x; // at this point, x is always 6
}
A compiler might be able to deduce that the function above always returns 6, and as such may replace the operations it performs with simply just returning 6. Adding volatile
to the type of the variable prevents this from happening.
int f(int a) {
volatile int x {a}; // marked as volatile
x = 6;
int y {x};
int x {y};
return x;
}
Using volatile
is important when working with embedded devices, where platform-specific memory locations often need to be accessed in a specific order / at specific intervals in seemingly useless ways (e.g. kicking a watchdog by writing 0 to a memory location but never reading that memory location).
If a variable has been deprecated, adding a [[deprecated]]
attribute will allow the compiler to generate a warning if it sees it being used.
[[deprecated("Warning -- this is going away in the next release")]]
int my_variable;
↩PREREQUISITES↩
An implicit type conversion is when an object of a certain type is converted (cast) automatically, without code explicitly changing the object to a different type. For example, long x {1}
implicitly converts the int
literal in the initializer to a long
.
int x {5};
long y {x}; // int to long
The most common types of implicit conversions are ...
int *
to void *
).int
to float
).0
to false
)Depending on the operation performed or how an object is initialized, the results of an implicit conversion may do something specific to that platform and/or compiler implementation.
source type | destination type | behaviour |
---|---|---|
Integer | Floating Point | Implementation-specific behaviour if can't fit in destination (speculation). |
Floating Point | Integer | Rounded to integer (speculation - how?), implementation-specific behaviour if can't fit in destination (speculation). |
Integer | Integer | Signed destination and value can't fit, implementation-specific behaviour. Unsigned destination and value can't fit, truncates higher-order bits. |
Floating Point | Floating Point | Implementation-specific behaviour if value can't fit in destination. |
Any Numeric | Boolean | 0 converts to false , otherwise true . |
Any Pointer | Boolean | nullptr converts to false , otherwise true . |
⚠️NOTE️️️⚠️
The book recommends to always use braced initialization because when you do, the compiler produces warnings about types not fitting. However, those warnings don't seem to cover everything, at least that's the impression I get from what I've tried.
↩PREREQUISITES↩
An explicit type conversion is the opposite of an implicit type conversion. It's when an object of a certain type is explicitly converted (cast) to another type in code.
long x {5L};
int y {static_cast<int>(x)}; // long to int
Explicit type conversions come in two forms:
Named conversions should be preferred over C-style casts. Any C-style cast can be performed through a named conversion.
Named conversion functions are a set of (seemingly templated) functions to convert an object's types. These functions provide safety mechanisms that aren't available in other older ways of casting.
const_cast
removes the const
modifier from an object's type.
void func(const MyType &t) {
T &moddable_t { const_cast<MyType &>(t) };
}
Performing this type of conversion should only be done in extreme situations since it breaks contracts.
static_cast
forces the reverse of an implicit conversion.
int a[] {1,2,3,4};
int *b { a }; // ok, implicit conversion (decay to pointer)
void *c { b }; // ok, implicit conversion
int *d { c }; // error, can't go in reverse
int *e { static_cast<int *>(c) }; // ok
In the above example, a uint32_t *
implicitly converts to void *
, but not the reverse. A static_cast
makes going in reverse possible. However, that doesn't mean it's always safe to do. For example, uint32_t
reads may need to be aligned to 4 byte boundaries on certain platforms. If the void *
was arbitrary data (e.g. coming in over a network), it might cause a crash to just treat it as a uint32_t *
and start reading.
⚠️NOTE️️️⚠️
Why does a uint32_t*
implicitly convert to a void *
? Recall that void *
just means "pointer to something unknown", which is something the language is okay automatically / implicitly converting.
reinterpret_cast
forces a reinterpretation of an object into an entirely different type.
int a[] {1,2,3,4};
int *b { a }; // ok, implicit conversion (decay to pointer)
short *c { b }; // error, you can't convert from an int* to a short* (not even with a static_cast because it's not an implicit conversion)
short *d { reinterpret_cast<int *>; } // ok
narrow_cast
is similar to static_cast
for numerics, except it ensures that no information loss occurred.
uint32_t a { 70000 }; // ok
uint16_t b { static_cast<uint16_t>(a) }; // ok, but since uint16_t has a max of65535, this object is mangled
uint16_t c { narrow_cast<uint16_t>(a) }; // runtime exception, narrow_cast sees that the object will be mangled
⚠️NOTE️️️⚠️
Is this part of the standard? The book seems to give the code for narrow_cast
and looking online it looks like people have their own implementations?
🔍SEE ALSO🔍
any_cast
for an "any" container)clock_cast
for converting times between different types of clocks)duration_cast
for converting between different types of durations)C-style casts are similar to casts seen in Java. The type is bracketed before whatever is being evaluated.
int x { (int) 9999999999L };
The problem with C-style casting is that it doesn't provide the same safety mechanisms as named conversions do (e.g. inadvertently strip the const
-ness). Named conversions provide these safety mechanisms and as such should be preferred over C-style casts. Any C-style cast can be performed using a named conversion.
↩PREREQUISITES↩
In C++, an object is a region of memory that has a type and a value (e.g. a class instance, an integer, a pointer to an integer, etc..). Contrary to other more high-level languages (e.g. Java), C++ objects aren't exclusive to classes (e.g. a boolean is an object).
An object's life cycle passes through the following stages:
The storage duration of an object starts from when its memory is allocated and ends when that memory is deallocated. An object's lifetime, on the other hand, starts when its constructor completes (meaning the constructor finishes) and ends when its destructor is invoked (meaning when the destructor starts).
Since C++ doesn't have a garbage collector performing cleanup like other high-level languages, it's the user's responsibility to ensure object lifetimes. The user is responsible for knowing when objects should be destroyed and ensuring that objects are only accessed within their lifetime.
The typical storage durations supported by C++ are...
By default, an object declared within a function is said to be an automatic object. Automatic objects have automatic storage durations: start at the beginning of the block and finish at the end of the block. When the keyword static
(or extern
in some cases) is added to the declaration, the storage duration of the function changes.
At global scope, if an object is declared as static
or extern
, storage duration of the object spans the entire duration of the program. The difference between the two is essentially just visibility:
static
makes it so it's accessible to only the translation unit it's declared in.extern
makes it so it's accessible to other translation units as well as the translation unit it's declared in.static int a { 0 }; // static variable
extern int b { 1 }; // static variable (accessible outside translation unit)
At function scope, the storage duration of objects declared as static
starts at the first invocation of that function and ends when the program exits.
int f1() {
static int z {0}; // static variable
z += 1;
return z;
}
At class level, the storage duration of a member (field or method) declared as static
is essentially the same as if it were declared at global scope (they aren't bound to an individual instance of the class the same way a normal field or method is). The only differences are that the static member is accessed on the class itself using the scoped resolution operator (::) and that static members that are fields must be initialized at global scope.
class X {
public:
static int m; // static member (field initialized at end)
static int f1() { // static member (method)
m += 1;
return m;
}
};
X::m = 0; // initialize static member
If the thread_local
modifier is added before static
(or extern
), each thread gets its own copy of the object. That is, the storage duration essentially gets changed to when the thread starts and ends.
thread_local static
can be shortened to just thread_local
(it's assumed to be static).
static int a {0};
thread_local static int b {1};
thread_local extern int c {2};
An object can be created in an ad-hoc manner, such that its storage duration is entirely controlled by the user. The operator ...
new
allocates a new object and calls its constructor.delete
calls the destructor of some object and deallocates it.Both keywords work with pointers: new
returns a pointer while delete
requires a pointer. To create a new object, use new
followed by the type.
int * ptr { new int };
*ptr = 0;
delete ptr;
Objects may be initialized directly within the new
invocation just as if it were an automatic object initialization. The only caveat is that equals initialization and brace-plus-equals initialization won't work because the equal sign is already being used during new
(speculation -- it doesn't work but I don't know the exact reason). As such, braced initialization is the best way to initialize a dynamic object.
int * ptr { new int {0} }; // initialize to 0
delete ptr;
The same process can be used to create an array of objects. Unlike automatic object arrays, dynamic arrays don't have a constant size array length restriction. However, the return value of new
will decay from an array type to a pointer type.
When deleting a dynamic object array, square brackets need to be appended to delete
operator: delete[]
. Doing so ensures that the destructor for each object in the array gets invoked before deallocation.
int * ptr { new int[len] }; // len is some non-constant positive integer, decayed to pointer type because array length can be non-constant.
delete[] ptr;
Braced initialization may be used when declaring dynamic arrays so long as the size of the array is at least the size of the initialization list.
int * ptr1 { new int[10] {1,2,3} }; // initialize the first 3 elems of a 10 elem array
int * ptr2 { new int[2] {1,2,3} }; // throws exception (size too small for initializer list)
int * ptr3 { new int[n] {1,2,3} }; // okay -- so long as n >= 3
delete[] ptr1;
delete[] ptr2;
delete[] ptr3;
By default, dynamic objects are stored on a block of memory called the heap, also sometimes referred to as the free store.
⚠️NOTE️️️⚠️
See operator overloading section to see how the new
and delete
operators may be overridden to customize where and how a specific type gets stored.
The new
and delete
operators may also be overridden globally rather than per-type. See the new header.
↩PREREQUISITES↩
C++ function declarations and definitions have the following form: prefix-modifiers return-type name(parameters) suffix-modifiers
return-type (required) - Type returned by function.
name: (required) - Name of function.
parameters (required) - Parameter list of function.
prefix-modifiers (optional) - Markers controlling the behaviour / properties of a function.
(e.g. static
, virtual
, constexpr
, [[noreturn]]
, inline
, ...)
suffix-modifiers (optional) - Markers controlling the behaviour / properties of a function.
(e.g. noexcept
, const
, final
, override
, volatile
, ...)
int add(int x, int y) {
return x + y;
}
In C++, functions that are ...
This section deals with free functions.
⚠️NOTE️️️⚠️
Some of the modifiers listed above are for member functions, not free functions.
Function overloading is when there are multiple functions with the same name in the same scope. For free functions, each function overload must have the same return type and a unique set of parameters.
bool test(int a) { return a != 0; }
bool test(double a) { return a != 0.0; }
bool test(int a, int b) { return a != b; }
When an overloaded function is called, the compiler will try to match argument types against parameter types to figure out which overloaded function to call. If no exact match can be found, the compiler attempts to obtain a correct set of types through a set of conversions.
int num { 1 };
test(1); // calls the first overload in the code above: bool test(int a);
⚠️NOTE️️️⚠️
See argument matching section.
When a function is called but the arguments types don't match the parameter list types, the compiler attempts to obtain a correct set of types through a set of conversions on the arguments. For example, if a parameter expects a reference to a constant object but what gets passed into the argument is an object, the argument is automatically converted to a constant object and its reference is used.
bool test(const int &obj) { ... }
int x {};
test(x); // x is turned into a "const int" and passed in as a reference
For floating point and integral types, the compiler will widen or narrow the if the exact type isn't found.
bool test(int32_t a) {
std::cout << a;
return a != 0;
}
float x {1.5};
test(x); // automatic narrowing
Similarly, the compiler will convert between signed and unsigned integral types if the exact integral type isn't found.
bool test(uint32_t a) {
std::cout << a;
return a != 0;
}
int64_t x {10};
test(x); // automatic narrowing and change to unsigned
When function overloads are involved, the candidate with the arguments matching most closely is the one chosen.
⚠️NOTE️️️⚠️
The exact rules here seem hard to definitively pin down. If you have two overloads of a function, one accepting int16 and int64, it'll fail when you try to call it with int8 claiming that it's too ambiguous. The best thing to do is to just ask the compiler to either warn on implicit conversion (-Wconversion
) flag or on narrowing implicit conversion (-Wnarrowing
/ -Wno-narrowing
). These flags may not be included under -Wall
.
↩PREREQUISITES↩
The entry-point to any C++ program is the main
function, which can take one of three possible forms:
int main()
No arguments.
int main(int argc, char* argv[])
Command-line arguments, where argv
is an array of size argc
containing the null-terminated command-line arguments. On most modern platforms, the first argument is the path of the executable.
int main(int argc, char* argv[], EXTRA_PLATFORM_SPECIFIC_PARAMS)
Same as the above except extra arguments are supplied that are platform-specific.
All three forms return an integer known as an exit code. On most modern day platforms, an exit code of 0 means success. If the code doesn't return an exit code, 0 is assumed.
#include <iostream>
int main(int argc, char* argv[]) {
std::cout << "hello world!" << ' ' << argv[0];
return 0;
}
⚠️NOTE️️️⚠️
Should argv
be const char * const *
? In that you shouldn't be able to change the strings or the string pointer at each array index.
↩PREREQUISITES↩
A variadic function is one that takes in a variable number of arguments, sometimes called varargs in other languages. A function can be made variadic by placing ...
as the final parameter. The arguments for this final parameter are called the variadic arguments.
The variadic arguments for a function are accessible through functionality provided by the cstdargs header.
#include <cstdargs>
float avg(size_t n, ...) {
va_list args;
va_start(args, n);
float sum {0};
while (size_t i {0}; i < n; i++) {
sum += va_args(args, float);
}
va_end(args);
return sum /= n;
}
va_list
- Access point to variadic arguments.va_start
- Initializes access to variadic arguments (requires the va_list
variable and the expected count of variadic arguments).va_args
- Gets the next variadic argument (requires the va_list
variable and the expected type).va_end
- Tears down access to the variadic arguments (requires the va_list
variable).In addition, the va_copy()
can be used to copy one va_list
to another. The source will need to be initialized before the copy (via va_start
). Once va_copy
returns, copy will already be initialized (no need for va_start
) but will need to be torn down before the function exits (via va_end
).
#include <cstdargs>
float add_and_mult(size_t n, ...) {
va_list args;
va_list args2;
va_start(args, n);
va_copy(args2, args); // 1st param is dst, 2nd param is src
float res {0};
while (size_t i {0}; i < n; i++) {
res += va_args(args, float);
}
va_end(args);
while (size_t i {0}; i < n; i++) {
res *= va_args(args2, float);
}
va_end(args2);
return res;
}
⚠️NOTE️️️⚠️
The book recommends against using variadic functions due to confusing usage and having to explicitly know the count and types of the variadic arguments before hand (can become security problem if screwed up). Instead it recommends using variadic templates for functions instead.
In certain cases, it'll be impossible for a function to throw an exception. Either the function (and the functions it calls into) never throws an exception or the conditions imposed by the function make it impossible for any exception to be thrown. In such cases, a function may be marked with the noexcept
keyword. This keyword allows the compiler to perform certain optimizations that it otherwise wouldn't have been able to, but it doesn't necessarily mean that the compiler will check to ensure an exception can't be thrown.
int add(int a, int b) noexcept {
return a + b;
}
⚠️NOTE️️️⚠️
The book mentions this is documented in "Item 16 of Effective Modern C++ by Scott Meyers". It goes on to say that, unless specified otherwise, the compiler assumes move constructors / move-assignment operators can throw an exception if they try to allocate memory but the system doesn't have any. This prevents it from making certain optimizations.
If a function has no possibility of ever gracefully returning to the caller, adding a [[noreturn]]
attribute will allow the compiler to make certain optimizations and provide / remove relevant warnings around that function.
[[noreturn]] int add(int a, int b) {
throw "error";
}
If a function returns something and it's of vital importance that the return value should be used by the invoker, adding a [[nodicard]]
attribute will allow the compiler to generate a warning.
[[nodiscard]] Result perform(int a) {
// perform some computation
if (result < 0) {
return ERROR_CODE;
}
return SUCCESS_CODE;
}
If a function's parameter isn't used but it's inclusion in the parameter list is intentional, adding a [[maybe_used]]
attribute will allow the compiler to remove any warnings that it might otherwise show up about it being unused.
int add(int a, int b, [[maybe_unused]] int c) {
return a + b;
}
If a function has been deprecated, adding a [[deprecated]]
attribute will allow the compiler to generate a warning if it's being used.
[[deprecated("Warning -- this is going away in the next release")]]
int add(int a, int b) {
return a + b;
}
C++ enumerations are declared using enum class
.
enum class MyEnum {
OptionA,
OptionB,
OptionC
};
MyEnum x {MyEnum::OptionC};
switch (x) {
case MyEnum::OptionA:
...
break;
case MyEnum::OptionB:
...
break;
case MyEnum::OptionC:
...
break;
default:
break;
}
An enumeration may be brought into scope via using
to remove the need to prefix with the enumeration's name.
switch (x) {
using enum MyEnum;
case OptionA:
...
break;
case OptionB:
...
break;
case OptionC:
...
break;
default:
break;
}
⚠️NOTE️️️⚠️
It's possible to remove the class
from enum class
, which heavily loosens type-safety and scope. By removing class
, the options within have their values implicitly converted to integers and you don't need the resolution scope operator (their options are accessible at the same level as an enum).
enum MyEnum { // no class keyword
OptionA,
OptionB,
OptionC
};
MyEnum x {OptionC}; // this is okay -- don't have to use MyEnum::OptionC
int y {OptionC}; // this is okay -- options are integers
You should prefer enum class
.
↩PREREQUISITES↩
C++ classes are declared using either the struct
keyword or class
keyword. When ...
struct
is used, the default visibility of class members is public.class
is used, the default visibility of class members is private.Public and private visibility are the same as in most other languages: private members aren't accessible outside the class while public members are. In C++ nomenclature, ...
class MyStruct {
private:
int count;
bool flag;
public:
char name[256];
void add() {
count += 1;
flag = false;
}
};
C++ classes that contain only data are called plain-old-data classes (POD), and they're typically created using the struct
keyword so as their members are all accessible by default.
struct MyStruct {
int count;
char name[256];
bool flag;
};
⚠️NOTE️️️⚠️
C++ guarantees that a class's fields will be sequentially stored in memory, but they may be padded / aligned based on the platform. Be aware when using the sizeof operator.
↩PREREQUISITES↩
Non-static methods of a class have access to an implicit pointer called this
, which allows for accessing that instance's members. As long as the class member doesn't conflict with any parameter name of the method invoked, the usage of that name will implicitly reference the this
pointer.
The member-of-pointer operator (->) allows for dereferencing a pointer and accessing a member on the result in a more concise form.
class MyStruct {
private:
int count;
bool flag;
public:
f1(int count) {
this->count = count; // same as (*this).count = count
flag = false;
}
f2(int count, bool flag) {
this->count = count; // same as (*this).count = count
this->flag = flag; // same as (*this).flag = flag
}
}
↩PREREQUISITES↩
For fields of a class, a const
before the type has the same meaning as a const
variable at global scope: It's unmodifiable.
For methods of a class, a const
after the parameter list indicates that the class's fields won't be modified (read-only). This is a deep check rather than a shallow check, meaning that the entire call graph is considered when checking for modification.
struct Inner {
int x {5};
int y {6};
void change(int n) {
x = n;
}
};
struct X {
int a {0};
Inner inner;
void test1() const {
a = 5; // NOT okay -- no mutation allowed
}
void test2() const {
inner.x = 15; // NOT okay -- no mutation allowed, even though this is deeper down
}
void test3() const {
inner.change(15); // NOT okay -- method being invoked must be const (otherwise mutation might happen)
}
};
↩PREREQUISITES↩
For fields of a class, a volatile
before the type has the same meaning as a volatile
variable at global scope: The compiler won't optimize its access.
For methods of a class, a volatile
after the parameter list indicates that all fields should be treated as volatile
(access won't be optimized away or re-ordered). This is a deep check rather than a shallow check, meaning that the entire call graph requires volatile
.
⚠️NOTE️️️⚠️
Another way to think of this is that the volatile
on a method makes it treat the instance of the class as if the variable that was declaring it were volatile
-- meaning all of its members are treated as volatile
recursively down the object tree.
struct Inner {
int x {5};
int y {6};
void change(int n) volatile {
x = n;
x = n;
x = n;
}
};
struct X {
int a {0};
int b {0};
void test() volatile {
a = b;
b = a;
inner.change(15);
}
};
If a class has been deprecated, adding a [[deprecated]]
attribute will allow the compiler to generate a warning if it sees it being used.
[[deprecated("Warning -- this is going away in the next release")]]
int add(int a, int b) {
return a + b;
}
For fields of a class, a static
before the type indicates that the function is independent of any instances of the class type: a static field points the same memory across all instances.
For methods of a class, a static
before the return type indicates that the function is independent of any instances of the class type, meaning that the only class fields that a static
method can access are static
fields.
static
methods and fields are accessed using the scope resolution (::) operator, where the scope is the class itself.
struct X {
static int a {1};
int b {0};
static void double_it() {
a *= 2;
}
};
X::double_it(); // call using scoped resolution
⚠️NOTE️️️⚠️
Be careful, static
has a different meaning for functions than it does for methods.
🔍SEE ALSO🔍
↩PREREQUISITES↩
C++ classes are allowed one or more constructors that initialize the object. Similar to Java, each constructor should have the same name as the class itself, no return type, and a unique parameter list.
class MyStruct {
private:
int count;
bool flag;
public:
MyStruct() {
count = 0;
flag = false;
}
MyStruct(int initialCount, bool initialFlag) {
this->count = initialCount;
this->flag = initialFlag;
}
}
The above constructors are using the member-of-pointer operator (->) to access the this
pointer. Non-static methods of a class have access to an implicit pointer called this
, which allows for accessing that instance's members. The member-of-pointer operator allows for dereferencing a pointer and accessing a member on the result in a more concise form.
this->count = 0; // same as (*this).member = 0
If a class offers constructors, the least error-prone way to invoke it is to use braced initialization: MyStruct x { 5, true }
. The reason is that C++ has so many object initialization foot-guns that, while simpler methods may work (e.g. MyStruct x(5, true)
), those methods may end up being interpreted by the compiler as something else that's entirely different (e.g. function declaration).
⚠️NOTE️️️⚠️
This ambiguity is often referred to as the "most vexing parse" problem.
Classes that don't have any constructors declared get an implicit zero-arg constructor that zeros out the memory of that class (speculation). If the class is a POD, a braced initialization that is ...
struct MyStruct {
int count;
char name[256];
bool flag;
};
MyStruct a; // initialized to zeroed out memory (via implicit constructor)
MyStruct b {}; // initialized to zeroed out memory (via implicit constructor)
MyStruct b {5, "steve", true}; // initialized to supplied arguments
⚠️NOTE️️️⚠️
See here for more information. The = operator won't result in a copy or anything like that (meaning performance won't suffer).
If a class does explicitly declare constructors, the implicit zero-arg constructor won't be generated. If desired, a zero-arg constructor may be declared with the default behaviour of the implicit zero arg constructor by adding = default
instead of a method body.
class MyStruct {
private:
int count;
bool flag;
public:
MyStruct() = default;
MyStruct(int initialCount, bool initialFlag) {
this->count = initialCount;
this->flag = initialFlag;
}
}
A field may be initialized to a value either through default member initialization or the member initializer list. For default member initializations, the initialization is done directly in the field's declaration.
struct MyStruct {
int count {5};
char name[256] {"steve"};
bool flag {true};
};
In contrast, a member initializer list is a comma separated list of braced initializations for the fields of a class. It's specified just before a constructor's body.
struct MyStruct {
int count;
bool flag;
MyStruct(): count{0}, flag{false} {
}
}
Each item in the comma separated list is called a member initializer.
⚠️NOTE️️️⚠️
How is this better than default member initialization, where initialization is done directly after the field declaration? According to this, it's more-or-less the same?
↩PREREQUISITES↩
C++ classes are allowed an explicit cleanup function called a destructor (e.g. closing an open file handle, zeroing out memory for security purposes, etc..). A destructor is declared similarly to a constructor, the only differences being ...
class MyStruct {
private:
int count {5};
bool flag {true};
public:
~MyStruct() {
// do some cleanup here
}
};
Destructors must never be called directly by the user. Treat any destructor as if it were marked with noexcept
. That is, an exception should never be thrown in a destructor. When an exception gets thrown, the call stack unwinds. As each function exits, the destructors for automatic variables of that function get invoked. Another exception getting thrown while one is already in flight means two exceptions would be in flight, which isn't supported.
If a destructor isn't declared, an empty one is implicitly generated.
⚠️NOTE️️️⚠️
When inheritance is involved, it's almost always to make the destructor a virtual function.
🔍SEE ALSO🔍
↩PREREQUISITES↩
There are two built-in mechanisms for copying in C++: the copy constructor and copy assignment.
A copy constructor is a constructor that has a single parameter, a reference to a const
object of the same type. By default, classes are implicitly provided with a default copy constructor if one hasn't been explicitly declared by the user. The copy semantics of this default copy constructor is to copy each field individually, called a member-wise copy.
Member-wise copying may not be the correct way to copy in certain cases, in which case a copy constructor should be explicitly provided with the correct copy semantics.
class MyStruct {
...
MyStruct(const MyStruct &orig) {
this->db = DatabaseConnection {orig.db.host, orig.db.port}; // make a new db connection instead of using orig's
this->max = orig.max;
}
}
MyStruct x {host, port};
MyStruct y {x}; // both x and y are independent and equal, but y has its own DatabaseConnection
Similarly, copy assignment is a method invoked when the assignment operator is used, called an operator overload. Unlike copy constructors, copy assignment is required to clean up any resources in the destination object prior to copying. By default, classes are implicitly provided with a copy assignment method if one hasn't been explicitly declared by the user. The copy semantics of this default method is to assign each field individually, called a member-wise copy.
class MyStruct {
...
MyStruct& operator=(const MyStruct &orig) {
if (this != &other) { // only do if assigning to self
this->db.close(); // close existing db connection
this->db = DatabaseConnection {orig.db.host, orig.db.port}; // make a new db connection
this->max = orig.max;
}
return *this; // return self -- this should always be the case??
}
}
To suppress the compiler from allowing copying or assignment of an object, add = delete
after both signatures instead of specifying a body. This is important if the object holds on to an uncopyable resource such as a lock.
class MyStruct {
...
MyStruct(const MyStruct &orig) = delete;
MyStruct& operator=(const MyStruct &orig) = delete;
}
⚠️NOTE️️️⚠️
If using the defaults, the book recommends explicitly declaring the methods but adding = default
after both signatures instead of specifying a body. The reason is that the default is almost always wrong, so if you tack this on it makes it explicit to others that you intended this.
class MyStruct {
...
MyStruct(const MyStruct &orig) = default;
MyStruct& operator=(const MyStruct &orig) = default;
}
ALSO, there's ambiguity around when the compiler generates default move/copy/destructor methods. It might be compiler specific. The book recommends that if you're using the defaults, always set them to = default
(or do = delete
to disallow them).
class MyStruct { ...
// copy
MyStruct(MyStruct &&orig) = default;
MyStruct& operator=(MyStruct &&orig) = default;
// move
MyStruct(MyStruct &&orig) = default;
MyStruct& operator=(MyStruct &&orig) = default;
// destructor
~MyStruct() = default;
} ```
↩PREREQUISITES↩
There are two built-in mechanisms for moving in C++: the move constructor and move assignment. Moving is different from copying in that moving actually guts the insides (data) of one object and transfers it into another, leaving that object in an invalid state. If the scenario allows for it, moving is oftentimes more efficient than copying.
A move constructor is a constructor that has a single parameter, an rvalue reference to an object of the same type. By default, classes are implicitly provided with a default move constructor if one hasn't been explicitly declared by the user. The move semantics of this default move constructor is to copy each field rather than actually move anything, called a member-wise copy.
class MyStruct {
...
MyStruct(MyStruct &&orig) noexcept {
this->str_ptr = orig.str_ptr;
this->max = orig.max;
orig.str_ptr = nullptr; // mark orig object as invalid
orig.max = -1; // mark orig object as invalid
}
}
MyStruct a {};
MyStruct c { std::move(a) }; // std::move returns MyObject && type, which calls MyObject's move constructor
// a is in an invalid state
⚠️NOTE️️️⚠️
Don't std::move
into a variable and pass that variable to the constructor. The reason is that the variable will be treated as an lvalue (an lvalue to an rvalue reference), meaning that the copy constructor will get invoked instead of the move constructor.
In the example above, the move constructor has noexcept
set to indicate that it will never throw an exception. Move constructors that can throw exceptions are problematic for the compiler to use. If a move constructor throws an exception, the source object will likely enter into an inconsistent state, meaning the program will likely be in an inconsistent state. As such, if the compiler sees that the move constructor can throw an exception, it'll prefer to copy it instead.
Similarly to the move constructor, move assignment is a method invoked when the assignment operator is used, called an operator overload. It has the same parameter list and it shouldn't throw exceptions either (noexcept
), the only difference is that it returns a reference to itself at the end.
class MyStruct {
...
MyStruct& operator=(MyStruct &&orig) {
if (this != &other) { // only do if assigning to self
this->str_ptr = orig.str_ptr;
this->max = orig.max;
orig.str_ptr = nullptr; // mark orig object as invalid
orig.max = -1; // mark orig object as invalid
}
return *this; // return self -- this should always be the case??
}
}
⚠️NOTE️️️⚠️
There's ambiguity around when the compiler generates default move/copy/destructor methods. It might be compiler specific. The book recommends that if you're using the defaults, always set them to = default
(or do = delete
to disallow them).
class MyStruct {
...
// copy
MyStruct(MyStruct &&orig) = default;
MyStruct& operator=(MyStruct &&orig) = default;
// move
MyStruct(MyStruct &&orig) = default;
MyStruct& operator=(MyStruct &&orig) = default;
// destructor
~MyStruct() = default;
}
↩PREREQUISITES↩
The compiler may automatically generate default implementations for some member functions (e.g. default constructor), called special member functions. However, under certain conditions, it may choose to omit generating them. If the compiler chooses to not generate a default implementation where one was expected, it's possible to force the compiler to generate that function by explicitly declaring it but replacing the function body = default
.
struct MyClass {
MyStruct() = default; // forcefully generate default constructor
}
⚠️NOTE️️️⚠️
Reasons why a compiler may decide to skip generating a function: it doesn't think it's needed, it doesn't think the behaviour will be correct, ...?
↩PREREQUISITES↩
The compiler may automatically generate default implementations for some member functions (e.g. default constructor), called special member functions. There are two ways to turn off these automatically generated member functions. The first way is to declare the function but make it privately scoped so that nothing outside can access it.
struct MyClass {
...
private:
MyStruct() { }; // default constructor is private
}
The second way is to explicitly declare the function but mark it as deleted by appending = delete
in place of the function body.
struct MyClass {
MyStruct() = delete; // default constructor is forcefully deleted
}
⚠️NOTE️️️⚠️
The 2nd way is the more "modern" way to do it.
↩PREREQUISITES↩
In C++, a class inherits another class by, just after its name, appending a colon (:) followed by the name of the parent class.
class MyChild : MyParent {
};
Like in most other object oriented languages, a child class...
MyChild c {};
MyParent p {x}; // MyChild inherits from MyParent, meaning that it's assignable to MyParent
To be able to override a method in a child class the same way as it's done in other languages (e.g. Java), the base call must have the virtual
keyword prepended on the method, making it a virtual method. Similarly, any method that overrides a virtual method should have the override
keyword appended just after the parameter list.
⚠️NOTE️️️⚠️
override
isn't strictly required, but it's a hint that the compiler can use to prevent you from making a mistake (e.g. it sees override
but what's being overridden isn't virtual
). It's similar to Java's @Override
annotation.
struct MyParent {
virtual int virt_method() { ... }
int non_virt_method() { ... }
};
struct MyChild : MyParent {
virtual int virt_method() override { ... }
};
If the base class and child class have the exact same non-virtual method, which method gets called depends on the type of the variable.
struct MyParent {
virtual int virt_method() { ... }
int non_virt_method(int a) { ... }
};
struct MyChild : MyParent {
virtual int virt_method() override { ... }
int non_virt_method(int a) { ... }
};
MyChild c {};
MyChild &cref {x};
MyParent &pref {x};
cref.non_virt_method(0); // calls MyChild::non_virt_method()
pref.non_virt_method(0); // calls MyParent::non_virt_method() even though object is a MyChild
To prevent a method from being overridable at all, add the final
keyword just after the parameter list.
struct MyParent {
virtual int methodA() final { ... }
};
struct MyChild : MyParent {
virtual int methodA() { ... } // ERROR HERE -- not allowed
};
Similarly, to prevent the entire class itself from being inheritable, add the final
keyword just after the name.
struct MyParent final {
virtual int methodA() { ... }
};
C++ chains constructor and destructor invocations appropriately as expected. The one caveat is that destructor, if not a virtual method, will use the method resolution mechanism described above: If the type of the variable doesn't match the object (variable type is the base class but object is not), the wrong destructor gets invoked, resulting in object potentially not cleaning up resources (e.g. closing file handles).
struct MyParent {
virtual int v1() { ... };
~MyParent() { ... };
};
struct MyChild : MyParent {
virtual int v1() { ... };
~MyChild() { ... };
};
MyParent *c {new MyChild{}};
delete c; // calls MyParent's destructor instead of MyChild's destructor
When inheritance is involved, it's almost always a good idea to enforce a virtual destructor. Since not having a virtual destructor sometimes makes sense (e.g. user determined that it's safe to omit it and as such omitted it to improve performance), the compiler won't produce a warning if it isn't virtual.
struct MyParent {
virtual int v1() { ... };
virtual ~MyParent() { ... };
};
↩PREREQUISITES↩
Interfaces and abstract classes are supported in C++, but not in the same way as other high-level languages. The C++ approach to interfaces is to explicitly mark certain methods as requiring an implementation. This is done by appending = 0
to the method declaration.
struct MyParent {
virtual int virt_method() = 0;
int non_virt_method() = 0;
};
A method that is both a virtual method and requires an implementation is called a pure virtual method. A class that contains all pure virtual methods is called a pure virtual class.
struct MyParent {
virtual int v1() = 0;
virtual int v2() = 0;
virtual ~MyParent() {}; // also okay to do "virtual ~MyParent() = default"
};
As shown in the example above, a pure virtual class should have a virtual destructor. While not required, failing to do so means that the wrong destructor may get invoked if the type of the variable doesn't match the object (variable type is the base class but object is not), resulting in class resources being left open (e.g. file handles).
⚠️NOTE️️️⚠️
See inheritance section for a more thorough explanation.
↩PREREQUISITES↩
C++ classes support operator overloading.
Operators are overload-able in two ways. To overload an operator the first way, introduce a method but instead of naming it, add the operator
keyword followed by the operator being overloaded. The parameters and return type of the method need to match whatever types the operator is intended to deal with.
struct MyClass {
...
// MyClass + int -- notice whitespace between 'operator' keyword and operator -- this is okay.
MyClass operator +(int rhs) const {
MyClass ret { this->value + x };
return ret;
};
// MyClass + MyClass
MyClass operator+(const MyClass &rhs) const {
MyClass ret { this->value + rhs.value };
return ret;
}
// MyClass += MyClass
MyClass& operator+=(const MyClass &rhs) {
this->value += x->value;
return *this;
}
};
To overload an operator the second way, introduce a function (not a method) using the operator
keyword followed by the operator being overloaded. In the examples above, the left-hand side was the this
pointer. When using this second way, a left-hand side needs to be explicitly provided as the first parameter while the right-hand side is the second argument.
// MyClass + int
MyClass operator+(const MyClass &lhs, int rhs) {
MyClass ret { lhs.value + x };
return ret;
};
// MyClass + MyClass
MyClass operator+(const MyClass &lhs, const MyClass &rhs) {
MyClass ret { lhs.value + rhs.value };
return ret;
}
// MyClass += MyClass
MyClass & operator+=(MyClass &rhs, const MyClass &rhs) {
lhs.value += rhs.value;
return lhs;
}
⚠️NOTE️️️⚠️
Evidently the two ways described above aren't equivalent. The second way has some added benefits. See here.
Note how the const
keyword is added to the method in cases where the operator shouldn't modify itself. Similarly, when the argument for a parameter shouldn't be changed, const
is used on that parameter. const
-ness depends on the scenario. For example, the second operator+
requires two references to const
types.
MyClass operator+(const MyClass &lhs, const MyClass &rhs) {
MyClass ret { lhs.value + rhs.value };
return ret;
}
Those const
s ensure that the operands aren't changed in the method. Imagine that you're performing x = y + z
. It doesn't make sense for y
or z
to get modified.
The signature could have just as well been modified to be the types themselves rather than const
references, in which case both the left-hand side and right-hand side would get copied on invocation of the method (modifications to copies don't matter).
MyClass operator+(MyClass lhs, MyClass rhs) {
MyClass ret { lhs.value + rhs.value };
return ret;
}
Certain operators can have default
implementations in which the compiler generates the body of the function. In simple cases, the default
implementation is good enough.
struct MyClass {
...
bool operator==(const MyDouble&) const = default;
};
⚠️NOTE️️️⚠️
See here for a list of operators and their signatures (still incomplete).
There's also the option to create operators that allow for implicit type casting and explicit type casting. See the type casting section for more information.
↩PREREQUISITES↩
⚠️NOTE️️️⚠️
The book claims that, when overloading / using the three-way comparison operator, you need to include the compare header from the C++ standard library: #include <compare>
. I didn't find this to be the case when I used G++12.
The three-way comparison operator (<=>), also called the spaceship operator, is a streamlined way of providing ordering comparison operators for a class. Typically, if a class is orderable / sortable, it should provide operator overloads for the following operators:
The three-way comparison operator provides all 4 of the above operators through a single operator overload, and may additionally provide equality (==) and inequality (!=) operators in certain conditions.
⚠️NOTE️️️⚠️
Those conditions (and the reasons for why this) is are explained explained further on in this section.
Rather than just returning a boolean
, the three-way comparison operator's return type breaks-down comparisons into 3 categories based on the idea of equality and equivalence:
Strong ordering (std::strong_ordering
) - A == B
means that the A
and B
are indistinguishable. One may be substituted for the other without any side-effects.
Strong ordering is sometimes also referred to as total ordering.
std::strong_ordering::less // a < b
std::strong_ordering::equal // a == b
std::strong_ordering::equivalent // same as equal
std::strong_ordering::greater // a > b
struct Time {
int hour;
int minute;
std::strong_ordering operator<=>(const Time& rhs) const {
if (hour < rhs.hour) {
return std::strong_ordering::less;
} else if (hour > rhs.hour) {
return std::strong_ordering::greater;
} else {
if (minute < rhs.minute) {
return std::strong_ordering::less;
} else if (minute > rhs.minute) {
return std::strong_ordering::greater;
} else {
return std::strong_ordering::equal;
}
}
}
bool operator==(const Time& other) const = default;
};
Weak ordering (std::weak_ordering
) - A == B
means that the A
and B
aren't guaranteed to be indistinguishable. They are equivalent but not equal (one can't be substituted for the other). For example, the strings "hello world"
and "HELLO WORLD"
are equivalent if you ignore case, but one can't necessarily be substituted for the other.
std::weak_ordering::less // a < b
std::weak_ordering::equivalent // a is equivalent to b (this is NOT equality, it's equivalence)
std::weak_ordering::greater // a > b
struct Rectangle {
int length;
int width;
std::weak_ordering operator<=>(const Rectangle& rhs) const {
int lhs_area { length * width };
int rhs_area { rhs.length * rhs.width };
if (lhs_area < rhs_area) {
return std::weak_ordering::less;
} else if (lhs_area > rhs_area) {
return std::weak_ordering::greater;
} else {
return std::weak_ordering::equivalent;
}
}
bool operator==(const Rectangle& other) const = default;
};
Partial ordering (std::partial_ordering
) -- same as std::weak_ordering
, but with the caveat that objects may not be comparable at all. For example, the floating point number 5.5
is not comparable at all to NaN
: 5.5 == NaN
, 5.5 < NaN
, and 5.5 > NaN
are all false.
std::partial_ordering::less // a < b
std::partial_ordering::equivalent // a is equivalent to b (this is NOT equality, it's equivalence)
std::partial_ordering::greater // a > b
std::partial_ordering::unordered // a was not comparable to b
struct FloatRectangle {
float length;
float width;
std::partial_ordering operator<=>(const FloatRectangle& rhs) const {
float lhs_area { length * width };
float rhs_area { rhs.length * rhs.width };
if (lhs_area < rhs_area) {
return std::partial_ordering::less;
} else if (lhs_area > rhs_area) {
return std::partial_ordering::greater;
} else if (lhs_area == rhs_area) { // if both sides are nan, this still evaluates to false
return std::partial_ordering::equivalent;
} else {
return std::partial_ordering::unordered;
}
}
bool operator==(const FloatRectangle& other) const = default;
};
All of the examples above provide a default implementation for the equality operator (==). That's because, when the three-way comparison operator overload has a non-default implementation, it doesn't provide support for the equality and inequality operators. The user has to provide those operator overloads.
However, when the three-way overload has a default implementation, the equality and inequality operators are automatically provided. A default three-way overload implementation first compares parent classes (left to right, in order they were declared -- C++ has multiple inheritance), then compares each member variable (in order they were declared).
struct Time {
int hour;
int minute;
std::strong_ordering operator<=>(const Time& rhs) const = default;
};
⚠️NOTE️️️⚠️
See here for reasoning on why equality and inequality are missing for a non-default implementation.
⚠️NOTE️️️⚠️
If you know what constexpr
and noexcept
are, the book is saying that compiler generated three-way comparison operators implicitly have this set.
⚠️NOTE️️️⚠️
If you know what the auto
keyword does:
For default implementations, sometimes people set the return type as auto
. This is because the default implementation's ordering type is only going to be as "weak" as its weakest parent class / member variable, and you don't know the weakness of everything beforehand.
So for example, if your class has a bunch of member variables with std::strong_ordering
and a single float member variable (std::partial_ordering
), the return type of the spaceship operator will be std::partial_ordering
.
⚠️NOTE️️️⚠️
You can use the three-way comparison operator directly in code if you want ...
if ((a <=> b) == std::strong_ordering::equal) {
// do something
} else {
// do something else
}
↩PREREQUISITES↩
C++ classes support both implicit type conversions and explicit type conversions via operator overloading. Implicit type conversions are represented as operator overload methods where the name of the operator being overloaded is the destination type and the return type is omitted.
struct MyClass {
...
operator int() const {
return this-> value / 42;
}
};
...
MyClass cls {};
int x {cls}; // triggers operator overload method
Explicit type conversions are enabled the same way as implicit type conversions, except the overload method is preceded by the explicit
keyword. The explicit
keyword makes it so that conversion to that type requires a static_cast
struct MyClass {
...
explicit operator int() const {
return this-> value / 42;
}
};
...
MyClass cls {};
int x {static_cast<int>(cls)}; // static_cast required to trigger operator overload method
⚠️NOTE️️️⚠️
The book recommends preferring explicit over implicit because implicit is a source for confusion.
Do these still qualify as operator overloads? Return types should be there.
↩PREREQUISITES↩
In addition to following the same function overloading rules as free functions, a member function may be overloaded based on whether the this
pointer is to a volatile
and / or const
object.
class MyClass {
public:
int get_data() {
std::cout << "non-const non-volatile\n";
counter += 1;
return counter;
}
int get_data() const {
std::cout << "const non-volatile\n";
return counter;
}
int get_data() volatile {
std::cout << "non-const volatile\n";
counter += 1;
return counter;
}
int get_data() volatile const {
std::cout << "const volatile\n";
return counter;
}
private:
int counter;
};
MyClass c1{};
c1.get_data(); // prints "non-const non-volatile"
const MyClass c2{};
c2.get_data(); // prints "const non-volatile"
volatile MyClass c3{};
c3.get_data(); // prints "non-const volatile"
const volatile MyClass c4{};
c4.get_data(); // prints "const volatile"
↩PREREQUISITES↩
In addition to following the same function overloading rules as free functions, a member function may be overloaded based on whether the this
reference is an l-value or r-value. To target ...
The benefit of reference overloading is being able to define a version of the function with efficient move semantics when the object is transient.
// THIS EXAMPLE WAS LIFTED FROM https://docs.microsoft.com/en-us/cpp/cpp/function-overloading?view=msvc-170#ref-qualifiers
class MyClass {
public:
MyClass() {/*expensive initialization*/}
std::vector<int> get_data() & {
std::cout << "lvalue\n";
return _data;
}
std::vector<int> get_data() && {
std::cout << "rvalue\n";
return std::move(_data);
}
private:
std::vector<int> _data;
};
MyClass c {};
auto v {c.get_data()}; // get a copy. prints "lvalue".
auto v2 {C().get_data()}; // get the original. prints "rvalue"
↩PREREQUISITES↩
A functor, also called a function object, is a class that you can invoke as if it were a function because it has an operator overload for function-call.
struct MyFunctor {
int operator()(int y) const { return -y + x; }
private:
int x {5};
};
MyFunctor inst{};
inst(15); // computes -15 + 5
Functors are useful because they allow for state (via fields) and parameterization (via constructor arguments) but still retain a function-like syntax.
⚠️NOTE️️️⚠️
Unlike normal functions, functors cannot be assigned to function pointers. See section on function pointers.
↩PREREQUISITES↩
A friend is a function or class that can access the non-public members of some other class that it wasn't declared in.
For friend functions, the class to be accessed needs to declare the function's prototype (function declaration) before implementations of a friend function (function definition) can exist. The prototype is included in the class just like any other member function, but the friend
prefix modifier is tacked on.
class MyClass {
public:
friend int addAndNegate(MyClass& obj, int n); // prototype
private:
int x {0};
};
int addAndNegate(MyClass& obj, int n) { // implementation -- friend of MyClass
return -(n + obj.x);
}
// test
MyClass obj{};
int t = addAndNegate(obj,5);
For friend classes, the class to be accessed needs to specify which outside class is able to access it using friend class
.
class MyClass {
public:
friend class MyFriend; // state that MyFriend can access MyClass's non-public members
private:
int x {0};
};
class MyFriend {
public:
int addAndNegate(MyClass& obj, int n) { // function in MyFriend accessing non-public members of MyClass
return -(n + obj.x);
}
};
// test
MyFriend obj_friend{};
MyClass obj{};
int t = obj_friend.addAndNegate(obj,5);
⚠️NOTE️️️⚠️
The class
in friend class
may be omitted if MyFriend
was already declared before MyClass
. Adding the word class
is a forward declaration -- it tells the compiler to just believe that it exists even though it may not have come across it yet.
Friend functions and friend classes may also target templated types.
class MyClass {
public:
template<typename T> friend int addAndNegate(MyClass& obj, T n); // every addAndNegate(MyClass&, T) will be a friend
private:
int x {0};
};
⚠️NOTE️️️⚠️
This is how C++ provides its version of Java's Object.toString()
. For each class that you want to be able to print as a string, you implement a templated friend function of the left-shift operator overload (<<) that targets the class ostream
, making it usable in something like std::cout
.
ostream& operator<<(ostream &os, const MyClass &obj) {
os << obj.x << "\n";
return os;
}
It seems like a convoluted way to do it.
↩PREREQUISITES↩
C++ provides a way for users to define their own literals through the use of operator overloading, called user-defined literals. User-defined literals wrap built-in literals and perform some operation to convert them to either another type or another value. It's identified by a unique suffix that starts with an underscore (e.g. _km
).
The operator overload is identified by two quotes followed by the suffix.
Distance operator"" _km (long double n) {
return Distance {n * 1000.0};
}
Distance operator"" _mi (long double n) {
return Distance {n * 1609.34};
}
Distance d { 1.2_km + 4.0_mi };
As stated above, user-defined literals must wrap an existing built-in literal type.
type | definition |
---|---|
integral | return_type operator"" identifier (unsigned long long int) |
floating point | return_type operator"" identifier (long double) |
character | return_type operator"" identifier (char) |
wide character | return_type operator"" identifier (wchar_t) |
utf-8 character | return_type operator"" identifier (char8_t) |
utf-16 character | return_type operator"" identifier (char16_t) |
utf-32 character | return_type operator"" identifier (char32_t) |
character string | return_type operator"" identifier (char *, size_t) |
wide character string | return_type operator"" identifier (wchar_t *, size_t) |
utf-8 string | return_type operator"" identifier (char8_t *, size_t) |
utf-16 string | return_type operator"" identifier (char16_t *, size_t) |
utf-32 string | return_type operator"" identifier (char32_t *, size_t) |
raw | return_type operator"" identifier (const char *) |
Note that, for ...
The last definition in the table above, raw, will get a character string of any numeric literal used.
const char * operator"" _as_str (const char * n) {
std::cout << "input str: " << n;
return n;
}
123.5e+12_as_str; // outputs "input str: 123.5e+12"
The C++ standard library makes use of user-defined literals in various places, but its identifiers don't require an underscore (_) prefix.
std:chrono::duration d { 2h + 15ms }
.std::complex<double> { (1.0 + 2.0i) * (3.0 + 4.0i) }
.std::string str { "hello"s + "world"s }
.🔍SEE ALSO🔍
↩PREREQUISITES↩
Lambdas are unnamed functors (not functions) that are expressed in a succinct form. Lambdas in C++ work similarly to lambdas in other high-level languages. They capture copies of / references to objects from the outer scope such that they can be used for whatever processing the functor performs.
For example, consider the following functor.
// define
constexpr struct MyFunctor {
MyFunctor(int x) {
this->x = x;
};
constexpr int operator()(int a) const {
return a + x;
}
private:
int x;
};
// instantiate
MyFunction f1{ 5 };
// invoke
f1(42);
The functor above can be written much more succinctly as a lambda.
// define and instantiate
auto f2 { [x=5] (int a) -> int { return a + x; } };
// invoke
f2(42);
The general syntax of a lambda is as follows: [capture-list] (parameter-list) modifiers -> return-type { body }
. The subsections below detail this general syntax.
⚠️NOTE️️️⚠️
Be aware that, by default, the function-call operator in the lambda version is const
and will automatically be made into a constexpr
if it satisfies all the requirements of constexpr
. This is discussed more in the modifiers subsections.
🔍SEE ALSO🔍
↩PREREQUISITES↩
[capture-list]
is a required part of [capture-list] (parameter-list) modifiers -> return-type { body }
that defines and sets member variables inside the functor. It's a comma separated list where each element is a list is a variable to capture as a member variable.
There are 3 different ways to capture member variables.
Copy a variable from the outer scope.
To create a copy of an individual variable into the functor, put the variable's name in the capture list.
int x {5};
int y {6};
// explicitly copy x and y from outer scope
auto f { [x, y] (int z) -> int { return x + y + z; } };
One way to avoid listing out individual variable names is to put =
as the first element of the capture list. When =
is present, missing member variables will automatically get copied as member variables.
int x {5};
int y {6};
// explicitly copy x and implicitly copy y from outer scope
auto f { [=, x] (int z) -> int { return x + y + z; } };
If used within an enclosing class, the this
pointer can be captured.
auto f { [this] (int z) -> int { return z + this->x; } }; // capture this as a pointer
auto f { [*this] (int z) -> int { return z + this->x; } }; // capture a COPY OF *this and pass it in as a pointer
⚠️NOTE️️️⚠️
It's mentioned that prior to C++20, automatic copy capturing ([=]
) would pull in this
. That feature has been deprecated.
Reference a variable from the outer scope.
To create reference to an individual variable into the functor, put the variable's name in the capture list preceded by an ampersand (&).
int x {5};
int y {6};
// explicitly reference x and y from outer scope
auto f { [&x, &y] (int z) -> int { return x + y + z; } };
One way to avoid listing out individual variable names is to put &
as the first element of the capture list. When &
is present, missing member variables will automatically get referenced as member variables.
int x {5};
int y {6};
// explicitly reference x and implicitly reference y from outer scope
auto f { [&, &x] (int z) -> int { return x + y + z; } };
Initialize a variable using an expression.
When a variable name is followed by =
and an expression, the expression is evaluated and captured.
int x {5};
int y {6};
auto f { [mod_x=x/2, mod_y=y/2] (int z) -> int { return mod_x + mod_y + z; } };
This is especially useful for capturing an object by moving it (as opposed to copying it or referencing it).
auto f { [o=std::move(my_obj)] (int z) -> int { return o.do_something(z); } };
(parameter-list)
is a required part of [capture-list] (parameter-list) modifiers -> return-type { body }
that defines the parameter list of the functor's function-call operator.
auto f1 { [] (int x, int y) -> int { return x + y; } };
auto f2 { [] (int x, int y = 99) -> int { return x + y; } }; // default args
auto f3 { [] (auto x, auto y) -> int { return static_cast<int>(x + y); } }; // templated params (compiler deduces types based on usage)
Lambda parameter lists are defined similarly to standard function parameter lists. It's common for a lambda's parameter list to use template parameters via auto
as is done in f3
of the example above. The reason for using auto
is that the lambda can still work even if you don't know / can't predict the exact types of the arguments passed in (e.g. you know the arguments will be integral types, but you don't know exactly which exact integral types).
⚠️NOTE️️️⚠️
auto
is a placeholder for a template parameter, and as such type deduction rules come into play. If you aren't careful, you'll end up with strange or incorrect behaviour. For example, in certain cases the compiler may decide to create a local copy for an argument that gets passed in where you may be expecting a reference.
🔍SEE ALSO🔍
↩PREREQUISITES↩
return-type
is an optional part of [capture-list] (parameter-list) modifiers -> return-type { body }
that defines the return type of the functor's function-call operator. The syntax of using an arrow and defining the type after the parameter list is called the trailing return syntax.
auto f1 { [] (int a, int b) -> int { return a+b; } };
auto f2 { [] (MyObject* v) -> const MyObject& { return v[5]; } };
🔍SEE ALSO🔍
When a lambda doesn't provide a return type, the return type is implicitly auto
. The compiler uses standard template parameter type deduction rules to determine what the return type should be.
auto f3 { [] (int a, int b) { return a+b; } }; // return deduced to int
auto f4 { [] (MyObject* v) { return v[5]; } }; // return deduced to MyObject
In f4
, even though v[5]
returns a const MyObject &
, type deduction rules evaluate it to const MyObject
(not a reference). That means f4
returns a copy of the object at v[5]
rather than a reference to the real thing. Type deduction rules such as the one here may end up causing subtle bugs if you aren't careful.
Another option is to explicitly return decltype(auto)
, which copies the exact type being returned.
auto f5 { [] (MyObject* v) -> decltype(auto) { return v[5]; } }; // return deduced to const MyObject&
⚠️NOTE️️️⚠️
When unsure, it's best to explicitly declare the return type or use decltype(auto)
.
↩PREREQUISITES↩
modifiers
is an optional part of [capture-list] (parameter-list) modifiers -> return-type { body }
that lists the modifiers of the functor's function-call operator. Except for the following special cases, modifiers work the same way that they do for normal functions.
The function-call operator is set to be a constexpr
function if it meets the requirements of being a constant expression. You can force it to be a constant expression by adding constexpr
as one of the modifiers.
auto f1 { [] (int x, int y) constexpr -> int { return x + y; } }; // force as constant expression
The function-call operator is set to be a const
function. You can force this off by adding mutable
as one of the modifiers.
auto f2 { [] (int x, int y) mutable -> int { return x + y; } }; // make non-const
↩PREREQUISITES↩
A lambda may have template parameters added by injecting template parameters in angle brackets between [capture-list]
and (parameter-list)
of the lambda declaration [capture-list] (parameter-list) modifiers -> return-type { body }
.
auto f1 { [] <typename T>(T x, T y) -> T { return x + y; } };
This is useful in cases where you need to be more explicit with the types of parameters / return types (auto
is too loose). The most obvious case is with containers, where you likely want the underlying container type listed.
// f2 and f3 will do the same thing when passed a std::vector, but f3 is much more explicit.
auto f2 { [] (auto& v) -> const auto& { return v[7]; } };
auto f3 { [] <typename T>(std::vector<T>& v) -> const T& { return v[7]; } };
⚠️NOTE️️️⚠️
Concepts may be used both with auto
and with explicit template parameters (e.g. T
). See the section on concepts to see how to they're applied in both cases.
↩PREREQUISITES↩
Templates are loosely similar to generics in other high-level languages such as Java. A template defines a class or function where some of the types and code are unknown, called template parameters. Each template parameter in a template either maps to a ...
int
).5
).5.5f
).MyEnum::Value
).&MyClass::MyStaticField
).&MyClass::MyStaticMember
).std::nullptr_t
value available at compile-time (e.g. nullptr
).Templates are created using the template
keyword, where the template parameters are a comma separated list sandwiched within angle brackets. When the user makes use of a template, its template parameters get substituted with what the user specified.
template <typename X, typename Y, typename Z, int N>
struct MyClass {
X perform(Y &var1, Z &var2) {
return (var1 + var2) * N;
}
};
As shown above, each template parameter for a ...
typename
. The keyword class
may be used instead of typename
. The meaning is exactly the same (typename
should be preferred).To use a template, use it just as you would a non-template but provide substitutions (template instantiation). To instantiate a class template, use the class as if it were a normal class but immediately after the class name add in a comma separated list of template parameter substitutions sandwiched within angle brackets. These substitutions should be in the same order as the template parameters.
MyClass<float, int, int, 2> obj {}; // X = float, Y = int, Z = int, N = 2
float x {obj.perform(5, 3)};
Declaring templated functions is done in the same manner as templated classes, and using templated functions is done similarly to templated classes: Use the function as if it were a normal function but immediately after the function name add in a comma separated list of substitutions sandwiched within angle brackets.
// declare
template <typename X, typename Y, typename Z, int N>
X perform(Y &var1, Z &var2) {
return (var1 + var2) * N;
}
// use
float x {perform<float, int, int, 2>(5, 3)};
When the template parameters are for types only (not values), it's possible to leave out substitutions during usage. The compiler will deduce the types from the argument you pass in and substitute them automatically.
// declare
template <typename X, typename Y, typename Z>
X perform(Y &var1, Z &var2) {
return var1 + var2;
}
// use
float x {perform(5, 3)}; // template arguments omitted, deduced by compiler
It's possible to supply a default substitution for a template parameter by appending it with =
followed by the substitution, called default template argument.
template <typename X, typename Y = long, typename Z = long>
X perform(Y &var1, Z &var2) {
return var1 + var2;
}
⚠️NOTE️️️⚠️
You would think that once a default is supplied, all other template parameters after it need a default as well. For whatever reason the compiler isn't erroring out when I do this.
Similarly, it's possible to use templates with type aliasing to create shorthand names where only some of the template parameters need to be set, called partial templates.
// declare
template <typename Y, typename Z>
using MyClassPartialTemplate = MyClass<float, Y, Z, 42>;
// use
MyClass<float, int, int, 42> x{};
MyClassPartialTemplate<int, int> y{}; // same type as previous line
Normally, C++ code is split into two files: a header file that contains declarations (e.g. function signatures) and a C++ file that contains definitions (e.g. function signatures with their bodies). When accessing C++ code that isn't local, typically only the declarations of that non-local code need to be included. The linker binds those non-local declarations to their definitions when it comes time to build the executable.
Templates work differently from Java generics in that the C++ compiler generates a new code for each unique set of substitutions it sees used (template instantiation). Doing so produces more code than if there was only one copy, but also ensures any performance optimizations unique to that specific set of substitutions. Also, because each usage of a template may result in newly generated code, that usage typically needs access to both the declaration and definition. The simplest way to handle this is to put the entirety of the template (both definition and declaration) into a header, which gets included into the same file as the usage.
↩PREREQUISITES↩
Universal references allow for collapsing together multiple function overloads where the only differences between overloads are the same parameters being overloaded as both lvalue references and rvalue references. In the following non-templated code, the only difference between the overloads is that one takes a lvalue reference and the other takes a rvalue reference (and moves it).
void test(int & x) {
if (x % 2 == 0) {
vector.push_back(x); // calls push_back(int &x)
}
}
void test(int && x) {
if (x % 2 == 0) {
vector.push_back(std::move(x)); // calls push_back(int &&x)
}
}
int main() {
int val {5};
test(5); // calls test(int && x)
test(val); // calls test(int & x)
return 0;
}
By templating the code above and forcing the compiler to deduce the parameter type through usage, the compiler can expand out the function overloads on its own.
template<typename T>
void test(T && x) {
if (x % 2 == 0) {
vector.push_back(std::forward<T>(x)); // forward to push_back(int &x) OR push_back(int &&x) based on the reference type
}
}
int main() {
int val {5};
test(5); // calls test(int && x)
test(val); // calls test(int & x)
return 0;
}
In the example above, the parameter x
is a universal reference. A universal reference has two ampersands (&&) as if it were a rvalue reference, but since the top-level type is a template parameter (T
in this case) it's considered a universal reference. std::forward<T>()
is used to maintain the rvalue-ness / lvalue-ness of the argument as it's passed forward into other functions.
⚠️NOTE️️️⚠️
Not using std::forward<T>()
will force the argument to get moved forward as a lvalue reference. You must use std::forward<T>()
to maintain the type of reference.
For a parameter to be a universal reference, it must follow the pattern NAME &&
where NAME
is the template parameter.
const
, volatile
, or modifying it in any other way will make it go back to becoming a rvalue reference rather a universal reference.NAME
as an argument of another type will make it get interpreted as a rvalue reference rather than a universal reference.template<typename T>
void test(const T&& param) { ... } // BAD: && means rvalue reference (because of const)
template<typename T>
void test(MyClass<T> && param) { ... } // BAD: && means rvalue reference (because it's wrapped in a concrete type)
template<typename T>
void test(T&& param) { ... } // OK: “&&” means universal reference
More examples of universal references in different contexts:
// CONTEXT: Multiple universal references of different types.
template<typename T, typename U>
void test(T && x, U && y) {
if (x % 2 == 0) {
vector.push_back(std::forward<U>(y));
}
}
// CONTEXT: Universal reference of a member function where the class itself is templated.
template<typename UNRELATED_PARAM>
struct MyClass {
...
template<typename T, typename U>
void test(T && x, U && y) {
if (x % 2 == 0) {
vector.push_back(std::forward<U>(y));
}
}
...
}
⚠️NOTE️️️⚠️
The reason why universal references work is that the compiler is deducing the correct type for the template parameter based on how its used. If it gets passed a lvalue reference, it'll invoke the lvalue version. If it gets passed a rvalue reference, it'll invoke the rvalue version.
Internally, the compiler uses a technique called "reference collapsing" to get this to work, which temporarily / internally allows certain unallowable C++ constructs (references to references are disallowed). See here for more information.
↩PREREQUISITES↩
auto
may be used as shorthand for template parameters. If a parameter has a type of auto
, that auto
assumes the place of a unique template parameter (e.g. T
).
void func(auto p); // template<T> void func(T p);
void func(auto & p); // template<T> void func(T & p);
void func(auto * p); // template<T> void func(T * p);
void func(const auto & p); // template<T> void func(const T & p);
void func(const auto * p); // template<T> void func(const T * p);
void func(auto && p); // template<T> void func(T && p);
Likewise, if a return type has a type of auto
it assumes the place of a unique template parameter.
auto func(int p); // template<T> T func(int p);
auto
is typically also used for variable declarations. One important aspect of auto
for variable declarations to be aware of: Braced initialization / braced-plus-equals initialization produces an std::initializer_list<T>
rather than just T
.
int x = 5; // x is int of 5
int x (5); // x is int of 5
int x {5}; // x is int of 5
int x = {5}; // x is int of 5
// ... vs ...
auto x = 5; // x is int of 5
auto x (5); // x is int of 5
auto x {5}; // x is std::initializer_list<int>
auto x = {5}; // x is std::initializer_list<int>
⚠️NOTE️️️⚠️
This seems to mesh with how certain classes work. For example, to create a std::vector<int>
, you can pass in an std::initializer_list<int>
via its constructor to prime it with a set of values. That std::initializer_list<int>
is typically created using the curly brace syntax.
std::vector<int> v ( {1, 2, 3, 4, 5} );
However, when you use auto
as the return type of a function OR auto
for parameters in a lambda, the curly-brace to std::initializer<T>
conversion discussed below doesn't happen. The compiler will fail to deduce the type if you use supply a list in curly braces.
⚠️NOTE️️️⚠️
Later sections discuss template deduction and decltype(auto)
, both of which are important to know about when using template parameters. decltype(auto)
can be used for variable declarations as well.
🔍SEE ALSO🔍
decltype(...)
usage)decltype(auto)
usage)↩PREREQUISITES↩
To automatically derive the type of a variable something to be passed in as a template parameter, use decltype()
. This is useful in scenarios where it's difficult or impossible to determine the exact type for a template parameter (e.g. functions, functors, template parameters).
// declare
template <typename FUNC_TYPE>
void perform(FUNC_TYPE * func) {
func(55);
}
// use
auto my_lambda = [](int x) { std::cout << x; };
perform<decltype(my_lambda)>(my_lambda};
decltype()
can take in either an entity (as shown above) or an expression.
// declare
template <typename N>
void perform(N n) {
std::cout << n;
}
// use
MyClass myClass{}
perform<decltype(myClass.numVar + 1L)>(my_lambda}; // N set to whatever type "myClass.numVar + 1L" evaluates to
⚠️NOTE️️️⚠️
The book mentions that, if you're going to use decltype()
, don't wrap the expression in brackets. The reason is that decltype()
, for whatever reason, will end up interpreting it different than what it is.
int x { 5 };
decltype(x) // will be an int
decltype((x)) // will be an int &
C++ templates allow for template parameters to be deduced based on usage.
template<typename T>
bool test(T x) {
return x % 2 == 0;
}
test(5); // same as test<int>(5)
test(5ULL); // same as test<unsigned long long>(5ULL)
The following subsections detail type deduction rules for templates as well as edge cases and workarounds.
template<typename T>
bool test(T p) {
return p % 2 == 0;
}
What the type T
gets deduced to depends on what p
is specified as and what type gets passed into p
as an argument.
int a { 5 };
const int * aPtr { &a };
// Scenario #1: p is just "T" by itself
template<typename T>
bool test1(T p) {
return *p % 2 == 0;
}
test1(aPtr);
// Scenario #2: p is "T *"
template<typename T>
bool test2(T * p) {
return *p % 2 == 0;
}
test2(aPtr);
// Scenario #2: p is "const T *"
template<typename T>
bool test3(const T * p) {
return *p % 2 == 0;
}
test3(aPtr);
The idea with C++'s type deduction is that it tries to do the right thing through pattern matching. In the example above, T
was deduced to be the correct type in each of the scenarios.
T=const int *
.T=const int
.T=int
.Pattern matching attempts to deduce template parameter T
based on...
T
is used for function parameter p
,e
is passed as the argument to p
.template<T>
void func(??? p) { // ??? can be T, T&, const T, const T&, ...
...
}
func(e); // Given the expression e, func()'s parameter p, what will T be?
For value types, pointer types, lvalue reference types, and rvalue reference types, the rules are as follows:
e
and p
are both values, const
/ volatile
will never transfer over to T
because a copy of e
is being passed in.e
and p
are both pointers, const
/ volatile
will transfer over to T
if not already set on p
.e
and p
are both references, const
/ volatile
will transfer over to T
if not already set on p
.e
is a value but p
is a reference, e
gets passed into the function as a reference (const
/ volatile
are maintained on e
's reference, see rule where both e
and p
are references).e
is a reference but p
is a value, e
gets passed into the function as a copy of the value it references (const
/ volatile
are removed from e
's copy, see rule where both e
and p
are values).p=T | p=const T | p=T& | p=const T& | p=T* | p=const T* | |
---|---|---|---|---|---|---|
e=int | T=int | T=int | T=int | T=int | ||
e=const int | T=int | T=int | T=const int | T=int | ||
e=int& | T=int | T=int | T=int | T=int | ||
e=const int& | T=int | T=int | T=const int | T=int | ||
e=int* | T=int | T=int | ||||
e=const int* | T=const int | T=int |
⚠️NOTE️️️⚠️
volatile
not included in above matrix to keep things simple. It behaves just like const
.⚠️NOTE️️️⚠️
The rules above work for return types exactly the same way that they do for parameter types: e
ends up being the expression being returned by the function and p
is the function's return type.
For universal references, the rules are more complicated. p
gets reinterpreted based on whether e
is a lvalue reference or rvalue reference:
e
is a lvalue reference, both p
and T
will be interpreted as lvalue reference to the core type.e
is a rvalue reference, p
is interpreted as an rvalue reference and T
is the reference-less version of p
.p=T&& | |
---|---|
e=int& | T=int& (p interpreted as int&) |
e=const int& | T=const int& (p interpreted as const int&) |
e=int&& | T=int (p interpreted as int&&) |
e=const int&& | T=const i (p interpreted as const int&&) |
⚠️NOTE️️️⚠️
What the above is saying is that, if e ends up being an rvalue reference, it uses the basic rules explained just previous to this universal references explainer. Recall that parameters that are universal references borrow the rvalue reference syntax of double ampersand (&&) -- double ampersands are universal references if the type is used in a parameter and left as-is (no const
/volatile
/etc..).
The types for T
and p
look invalid in lvalue cases but there's some special logic going on under the hood in terms of "reference collapsing" and doing things internally that would be explicitly illegal to do in code. For example, normally, if p=int&
then T=int
. But that isn't the case with universal references: p=int&
(interpreted) but then T=int&
as well.
⚠️NOTE️️️⚠️
A quick-and-dirty way to determine what a type is deduced is to use typeid()
in combination with querying type traits.
template<typename T>
void test(T p) { // T or T& or const T or const T& or ...
using P = decltype(p);
using T_ref_removed = std::remove_reference<T>::type;
using P_ref_removed = std::remove_reference<P>::type;
using T_ref_and_cv_removed = std::remove_cv<T_ref_removed>::type;
using P_ref_and_cv_removed = std::remove_cv<P_ref_removed>::type;
// is_const/is_volatile must have ref removed for test to work: https://en.cppreference.com/w/cpp/types/is_const
std::cout
<< "p: "
<< (std::is_const<P_ref_removed>::value ? "[const]" : "")
<< (std::is_volatile<P_ref_removed>::value ? "[volatile]" : "")
<< (std::is_lvalue_reference<P>::value ? "[&]" : "")
<< (std::is_rvalue_reference<P>::value ? "[&&]" : "")
<< typeid(P_ref_and_cv_removed).name()
<< " / "
<< "T:"
<< (std::is_const<T_ref_removed>::value ? "[const]" : "")
<< (std::is_volatile<T_ref_removed>::value ? "[volatile]" : "")
<< (std::is_lvalue_reference<T>::value ? "[&]" : "")
<< (std::is_rvalue_reference<T>::value ? "[&&]" : "")
<< typeid(T_ref_and_cv_removed).name()
<< std::endl;
}
typeid()
by itself has a couple of issue:
In certain cases, it won't output specifics of the type (see https://stackoverflow.com/q/37412265). I've tried to work around this by using type traits in the code above.
The mains are mangled in G++ and clang (MSVC produces full type names). To de-mangle, you can use a command-line tool (that comes with most Linux g++/clang setups) called "c++-filt". For example, if typeid().name()
outputs "PKi", ...
user@localhost$ c++filt -t Pki
int const*
⚠️NOTE️️️⚠️
The book mentions a couple of niche cases to do with decaying of types.
When e
is a raw array (e.g. e=int[13]
) and p
is a reference type (e.g. p=T&
, p=const T&
, p=T&&
, ...), p
doesn't decay to a pointer (it doesn't become p=int* &
). Instead, an actual reference to the array (including its size) gets passed in, meaning that it's possible to get the array's size via sizeof()
. This isn't possible if it decayed to a pointer.
The book recommends using std::array
instead of relying on this.
When e
is a function and p
is a reference type, p
doesn't decay to a function pointer. It ends up being a reference to the actual function.
The book mentions that, in practice, the non-decaying of functions rarely makes a difference to the code.
Type deduction for auto
works almost exactly the same as template parameter type deduction. If a parameter type has auto
, that auto
assumes the place of a unique template parameter (e.g. T
). What auto
gets deduced to follows the same rules -- it takes into account the expression passed in as the argument for the parameter and how the parameter is specified (e.g. if it's const
, a reference, a pointer, etc..).
void func(auto p); // template<T> void func(T p);
void func(auto & p); // template<T> void func(T & p);
void func(auto * p); // template<T> void func(T * p);
void func(const auto & p); // template<T> void func(const T & p);
void func(const auto * p); // template<T> void func(const T * p);
void func(auto && p); // template<T> void func(T && p);
auto func(int p); // template<T> T func(int p);
This extends to variable declarations that use auto
. The rules are essentially the same:
auto
assumes the role of the template parameterconst
, a reference, a pointer, etc..).const auto p = 5;
// Imagine p is a parameter in a function and 5 is the argument being passed into it:
//
// template<T>
// void func(const T p) {
// ...
// }
// func(5);
The rules are the same even for variables typed as auto &&
. When a variable type is auto &&
, it's interpreted as a universal reference rather than a rvalue reference. It only becomes a rvalue reference if it's set to a rvalue reference. Otherwise, it's a lvalue reference.
int x { 22 };
auto && p1 = 52; // p1 is rvalue reference
auto && p2 = x; // p2 is lvalue reference
↩PREREQUISITES↩
In certain cases, a variable declaration / return statement needs to replicate the exact type of whatever expression is being assigned to it. This is possible with decltype(auto)
.
// funcA()'s return type is the exact same as f_ptr()'s return type.
template<typename F>
decltype(auto) funcA(F * f_ptr, int index) {
return f_ptr(index);
}
// x's type is the exact same as f()'s return type.
decltype(auto) x = f(a1, a2, a3, a4);
This is needed because, with normal type deduction rules, the deduction of T
changes based on how the overall type is specified (e.g. (e.g. T
, const T
, T&
, T*
, etc..) and the type of the expression that gets assigned to it.
⚠️NOTE️️️⚠️
See previous section for a refresher on type deduction rules.
template<typename F, typename T>
T funcA(F * f_ptr, int index) {
return f_ptr(index);
}
// What does T get deduced as here? Impossible to know because the signature of "f_ptr()" isn't known beforehand. But,
// if "f_ptr(index)" returns a reference, type deduction rules say that T will end up stripping off the reference. So,
// for example, if "f_ptr(index)" returns "MyObject &", this function will end up returning a COPY of that object
// rather than the reference itself.
If f_ptr()
returns a copy and your return type is T
, everything is okay.
template<typename T, typename F>
T test(F * f_ptr, int index) {
return f_ptr(index); // f_ptr() returns a COPY and you return a COPY
}
If f_ptr()
returns a reference and your return type is T&
, everything is okay.
template<typename T, typename F>
T& test(F * f_ptr, int index) {
return f_ptr(index); // f_ptr() returns a REFERENCE and you return a REFERENCE
}
If f_ptr()
returns a reference but your return type is T
, it's inefficient code.
template<typename T, typename F>
T& test(F * f_ptr, int index) {
return f_ptr(index); // f_ptr() returns a REFERENCE and you return a COPY of that reference -- it would have been fine
// to return just the reference itself
}
If f_ptr()
returns a copy but your return type is T&
, it's faulty code.
template<typename T, typename F>
T& test(F * f_ptr, int index) {
return f_ptr(index); // f_ptr() returns a COPY and you return a REFERENCE to that local copy -- copy is destroyed once
// this function exits meaning that the reference will be pointing to junk.
}
If you don't know whether f_ptr()
will return a reference or a copy (you just want to mirror back whatever its return type is), use decltype(auto)
.
template<typename T, typename F>
decltype(auto) test(F * f_ptr, int index) {
return f_ptr(index); // returns the exact type of f_ptr()
}
⚠️NOTE️️️⚠️
The book mentions that, if you're going to use decltype(auto)
, don't wrap the expression in brackets. The reason is that decltype()
, for whatever reason, will end up interpreting it different than what it is.
// Example from the book
decltype(auto) f1() {
int x = 0;
return x; // decltype(x) is int, so f1 returns int
}
decltype(auto) f2() {
int x = 0;
return (x); // decltype((x)) is int&, so f2 returns int&
}
⚠️NOTE️️️⚠️
The book mentions that, before decltype(auto)
, you needed to use the trailing return type syntax to get similar behaviour when the return type depended on the parameter types.
template<typename T>
auto f1(T x) -> decltype(x + 5) {
return x + 5;
}
You can't do something similar with the original return type syntax because the compiler doesn't know what's in the parameter list -- it hasn't parsed that part yet.
template<typename T>
decltype(x + 5) f1(T x) { // THIS WON'T WORK: x used in decltype() before it's encountered in parameter list
return x + 5;
}
A variadic function is one that takes in a variable number of arguments, sometimes called varargs in other languages. A template can be made variadic by placing a final template parameter with ...
preceding the name, where this template parameter is referred to as parameter pack.
One common use-case for parameter packs is invoking functions where the parameter list isn't known before hand.
template <typename X, typename... R>
X create(R... args) {
return X {args...};
}
Another less common use-case is specifying the base classes to inherit from (multiple inheritance).
template <typename X, typename... R>
struct X : R... {
X(const R&... args) : R(args)... { // member initializer list calls constructors of base class
}
}
Another less common use-case is to repeatedly apply some operator or function.
template<typename T>
T sum(T t) {
return t;
}
template<typename T, typename... R>
T sum(const T& first, R... rest) {
return sum(first) + sum(rest...);
}
Alternatively, rather than using recursion to exhaustively apply a binary operator, a fold expression may be applied to the parameter pack. A fold expression applies a binary operator to the contents of a parameter pack and returns the final result.
The syntax for fold expressions is ...
and the parameter pack's name sandwiched in between the operator, all encapsulated within a pair of brackets. Which side of the operator the ...
appears at defines if the fold expression will be left associative or right associative.
template<typename... R>
T test(R... args) {
R l_ass_res = (... - args); // ((((a-b)-c)-d)-...)
R r_ass_res = (args - ...); // (...-(w-(x-(y-z))))
return l_ass_res + r_ass_res;
}
⚠️NOTE️️️⚠️
Just a heads up that, depending on the operator, associativity matters. For example ((5-4)-3)
is not equal to (5-(4-3))
.
To get the size of a parameter pack, add ...
after the sizeof
operator.
template <typename X, typename... R>
size_t calculate_size(R... args) {
return sizeof...(args);
}
Parameter packs are used internally within C++'s implementation of analogues to Python's tuples and zip: std::pair
, std::tuple
, and std::zip
.
⚠️NOTE️️️⚠️
Examples adapted from here.
Given a specific set of substitutions for the template parameters of a template, a template specialization is code that overrides the template generated code. Oftentimes template specializations are introduced because they're more memory or computationally efficient than the standard template generated code. The classic example is a template that holds on to an array. Most C++ implementations represent a bool
as a single byte, however it's more compact to store an array of bool
s as a set of bits.
Declare a template specialization with the template
keyword but without any template parameters (empty angle brackets). The class or function that follows should list out substitutions after its name and the code within it should be real (non-templated).
// template
template<typename T>
T sum(T a, T b) {
return a + b;
}
// template specialization for bool: bitwise or
template<>
bool sum<bool>(bool a, bool b) {
return a | b;
}
Template specialization doesn't have to substitute all template parameters. When a template specialization only provides substitutes for some of its template parameters, leaving other template parameters as-is or partially refined, it's called a partial template specialization.
// template
template<typename R, typename T>
struct MyClass {
R sum(T a, T b) {
return a + b;
}
};
// template specialization for pointers of unknown type: already return false
template<typename X>
struct MyClass<bool, X*> {
bool sum(X * a, X* b) {
return false;
}
};
⚠️NOTE️️️⚠️
Partial template specializations for functions aren't supported (yet?). See here.
In certain cases, the compiler is able to deduce the types for a specialization from its usage, meaning explicitly listing substitutions after the name may not be required.
// first example without explicitly listing out substitutions
template<>
bool sum(bool a, bool b) { // type removed after name: "sum<bool>" to just "sum"
return a | b;
}
↩PREREQUISITES↩
Similar to classes and functions, type aliasing can be templated. A template
declaration is needed before the the type alias itself.
template<typename T>
using V = std::vector<T>;
// usage
V<int> my_vec { 1, 2, 3 };
In certain cases, when using a templated type within a templated type alias, the keywords typename
and template
may be required within the type alias declaration itself.
struct Option {
template<typename T>
using Vector = std::vector<T>;
};
template<typename O, typename T>
using Vector = typename O::template Vector<T>;
// usage
Vector<Option, int> v { 1, 2, 3 };
The rules for this are complex, but essentially in certain cases the compiler can't decide how to parse a templated type alias and the keywords typename
and template
act as disambiguation. The compiler will usually generate an error telling you that typename
is needed, but may not warn for template
and essentially interpret it as something other than what the programmer intended.
⚠️NOTE️️️⚠️
For a full breakdown, see here.
A callable's (e.g. function, functor, lambda) type encompasses multiple other types. For example, the following function's type is int(long, short)
...
int my_func(long lval, short sval) {
return 5;
}
But, that type is composed of 3 other types:
int
, which is the return type.long
, which is the first parameter type.short
, which is the second parameter type.The subsections below describe how to unpack a callable's type such that you can extract out the types it's composed of. Each callable type has a slightly different way of unpacking types.
↩PREREQUISITES↩
A template can be used to unpack / extract the types that make up a function's declaration. The process works by first creating an unimplemented templated class with a single template parameter.
template<typename Fn>
struct func_types; // unimplemented
When a function type gets passed into func_types<Fn>
's template parameter, the C++ compiler looks for a template specialization that more closely matches that function type. The example below is for a function type with two parameters: R(P1, P2)
, where R
, P1
, and P2
are template parameters. It provides an implementation for the templated class that simply assigns R
, P1
, and P2
to type aliases nested within. Any function type with two parameters will match it (e.g. int(long, short)
, void*(char, bool)
, etc...).
template<typename R, typename P1, typename P2>
struct func_types<R(P1, P2)> {
using return_t = R;
using param1_t = P1;
using param2_t = P2;
};
To use the func_types<Fn>
example above, either set Fn
to a function type or extract the type of an existing function using decltype()
.
int my_func(long lval, short sval) {
return 5;
}
int main() {
using types = func_types<decltype(my_func)>; // equiv to func_types<int(long, short)>
std::cout << (std::is_same<types::return_t, int>::value ? "true" : "false") << std::endl; // prints "true"
std::cout << (std::is_same<types::param1_t, long>::value ? "true" : "false") << std::endl; // prints "true"
std::cout << (std::is_same<types::param2_t, short>::value ? "true" : "false") << std::endl; // prints "true"
return 0;
}
The func_types<>
example above is specifically for a two parameter function types. It won't work for 0 parameter function types, or 1 parameter function types, or 3 parameter function types, or etc... A template specialization will be needed for each of those.
To support an arbitrary number of parameters, you need to use a template parameter pack to represent an arbitrary list of types and recursion to hone in on a specific index within that list.
template <std::size_t N, typename T0, typename ... Ts>
struct recurse_to_type {
using type = typename recurse_to_type<N-1u, Ts...>::type;
};
template <typename T0, typename ... Ts>
struct recurse_to_type<0u, T0, Ts...> { // template specialization for when N=0
using type = T0;
};
recurse_to_type<N, T0, Ts>
takes in an index N
, a starting type T0
, and a parameter pack of remaining types Ts
. It recursively pulls the first type out Ts
(first type in Ts
becomes T0
in next recursion) and subtracts N
by 1 until it reaches 0, at which point the template specialization recurse_to_type<0, T0, Ts>
sets that first type T0
into the type alias.
Given a function type, pulling out the type of one of its parameters simply involves using recurse_to_type
with the list of parameters (represented as a template parameter pack) and the index of that parameter N
.
template <std::size_t N, typename Fn>
struct param_type;
template <std::size_t N, typename R, typename ... Ps>
struct param_type<N, R(Ps...)> {
using type = typename recurse_to_type<N, Ps...>::type;
};
Pulling out the total number of parameters is done by passing the parameter pack to the sizeof
operator.
template <typename Fn>
struct param_cnt;
template <typename R, typename ... Ps>
struct param_cnt<R(Ps...)> {
static const std::size_t cnt { sizeof...(Ps) }; // probably should be constexpr
};
🔍SEE ALSO🔍
constexpr
)Pulling out a return type is done by simply referring to the template parameter holding it.
template <typename Fn>
struct ret_type;
template <typename R, typename ... Ps>
struct ret_type<R(Ps...)> {
using type = R;
};
Use the templates above similarly to the initial two parameter function type example.
int my_func(long lval, short sval) {
return 5;
}
int main() {
using r = ret_type<decltype(my_func)>::type;
using p0 = param_type<0, decltype(my_func)>::type;
using p1 = param_type<1, decltype(my_func)>::type;
std::cout << (std::is_same<r, int>::value ? "true" : "false") << std::endl; // prints "true"
std::cout << (std::is_same<p0, long>::value ? "true" : "false") << std::endl; // prints "true"
std::cout << (std::is_same<p1, short>::value ? "true" : "false") << std::endl; // prints "true"
std::cout << param_cnt<decltype(my_func)>::cnt << std::endl; // prints "2"
return 0;
}
⚠️NOTE️️️⚠️
There's a much simpler way to do all of this if you're already familiar with std::tuple
from the C++ standard library, documented here. Basically, wrap the parameter types within a std::tuple
's type, and then use std::tuple
's type access functions to pull out individual types within that tuple type / number of types nested in that tuple type.
template<typename Fn>
struct func_types; // unimplemented
template<typename R, typename... Ps>
struct func_types<R(Ps...)> {
using ret_t = R;
using params_as_tuple_t = std::tuple<Ps...>;
template<std::size_t N>
using param_t = std::tuple_element<N, params_as_tuple_t>::type;
static const constexpr std::size_t param_cnt { std::tuple_size<params_as_tuple_t>{} };
};
int my_func(long lval, short sval) {
return 5;
}
int main() {
using types = func_types<decltype(my_func)>;
std::cout << (std::is_same<types::ret_t, int>::value ? "true" : "false") << std::endl; // prints "true"
std::cout << (std::is_same<types::param_t<0u>, long>::value ? "true" : "false") << std::endl; // prints "true"
std::cout << (std::is_same<types::param_t<1u>, short>::value ? "true" : "false") << std::endl; // prints "true"
std::cout << types::param_cnt;
return 0;
}
There's also Boost's type traits library, which provides a simple function_traits<>
template that pulls out all the types and other type related information within a function: function_traits<my_func>::result_type
, function_traits<my_func>::argN_type
, function_traits<my_func>::arity
, etc...
↩PREREQUISITES↩
Type extraction for functors is similar to type extraction for functions. The main difference is that, with functors, the type extraction is being performed on the function call operator of the functor. As such, ...
the template specialization for a functor has to take in an extra template parameter that maps to the functor's type (class type).
// for function
template <typename R, typename ... Ps>
struct my_struct<R(Ps...)> { /* fill me in */ };
// for functor's function call operator
template<typename O, typename R, typename... Ps>
struct my_struct<R (O::*)(Ps...)> { /* fill me in */ };
In the example above, the functor takes in the extra template parameter O
and the template parameter breakdown is R (O::*)(Ps...)
, which is mapping O
to the type of the functor.
the type being passed to the template parameter needs to be the type of the function call operator, not the functor.
// for function
using types = my_struct<decltype(my_func)>;
// for functor
using types = my_struct<decltype(&my_functor::operator())>;
In the example above, the template parameter isn't the type of the functor but the type of its function call operator: decltype(&my_functor::operator())
.
The following is an adaptation of the type code presented in the previous function unpacking section. The template specialization's template parameters have been modified to work with functors instead of functions (R(O::*)(Ps...)
vs R(Ps...)
).
// recurse_to_type is the same for functors as it was for functions
template <std::size_t N, typename T0, typename ... Ts>
struct recurse_to_type {
using type = typename recurse_to_type<N-1u, Ts...>::type;
};
template <typename T0, typename ... Ts>
struct recurse_to_type<0u, T0, Ts...> {
using type = T0;
};
// param_type but with template specialization for functors
template <std::size_t N, typename Fn>
struct param_type;
template <std::size_t N, typename O, typename R, typename... Ps>
struct param_type<N, R (O::*)(Ps...)> {
using type = typename recurse_to_type<N, Ps...>::type;
};
// param_cnt but with template specialization for functors
template <typename Fn>
struct param_cnt;
template<typename O, typename R, typename... Ps>
struct param_cnt<R (O::*)(Ps...)> {
static const /*constexpr*/ std::size_t cnt { sizeof...(Ps) };
};
// ret_type but with template specialization for functors
template <typename Fn>
struct ret_type;
template<typename O, typename R, typename... Ps>
struct ret_type<R (O::*)(Ps...)> {
using type = R;
};
//
// usage
//
struct my_functor {
int operator() (long lval, short sval) {
return 5;
}
};
int main() {
using functor_func_call_type = decltype(&my_functor::operator());
using r = ret_type<functor_func_call_type>::type;
using p0 = param_type<0, functor_func_call_type>::type;
using p1 = param_type<1, functor_func_call_type>::type;
std::cout << (std::is_same<r, int>::value ? "true" : "false") << std::endl; // prints "true"
std::cout << (std::is_same<p0, long>::value ? "true" : "false") << std::endl; // prints "true"
std::cout << (std::is_same<p1, short>::value ? "true" : "false") << std::endl; // prints "true"
std::cout << param_cnt<functor_func_call_type>::cnt << std::endl; // prints "2"
return 0;
}
The problem with the modifications above is that they only work if a functor's function call operator isn't const
or volatile
.
// RECALL: lambdas are essentially shorthand for functors.
[](long l, short s) -> int { return 5; }; // FAIL: lambda's function call operator is const by default
[](long l, short s) mutable -> int { return 5; }; // OKAY: a mutable lambda's function call operator isn't const
struct f { int operator() (long l, short s) { return 5; } }; // OKAY: function call operator isn't const or volatile
struct f { int operator() (long l, short s) const { return 5; } }; // FAIL: function call operator is const
struct f { int operator() (long l, short s) volatile { return 5; } }; // FAIL: function call operator is volatile
struct f { int operator() (long l, short s) const volatile{ return 5; } }; // FAIL: function call operator is const & volatile
To support the failing cases shown above, even more template specializations are needed. For example, adding const
and volatile
support to ret_type
's template specializations yields a total of 4 template specializations.
template <typename Fn>
struct ret_type;
template<typename O, typename R, typename... Ps> // for functor whose function call operator is not const and not volatile
struct ret_type<R (O::*)(Ps...)> {
using type = R;
};
template<typename O, typename R, typename... Ps> // for functor whose function call operator is const but not volatile
struct ret_type<R (O::*)(Ps...) const> {
using type = R;
};
template<typename O, typename R, typename... Ps> // for functor whose function call operator is volatile not not const
struct ret_type<R (O::*)(Ps...) volatile> {
using type = R;
};
template<typename O, typename R, typename... Ps> // for functor whose function call operator is both const and volatile
struct ret_type<R (O::*)(Ps...) const volatile> {
using type = R;
};
⚠️NOTE️️️⚠️
If you're aware of the type traits library, you might be tempted to use std::remove_cv<T>::type
. That won't actually remove the const
/ volatile
off of a function. See here. The most you can do is build your own version of std::remove_cv
for this usecase, as is done in the link. The full version of what's in the link is below.
template<typename T>
struct remove_cv_seq;
template<typename O, typename R, typename... Ps>
struct remove_cv_seq<R (O::*)(Ps...) const> {
using type = R (O::*)(Ps...);
};
template<typename O, typename R, typename... Ps>
struct remove_cv_seq<R (O::*)(Ps...) volatile> {
using type = R (O::*)(Ps...);
};
template<typename O, typename R, typename... Ps>
struct remove_cv_seq<R (O::*)(Ps...) const volatile> {
using type = R (O::*)(Ps...);
};
template<typename O, typename R, typename... Ps>
struct remove_cv_seq<R (O::*)(Ps...)> {
using type = R (O::*)(Ps...);
};
I think these are only required for functors and lambdas, not free functions. It doesn't make sense for a free function to be const
or `volatile?
⚠️NOTE️️️⚠️
The following adapts the std::tuple
approach documented here to work with functors as well as functions.
template<typename Fn>
struct func_types; // unimplemented
// template specialization for functors
template<typename O, typename R, typename... Ps>
struct func_types<R (O::*)(Ps...)> {
using ret_t = R;
using params_as_tuple_t = std::tuple<Ps...>;
template<std::size_t N>
using param_t = std::tuple_element<N, params_as_tuple_t>::type;
static constexpr std::size_t param_cnt { std::tuple_size<params_as_tuple_t>{} };
};
// manipulators to remove cv-type off functor's function call operator
template<typename T>
struct remove_cv_seq;
template<typename O, typename R, typename... Ps>
struct remove_cv_seq<R (O::*)(Ps...) const> {
using type = R (O::*)(Ps...);
};
template<typename O, typename R, typename... Ps>
struct remove_cv_seq<R (O::*)(Ps...) volatile> {
using type = R (O::*)(Ps...);
};
template<typename O, typename R, typename... Ps>
struct remove_cv_seq<R (O::*)(Ps...) const volatile> {
using type = R (O::*)(Ps...);
};
template<typename O, typename R, typename... Ps>
struct remove_cv_seq<R (O::*)(Ps...)> {
using type = R (O::*)(Ps...);
};
struct my_functor {
int operator() (long lval, short sval) const volatile {
return 5;
}
};
int main() {
using functor_call_op_t = decltype(&my_functor::operator());
using functor_call_op_t_without_cv = remove_cv_seq<functor_call_op_t>::type;
using types = func_types<functor_call_op_t_without_cv>;
std::cout << (std::is_same<types::ret_t, int>::value ? "true" : "false") << std::endl; // prints "true"
std::cout << (std::is_same<types::param_t<0u>, long>::value ? "true" : "false") << std::endl; // prints "true"
std::cout << (std::is_same<types::param_t<1u>, short>::value ? "true" : "false") << std::endl; // prints "true"
std::cout << types::param_cnt;
return 0;
}
↩PREREQUISITES↩
⚠️NOTE️️️⚠️
This text below makes use of the type traits library in the C++ standard libraries. Type traits are templates that provide information about types (e.g if a type is signed or unsigned, if it has a copy constructor, etc...) and manipulate types. The link below jumps to that section. At this point, it's safe to read the type traits section because it doesn't require any background knowledge other than how to make use of templates, which is something you should already be aware of at this point.
🔍SEE ALSO🔍
Compile-time expressions can unify the usage of the template specializations discussed in the previous sections (unpacking functions and functors). The example function below picks the appropriate template specialization based on the type traits of the template parameter passed in. If the type is a ...
The functions never actually gets invoked. Its sole purpose is to encapsulate and run compile-time expressions to determine type information.
// this is specifically targeting the ret_type<> template created above, you'd
// have one of these functions for param_type<> and param_cnt<> as well.
template<typename T>
struct unified_ret_type {
private:
static auto _fake() {
if constexpr (!std::is_function<T>::value) {
return ret_type<decltype(&T::operator())> {};
} else {
return ret_type<T> {};
}
}
public:
using type = decltype(unified_ret_type<T>::_fake())::type;
};
You can directly pass in a function type or a functor type directly into unified_ret_type<T>
's template parameter. If a functor type is passed in, unified_ret_type<T>
makes sure to forward its function call operator type.
using r = unified_func_types<my_functor>()::type; // pass in functor type
using r = unified_func_types<decltype(my_func)>()::type; // pass in function type
// RECALL: my_func is an actual function, it's put in decltype() to pull out it's type
⚠️NOTE️️️⚠️
The following adapts the std::tuple
approach documented here to work with functors as well as functions.
// manipulators to remove cv-type off functor's function call operator AND free functions
// don't know if any of this is even valid for free functions
template<typename T>
struct remove_cv_seq;
template<typename R, typename... Ps>
struct remove_cv_seq<R (Ps...) const> {
using type = R (Ps...);
};
template<typename R, typename... Ps>
struct remove_cv_seq<R (Ps...) volatile> {
using type = R (Ps...);
};
template<typename R, typename... Ps>
struct remove_cv_seq<R (Ps...) const volatile> {
using type = R (Ps...);
};
template<typename R, typename... Ps>
struct remove_cv_seq<R (Ps...)> {
using type = R (Ps...);
};
template<typename O, typename R, typename... Ps>
struct remove_cv_seq<R (O::*)(Ps...) const> {
using type = R (O::*)(Ps...);
};
template<typename O, typename R, typename... Ps>
struct remove_cv_seq<R (O::*)(Ps...) volatile> {
using type = R (O::*)(Ps...);
};
template<typename O, typename R, typename... Ps>
struct remove_cv_seq<R (O::*)(Ps...) const volatile> {
using type = R (O::*)(Ps...);
};
template<typename O, typename R, typename... Ps>
struct remove_cv_seq<R (O::*)(Ps...)> {
using type = R (O::*)(Ps...);
};
// function type breakers
//
template<typename Fn>
struct func_types; // unimplemented
// template specialization for functions
template<typename R, typename... Ps>
struct func_types<R(Ps...)> {
using ret_t = R;
using params_as_tuple_t = std::tuple<Ps...>;
template<std::size_t N>
using param_t = std::tuple_element<N, params_as_tuple_t>::type;
static constexpr std::size_t param_cnt { std::tuple_size<params_as_tuple_t>{} };
};
// template specialization for functors
template<typename O, typename R, typename... Ps>
struct func_types<R (O::*)(Ps...)> {
using ret_t = R;
using params_as_tuple_t = std::tuple<Ps...>;
template<std::size_t N>
using param_t = std::tuple_element<N, params_as_tuple_t>::type;
static constexpr std::size_t param_cnt { std::tuple_size<params_as_tuple_t>{} };
};
// unify template specializations
template<typename T>
struct unified_func_types {
private:
static auto _fake() {
if constexpr (!std::is_function<T>::value) {
using functor_call_op_t = decltype(&T::operator());
using functor_call_op_t_without_cv = remove_cv_seq<functor_call_op_t>::type;
return func_types<functor_call_op_t_without_cv> {};
} else {
using function_without_cv = remove_cv_seq<T>::type;
return func_types<function_without_cv> {};
}
}
public:
using types = decltype(unified_func_types<T>::_fake());
};
int my_func(long lval, short sval) {
return 5;
}
struct my_functor {
int operator() (long lval, short sval) {
return 5;
}
};
int main() {
// using types = unified_func_types<decltype(my_func)>::types;
using types = unified_func_types<my_functor>::types;
std::cout << (std::is_same<types::ret_t, int>::value ? "true" : "false") << std::endl; // prints "true"
std::cout << (std::is_same<types::param_t<0u>, long>::value ? "true" : "false") << std::endl; // prints "true"
std::cout << (std::is_same<types::param_t<1u>, short>::value ? "true" : "false") << std::endl; // prints "true"
std::cout << types::param_cnt;
return 0;
}
↩PREREQUISITES↩
⚠️NOTE️️️⚠️
This section makes use of the type traits library in the C++ standard libraries. Type traits are templates that provide information about types (e.g if a type is signed or unsigned, if it has a copy constructor, etc...). The link below jumps to that section. At this point, it's safe to read the type traits section because it doesn't require any background knowledge other than how to make use of templates, which is something you should already be aware of at this point.
🔍SEE ALSO🔍
In certain cases, a set of types substituted in for a template parameters won't produce working code.
// declare
template <typename X, typename Y, typename Z>
X perform(Y& var1, Z& var2) {
return var1 + var2;
}
In the example above, Y
and Z
need to be types that support the plus operator (+) on each other (e.g. int
and short
) and the result must be of type X
(or convertible to X
). If types substituted for X
, Y
, and Z
don't satisfy those conditions, the compiler gives back cryptic compilation errors.
Concepts may be used within a template to produce more straightforward compilation errors for bad type substitutions: A concept is a predicate, evaluated at compile-time (not runtime), that ensures a set of substituted types support specific type traits (e.g. supports plus operator, has a specific member function, has a copy constructor, etc..). The compiler gives back easier to understand compilation errors when the predicate fails.
Concepts themselves are templates where the concept
keyword is used followed by its name and a compile-time evaluated expression that returns a bool
. For example, the concept below uses the type traits library to ensure that T
is both has a default constructor and a copy constructor.
template <typename T>
concept DefaultAndCopy = std::is_default_constructible<T>::value && std::is_copy_constructible<T>::value;
A concept's expression can invoke other concepts. For example, the concept below makes use of the example concept above and includes an additional type traits check to ensure that T
also has a move constructor.
template <typename T>
concept DefaultAndCopyAndMove = DefaultAndCopy<T> && std::is_move_constructible<T>::value;
The C++ standard library includes a set of commonly used concepts. These concepts perform checks similar to the checks provided by the type traits library.
// equiv to DefaultAndCopyAndMove but written using the concepts library.
template <typename T>
concept DefaultAndCopyAndMove = std::default_initializable<T> && std::copy_constructible<T> && std::move_constructible<T>;
In cases where neither the type traits library nor the concepts library has the check you need, a special requires
clause can be used to directly specify exactly what a type needs to support. This requires
clause has a parameter list (exactly as if it were a function), and within its body is a list of expressions that must be supported by the types.
template <typename T1, typename T2, typename TR>
concept MyConcept =
requires(const T1* t1, const T2& t2) { // param list may also contain non-template types like int, float, ...
{ (*t1) + t2 } -> std::same_as<TR>;
{ (*t1) * t2 } -> std::same_as<TR>;
{ std::hash<T1>{}(*t1) } -> std::convertible_to<std::size_t>;
{ std::hash<T2>{}(t2) } -> std::convertible_to<std::size_t>;
}
&& std::is_default_constructible<T1>::value
&& std::is_default_constructible<T2>::value;
The requires
clause in the example above pretends as if it's a function taking a pointer to a const
(T1
) and a lvalue reference to a const
(T2
).
T1
is dereferenced and either added / multiplied to T2
, it must produce a type of TR
.T1
is dereferenced and passed to std::hash
, it must produce a type that's convertible to size_t
.T2
is passed to std::hash
, it must produce a type that's convertible to size_t
.⚠️NOTE️️️⚠️
Note the syntax. Each statement in the body of requires
is an expression that must result in a type that passes its concept check.
To apply a concept to a template, add a requires
just before the body of the template with a concept expression. The concept expression is the exact same as the expressions used to define concepts: It's evaluated at compile-time, can reference type traits, can reference other concepts, can have a special parameter list requires
clause, and must return a bool
.
// templated function perform() using the concept "MyConcept" declared above.
template <typename T1, typename T2, typename TR>
requires MyConcept<T1, T2, TR>
TR perform(T1 t1, T2 t2) { /* ... implementation ... */ }
// templated function perform() embedding the rules for that same concept.
template <typename T1, typename T2, typename TR>
requires requires(const T1* t1, const T2& t2) {
{ (*t1) + t2 } -> std::same_as<TR>;
{ (*t1) * t2 } -> std::same_as<TR>;
{ std::hash<T1>{}(*t1) } -> std::convertible_to<std::size_t>;
{ std::hash<T2>{}(t2) } -> std::convertible_to<std::size_t>;
}
&& std::is_default_constructible<T1>::value
&& std::is_default_constructible<T2>::value;
TR perform(T1 t1, T2 t2) {
/* ... implementation ... */
}
⚠️NOTE️️️⚠️
The requires requires
above is valid. The first requires
is saying that this template is performing checks, and the second requires
is the special parameter list requires
clause that lists out what operations T1
, T2
, and TR
must support.
If a concept only checks a single type, it's possible to use it just by substituting its name in place of the typename
/ class
for the template parameter that requires it (as opposed to using requires
shown above).
// concept
template <typename T>
concept SingleTypeConcept = requires(T a, T b) {
{ a + b } -> std::same_as<T>;
{ a * b } -> std::same_as<T>;
};
// usage of concept
template <SingleTypeConcept X> // this line is updated -- "typename T" replaced with "SingleTypeConcept T"
X add_and_multiply(X &var1, X &var2) {
X x { var1 + var 2};
return x * var2;
}
For function templates specifically, rather than parameterizing using template
, a common shorthand is to use auto
for the return type / parameter types being templated. The compiler automatically infers the correct types based on usage. Each auto
parameter / return type can have a concept applied to it by placing that concept's name just before auto
. For example, the usage of SingleTypeConcept
in the example above can be rewritten as follows.
// usage of concept
SingleTypeConcept auto add_and_multiply(
SingleTypeConcept auto &var1,
SingleTypeConcept auto &var2
) {
auto x { var1 + var 2};
return x * var2;
}
⚠️NOTE️️️⚠️
Be careful when making use of concepts like this. When const auto
is involved, it'll break compilation.
std::integral const auto f() {
return 0;
}
The above function won't compile because there's something before the const
. When a const
is the left-most thing, it can't have anything further left of it. You need to move the const
after auto
(const auto
is the exact same as auto const
, having const
as left-most thing is just a syntactical convenience thing).
std::integral auto const f() {
return 0;
}
⚠️NOTE️️️⚠️
Does this work with decltype(auto)
as well?
🔍SEE ALSO🔍
decltype(auto)
description)↩PREREQUISITES↩
Concepts can be used to overload based on type traits instead of actual types. In the example below, there are two overloads for the function multiply()
. The overload that gets used depends on if the type supports the multiplication operator (*) or the addition operation (+).
// Use this multiply() if the type supports the * operator
template<typename T>
requires requires(T a, T b) {
{ a * b } -> std::same_as<T>;
}
T multiply(T a, T b) {
return a * b;
}
// Use this multiply() if the type supports the + operator
template<typename T>
requires requires(T a, T b) {
{ a + b } -> std::same_as<T>;
}
T multiply(T a, T b) {
T ret {a};
for (T i {0}; i < b; i++) {
ret = ret + a;
}
return ret;
}
If the type supports both operators (such as int
), the compiler will complain that it can't decide which overload to use. In such cases, you can add a constrained function overload for the type in question (regular function overload with the concrete type).
int multiply(int a, int b) {
return a * b;
}
The compiler will always choose to use the overload with constrained types over the unconstrained ones.
⚠️NOTE️️️⚠️
See here for more information.
Concepts can be used for template specializations similar to function overloading: A template specialization's template parameter can be set to a concept rather than an actual type. In the example below, a concept is used for a template specialization. The template specialization that gets used depends on if the type supports the multiplication operator (*) or the addition operation (+).
// Concepts to use
template<typename T>
concept AddableTypeConcept = requires(T a, T b) {
{ a + b } -> std::same_as<T>;
};
template<typename T>
concept MultiplyableTypeConcept = requires(T a, T b) {
{ a * b } -> std::same_as<T>;
};
// The base template
template<typename T>
T multiply(T a, T b) {
return -1;
}
// The template specializations
template<MultiplyableTypeConcept T>
T multiply(T a, T b) {
return a * b;
}
template<AddableTypeConcept T>
T multiply(T a, T b) {
T ret {a};
for (T i {0}; i < b; i++) {
ret = ret + a;
}
return ret;
}
If the type supports both operators (such as int
), the compiler will complain that it can't decide which template specialization to use.
⚠️NOTE️️️⚠️
Recall that, with concept function overloading, you can add a constrained function overload (overload using concrete types in the parameters) and the compiler will always default to that if there's ambiguity in which unconstrained function overload it should use. The same doesn't seem to apply with template specializations. I added the following template specialization and the compiler still complained about ambiguity when I did multiply(3, 5)
:
template<>
int multiply<int>(int a, int b) {
return a * b;
}
The following subsections detail some common concepts that are either useful or used heavily within the C++ standard library.
An ordered type is a type that can be compared typically overrides the 6 common relational operators: equals, not equals, less than, less than or equals, greater than, and greater than or equals to. The example concept below provides a concept that ensures the type provides all of these relational operators.
template<typename T>
concept Ordering =
requires(T a, T b) {
{ a == b } -> std::convertible_to<bool>;
{ a != b } -> std::convertible_to<bool>;
{ a <= b } -> std::convertible_to<bool>;
{ a < b } -> std::convertible_to<bool>;
{ a > b } -> std::convertible_to<bool>;
{ a >= b } -> std::convertible_to<bool>;
};
⚠️NOTE️️️⚠️
You typically won't have to write this out by hand. The C++ standard library has the concepts std::three_way_comparable
and std::three_way_comparable_with
. The former makes sure that a type allows relational comparisons against the same type (same as the example above) while the former allows relational comparisons against different types (e.g. comparing an int
against a long
).
Both concepts are related to the spaceship operator.
↩PREREQUISITES↩
A semi-regular type is a common idea in C++, commonly referred to in documentation on C++ and the C++ standard library. A type is considered semi-regular type if it has a ...
std::swap(T, T)
(is a swappable type)template<typename T>
concept SemiRegular =
std::is_default_constructible<T>::value &&
std::is_copy_constructible<T>::value &&
std::is_copy_assignable<T>::value &&
std::is_move_constructible<T>::value &&
std::is_move_assignable<T>::value &&
std::is_destructible<T>::value &&
std::is_swappable<T>::value;
⚠️NOTE️️️⚠️
This type is built out using functionality provided by the type traits library. Even then, you don't need to use this as the C++ standard library already provides the std::semiregular
concept.
↩PREREQUISITES↩
A regular type is a common idea in C++, commonly referred to in documentation on C++ and the C++ standard library. A type is considered regular type if it supports all the traits of a semi-regular type and it supports both the equality operator (==) and inequality operator (!=).
template<typename T>
concept Regular =
SemiRegular<T> &&
requires(T a, T b) {
{ a == b } -> std::convertible_to<bool>;
{ a != b } -> std::convertible_to<bool>;
};
⚠️NOTE️️️⚠️
You don't need to use this as the C++ standard library already provides the std::regular
concept.
⚠️NOTE️️️⚠️
The book and online documentation claims that regular types should behave similarly built-in types like int
.
One of the most basic use-cases for concepts is to require that a type be one of a set of known types (e.g. require that the type be either short
, int
, or long
). In the example below, a clever use of templates is used to test if the two types are equal, then a concept makes use of those templates to see if a type is contained in some larger set.
// templates
template<typename T, typename U>
struct is_same {
static constexpr bool value = false;
};
template<typename T>
struct is_same<T, T> {
static constexpr bool value = true;
};
// concept for a function whose first parameter's type is an integral type
template<typename T>
concept integral_check = is_same<T, short>::value || is_same<T, int>::value || is_same<T, long>::value;
// usage
template<integral_check T>
long square(T num) {
return num * num;
}
int main() {
std::cout << square(2) << std::endl;
std::cout << square(2L) << std::endl;
return 0;
}
⚠️NOTE️️️⚠️
In most cases, you shouldn't have to write out templates like is_same<>
yourself. The C++ standard library provides the type_traits
header library which contains std::is_same<>
and several other type checks. The C++ standard library also provides a set of pre-built concepts that make use of check that a type has specific type traits. For example, std::is_same<>
is exposed as the concept std::same_as<>
.
Likewise, the C++ standard library provides a more elaborate version of integral_check<>
as std::integral<>
.
↩PREREQUISITES↩
To test a callable's parameter count within a concept, templates can be used to extract the parameter count. In the example below, the concept checks that a callable has exactly 1 parameter.
// template(s) to extract parameter count
template <typename F>
struct argCnt;
template <typename R, typename ... As>
struct argCnt<R(*)(As...)> { static constexpr size_t cnt = sizeof...(As); }; // needed for std::integral<>
template <typename R, typename ... As>
struct argCnt<R(As...)> { static constexpr size_t cnt = sizeof...(As); }; // needed for std::integral<>
// concept for a callable that has exactly 1 parameter
template<typename Fn>
concept MySpecialFunction = argCnt<Fn>::cnt == 1;
// usage
template<MySpecialFunction Fn>
decltype(auto) call(Fn fn) {
return fn(2);
}
int square_int(int num) {
return num * num;
}
long square_long(long num) {
return num * num;
}
int main() {
std::cout << call(square_int) << std::endl;
std::cout << call(square_long) << std::endl;
return 0;
}
⚠️NOTE️️️⚠️
This section comes from my question on stackoverflow. Everything was tested on g++12.1 using C++20 standard. Newer versions of C++ or the g++ compiler might have better stuff to handle these types of requirements.
↩PREREQUISITES↩
To test a callable's parameter type within a concept, templates can be used to extract the parameter type. In the example below, the concept checks that a callable's first parameter has a type conforming to the concept std::integral
.
// template(s) to extract parameter types
template <std::size_t N, typename T0, typename ... Ts>
struct typeN { using type = typename typeN<N-1U, Ts...>::type; };
template <typename T0, typename ... Ts>
struct typeN<0U, T0, Ts...> { using type = T0; };
template <std::size_t, typename F>
struct argN;
template <std::size_t N, typename R, typename ... As>
struct argN<N, R(*)(As...)> { using type = typename typeN<N, As...>::type; }; // needed for std::integral<>
template <std::size_t N, typename R, typename ... As>
struct argN<N, R(As...)> { using type = typename typeN<N, As...>::type; }; // needed for std::is_integeral_v<>
// concept for a function whose first parameter's type is an integral type
template<typename Fn>
concept MySpecialFunction = std::integral<typename argN<0U, Fn>::type>;
// usage
template<MySpecialFunction Fn>
decltype(auto) call(Fn fn) {
return fn(2);
}
int square_int(int num) {
return num * num;
}
long square_long(long num) {
return num * num;
}
int main() {
std::cout << call(square_int) << std::endl;
std::cout << call(square_long) << std::endl;
return 0;
}
// type trait checks using static_assert() -- not necessary
static_assert( std::is_integral_v<typename argN<0U, decltype(square_int)>::type> );
static_assert( std::is_integral_v<typename argN<0U, decltype(square_long)>::type> );
⚠️NOTE️️️⚠️
This section comes from my question on stackoverflow. Everything was tested on g++12.1 using C++20 standard. Newer versions of C++ or the g++ compiler might have better stuff to handle these types of requirements.
⚠️NOTE️️️⚠️
Instead of using the templates show above, one other solution is to use Boost's type traits library: function_traits<my_func>::argN_type
⚠️NOTE️️️⚠️
Cleverly using templates as shown above is the most robust way to check a parameter's type. But, if your requirements aren't overly complex, there may be simpler ways.
SCENARIO 1: Testing for a known concrete types
In this scenario, the requirement is that a callable's parameter type be a concrete type that's known beforehand (e.g. int
). The concept for the callable itself can simply use a parameter list requires
clause.
// concept for a function that takes in a single argument of type int
template <typename Fn>
concept MySpecialFunction = requires(Fn f, int t) {
{ f(t) } -> std::same_as<int>;
};
SCENARIO 2: Testing for a set of known concrete types
In this scenario, the requirement is that a callable's parameter be one of a set of concrete types that's known beforehand (e.g. int
or long
). The concept for the callable can be exploded out into several sub-concepts: Each sub-concept checks that the callable's parameter type match a specific concrete type, then those sub-concepts combine to form the full concept.
// concept that combines the two sub-concepts: checks for a function has a single parameter of type int or long
template<typename Fn>
concept MySpecialFunction1 = requires(Fn f, int i) { // sub-concept1: func that has a single parameter of type int
{ f(i) } -> std::same_as<decltype(i)>;
};
template<typename Fn>
concept MySpecialFunction2 = requires(Fn f, long l) { // sub-concept2: func that has a single parameter of type long
{ f(l) } -> std::same_as<decltype(l)>;
};
template<typename Fn>
concept MySpecialFunction = MySpecialFunction1<Fn> || MySpecialFunction2<Fn>; // final concept: func that has a single parameter of type int or long
// usage
template<MySpecialFunction Fn>
decltype(auto) call(Fn f) {
return f(2);
}
int square_int(int num) {
return num * num;
}
long square_long(long num) {
return num * num;
}
int main() {
std::cout << call(square_int) << std::endl;
std::cout << call(square_long) << std::endl;
return 0;
}
The problem with exploding out to sub-concepts is that the number of sub-concepts can get very large. For example, if the callable should have 4 parameters and each of those parameters should be of type int
, long
, short
, or void*
, that's 256 different sub-concepts to list out.
// sub-concepts for function that takes in 4 params:
// param1: int|long|short|void*
// param2: int|long|short|void*
// param2: int|long|short|void*
// param3: int|long|short|void*
//
// 4^4=256 sub-concepts required, not really feasible to code something like this out
template<typename Fn>
concept MySpecialFunction1 = requires(Fn f, int p1, int p2, int p3, int p4) {
{ f(p1, p2, p3, p4) } -> std::same_as<long>;
};
template<typename Fn>
concept MySpecialFunction2 = requires(Fn f, int p1, int p2, int p3, long p4) {
{ f(p1, p2, p3, p4) } -> std::same_as<long>;
};
template<typename Fn>
concept MySpecialFunction3 = requires(Fn f, int p1, int p2, int p3, short p4) {
{ f(p1, p2, p3, p4) } -> std::same_as<long>;
};
template<typename Fn>
concept MySpecialFunction4 = requires(Fn f, int p1, int p2, int p3, void* p4) {
{ f(p1, p2, p3, p4) } -> std::same_as<long>;
};
template<typename Fn>
concept MySpecialFunction5 = requires(Fn f, int p1, int p2, long p3, int p4) {
{ f(p1, p2, p3, p4) } -> std::same_as<long>;
};
template<typename Fn>
concept MySpecialFunction6 = requires(Fn f, int p1, int p2, long p3, long p4) {
{ f(p1, p2, p3, p4) } -> std::same_as<long>;
};
...
template<typename Fn>
concept MySpecialFunction256 = requires(Fn f, void* p1, void* p2, void* p3, void* p4) {
{ f(p1, p2, p3, p4) } -> std::same_as<long>;
};
// combine sub-concepts together into final concept
template<typename Fn>
concept MySpecialFunction =
MySpecialFunction1<Fn>
|| MySpecialFunction2<Fn>
|| MySpecialFunction3<Fn>
|| MySpecialFunction4<Fn>
|| MySpecialFunction5<Fn>
|| MySpecialFunction6<Fn>
|| ...
|| MySpecialFunction256<Fn>;
// usage
template<typename Fn>
requires MySpecialFunction<Fn>
decltype(auto) call(Fn fn) {
return fn(1, 2, 3, 4);
}
long multiply(int num1, long num2, short num3, long num4) {
return num1 * num2 * num3 * num4;
}
int main() {
std::cout << call(multiply) << std::endl;
return 0;
}
One potential workaround to the sub-concept explosion problem shown in the example above is to use a parameter list requires
clause: Each of the 4 parameter types gets fed into the top-level concept as a template parameter and requirements are individually tested on each of those template parameters.
// function that takes in 4 params:
// param1: int|long|short|void*
// param2: int|long|short|void*
// param2: int|long|short|void*
// param3: int|long|short|void*
template<typename Fn, typename P1, typename P2, typename P3, typename P4>
concept MySpecialFunction =
(std::same_as<P1, int> || std::same_as<P1, long> || std::same_as<P1, short> || std::same_as<P1, void*>)
&& (std::same_as<P2, int> || std::same_as<P2, long> || std::same_as<P2, short> || std::same_as<P2, void*>)
&& (std::same_as<P3, int> || std::same_as<P3, long> || std::same_as<P3, short> || std::same_as<P3, void*>)
&& (std::same_as<P4, int> || std::same_as<P4, long> || std::same_as<P4, short> || std::same_as<P4, void*>)
&& requires(Fn f, P1 p1, P2 p2, P3 p3, P4 p4) {
{ f(p1, p2, p3, p4) } -> std::same_as<long>;
};
Doing this removes the sub-concept explosion problem, but it introduces a new problem of the compiler losing the ability to infer template parameters from usage. In the example below, the concept for the callable is concise, but usages of call()
now need to explicitly specify what each template argument is because the C++ compiler is no longer able to infer them on its own.
// function that takes in 4 params:
// param1: int|long|short|void*
// param2: int|long|short|void*
// param2: int|long|short|void*
// param3: int|long|short|void*
template<typename Fn, typename P1, typename P2, typename P3, typename P4>
concept MySpecialFunction =
(std::same_as<P1, int> || std::same_as<P1, long> || std::same_as<P1, short> || std::same_as<P1, void*>)
&& (std::same_as<P2, int> || std::same_as<P2, long> || std::same_as<P2, short> || std::same_as<P2, void*>)
&& (std::same_as<P3, int> || std::same_as<P3, long> || std::same_as<P3, short> || std::same_as<P3, void*>)
&& (std::same_as<P4, int> || std::same_as<P4, long> || std::same_as<P4, short> || std::same_as<P4, void*>)
&& requires(Fn f, P1 p1, P2 p2, P3 p3, P4 p4) {
{ f(p1, p2, p3, p4) } -> std::same_as<long>;
};
// usage
template<typename Fn, typename P1, typename P2, typename P3, typename P4>
requires MySpecialFunction<Fn, P1, P2, P3, P4>
decltype(auto) call(Fn fn) {
return fn(1, 2, 3, 4);
}
long multiply(int num1, long num2, short num3, long num4) {
return num1 * num2 * num3 * num4;
}
int main() {
// std::cout << call(multiply) << std::endl; // <--- WON'T COMPILE because template parameters can't be inferred by the compiler
std::cout << call<decltype(multiply), int, long, short, long>(multiply) << std::endl; // <--- WILL COMPILE because template parameters explicitly listed,
return 0;
}
↩PREREQUISITES↩
To test a callable's return type within a concept, templates can be used to extract the type. In the example below, the concept checks that a callable has a return type of integral.
// template(s) to extract return types
template <typename F>
struct returnType;
template <typename R, typename ... As>
struct returnType<R(*)(As...)> { using type = R; };
template <typename R, typename ... As>
struct returnType<R(As...)> { using type = R; };
// concept for a function whose return type is an integral type
template<typename Fn>
concept MySpecialFunction =
std::integral<typename returnType<Fn>::type>;
// usage
template<MySpecialFunction Fn>
decltype(auto) call(Fn fn) {
return fn(2);
}
int square_int(int num) {
return num * num;
}
long square_long(long num) {
return num * num;
}
int main() {
std::cout << call(square_int) << std::endl;
std::cout << call(square_long) << std::endl;
return 0;
}
⚠️NOTE️️️⚠️
This section comes from my question on stackoverflow. Everything was tested on g++12.1 using C++20 standard. Newer versions of C++ or the g++ compiler might have better stuff to handle these types of requirements.
⚠️NOTE️️️⚠️
Instead of using the templates show above, one other solution is to use Boost's type traits library: function_traits<my_func>::result_type
Somewhat related as well from the C++ standard library: std::result_of
/ std::invoke_result
.
⚠️NOTE️️️⚠️
Cleverly using templates as shown above is the most robust way to check a callable's return type. But, if your requirements aren't overly complex, it may be feasible to use simpler checks such as those discussed in the parameter types section before this section. For example, if the scenario allows for it, a concept check can be reduced to just a set of parameter list requires
clauses being logically or'd together.
// concept for a function whose return type is an integral type
template<typename Fn>
concept MySpecialFunction =
requires(Fn f, int i) {
{ f(i) } -> std::same_as<int>;
}
|| requires(Fn f, long l) {
{ f(l) } -> std::same_as<long>;
};
// usage
template<MySpecialFunction Fn>
decltype(auto) call(Fn fn) {
return fn(2);
}
int square_int(int num) {
return num * num;
}
long square_long(long num) {
return num * num;
}
int main() {
std::cout << call(square_int) << std::endl;
std::cout << call(square_long) << std::endl;
return 0;
}
↩PREREQUISITES↩
A coroutine that can suspend its own execution and have it be continued at a later time. Similar to async functions in Javascript, C++ coroutines can work with promise objects (objects that do work asynchronously). A function can be made into a coroutine by using any of the following:
co_await
- suspend execution waiting for a promise to finish.co_yield
- suspend execution and optionally return a value.co_return
- complete execution and optionally return a value.The return value of a coroutine is a "promise type", a C++ class that has a specific structure and specific set of functionality that the compiler calls to determine and control the coroutine's state.
⚠️NOTE️️️⚠️
This is deeply convoluted and requires a lot more digging and documentation, possibly in its own section instead of sub-section under the Function header.
#include <iostream>
#include <cstdlib>
#include <coroutine>
struct Resumable {
struct promise_type; // forward declaration
Resumable(std::coroutine_handle<promise_type> coro) : coro(coro) {}
~Resumable() {
coro.destroy();
}
void destroy() { coro.destroy(); }
void resume() { coro.resume(); }
private:
std::coroutine_handle<promise_type> coro;
};
struct Resumable::promise_type {
auto get_return_object() { return Resumable(std::coroutine_handle<Resumable::promise_type>::from_promise(*this)); }
auto initial_suspend() { return std::suspend_never(); }
auto final_suspend() noexcept { return std::suspend_never(); }
auto yield_value(int value) {
current_value = value;
return std::suspend_always{};
}
void return_void() { }
void unhandled_exception() { }
int current_value;
};
Resumable range(int start, int end) {
while (start < end) {
co_yield start;
std::cout << start << '\n';
start++;
}
co_return;
}
int main() {
auto x {range(0, 10)};
x.resume(); // prints 0
x.resume(); // prints 1
x.resume(); // prints 2
}
⚠️NOTE️️️⚠️
It's said that the coroutine state is kept on the heap, resulting in C++ coroutines being a performance hog. Maybe it's possible to use a custom allocator to work around performance problems?
IN AWAITABLE::await_suspend() IS WHERE YOU LAUNCH SOME ASYNC OPERATION THAT EVENTUALLY CALLS h::resume OT CONTINUE THE COROUTINE
#include <coroutine>
#include <iostream>
#include <stdexcept>
#include <thread>
auto switch_to_new_thread(std::jthread& out) {
struct awaitable {
std::jthread* p_out;
bool await_ready() { return false; }
void await_suspend(std::coroutine_handle<> h) {
std::jthread& out = *p_out;
if (out.joinable())
throw std::runtime_error("Output jthread parameter not empty");
out = std::jthread([h] { h.resume(); });
// Potential undefined behaviour: accessing potentially destroyed *this
// std::cout << "New thread ID: " << p_out->get_id() << '\n';
std::cout << "New thread ID: " << out.get_id() << '\n'; // this is OK
}
void await_resume() {}
};
return awaitable{&out};
}
struct task{
struct promise_type {
task get_return_object() { return {}; }
std::suspend_never initial_suspend() { return {}; }
std::suspend_never final_suspend() noexcept { return {}; }
void return_void() {}
void unhandled_exception() {}
};
};
task resuming_on_new_thread(std::jthread& out) {
std::cout << "Coroutine started on thread: " << std::this_thread::get_id() << '\n';
co_await switch_to_new_thread(out);
// awaiter destroyed here
std::cout << "Coroutine resumed on thread: " << std::this_thread::get_id() << '\n';
}
int main() {
std::jthread out;
resuming_on_new_thread(out);
}
C++ unions are a set of variables that point to the same underlying memory. Each union takes up only as much memory as its largest member.
union MyUnion {
char raw[100];
short num_int;
double num_dbl;
}
MyUnion x;
// set all bytes of raw to 0
for (int i {0}; i < sizeof(x.raw); i++) {
x.raw[i] = 0;
}
// since all members of the union start at the same memory location, these
// will by likely both be 0 (unless short or double has a byte size of over
// 100).
int x = x.num_in;
int y = x.num_dbl;
🔍SEE ALSO🔍
↩PREREQUISITES↩
Namespaces are C++'s way of organizing code into a logical hierarchy / avoiding naming conflicts, similar to packages in Java or Python. Unlike packages, namespaces don't use the filesystem to define their logical hierarchy. Instead, the hierarchy is specified directly in code using namespace
blocks.
namespace FirstLevel {
namespace MiddleLevel {
namespace LastLevel {
struct MyStruct {
int count;
bool flag;
};
}
}
}
The nesting in the example above is avoidable via the scope operator (::).
namespace FirstLevel::MiddleLevel::LastLevel {
struct MyStruct {
int count;
bool flag;
};
}
To use the symbols within a namespace, either include them directly or bring all symbols within the namespace to the forefront via the using
keyword (similar to Java's import
or Python's from
/ import
).
// Use namespace directly.
FirstLevel:MiddleLevel::LastLevel::MyStruct x{};
// Bring all symbols within a namespace to the forefront.
using FirstLevel:MiddleLevel::LastLevel;
MyStruct y{};
// Bring a single symbol within a namespace to the forefront.
using FirstLevel:MiddleLevel::LastLevel::MyStruct;
MyStruct z{};
A special type of namespace, called an unnamed namespace, limits the visibility of the code to the containing translation unit. That means you can't reference an unnamed namespace in some other translation unit: It behaves as if you gave the namespace a unique name and never referenced that namespace outside of the translation unit.
// A.h
namespace {
void help() {
// ... code removed ...
}
}
// B.h
#include "A.h" // help() in A.h won't conflict with the help() here
void help() {
// ... code removed ...
}
↩PREREQUISITES↩
Modifiers on a variable or function declaration are used to control how the linker behaves. Specifically, the modifiers can ask the linker to automatically ...
inline
)extern
)static
).A static function or variable is one that's only visible to other code in the same translation unit. The linker will make sure that the function doesn't intermingle with other translation units.
Static functions/variables have the static
modifier applied.
static int add(int a, int b) {
return a + b;
}
⚠️NOTE️️️⚠️
This is only for non-members (not belonging to a class).
The meaning of static
changes when the function or variables belongs to a class (method). When applied on a member function, it means that it isn't bound to any instance of the class -- it can't access fields belonging to an instance.
An inline function or variable is one that may be defined in multiple different translation units. The linker will make sure all translation units use a single instance of that function/variable even though it may have been defined multiple times.
Inline functions/variables have the inline
modifier applied.
int add(int a, int b) inline {
return a + b;
}
⚠️NOTE️️️⚠️
See this. Typically, the compiler applies inline
automatically based on what it sees, meaning that it isn't something that should be added by the programmer in most cases. The only exception to that seems to be templates? See some of the other answers in the linked stack overflow question.
⚠️NOTE️️️⚠️
The original intent of inline
was to indicate to the compiler that embedding a copy of the function for an invocation was preferred over an function call. The reason being that in certain cases the code would be faster if it were embedded rather than having it branch into a function call.
An external function or variable is a one that's usable within the translation unit but isn't defined. The linker will sort out where the function is when the time comes.
External linkage functions/variables have the extern
modifier applied.
extern int add(int a, int b);
⚠️NOTE️️️⚠️
Sounds similar to forward declaration but across different translation units?
C++ flow control structures are similar to those in other high-level languages (e.g. Java), with the exception that ...
⚠️NOTE️️️⚠️
An important caveat about loops in C++ from cppreference.com:
As part of the C++ forward progress guarantee, the behaviour is undefined if a loop that has no observable behaviour (does not make calls to I/O functions, access volatile objects, or perform atomic or synchronization operations) does not terminate. Compilers are permitted to remove such loops.
If statements follow a similar structure to if statements in Java. The only major difference is that an initializer statement is allowed before the condition in the initial if
.
if (int r {rand()}; r % 2 == 0) {
std::cout << r << " even";
else if (r % 5 == 0) {
std::cout << r << " div by 5";
} else {
std::cout << r << " odd";
}
In the example above, an initializer statement has been added that sets a variable to a random number. That variable is only accessible inside the different branches of the if statement.
Switch statements follow a similar structure to switch statements in Java. The only major difference is that an initializer statement is allowed before the condition.
switch (int r {rand()}; r % 2) {
case 0:
std::cout << r << " even";
break;
case 1:
std::cout << r << " odd";
break;
default:
std::cout << "this should never happen";
break;
}
To indicate to the compiler that a fallthrough case is intended behaviour, use the [[fallthrough]]
attribute.
switch (x) {
case 0: [[fallthrough]]
std::cout << r << " even";
case 1:
std::cout << r << " odd";
break;
default:
std::cout << "this should never happen";
break;
}
↩PREREQUISITES↩
For loops follow a similar structure to for loop in Java.
for (int i {0}; i < 10; i++) {
std::cout << i;
}
Similarly, an analog to Java's for-each loop exists called range-based for loops. The only major difference is that an initializer statement is allowed (optional) before the range declaration.
for (int r {rand()}; int val : array) {
std::cout << (r + val) << ' ';
}
For-each loops, sometimes also called range-based for loops, are translated differently based on the type that's being looped over. Specifically, if the type being looped over is an ...
array:
// FOR-EACH LOOP
for (int val : array) {
std::cout << val << ' ';
}
//
// TRANSLATION OF THE ABOVE LOOP AS A STANDARD FOR LOOP
//
{
auto && __range = array;
auto __begin = array;
auto __end = array + std::size(array);
for ( ; __begin != __end; ++__begin){
int val = *__begin;
std::cout << val << ' ';
}
}
an object that has the member functions begin()
and end()
:
// FOR-EACH LOOP
for (int val : obj) {
std::cout << (r + val) << ' ';
}
//
// TRANSLATION OF THE ABOVE LOOP AS A STANDARD FOR LOOP
//
{
auto && __range = obj;
auto __begin = obj.begin();
auto __end = obj.end();
for ( ; __begin != __end; ++__begin){
int val = *__begin;
std::cout << val << ' ';
}
}
🔍SEE ALSO🔍
an object for which the free functions std::begin(T)
and std::end(T)
have overloads for:
// FOR-EACH LOOP
for (int val : obj) {
std::cout << (r + val) << ' ';
}
//
// TRANSLATION OF THE ABOVE LOOP AS A STANDARD FOR LOOP
//
{
auto && __range = obj;
auto __begin = std::begin(obj);
auto __end = std::end(obj);
for ( ; __begin != __end; ++__begin){
int val = *__begin;
std::cout << val << ' ';
}
}
🔍SEE ALSO🔍
While and do-while loops follow a similar structure to their counterparts in Java.
int r {rand() % 5};
while (r > 0) {
std::cout << r << " ";
r--;
}
int r {rand() % 5};
do {
std::cout << r << " ";
r--;
} while (r > 0); // semicolon required at the end
⚠️NOTE️️️⚠️
Unlike other control structures, these loops cannot have initializer statements.
Unlike most other high-level languages (e.g. Java), C++ allows the use of goto statements. However, note that goto statements are generally considered bad practice and should somehow be refactored to higher-level constructs (e.g. loops, if statements, etc..).
retry:
int r {rand()};
if (r % 2 == 0) {
goto retry;
}
std::cout << r << " odd";
Conditional branching operations in flow control statements may have the [[likely]]
and [[unlikely]]
attributes applied to hint at the likelihood / unlikelihood that of the path execution will take. This allows for better optimization by the compiler (based on your assumptions).
switch (exit_code) {
case 0:
// happy path
break;
case 1:
// recognized error path
break;
[[unlikely]] default:
// unrecognized error path
break;
}
if (is_valid(email)) [[likely]] {
// happy path
} else {
// error path
}
while (i > 0) [[unlikely]] {
// do something
}
⚠️NOTE️️️⚠️
I read something online saying you shouldn't use both [[likely]]
and [[unlikely]]
on the same switch/if/while/etc...
↩PREREQUISITES↩
C++ attributes are similar to annotations in Java, providing information to the user / compiler about the code that it's applied to. Unlike Java, C++ compilers are free to pick and choose which attributes they support and how they support them. There is no guarantee what action a compiler will take, if any, when it sees an attribute (e.g. compiler warnings).
An attribute is applied by nesting it in double square brackets (e.g. [[noreturn]]
) and placing it as a modifier on the function.
[[noreturn]] void fail() {
throw std::runtime_error { "Failed" };
}
Common attributes:
attribute | description |
---|---|
[[deprecated("msg")]] |
Indicates that a function is deprecated. Message is optional. |
[[noreturn]] |
Indicates that a function doesn't return. |
[[fallthrough]] |
Indicates that a switch case was explicitly designed to fall through to the next case (no break / return / etc.. intended). |
[[nodiscard]] |
Indicates that a function's result should be used somehow (produce compiler warning). |
[[maybe_unused]] |
Indicates that a function's result doesn't have to be used (avoid compile warning). |
🔍SEE ALSO🔍
↩PREREQUISITES↩
Compile-time evaluations allow code to be executed at compile-time rather than run-time. For example, compile-time code could be used to compute a constant used by your code, to test features of the compiler / platform (e.g. endian-ness, bytes in an int
, etc..), or to test type traits (e.g. does it support the +
operator).
The subsections below detail the various mechanisms for writing compile-time code and their restrictions.
↩PREREQUISITES↩
Compile-time evaluations allow code to be executed at compile-time rather than run-time. For example, rather than hardcoding a magic number, a piece of code can run during compilation that calculates that magic number and automatically hardcodes it in behind the scenes. Calculating a magic number like results in cleaner and more understandable code because the developer can see how the magic number is derived and can even tweak the code that calculates it.
consteval int sqr(int n) {
return n * n;
}
int y { sqr(17) }; // compiles exactly the same as initializing y directly to 289
Compile-time evaluations are enabled through the following keywords: consteval
, constinit
, and constexpr
.
consteval
- A function with the consteval
specifier is referred to as an immediate function. An immediate function's invocation always gets executed during compilation, where the result of that function is swapped in for its invocation in the compiled code. This is essentially the same as using a compile-time constant, but the compile-time constant is generated through code.
consteval int sqr(int n) {
return n * n;
}
int main() {
int x {sqr(7)}; // during compilation, sqr(7) is replaced with the result of sqr(7)
std::cout << x << std::endl;
}
A consteval
function's body and invocations must only reference other functions and variables that are available at compile-time (e.g. it can call another consteval
function). Similarly, arguments being passed to it during invocation must be available at compile-time as well.
consteval int sqr(int n) {
return n * n;
}
int main() {
int divide_val { 2 };
int x {sqr(7 / divide_val)}; // ERROR: even though it looks like it should be, var 'divide_val' is not guaranteed to be known at compile-time
std::cout << x << std::endl;
}
constinit
- A variable with the constinit
specifier is one that has its initializer executed during compilation, where the result of that execution is swapped in for usages of the variable. It works similarly to consteval
in that its initializer must only reference other functions and variables that are available at compile-time.
constinit int x {5+4/3};
int main() {
std::cout << x << std::endl;
return 0;
}
The difference between a const
variable and constinit
variable is that the former only guarantees the variable is unmodifiable. It doesn't actually guarantee that the expression within is evaluated at compile-time.
constinit int x {5+4/3};
int main() {
std::cout << x << std::endl;
x = 77; // THIS IS OKAY because x is constinit but not const
std::cout << x << std::endl;
return 0;
}
constinit
can only be applied to variables with static storage duration.
⚠️NOTE️️️⚠️
It can also be applied to thread-local storage duration, if you know what that is.
constexpr
- A function or variable with the constexpr
specifier, referred to as constant expression, acts as if it were a consteval
/ constinit
as long as it only references other objects (functions / variables) that are available at compile-time and takes in arguments that are available at compile-time. Otherwise, it acts as if it were a normal function, meaning nothing gets executed at compile-time and the code is compiled as-is.
constexpr int x {5+4/3};
constexpr int sqr(int n) {
return n * n;
}
int sqr_runtime_only(int n) {
return n * n;
}
int main() {
int z {sqr(7) + x}; // z's initializer executed and swapped with constant at compile-time
std::cout << z << std::endl;
int a {sqr_runtime_only(7) + x}; // a's initializer executed at run-time
std::cout << a << std::endl;
}
⚠️NOTE️️️⚠️
There's a special function in the C++ standard library called std::is_constant_evaluated()
that you can use in a constexpr
function to determine if / ensure that the code is being executed at compile-time or at run-time. This is useful if you want the code to do something different when evaluated at compile-time vs run-time (e.g. use a look-up table if evaluated at run-time vs do the full calculation if evaluated at compile-time).
Here's the example from the book...
constexpr double power(double b, int x) {
if (std::is_constant_evaluated() && !(b == 0.0 && x < 0)) {
if (x == 0)
return 1.0;
double r = 1.0, p = x > 0 ? b : 1.0 / b;
auto u = unsigned(x > 0 ? x : -x);
while (u != 0) {
if (u & 1) r *= p;
u /= 2;
p *= p;
}
return r;
} else {
return std::pow(b, double(x));
}
}
Technically, std::is_constant_evaluated()
can be used anywhere. If you use it ...
consteval
, it will always evaluate to trueconstexpr
, it may evaluate to true or false depending on where it was calledThe restrictions on constinit
/ consteval
/ constexpr
are vast. At a high-level, the only allowed inputs and outputs are literal types:
std::nullptr_t
, etc..Also, exceptions handling, static
variables, and thread_local
variables are not allowed.
⚠️NOTE️️️⚠️
The rules here are vast and complicated. The above might not be entirely correct, may be missing some conditions, or may not cover certain aspects. In the type_traits header, there's a function called std::is_literal_type
that can be used to test if a type is a literal type.
⚠️NOTE️️️⚠️
Type information is queryable at compile-time through the type_traits. Information about numeric types is queryable at compile-time using numeric_limits, cstdlib, and cfloat headers.
Those are what you would commonly use in if constexpr
blocks. They help with building portable software.
↩PREREQUISITES↩
The keyword constexpr
may also be used within an if-else to conditionally compile code. When constexpr
appears immediately after if
, the conditional expression within is treated similarly to a constexpr
variable's initializer. Assuming the conditional expression is evaluated at compile-time, the chosen if-else path is the only one that gets compiled. All other paths of the if-else skip compilation.
if constexpr (y == sizeof(int)) {
// constant expression y is same as the number of bytes for an int, so compile this block
...
} else if constexpr (y == sizeof(long)) {
// constant expression y is same as the number of bytes for an long, so compile this block
...
} else {
// constant expression y is NOT same as the number of bytes for an int or long, so compile this block
...
}
//
// What use is a constexpr if-else? Use-cases include ...
//
// * omitting parts of a program from compilation (e.g. demonstration software).
// * working around compiler-specific / platform-specific inconsistencies (e.g. only include code if `int`'s max value is above some threshold).
// * performing specific actions based on the types chosen for template parameters (e.g. include code path 1 if pointer, otherwise code path 2).
⚠️NOTE️️️⚠️
This seems to replace the need for C preprocessor macros #define
/ #ifdef
/ etc...
⚠️NOTE️️️⚠️
It looks like C++23 may allow consteval
for if-else as well.
↩PREREQUISITES↩
static_assert
is used to perform some compile-time test that results in a boolean
and error if it evaluates to false
, optionally with a message.
static_assert(sizeof(int) == 4, "STOPPING COMPILATION: This code was written with the assumption that an int is 4 bytes");
↩PREREQUISITES↩
C++ exceptions work similarly to exceptions in other languages, except that there is no finally
block. The idea behind this is that resources should be bound to an object's lifetime (destructor). As the call stack unwinds and the automatic objects that each function owns are destroyed, the destructors of those objects should be cleaning up any resources that would have been cleaned up by the finally
block. This concept is referred to as resource acquisition is initialization (RAII).
⚠️NOTE️️️⚠️
What does accordingly mean? For example, wrap the dynamically allocated object in a class where allocation happens in the constructor / deallocation happens in the destructor. An automatic object of that class type will clean up properly when the function exits.
To throw an exception, use the throw
keyword followed by the object to throw. Most object types are throwable, but thrown objects are typically limited to types either in or derived from those in the stdexcept header.
void no_negatives_check(int x) {
if (x < 0) {
throw std::runtime_error { "no negatives" };
}
}
Similar to Java and Python, C++ provides a standard set of exceptions in stdexcept complete with a hierarchy.
To catch an exception potentially being thrown, wrap code in a try-catch block. Typical inheritance rules apply when catching an exception. For example, catching a std:runtime_error
type will also catch anything that extends from it as well (e.g. std:overflow_error
).
try {
no_negatives_check(55);
no_negatives_check(0);
no_negatives_check(-1); // will throw an exception
} catch (const std::runtime_error &e) {
// do something
}
To catch any exception regardless of type, use ...
.
try {
no_negatives_check(-1); // will throw an exception
} catch (...) {
// do something, note the exception object is not accessible here
}
Multiple catches may exist in the same try-catch block.
try {
no_negatives_check(-1); // will throw an exception
} catch (const std::range_error &e) {
// do something
} catch (const std::runtime_error &e) {
// do something -- this block will get chosen
} catch (const std::exception &e) {
// do something
} catch (...) {
// do something, note the exception object is not accessible here
}
↩PREREQUISITES↩
Structured binding declaration is a C++ language feature similar to Python's unpacking of lists and tuples. Given an array or a class, the values contained within are unpackable to individual variables.
// array example
int x[] = {1,2};
auto [a, b] = x; // a is a copy of x[0], b is a copy of x[1]
auto &[c, d] = x; // c is a REFERENCE to a[0], d is a REFERENCE to a[1]
// class example
struct MyStruct {
int count;
bool flag;
};
MyStruct y {5,true};
auto [i, j] = y; // i is a copy of y.count, b is a copy of y.flag
auto &[k, l] = y; // k is a REFERENCE to y.count, l is a REFERENCE to y.flag
Value categories are a classification of expressions in C++. At their core, these categories are used for determining when objects get moved vs copied, where a move means that the guts of the object are scooped out and transferred to another object.
This is explicitly categorizing expressions, not objects, variables or types. Each expression is categorized as either an lvalue, xvalue, or prvalue.
A prvalue is an expression that generates some transient result, where that result is typically either used for assignment or passed into a function invocation by moving it.
int a { 0 }; // move -- 0 is being generated and MOVED into a (the expression 0 is a prvalue)
// ^
// |
// rvalue
int b { a }; // copy -- a already exists and its being COPIED into b (the expression a is NOT a prvalue)
In essence, the way to think of a prvalue is that it's an expression that meets the following 3 conditions ...
can't have the address-of operator used on it.
MyStruct* a {&MyStruct(true)}; // error -- right-hand expression is transient, not a var that you can get the address of
int* b {&(5)} // error -- right-hand expression is a literal, not a var that you can get the address of
int* c {&get_int()} // error -- right-hand expression is the return val of function, not a var that you can get the address of
can have its guts be scooped out and moved into something else.
x = 55 + y; // expression 55 + y is evaluated and the result is MOVED into x (its guts are scooped out and moved into x)
doesn't persist once the expression has been executed.
x = 55 + y; // expression 55 + y is a prvalue -- doesn't persist after this line (its not something you can access)
x = c; // expression c is NOT a prvalue -- DOES persist after this line (it IS something you can keep accessing)
⚠️NOTE️️️⚠️
The name prvalue is short for pure right value. It's called that because prvalue expressions are usually found on the right side of an assignment.
An lvalue is an expression that is the opposite of a prvalue. An lvalue expression CAN use the address-of operator (opposite of point 1 above), it CANNOT have guts scooped out and moved into something else (opposite of point 2 above), and it DOES persist (opposite of point 3 above). The typical example of an lvalue is an expression that's solely a variable name or function name.
x = y; // both x and y are lvalue
x = 0; // x is an lvalue while 0 is a prvalue
The key takeaway with lvalues is that you might be able to copy over its contents to something else, but you can't scoop out its guts and move it over to something different. Doing so would make whatever that lvalue points to no longer usable.
⚠️NOTE️️️⚠️
The name xvalue is short for left value. It's called that because lvalue expressions are usually found on the left side of an assignment.
An xvalue is an expression which can have the address-of operator used on it but also can be moved. The general idea with an xvalue expression is that the object it represents is nearing the end of its lifetime and as such moving its guts is fine. There are a very limited number of cases where this happens or is required.
MyObject a {};
MyObject c { std::move(a) }; // std::move returns MyObject && type, which calls MyObject's move constructor
// a is in an invalid state
⚠️NOTE️️️⚠️
The example above is using features that haven't been introduced yet (std::move
, rvalue references, move constructor). Just ignore it if you don't know those pieces yet. They're explained in other sections.
This is in contrast to lvalue expressions, which the address-of operator is usable on but CANNOT be moved. If the address-of operator works on it, regardless of if it's moveable (xvalue) or not (lvalue), it's called a glvalue.
Similarly, if it's an expression that can be moved (gutted), it's called an rvalue regardless of if the address-of operator can be used on it or not.
⚠️NOTE️️️⚠️
See here for what I used to clarify what's going on here.
⚠️NOTE️️️⚠️
Source is this website.
C++ modules change how C++ source code files interface with each other. Normally, a C++ source / header file would use #include <...>
directives to pull in other source code files that it needs access to. Those outside source code files provided things like preprocessor macros, function declarations, class declarations, global variable constants, forward declarations, templates, etc...
Instead of dealing with source code files directly, C++ modules allow for independently "compiling" source code files and importing them for use into different source code files, similar to how a Java source code file imports compiled Java class files for use. Modules reduce some of the complexities of using header files but certain functionality is also gone. Specifically, before modules go through compilation, preprocessor macros and preprocessor directives aren't included.
To create a module from a single file, add export module
followed by the name of the module in the beginning of the file. Then, prefix export
to any function, enumeration, class, etc.. that the module should expose.
export module my_module;
export int add(int a, int b) {
return a + b;
}
export int multiply(int a, int b) {
return a * b;
}
To make use of a module in some other source code, use import
followed by the module's name.
import my_module;
int main() {
return add(1, 2);
}
Similar to how non-module C++ source code is broken up into a source file containing definitions and accompanying header file containing declarations, a module may also be broken up into separate definition and declaration files. The declarations go in a file with export module
at the top (as shown above) and the definitions go in a file with just module
. Declaration files aren't allowed to use export
at all.
// my_module.cpp
export module my_module;
export int add(int a, int b);
export int multiply(int a, int b);
// my_module_impl.cpp
module my_module; // no "export" in module declaration, meaning export not allowed anywhere else in this file
int add(int a, int b) {
return a + b;
}
int multiply(int a, int b) {
return a * b;
}
Modules may be broken up into several pieces using module partitions, with each piece in its own file, using colons (:).
// my_module_addition.cpp
export my_module:addition;
export int add(int a, int b) {
return a + b;
}
// my_module_multiplication.cpp
export my_module:multiplication;
export int multiply(int a, int b) {
return a * b;
}
// my_module.cpp
export module my_module;
export import :addition; // export everything under my_module:addition partition
export import :multiplication; // export everything under my_module:multiplication partition
Module partitions may be made non-exportable as well, similar to the definition / declaration example earlier. The parent would need to re-define anything it wants to explicitly export.
// my_module_addition.cpp
export my_module:addition;
int add(int a, int b) {
return a + b;
}
// my_module_multiplication.cpp
my_module:multiplication;
int multiply(int a, int b) {
return a * b;
}
// my_module.cpp
export module my_module;
import :addition;
import :multiplication;
export int add(int a, int b); // explicitly export this function (imported from my_module:addition partition)
export int multiply(int a, int b); // explicitly export this function (imported from my_module:multiplication partition)
Note that there can only ever be 1 parent for a partition. All partitions are a part of their parent module, not modules themselves. The parent module must import all of its partitions using either import
or export import
as shown in the examples above. No module can directly import a partition that doesn't belong to it.
One way to work around these restrictions is to simply make the partitions their own modules. The most common way to do this is to replace the colons (:) in each partition name with a dot (.), making sure to use the full name in the import lines (because the pieces being imported are no longer partitions of the parent module).
// my_module_addition.cpp
export my_module.addition;
export int add(int a, int b) {
return a + b;
}
// my_module_multiplication.cpp
export my_module.multiplication;
export int multiply(int a, int b) {
return a * b;
}
// my_module.cpp
export module my_module;
export import my_module.addition; // export everything under my_module.addition (FULL NAME USED)
export import my_module.multiplication; // export everything under my_module.multiplication (FULL NAME USED)
⚠️NOTE️️️⚠️
Last I recall using this, each compiler required a special flag to turn on modules. Just because your code uses modules doesn't mean the internal C++ libraries (e.g. standard template library, cstdint
, etc..) are going to expose things as modules. You still have to include those using the #include <...>
directives (maybe -- I think I remember there being some roundabout way of getting modules to work).
The preprocessor is a component of the C++ compiler. Before the programming statements in a source code file are compiled, the processor goes over the file looking for preprocessor directives. Preprocessor directives either...
The first case (text manipulation) is primarily what the preprocessor is used for. Unlike normal C++ programming statements, preprocessor directives start with the pound sign (#) and shouldn't include a semicolon (;) at the end.
To include one file in another file, use #include
. Local files should be wrapped in quotes while files coming from libraries should be wrapped in angled brackets.
#include <vector> // library header
#include "OtherClass.hpp" // local header
To replace strings in a file with another string, use #define
.
#define INITIAL_VALUE 500
int x {INITIAL_VALUE};
int y {INITIAL_VALUE};
To replace strings in a file with a parameterized replacement, use #define
with parenthesis.
#define ADDED_VALUE(x, y) x + y - 15
int x {ADDED_VALUE(1, 7)};
int y {ADDED_VALUE(5, 3)};
To stop replacing a string, use #undef
.
#define INITIAL_VALUE 500
int x {INITIAL_VALUE};
#undef INITIAL_VALUE
#define INITIAL_VALUE 8
int y {INITIAL_VALUE};
To conditionally include / ignore portions of a file, use an #ifdef
/ #else
/ #endif
block.
#ifdef INITIAL_VALUE
int x {INITIAL_VALUE};
#else
int x {ADDED_VALUE(1, 7)};
#endif
Similarly, #ifndef
may be used to conditionally include / ignore portions of a file (#ifndef
-- note the n, if NOT defined).
#ifndef INITIAL_VALUE
int x {ADDED_VALUE(1, 7)};
#else
int x {INITIAL_VALUE};
#endif
Conditional inclusion preprocessor directives come in an alternate form that allows for more flexible conditions: #if
/ #elif
/#else
/ #endif
block.
#if !defined INITIAL_VALUE
int x {1}
#elif INITIAL_VALUE > 50
int x {INITIAL_VALUE - 50}
#else
int x {INITIAL_VALUE}
#endif
⚠️NOTE️️️⚠️
Compiler / compilation options may be controlled through #pragma
s. I've left #pragma
s out of the document because they're specific to the compiler and platform.
High-level languages are typically very consistent. For example, except for a handful of small things, Java's runtime and core libraries are consistent across different platforms (e.g. Windows vs Linux), architectures (e.g. ARM vs x86), and compilers (e.g. OpenJDK vs Eclipse compiler). C++ has much less consistency than those other high-level languages because it has to support more platforms and architectures. In addition, having less consistency sometimes allows for more aggressive optimization during compilation.
Inconsistencies comes in three different types:
valid | documented | |
---|---|---|
Implementation-defined behaviour | YES | YES |
Unspecified behaviour | YES | NO |
Undefined behaviour | MAYBE | NO |
Implementation-defined behaviour is behaviour that varies between implementations, where that behaviour is valid (e.g. no hard crash) and documented. The obvious example is with numeric data types: short
, int
, float
, etc.. will each have a different minimum and maximum across different platforms:
short
is from SHORT_MIN
to SHORT_MAX
.int
is from INT_MIN
to INT_MAX
.⚠️NOTE️️️⚠️
Someone posted this as a comprehensive list of implementation-defined behaviour.
Unspecified behaviour is behaviour that varies between implementations, where that behaviour is valid (e.g. no hard crash) but not documented. The obvious example is the order in which operands are evaluated in an expression. For example, consider the following statement ...
int x {bird_func() / cow_func()};
The results of bird_func()
and cow_func()
may be gotten in any order prior to performing the division. There is no requirement as to which one gets invoked first. The division itself with the correct operands in the correct spots, but which function gets called first is up to the compiler.
// option1 -- bird_func() evaluated first
int a {bird_func()};
int b {cow_func()};
int x {a / b};
// option2 -- cow_func() evaluated first
int b {cow_func()};
int a {bird_func()};
int x {a / b};
Another example is the memory representation of core types (e.g. integral types). The platform's memory layout could be either big-endian, little-endian, or some other uncommon memory layout.
int x {5};
// big endian: 00 00 00 05 (e.g. ARM)
// little endian: 05 00 00 00 (e.g. x86)
The above doesn't matter unless you're trying to read raw contents of core types (e.g. for serializing classes to disk).
⚠️NOTE️️️⚠️
This may not be the case anymore in C++20 because the C++ standard library has std::endian
, which tells you the endian-ness of scalar types.
⚠️NOTE️️️⚠️
I wasn't able to find a comprehensive list of what the C++ spec considers as unspecified behaviour.
↩PREREQUISITES↩
⚠️NOTE️️️⚠️
According to documentation online: "Compilers are not required to diagnose or do anything meaningful when undefined behaviour is present. Correct C++ programs are free of undefined behaviour". Not exactly sure how to fix some scenarios to be "free" of undefined behaviour. Specifically, there are a lot of cases where signed integer overflow (described below) happens, but that's undefined behaviour. I read online that the way to handle these cases is to test at the beginning of the function if overflow is possible and bail out if it is, but there's no built-in C++ mechanism to do that.
The statement and the examples below, were lifted from here.
Undefined behaviour is behaviour that is unrestricted and not documented. The compiler may do anything for code producing undefined behaviour. For example, code producing undefined behaviour could end up ...
None of the examples have to be consistent. For example, it could produce a hard crash some of the time and the intended results the rest of the time.
Signed integer overflow
Although signed integers are guaranteed to be two's complement (as of C++20), what happens when a signed integer overflows is still undefined behaviour. In many cases, the compiler will treat signed integer operations as if overflowing isn't possible. For example, consider the following function ...
bool test(int x) {
return x < x + 1;
}
What may happen: The compilers will optimize away the return expression to always return true
. Had signed integer overflow NOT been undefined behaviour, the function would return true
except in the case where x == INT_MAX
: When x == INT_MIN
, the expression x + 1
would rollover to INT_MIN
, leading x < x + 1
to evaluate to false
.
Array out of bounds access
Array out of bounds access typically ends up touching memory past the array's boundaries.
int data[5] {1,2,3,4,5};
data[65535] {15};
What may happen: Out of bounds data access will result in either...
Uninitialized scalar
Uninitialized scalars are scalars that are read before being written to.
int x;
std:cout << x; // what's in x?
What may happen: Uninitialized scalars contain junk data (e.g. whatever was in the memory before).
Invalid scalar
When a scalar gets reinterpreted as something else (e.g. a byte array) and its contents are manipulated, reading from that original scalar is undefined behaviour.
// EXAMPLE FROM https://en.cppreference.com/w/cpp/language/ub
int f() {
bool b {true};
unsigned char* p {reinterpret_cast<unsigned char*>(&b)};
*p {10};
// reading from b is now UB
return b == 0;
}
What may happen: modifications on the reinterpretation are treated as if it never happened.
Null pointer dereference
// EXAMPLE FROM https://en.cppreference.com/w/cpp/language/ub
int foo(int* p) {
int x {*p};
if(!p) return x; // Either UB above or this branch is never taken
else return 0;
}
int bar() {
int* p {nullptr};
return *p; // Unconditional UB
}
What may happen: Trying to read or write to a dereferenced nullptr
will cause a crash.
Side-effect free infinite loops
A side-effect free infinite loop is a loop that goes on forever but doesn't change anything outside of its own scope (e.g. no global variable is changed, nothing is printed to standard out, etc..).
// EXAMPLE FROM https://en.cppreference.com/w/cpp/language/ub
while (1) {
if (((a*a*a) == ((b*b*b)+(c*c*c)))) return 1;
a++;
if (a>MAX) { a=1; b++; }
if (b>MAX) { b=1; c++; }
if (c>MAX) { c=1;}
}
What may happen: Side-effect free infinite loops are removed entirely.
⚠️NOTE️️️⚠️
I wasn't able to find a comprehensive list of what the C++ spec considers as undefined behaviour. The above examples were taken from cppreference.
↩PREREQUISITES↩
By default, C++ comes packages with the C++ standard library. You can think of this as C++'s version of core Java packages: collection classes in java.util
, thread classes in java.util.concurrent
, IO classes in java.io
, etc... In addition, several third-party C++ libraries exist that provide commonly needed functionality. You can think of these are as C++'s version of common Java libraries: Guava, Apache Commons, etc...
Common third-party C++ libraries:
The subsections below detail important functionality across the many C++ libraries in existence. If the functionality being documented is for a third party library, it'll be signalled in some way (e.g. namespace / header files / comments used in sample code will make it apparent).
The C++ standard library includes a set of templated classes that detects the traits of a type at compile-time. This is useful in cases where template parameters need to be restricted.
template<typename T>
T test(T t) {
static_assert(std::is_integral<T>::value, "Must be integral");
return t + 1;
}
test(4); // OK
test(4ULL); // OK
test(4.09); // FAIL -- 4.09 is a floating point number, not an integral number
The value
field is true
/ false
depending on if the type passes the check. A shortcut in later versions of C++ is to append _v
to the name of the class performing the check rather than explicitly querying the value
field (e.g. std::is_integral<T>::value
vs std::is_integral_v<T>
).
List of useful checks:
std::is_signed
- ensures a type is signed.std::is_unsigned
- ensures a type is unsigned.std::is_integral
- ensures a type is an integer (e.g. short int
, int
, unsigned long long int
, etc..)std::is_pod
- ensures a type is a POD.std::is_fundamental
- ensures a type is a fundamental type.std::is_abstract
- ensures a type is an abstract class (has at least one pure virtual function).std::is_copy_constructible
- ensures type has a copy constructor.std::is_copy_assignable
- ensures type has copy assignment.std::is_move_constructible
- ensures type has a move constructor.std::is_nothrow_move_constructible
- ensures type has a move constructor that never throws an exception (noexcept
).std::is_move_assignable
- ensures type has move assignment.🔍SEE ALSO🔍
Type traits may also be manipulated at compile-time via a set of templated classes.
template<typename T>
auto test(T t) {
using R = std::make_unsigned<T>::type; // R is same type as T but unsigned (if it already isn't)
R x { t + 1 };
return x;
}
The type
field contains the name type. A shortcut in later versions of C++ is to append _t
to the name of the class doing the manipulation rather than explicitly querying the type
field (e.g. std::make_unsigned<T>::type
vs std::make_unsigned_t<T>
).
List of useful conversions:
std::remove_cv
- remove const
and / or volatile
.std::remove_const
- remove const
.std::remove_volatile
- remove volatile
.std::remove_pointer
- make into non-pointer type (removes a *
from the type).std::remove_reference
- make into non-reference type.std::add_cv
- add const
and / or volatile
.std::add_const
- add const
.std::add_volatile
- add volatile
.std::add_pointer
- make into pointer type (adds a *
to the type).std::add_lvalue_reference
- make into a lvalue reference type.std::add_rvalue_reference
- make into a rvalue reference type.std::make_signed
- make into an equivalent version of the same type that's signed.std::make_unsigned
- make into an equivalent version of the same type that's unsigned.⚠️NOTE️️️⚠️
Are constant expressions used to write these checks and transformations? Maybe.
🔍SEE ALSO🔍
Allocators allow for customizing how objects are allocated and deallocated. Some library APIs allow you to provide a custom allocator rather than using the typical new
/new[]
and delete
/delete[]
operators. In certain scenarios where performance is important (e.g. gaming, simulations, high-frequency trading, etc..), custom allocators are often used. A custom allocator could increase performance by ...
By default, libraries that support custom allocators will default to std::allocator
, which just wraps the new
/new[]
and delete
/delete[]
operators.
To implement a custom allocator, you need to create a templated class with a single template parameter representing the type being allocated / deallocated. The class must have the following traits...
value_type
corresponding to the template parameter.template <typename T>
struct MyAllocator {
using value_type = T; // 3
MyAllocator() noexcept{ } // 1
template <typename U>
MyAllocator(const MyAllocator<U>&) noexcept { } // 2 (why is this here? https://docs.microsoft.com/en-us/cpp/standard-library/allocator-class?view=msvc-170#allocator)
T* allocate(size_t n) { // 4
auto ret = operator new(sizeof(T) * n);
std::cout << "allocated!" << ret << "\n";
return static_cast<T*>(ret);
}
void deallocate(T* p, size_t n) { //5
std::cout << "deallocated!" << ret << "\n";
operator delete(p);
}
};
template <typename T1, typename T2>
bool operator==(const MyAllocator<T1>&, const MyAllocator<T2>&) { // 6 (why is this here? https://stackoverflow.com/a/30654267)
return true; // always because this class retains no state
}
template <typename T1, typename T2>
bool operator!=(const MyAllocator<T1>&, const MyAllocator<T2>&) { // 6 (why is this here? https://stackoverflow.com/a/30654267)
return false; // always because this class retains no state
}
⚠️NOTE️️️⚠️
I don't fully understand what the copy constructor and the operator overloads are for. The copy constructor seems to be for cases where you pass in an allocator to some container class (e.g. vector
) but that container class needs to allocate more than just the type you're interested in. For example, the allocator may be for creating int
s (e.g. template parameter T
= int
) but the container class you're storing those int
s may have bookkeeping structures that it wraps each int
in (e.g. each int
is wrapped as a Node
object which also contains some extra pointers). This copy constructor "repurposes" the allocator, allowing you to to pass in MyAllocator<int>
but have it repurposed to MyAllocator<SomeOtherTypeHere>
.
But if you're copying the guts of one allocator into another but both keep on allocating and deallocating, won't they trip up over each other?
I haven't been able to find answers online as to what's going on here. The book just seems to hand wave it away.
⚠️NOTE️️️⚠️
Why not just do operator overloading for the new operator? The answer seems to be that allocators are simpler to deal with and handle wider scenarios (such as the concatenating example in the note above).
↩PREREQUISITES↩
🔍SEE ALSO🔍
Smart pointers are classes that wrap pointers to dynamically objects. These wrappers provide some level of automated pointer management / memory management through the use of move semantics, copy semantics, and RAII.
The subsections below document some common smart pointers and their usages.
(non-moveable, non-copyable)
A scoped pointer wraps a pointer to an existing dynamic object / dynamic array and invokes delete
/ delete[]
on it once it exits the current scope. It explicitly turns off class copy semantics and move semantics, meaning that copying a scoped pointer or moving it isn't allowed.
if (x == 123L) {
boost::scoped_ptr<int> ptr { new int {5} };
x += *ptr; // like a real pointer, use indirection operator / member of pointer operator
} // ptr destroyed via delete operator at the end of if block (RAII)
Scoped pointers come in two flavours:
scoped_ptr
: pointer to a dynamic object.scoped_array
: pointer to a dynamic array.boost::scoped_ptr<int> ptr { new int {5} };
boost::scoped_array<int> ptr { new int[4] {1, 1, 1, 1} }};
Although the official move semantics of a scoped pointer are to deny moves, it does provide a ...
swap()
function that lets you swap the dynamic objects between two scoped pointers.destroy()
function that destroys the current dynamic object and sets it to the argument passed in, if any.In addition, it's possible to have an unset scoped pointer (nullptr
). An unset scoped pointer won't attempt to destroy an object when it goes out of scope.
↩PREREQUISITES↩
(moveable, non-copyable)
A unique pointer supports all the same features as a scoped pointer, except that it also supports moving: The ownership of the pointer that unique pointer has is transferable to another unique pointer via move semantics (e.g. assignment operator move constructor). Like scoped pointer, a unique pointer doesn't support copying.
if (x == 123L) {
std::unique_ptr<int> ptr { new int {5} };
x += *ptr;
std::unique_ptr<int> ptr2 { std::move(ptr) }; // ALLOWED: move ptr into ptr2 -- ptr is invalid after this point
x -= *ptr2;
}
Unlike scoped pointers, a unique pointer uses the same class for both dynamic objects and dynamic arrays.
std::unique_ptr<int> ptr { new int {5} };
std::unique_ptr<int[]> ptr { new int[4] {1, 1, 1, 1} }};
⚠️NOTE️️️⚠️
Look at the template parameters in the example above. It's important that you add []
into the template parameter when you're dealing with arrays so the destroy dynamic array operator (delete[]
) gets used. If the destroy dynamic object operator (delete
) is used for an array, it's undefined behaviour. Likewise, don't add []
into the template parameter if you aren't dealing with arrays.
In older versions of C++, the templated function std::make_unique()
was provided to create unique pointers because the normal way (shown in the example above) has subtle edge cases that could result in memory leaks. Newer versions of C++ fixed the memory leak problem, so using std::make_unique()
isn't necessary but it's still available for backwards compatibility.
// following two are equivalent
std::unique_ptr<int> ptr { new int {5} };
std::unique_ptr<int> ptr = std::make_unique<int>(5); // make_unique automatically calls new
Unique pointers don't support custom allocators: You pass the pointer you want to track directly into the constructor or create it via std::make_unique()
. But, unique pointers do support taking in a function-like object (e.g. function, functor, lambda) to invoke instead of using delete
/ delete[]
on the tracked pointer. This is useful in cases where the pointer is being tracked ...
new
/ new[]
(e.g. FILE *
created using fopen()
).For example, a unique pointer that points to a memory mapped file region shouldn't call delete[]
when it goes out of scope because it isn't actually pointing to a dynamic array. Instead, it should invoke the relevant function(s) that release a memory mapped file region.
auto custom_deleter = [](int* x) {
std::cout << "Deleting an int at " << x;
delete x;
};
std::unique_ptr<int, decltype(custom_deleter)> ptr{ new int {5}, custom_deleter };
// NOTE: Types have to match -- if the unique pointer is for an "int", the custom deleter should take in an "int *"
↩PREREQUISITES↩
(moveable, copyable)
A shared pointer tracks the number of copies it has in existence and only destroys the dynamic object it points to once the number of copies reaches 0 (reference counting). The reference count increments when a new copy is made (e.g. copy constructor) and decrements when a copy is destroyed (e.g. goes out of scope). Moves don't modify the reference count.
std::shared_ptr<int> ptr { new int {5} }; // ref count 1
if (x == 123L) {
std::shared_ptr<int> ptrCopy { ptr }; // ref count 2
x += *ptrCopy;
} // ref count back to 1 because ptrCopy destroyed here
x -= *ptr2;
⚠️NOTE️️️⚠️
There's a version of shared pointer in Boost and one in the C++ standard library. The Boost version is legacy.
The construction of shared pointers is similar to the construction of unique pointers. You can either call the constructor directly or you can use the templated function std::make_shared()
. Where as std::make_unique()
was a legacy creation mechanism for unique pointers, std::make_shared()
is the preferred creation mechanism for shared pointers. That's because shared pointers require a "control block" that holds onto tracking information (e.g. reference counts), and using std:make_shared()
allocates that control block along with the dynamic object in one single allocation (better performance: one allocation vs two allocations).
std::shared_ptr<int> ptr { std::make_shared<int>(5) }; // allocates the control block and object together
std::shared_ptr<int> ptr { new int {5} };
std::shared_ptr<int[]> ptr { new int[4] {1, 1, 1, 1} }};
⚠️NOTE️️️⚠️
The book says that sometimes you want to avoid std::make_shared
because you might need the control block even if the dynamic object goes away (mentions weak pointers). This isn't possible if the control block and the dynamic object are allocated as one, because if they're allocated as one then you can't individually delete them (you can only delete both things at once).
Like unique pointer, shared pointer's constructor may take in a custom deleter. In addition, it may also take in a custom allocator. The custom allocator has nothing to do with the underlying pointer -- it gets used for allocating and deallocating the control block.
auto object_deleter = [](int* x) {
std::cout << "Deleting an int at " << x;
delete x;
};
auto control_block_allocator { std::allocator<void>{}} ;
std::shared_ptr<int> ptr{ new int {5}, object_deleter, control_block_allocator };
⚠️NOTE️️️⚠️
There are no template parameters for the deleter or allocator. You just pass them in as the last two constructor arguments and it should just work. The book is saying that they were left out for "complicated reason".
It isn't possible to use a custom allocator with std::make_shared()
. It forces the use of new
/ new[]
and delete
/ delete[]
for single combined allocation of the control block and the dynamic object. To perform a single combined allocation and use a custom allocator for that allocation, you need to use std::allocate_shared()
instead.
auto my_allocator { std::allocator<void>{}} ;
std::shared_ptr<int> ptr{ std::allocate_shared<int>(my_allocator, 5) };
⚠️NOTE️️️⚠️
There is no template parameter for the allocator. You just pass it in as the first constructor argument and it should just work.
There is no custom deleter with std::allocate_shared
because the deletion happens via the allocator. Both the control block and the dynamic object are being allocated and deallocated together.
In certain cases, a class may want to return shared pointers to itself (a shared pointer to its this
pointer). A class can't ...
To work around these problems, C++ provides the std::enable_shared_from_this
base class that you can inherit from.
struct MyClass : public std::enable_shared_from_this<MyClass> {
shared_ptr<MyClass> getSharedPointer() {
return shared_from_this(); // special function provided by the base class
}
}
⚠️NOTE️️️⚠️
See here for more information.
↩PREREQUISITES↩
A weak pointer is essentially a shared pointer that doesn't increment or decrement the reference count. At any moment, it can generate an actual shared pointer via its lock()
method, thereby increasing the reference count.
std::shared_ptr<int> sp1 { new int {5} }; // ref count = 1
std::weak_ptr<int> wp{ sp }; // ref count = 1
std::shared_ptr<int> sp2 { wp.lock() }; // ref count = 2
std::shared_ptr<int> sp3 { wp.lock() }; // ref count = 3
If the shared pointer reference count has already reached 0 when lock()
is invoked, the returned shared pointer will be empty.
std::weak_ptr<int> wp {}; // unset weak pointer
{
std::shared_ptr<int> sp { new int {5} }; // create a new shared pointer
wp = sp; // assign the shared pointer to the weak pointer
} // scope ends, shared pointer destroyed (reference count drops from 1 to 0, meaning object is deleted)
std::shared_ptr<int> sp { wp.lock() };
bool isEmpty = (sp == std::nullptr); // isEmpty will be true
⚠️NOTE️️️⚠️
There's a version of weak pointer in Boost and one in the C++ standard library. The Boost version is legacy. Each weak pointer version is tied to the shared pointer from its library. For example, if you're using the weak pointer in Boost, you need to use it with the shared pointer from Boost.
The typical use-cases for weak pointers are ...
For cyclical references point above, what it means is that shared pointers forming a cycle will never reach a reference count of 0.
In the example above...
Nothing else holds these shared pointers. They all reference each other, meaning that none of the shared pointer reference counts will never reach 0 (memory leak).
↩PREREQUISITES↩
An intrusive pointer invokes the free function intrusive_ptr_add_ref()
on any allocation and intrusive_ptr_release()
on any deallocation. Both functions take in a single argument: a pointer of the type being allocated / deallocated.
size_t cnt {};
void intrusive_ptr_add_ref(int* ptr) {
cnt++;
}
void intrusive_ptr_release(int* ptr) {
cnt--;
}
boost::intrusive_ptr<int> a { new int{5} }; // after this, cnt will be 1
{
boost::intrusive_ptr<int> b { new int{6} }; // after this, cnt will be 2
boost::intrusive_ptr<int> c { a }; // after this, cnt will be 3
} // at the end of this scope, cnt will be 1 again
Where as a shared pointer keeps a count of how many copies of itself are live (reference count), an intrusive pointer is typically used for keeping count of how many of some specific pointer type are live. In the example above it's tracking int
, but you can track more types by simply overloading intrusive_ptr_add_ref()
and intrusive_ptr_release()
.
⚠️NOTE️️️⚠️
The book mentions that this is useful in cases where the OS or framework requires some cleanup operation once the last instance of some type goes away (e.g. the old school Windows component object model).
There are utility classes that wrap one or more other objects, such as optionals or tuples. They either provide some type of extra functionality or provide abstractions that make code easier to handle and reason about.
The subsections below document some common wrappers classes and their usages.
std::optional
is a wrapper that either holds on to an object or is empty (similar to Java or Python's optional class).
std::optional<int> take(int x) {
if (x < 0) {
return std::nullopt; // nullopt = empty optional
}
return std::optional<int> { x * x };
}
As shown in the example above, the typical usecase for optional is to have a function return an empty optional on failure.
⚠️NOTE️️️⚠️
Other strategies for reporting failure are throwing an error and returning an error code along with the object.
In addition to the optional provided by the C++ standard library, Boost provides its own version of optional boost::optional
as well as provides an optional-like boolean type called tribool: boost::logic::tribool
. A tribool has a third state in addition to true and false, called indeterminate. Boolean operations where one of the operands is a boolean value and the other is indeterminate will always result in false (tribools convert to booleans via implicit conversion).
🔍SEE ALSO🔍
boost::logic::tribool tb { boost::logic::indeterminate };
bool x {tb == true}; // false
bool y {tb == false}; // false
bool z {!tb}; // false
The typical usecase for tribool is for operations that take a long time to complete. A tribool may be set as indeterminate while the operation is running, then be set to true (success) or false (failure) once the operation completes.
std::tuple
is a templated class that holds on to an arbitrary number of elements of arbitrary types. The number of elements and types of elements must be known at compile-time, and any code accessing those elements must know which element it's accessing at compile-time.
std::tuple<int, long, MyClass> my_tuple{ 1, 500L, MyClass{} };
auto& x { std::get<0>(my_tuple) };
auto& y { std::get<1>(my_tuple) };
auto& z { std::get<2>(my_tuple) };
// OR you can use structured binding
auto& [x, y, z] = my_tuple;
Note how the elements are being accessed using std::get()
, which requires the index being accessed passed in as a template parameter. Elements your code accesses must be known at compile-time, meaning you can't evaluate some expression at run-time to determine which index to access like you can with tuples in other high-level languages (e.g. Python).
If all the types in for a tuple are different, the type itself can be passed into std::get()
.
std::tuple<int, long, MyClass> my_tuple{ 1, 500L, MyClass{} };
auto& x { std::get<int>(my_tuple) };
auto& y { std::get<long>(my_tuple) };
auto& z { std::get<MyClass>(my_tuple) };
⚠️NOTE️️️⚠️
I guess the way to think about tuples is that they're short-hand for PODs. Declaring a tuple is like creating a custom POD where each element of the tuple is a member variable of the POD.
Pairs are special cases of tuples where they're restricted to exactly two elements. Accessing the elements in a pair is done through the first
and second
member variables.
std::pair<int, long> my_pair{ 1, 500L };
auto& x {my_pair->first};
auto& y {my_pair->second};
// OR you can use structured binding
auto& [x, y] = inimitable_duo;
Boost also provides a version of pair, boost::compressed_pair
, except that it's slightly more efficient when either of the template parameters points to an empty class.
struct EmptyClass {};
std::pair<int, EmptyClass> p {5, EmptyClass{} };
boost::compressed_pair<int, EmptyClass> cp {5, EmptyClass{} }; // this one consumes less memory
⚠️NOTE️️️⚠️
There's a helper function called std::make_tuple()
/ std::make_pair()
which makes tuples / pairs but has problems when the type is a reference. Be aware of that if you decide to use it. See here.
std::any
is a wrapper that can hold on to an object of unknown type (a type that isn't known at compile-time).
std::any wrapper {};
wrapper.emplace<MyClass> { arg1, arg2 };
auto v1 = std::any_cast<MyClass>(wrapper); // ok
auto v2 = std::any_cast<BadType>(wrapper); // should throws std::bad_any_cast
To place an object into the wrapper, use emplace()
. This creates a new object and places it into the wrapper, destroying the object previously held. The type of the object is passed in as a template parameter argument while the function arguments are used to initialize that object (e.g. passed directly to constructor).
Accessing the object within the wrapper is done via std::any_cast()
. The function argument is the wrapper itself and the template parameter argument is the type of object you think is being held. If the object is of a different type, the function throws std::bad_any_cast
instead of returning the object.
⚠️NOTE️️️⚠️
The closest Java analog I could think of is the base class hierarchy where all Java objects have to derive from java.lang.Object
. You can accept a type of Java.lang.Object
and cast it at runtime to the correct type (or one of its ancestors). The C++ any class provides similar functionality to that.
Boost also provides a version of this wrapper, boost::any
.
↩PREREQUISITES↩
std::variant
is a type-restricted form of the std::any
. Where as with the std::any
you can hold on to an object of any type, with std::variant
you can hold on to an object of one of several predefined types.
std::variant<int, float, MyClass> wrapper {}; // may hold on to either int, float, or MyClass
wrapper.emplace<MyClass> { arg1, arg2 };
auto v1 = std::get<MyClass>(wrapper); // ok
auto v2 = std::get_if<MyClass>(wrapper); // ok
auto v3 = std::get_if<int>(wrapper); // returns nullptr
auto v4 = std::get<int>(wrapper); // throws std::bad_variant_exception
auto which_type = wrapper.index(); // returns 2
To determine which of the allowed types is currently held, use index()
.
To access data, use std::get()
where the template parameter argument is the type you're interested in (similar to how data access is done in std::tuple
). If the variant isn't holding an object of the type trying to be extracted, std::get()
will throw std::bad_variant_exception
. To avoid an exception, use std::get_if()
-- it will return nullptr
rather than throw an exception.
Unlike std::any
, std::variant
cannot be left unset (it must hold on to an object). Initially, it creates and holds on to an object of the first type in its allowed types list. That means the first type in its allowed types list must be constructible with empty initializer arguments (e.g. default constructor). In the example above, the first allowed type is int
, meaning that the variant starts off by holding on to an int
created using an empty initializer (will have value of 0).
The easiest way to work around this problem is to set the first type to std::monostate
. This allows your variant to be unset. Trying to call std::get()
on an unset variant throws std::bad_variant_access
.
std::variant<std::monostate, int, float, MyClass> wrapper {}; // may hold on to nothing, int, float, or MyClass
auto which_type = wrapper.index(); // returns 0
auto v1 = std::get<MyClass>(wrapper); // throws std::bad_variant_access
If you have a set of single parameter functions with the same name (function overload), where those parameters contains all the types in a variant's allowed types list, you can use std::visit()
to automatically pull out the object contained in the variant and call the appropriate function overload with that object as the argument.
std::variant<int, float> wrapper { 0 }; // may hold on to either int, float
auto res1 { std::visit([](auto& x) { return 5 * x; }, wrapper) }; // call into a generic lambda / functor
auto res2 { std::visit([](auto &x) { return my_function(x); }, wrapper) }; // call into an overloaded free function (via a generic lambda / functor)
In addition to the usage above, std::visit()
is typically used in two other ways. The first is to pass in a functor with callable operator overrides for each type in the variant.
struct MyFunctor {
void operator()(int& x) { std::cout << "int" << x; }
void operator()(float& x) { std::cout << "float" << x; }
void operator()(auto& x) { std::cout << "OTHER"; }
};
std::variant<int, float> wrapper { 0 };
std::visit(MyFunctor{}, wrapper);
The second is to use the "overload trick". The "overload trick" is a templated class that combines multiple callable units into a single functor, where that single functor encompasses all of the overloads of those callable units.
template<class... Ts>
struct overloaded : Ts... {
using Ts::operator()...;
};
template<class... Ts>
overloaded(Ts...) -> overload<Ts...>; // line not needed in C++20...
std::variant<int, float> wrapper { 0 };
std::visit(
overloaded {
[](int& x) { std::cout << "int" << x; },
[](float& x) { std::cout << "float" << x; },
[](auto &) { std::cout << "OTHER"; }
},
wrapper
);
⚠️NOTE️️️⚠️
std::visit()
can take in multiple std::variant
objects. The callable unit passed in needs to have overloads for the cartesian product of the variant types (param at index n covers the types of the variant at index n). For example, calling std::visit()
with std::variant<int, float>
amd std::variant<char, long>
requires that the callable unit have overloads for (int, char), (int, long), (float, char), and (float, long).
Boost also provides a version of this wrapper, boost::variant
.
🔍SEE ALSO🔍
std::function
is a standardized wrapper for function-like objects.
void print_num(int i){
std::cout << i << '\n';
}
struct PrintNum {
void operator()(int i) const {
std::cout << i << '\n';
}
};
std::function<void(int)> f1 { print_num };
std::function<void(int)> f2 { PrintNum };
🔍SEE ALSO🔍
The typical use-case for std::function
is to provide a function with a unified way to accept all function-like objects (e.g. functors and function pointers) as a parameter. The alternative would be to explicitly provide an overload for each function-like object type.
void call_func_with_42(std::function<void(int)> func) {
func(42);
}
std::bind()
takes in a function-like object and hardcodes some of its parameters via a new proxy functor.
int add_numbers(int i, int j, bool print) {
int res { i + j };
if (print) {
std::cout << (i + j) << '\n';
}
return res;
}
auto add_to_47 = std::bind(
add_numbers,
std::placeholders::_1,
47,
std::placeholders::_2
); // add_to_47() proxies add_numbers(), but hardcodes its 2nd param to 47.
add_to_47(3, true); // prints "50"
The first parameter of std::bind()
is the function-like object to proxy, while subsequent parameters define parameter mappings. Specifically, if the argument for a subsequent parameter is ...
std::placeholders::_N
, the proxy functor's N
th parameter is forwarded to the original function-like object's parameter at that position (e.g. the 2nd parameter of add_to_47()
is mapped to the 3rd parameter of add_numbers()
).47
is mapped to the 2rd parameter of add_numbers()
).⚠️NOTE️️️⚠️
Testing on godbolt, the proxy functor generated by std::bind
requires at least as many parameters as the maximum placeholder value. So for example, if you only used std::placeholders::_1
and std::placeholders::_5
, the proxy functor still requires that you supply something for 2nd, 3rd, and 4th parameters of the proxy functor even though they aren't mapped to anything. You can also supply arguments for a 6th, 7th, 8th, 9th, ... parameter, even though they aren't mapped using std::placeholder
s, and it'll be fine. You just can't supply less than 5.
For situations where all hardcoded parameters are at the front (contiguous), std::bind_front()
is a more terse alternative. The first parameter of std::bind_front()
is the function-like object to proxy, while subsequent parameters list out the hardcoded mappings for parameters 1, 2, 3, etc.. All remaining parameters will implicitly become std::placeholder::_N
s that auto-increment from 1.
int add_numbers(int i, int j, bool print) {
int res { i + j };
if (print) {
std::cout << (i + j) << '\n';
}
return res;
}
auto add_to_47 = std::bind_front(add_numbers, 47); // add_to_47() proxies add_numbers(), but hardcodes its 1st param to 47.
add_to_47(3, true); // prints "50"
⚠️NOTE️️️⚠️
Why not just use a lambda instead of std::bind()
/ std::bind_front()
? Probably because this is more terse. With a lambda, you likely will need auto &&
for each parameter, decltype(auto)
for the return, and std::forward
s for each argument being pushed into the original function-like object.
↩PREREQUISITES↩
🔍SEE ALSO🔍
std::reference_wrapper
is a wrapper that holds a reference to an object. This is important because, in C++, you can't have a reference to a reference like you can have a pointer to a pointer. References, from a usage perspective, are treated as if they're the direct object themselves.
int ** x { ... }; // OK: x is a pointer to a pointer to a n integer
int && y { ... }; // BAD: y is an rvalue reference, NOT a reference to a reference
std::reference_wrapper<std::reference_wrapper<int>> z{ ... }; // OK: z is a reference wrapper to a reference wrapper
To create a std::reference_wrapper
, use std::ref()
. To access the value referenced to by a reference wrapper, use get()
.
const int a { 5 };
const std::reference_wrapper<const int> aWrapped{ std::ref(a) };
const int b { aWrapped.get() };
const int c { aWrapped }; // This can also work because std::reference_wrapper provides implicit conversion
std::reference_wrapper
s are especially useful for containers. Normally, containers won't allow you to store references. The only options are to store either full objects or pointers to objects.
std::vector<int> vec2 {}; // OK: stores ints
std::vector<int *> vec2 {}; // OK: stores int pointers
std::vector<int &> vec3 {}; // BAD: not allowed
While storing pointers seems like a good alternative to storing references, certain container types may not work as expected with pointers. For example, unordered associative containers like std::unordered_set
will use the default pointer template specializations for std::hash
and std::equal_to
, meaning that the container determines equality by inspecting the pointer rather than the object it points to.
int a {0}; int b {1}; int c {1}; int d {1}; int e {1};
// The following outputs 1 0
std::unordered_set<int> vec2 { a, b, c, d, e };
for (auto e : vec2) {
std::cout << e << ' ';
}
std::cout << std::endl;
// The following outputs 1 1 1 1 0
std::unordered_set<int *> vec1 { &a, &b, &c, &d, &e };
for (auto e : vec1) {
std::cout << *e << ' ';
}
std::cout << std::endl;
Using std::reference_wrapper
allows for the template specializations of the object itself to be used. The only caveat is that the template specializations need to be declared directly in the container type.
// The following outputs 1 0
std::unordered_set<
std::reference_wrapper<int>, // type to store
std::hash<int>, // std::hash specialization to use
std::equal_to<int> // std::equal_to specialization to use
> vec3 { a, b, c };
for (auto e : vec3) {
std::cout << e.get() << ' ';
}
std::cout << std::endl;
// HOW CAN THE ABOVE WORK if the stored type is std::reference_wrapper<int> but hash/equal_to accept only int? Recall that
// std::reference_wrapper<int> can implicitly convert to its underlying type. For example, the following is equivalent ...
int & x { (*vec3.begin()).get() };
int & y { *vec3.begin() }; // references same object as X
// As such, when std::hash<int> / std::equal_to<int> to get invoked, they get passed in a std::reference_wrapper<int> which
// implicitly converts to int.
Common pattern for different container types:
std::vector<std::reference_type<T>>
std::unordered_set<std::reference_type<K>, std::hash<K>, std::equal_to<K>>
std::unordered_map<std::reference_type<K>, V, std::hash<K>, std::equal_to<K>>
std::ordered_set<std::reference_type<K>, std::less_than<K>>
std::ordered_map<std::reference_type<K>, V, std::less_than<K>>
⚠️NOTE️️️⚠️
Another option is to go ahead and use pointers, but rather than specifying std::hash<K>
/ std::equal_to<K>
/ std::less_than<K>
in the template parameters, create custom functor that access data on the object being pointed to.
struct custom_less_functo {
constexpr bool operator()(const MyObject* & lhs, const MyObject* & rhs) const {
return lhs->val1 < rhs->val1 || lhs->val2 < rhs->val2;
}
}
std::ordered_set<MyObject*, custom_less_functor_for> s { ... }
↩PREREQUISITES↩
std::invoke()
is an implementation of a concept in the C++ standard library known as "INVOKE". The concept defines how generic library functions treat arguments that are callbacks, where those callbacks may not be callable units at all. At its simplest, std::invoke()
will simply invoke whatever callable unit you give it with whatever arguments you give to it.
auto x { [](int a, int b) { return a + b; }};
auto res { std::invoke(x, 3, 2) }; // equivalent to calling x directly: "x(3,2)"
When passed a member function, the first argument is expected to be this
while the remaining arguments are the member function's arguments. It doesn't matter if this
is a reference, a pointer, or a smarter pointer. It'll work regardless.
struct X {
int run(int a, int b) { return a + b; }
};
X x{};
auto res1 { std::invoke(&X::run, x, 3, 2) }; // OKAY - this is a reference
auto res2 { std::invoke(&X::run, std::ref(x), 3, 2) }; // OKAY - this is a reference_wrapper
auto res3 { std::invoke(&X::run, &x, 3, 2) }; // OKAY - this is a pointer
When passed a member variable that's a data type, the data itself is returned. Just like with member functions, the first argument is expected to be this
.
struct X {
int val { 55 };
};
X x{};
auto res1 { std::invoke(&X::val, x) }; // OKAY - this is a reference
auto res2 { std::invoke(&X::val, std::ref(x)) }; // OKAY - this is a reference_wrapper
auto res3 { std::invoke(&X::val, &x) }; // OKAY - this is a pointer
⚠️NOTE️️️⚠️
Be careful with data members that are invokable (e.g. a member variable that's of type std::function
). It will not get invoked, and will sometimes it leads to stuff being silently ignored. See here.
Here's a good blurb going over why std::invoke()
exists.
The type traits library has std::invoke_result<...>::type
, which return the type of some "INVOKE".
struct X {
int run(int a, int b) { return a + b; }
};
X x{};
// make sure type_traits header is included or else std::invoke_result won't work
using ret_t = std::invoke_result<int (X::*)(int, int), std::reference_wrapper<X>, int, int>::type;
ret_t r { std::invoke(&X::run, std::ref(x), 3, 2) };
↩PREREQUISITES↩
C++'s verson of Java collections are commonly referred to as containers. C++ containers come in 3 major types:
Sequence containers - Objects organized in a sequence, where they're (at least) accessible from one end to the other.
C++ container | nearest Java equivalent |
---|---|
std::array |
standard Java array |
std::vector |
ArrayList |
std::deque |
Deque (an interface -- the C++ class is an implementation) |
std::list |
LinkedList (doubly-linked list) |
std::forward_list |
LinkedList (singly-linked list) |
Associative containers - Objects organized by key and potentially a value (keys sorted, requiring a comparison function).
C++ container | nearest Java equivalent |
---|---|
std::set |
TreeSet |
std::map |
TreeMap |
std::multiset |
TreeBag (Apache Commons Collections) |
std::multimap |
TreeMultimap (Guava) |
Unordered associative containers - Objects organized by key and potentially a value (keys unsorted).
C++ container | nearest Java equivalent |
---|---|
std::unordered_set |
HashSet |
std::unordered_map |
HashMap |
std::unordered_multiset |
HashBag (Apache Commons Collections) |
std::unordered_multimap |
ArrayListValuedHashMap (Apache Commons Collections) |
🔍SEE ALSO🔍
Containers support iterators via their begin()
and end()
functions. Looping over a container using a for-each loop calls those functions, but the order in which containers are iterated over depends on the container. For example ...
std::vector
maintains insertion order.std::map
iterates in sort order.std::unordered_map
iterators in some unknown order (unordered).std::vector<MyObject> container { /* items here */ };
for (MyObject &obj : container) {
// do something with value here
}
Most (not all) containers allow using user-supplied allocators via template parameter argument.
std::vector<MyObject, CustomAllocator> container { };
The subsections below detail some of the containers mentioned above. Other major libraries provide more specialized containers (boost, abseil, etc..), but those containers aren't detailed here.
Sequential containers organize objects as a sequence, where they're (at least) accessible from one end to the other. The container may or may not be dynamically sized (grow vs shrink) and underlying data structures used by sequential containers aren't the same.
The subsections below detail the various sequential containers that are provided by the C++ standard library.
std::array
is a container that's more-or-less a wrapper around a normal C++ array. Like normal C++ arrays, it ...
One caveat to this container is that it does not allocate a dynamic object array. That means, just like normal C++ arrays created on the stack, the number of elements must be known at compile-time.
int my_arr1[55] {}; // int array of size 55
std::array<int, 55> my_arr2 {}; // std::array of ints, with size 55
for (auto &obj : my_arr2) {
// do something with value here
}
std::array
provides copy semantics and move semantics. However, because the underlying array is a local object, both moving and copying end up recreating that underlying array. This means that copying and moving may potentially be expensive.
std::array<int, 55> my_arr3 { std::move(my_arr2) }; // move my_arr2 into my_arr3
To read elements, use the subscript operator ([]) or at()
. The main difference between the two is that the latter has bounds checking. Alternatively, ...
front()
may be used as shorthand to get the first element.back()
may be used as shorthand to get the last element.std::get()
may be used to read a random element so long as the index being read is known at compile-time (does bounds checking at compile-time).int w { my_arr2[20] };
int x { my_arr2.at(20) };
int y { my_arr2.at(1000) }; // throws std::out_of_range
int z { std::get<20>(my_arr2) };
int a { my_arr2.front() }; // WARNING: undefined behaviour of len is 0
int b { my_arr2.back() }; // WARNING: undefined behaviour of len is 0
🔍SEE ALSO🔍
std::get()
)To replace elements, use any of the same functions used for reading elements except std::get()
. They return a reference, which means assigning something to them will assign into the container.
my_arr2[20] = 123;
my_arr2.at(20) = 123;
my_arr2.front() = 123;
my_arr2.back() = 123;
auto & ref = my_arr(20);
ref = 123;
To get the size, use size()
.
int len { my_arr2.size() };
⚠️NOTE️️️⚠️
size()
and max_size()
are equivalent for std::array
, but not for other containers that can grow / shrink.
To gain access to the underlying array being wrapped, use data()
.
int * backing_arr = my_arr2.data();
// NOTE: each below are the same as the above, but the one above should be preferred
// because the ones below will have undefined behaviour if array length is 0.
int * backing_arr = &my_arr2[0];
int * backing_arr = &my_arr2.at(0);
int * backing_arr = &my_arr2.front();
To iterate over the elements, use being()
and end()
.
// RECALL: for-each loop will implicitly call begin() and end()
for (auto &obj : my_arr2) {
// do something with value here
}
⚠️NOTE️️️⚠️
There's also ..
⚠️NOTE️️️⚠️
WARNING: The book is saying that there is no hard requirement for a container to return copies vs references. Most of the time a container returns references, but in special cases it may return a copy of some object. For example, vector<bool>
has a template specialization that returns a proxy object rather than a direct reference (std::vector<bool>::reference
).
std::vector
is a container that holds on to its elements sequentially and contiguously in memory (array), but it can dynamically size itself (e.g. expand the internal array if not enough room is available to add a new element). It has most of the same functions as std::array
, in addition to some others.
To create an std::vector
primed with a sequence of values known as compile-time, use typical braced initialization.
std::vector<int> my_vec1 { 5, 5, 5, 5, 5, 5, 5, 5 };
To create an std::vector
without priming it directly to a sequence of values, you can't use braced initialization or brace-plus-equals initialization. You must use parenthesis.
std::vector<int> my_vec2 (8, 5); // same as initializing to above (8 copies of 5)
std::vector<int> my_vec3 (c) // copy another container
std::vector<int> my_vec4 (c.begin(), c.begin() + 10) // copy first 10 elems from another container
⚠️NOTE️️️⚠️
The rules for initialization are complex. In this case, there's a constructor that takes in an std::initializer_list
. That means braced initialization / brace-plus-equals initialization will in most cases call that constructor, where that initializer list gets populated with whatever is in the braces. To avoid that, the easiest thing you can do is fall back to using the legacy way of calling constructors (parenthesis).
std::vector
provides copy semantics and move semantics. Because elements are dynamic objects, moving one std::vector
into another is fast because it's simply passing off a pointer / reference. Copying can potentially be expensive.
std::vector<int> my_vec5 { std::move(my_vec1) }; // move my_vec1 into my_vec5
Similarly, because std::vector
's elements are created as dynamic objects, you have the option of supplying a custom allocator.
CustomAllocator allocator {};
std::vector<int, CustomAllocator> my_vec6 (allocator);
To read elements, the same read functions for std::array
are available here.
int w { my_vec1[5] };
int x { my_vec1.at(5) };
int y { my_vec1.at(1000) }; // throws std::out_of_range
int a { my_vec1.front() }; // WARNING: undefined behaviour of len is 0
int b { my_vec1.back() }; // WARNING: undefined behaviour of len is 0
⚠️NOTE️️️⚠️
Does std::get()
work here as well? I don't think so because this is a dynamic array.
To replace elements, the same write functions for std::array
are available here. Those functions are the same functions used for reading elements. They return a reference, which means assigning something to them will assign into the container.
my_vec1[20] = 123;
my_vec1.at(20) = 123;
my_vec1.front() = 123;
my_vec1.back() = 123;
To add elements, the following functions are available:
insert()
will insert an element just behind some iterator position (object copied / moved).emplace()
will insert an element just behind some iterator by creating it directly (no copying / moving).push_back()
will append an element (copy semantics)emplace_back()
will append an element by creating it directly (no copying / moving).auto it1 = my_vec1.begin() + 3;
my_vec1.insert(it1, 77); // WARNING: it1 invalid after this call
auto it2 = my_vec1.begin() + 3;
my_vec1.emplace(it2, 77); // WARNING: it2 invalid after this call
my_vec1.push_back(123);
my_vec1.emplace_back(123);
⚠️NOTE️️️⚠️
emplace()
/ emplace_back()
don't copy or move because you pass in initialization arguments directly into the functions. Internally, they use template parameter packs to forward arguments for object creation (e.g. constructor arguments, initializer list, etc..).
🔍SEE ALSO🔍
emplace()
function)emplace()
function)To delete either a single element or a range of elements, use erase()
and pass into it either an iterator at some position or an iterator range.
auto it1 {my_vec1.begin() + 3};
my_vec1.erase(it1); // WARNING: it1 invalid after this call
auto it2 {my_vec1.begin()};
auto it3 {my_vec1.begin() + 10};
my_vec1.erase(it2, it3); // WARNING: it2/it3 invalid after this call
To delete the last element, use pop_back()
. It's similar to back()
(returns element) but it removes the element as well.
int c { my_vec1.pop_back() }; // REMOVES the last
To delete all elements, use clear()
.
my_vec1.clear();
To delete all elements and re-assign to a list of new elements, use assign()
.
my_vec1.assign(5, 10); // 5 copies of 10
my_vec1.assign({5,5,5,5,5}); // 5 copies of 10
my_vec1.assign(c.begin(), c.begin() + 5); // starting 5 elements of another container
To get the number of elements, the same functions for std::array
are available here.
auto is_empty1 { my_vec1.size() == 0 };
auto is_empty2 { my_vec1.empty() };
Internally, std::vector
grows in chunks. For example, if the underlying array has a size of 5 and all of those 5 elements are occupied, when you add in a 6th element the underlying array resizes to have a capacity larger than 6 (e.g. 10). This way, you can continue adding in a few more elements without another resize happening right away (more efficient).
To get the current capacity, use capacity()
.
float usage { my_vec1.size() / my_vec2.capacity() };
If you ...
reserve()
.shrink_to_fit()
.my_vec1.reserve(1000);
my_vec1.shrink_to_fit();
Similar to std::array
, you can access the underlying array for an std::vector
. However, the returned array may become invalid as soon as you start performing operations on the owning std::vector
(e.g. it may get recreated due to shrinkage/growth).
To gain access to the underlying array being wrapped, use data()
.
int * backing_arr = my_vec1.data();
// NOTE: each below are same as the above, but the one above should be preferred
// because the ones below will have undefined behaviour if array length is 0.
int * backing_arr = &my_vec1[0];
int * backing_arr = &my_vec1.at(0);
int * backing_arr = &my_vec1.front();
To iterate over the elements, use being()
and end()
.
// RECALL: for-each loop will implicitly call begin() and end()
for (auto &obj : my_vec1) {
// do something with value here
}
⚠️NOTE️️️⚠️
There's also ..
↩PREREQUISITES↩
std::deque
is a container that holds on to its elements sequentially but not contiguously in memory (not an array). It can dynamically size itself (e.g. expand the internal array if not enough room is available to add a new element) just like std::vector
and it supports most of the same function as std::vector
. The most prominent functions it doesn't support:
data()
because there is no underlying array with this container.capacity()
.reserve()
.Because of the internal data structure used by this container, the added functions above are efficient.
To create an std::deque
primed with a sequence of values known as compile-time, use typical braced initialization.
std::deque<int> d1 { 5, 5, 5, 5, 5, 5, 5, 5 };
To create an std::deque
without priming it directly to a sequence of values, you can't use braced initialization or brace-plus-equals initialization. You must use parenthesis.
std::deque<int> d2 (8, 5); // same as initializing to above (8 copies of 5)
std::deque<int> d3 (c) // copy another container
std::deque<int> d4 (c.begin(), c.begin() + 10) // copy first 10 elems from another container
std::deque
provides copy semantics and move semantics. Because elements are dynamic objects, moving one std::deque
into another is fast because it's simply passing off a pointer / reference. Copying can potentially be expensive.
std::deque<int> d5 { std::move(d1) }; // move d1 into d5
Similarly, because std::deque
's elements are created as dynamic objects, you have the option of supplying a custom allocator.
CustomAllocator allocator {};
std::deque<int, CustomAllocator> d6 (allocator);
To read elements, the same read functions for std::vector
are available here.
int w { d1[5] };
int x { d1.at(5) };
int y { d1.at(1000) }; // throws std::out_of_range
int a { d1.front() }; // WARNING: undefined behaviour of len is 0
int b { d1.back() }; // WARNING: undefined behaviour of len is 0
To replace elements, the same write functions for std::vector
are available here. Those functions are the same functions used for reading elements. They return a reference, which means assigning something to them will assign into the container.
d1[20] = 123;
d1.at(20) = 123;
d1.front() = 123;
d1.back() = 123;
To add elements, the same add functions for std::vector
are available here in addition to...
emplace_front()
- similar to emplace_back()
but adds to the front.push_front()
- similar to push_back()
but adds to the front.auto it1 = d1.begin() + 3;
d1.insert(it1, 77); // WARNING: it1 invalid after this call
auto it2 = d1.begin() + 3;
d1.emplace(it2, 77); // WARNING: it2 invalid after this call
d1.push_back(123);
d1.push_front(123); // similar to push_back, but adds to front
d1.emplace_back(123);
d1.emplace_front(123); // similar to emplace_back, but adds to front
⚠️NOTE️️️⚠️
Recall that "emplace" functions don't copy or move. They're templated functions. You pass in object initialization arguments directly into the functions and it uses a template parameter pack to forward those arguments for object creation directly within the function (e.g. constructor arguments, initializer list, etc..).
To delete elements, the same delete functions for std::vector
are available here in addition to ...
pop_front()
- similar to pop_back()
but removes from the front.// DELETE at ifx
auto it1 {d1.begin() + 3};
d1.erase(it1); // WARNING: it1 invalid after this call
// DELETE between idx range
auto it2 {d1.begin()};
auto it3 {d1.begin() + 10};
d1.erase(it2, it3); // WARNING: it2/it3 invalid after this call
// DELETE front or back
int c { d1.pop_back() };
int d { d1.pop_front() }; // similar to pop_back, but removed from the FRONT
// DELETE all
d1.clear();
// DELETE all and RE-ASSIGN
d1.assign(5, 10); // 5 copies of 10
d1.assign({5,5,5,5,5}); // 5 copies of 10
d1.assign(c.begin(), c.begin() + 5); // starting 5 elements of another container
To get the number of elements, the same functions for std::vector
are available here.
auto is_empty1 { d1.size() == 0 };
auto is_empty2 { d1.empty() };
To release unused memory that's been reserved by the container, use shrink_to_fit()
. The capacity()
and reserve()
functions found in std::vector
are not present in this container.
To iterate over the elements, use being()
and end()
.
// RECALL: for-each loop will implicitly call begin() and end()
for (auto &obj : d1) {
// do something with value here
}
⚠️NOTE️️️⚠️
There's also ..
↩PREREQUISITES↩
std::list
is a container that holds on to its elements sequentially. It's implemented as a doubly-linked list, meaning its size is dynamic but it isn't stored contiguously in memory (not an array).
std::list
supports a similar set of functions as std::deque
except for the fact that random element access isn't allowed (the functions for it don't exist -- random access is inefficient with linked lists). You can only access elements by walking either forward or backward. In addition, it provides several built-in helper functions such as sorting and de-duplication.
⚠️NOTE️️️⚠️
List of helper functions is documented near the end.
⚠️NOTE️️️⚠️
There's also std::forward_list
which is a singly-linked list. It's functionality is very similar to this class, but since it only supports walking forward, some of the functions listed here are missing.
To create a std::list
primed with a sequence of values known as compile-time, use typical braced initialization.
std::list<int> l1 { 5, 5, 5, 5, 5, 5, 5, 5 };
To create a std::list
without priming it directly to a sequence of values, you can't use braced initialization or brace-plus-equals initialization. You must use parenthesis.
std::list<int> l2 (8, 5); // same as initializing to above (8 copies of 5)
std::list<int> l3 (c) // copy another container
std::list<int> l4 (c.begin(), c.begin() + 10) // copy first 10 elems from another container
std::list
provides copy semantics and move semantics. Because elements are dynamic objects, moving one std::list
into another is fast because it's simply passing off a pointer / reference. Copying can potentially be expensive.
std::list<int> l5 { std::move(l1) }; // move l1 into l5
Similarly, because std::list
's elements are created as dynamic objects, you have the option of supplying a custom allocator.
CustomAllocator allocator {};
std::list<int, CustomAllocator> l6 (allocator);
To read elements, use the iterator functions begin()
and / or end()
to walk the sequence. Alternatively, the front()
and back()
functions give direct access to the first and last elements respectively.
// WARNING: undefined behaviour of len is < 3
auto it = l1.begin();
int a { *it };
it++;
int b { *it };
it++;
int c { *it };
int d { l1.front() }; // WARNING: undefined behaviour of len is 0
int e { l1.back() }; // WARNING: undefined behaviour of len is 0
To replace elements, the same iterator functions begin()
and / or end()
need to be used to walk the sequence to the point of replacement.
// WARNING: undefined behaviour of len is < 3
auto it = l1.begin() + 2;
*it = 55;
To add elements, the same add functions for std::deque
are available here.
auto it1 = l1.begin() + 3;
l1.insert(it1, 77);
auto it2 = l1.begin() + 3;
l1.emplace(it2, 77);
l1.push_back(123);
l1.push_front(123);
l1.emplace_back(123);
l1.emplace_front(123);
⚠️NOTE️️️⚠️
I'm getting conflicting information about if an iterator is invalid after a write. Right now I'm leaning towards NOT invalid.
⚠️NOTE️️️⚠️
Recall that "emplace" functions don't copy or move. They're templated functions. You pass in object initialization arguments directly into the functions and it uses a template parameter pack to forward those arguments for object creation directly within the function (e.g. constructor arguments, initializer list, etc..).
To delete elements, the same delete functions for std::deque
are available here.
// DELETE at idx
auto it1 {l1.begin() + 3};
d1.erase(it1); // WARNING: it1 invalid after this call
// DELETE between idx range
auto it2 {l1.begin()};
auto it3 {l1.begin() + 10};
l1.erase(it2, it3); // WARNING: it2/it3 invalid after this call
// DELETE front or back
int c { l1.pop_back() };
int d { l1.pop_front() }; // similar to pop_back, but removed from the FRONT
// DELETE all
l1.clear();
// DELETE all and RE-ASSIGN
l1.assign(5, 10); // 5 copies of 10
l1.assign({5,5,5,5,5}); // 5 copies of 10
l1.assign(c.begin(), c.begin() + 5); // starting 5 elements of another container
To get the number of elements, the same functions for std::deque
are available here.
auto is_empty1 { l1.size() == 0 };
auto is_empty2 { l1.empty() };
To iterate over the elements, use being()
and end()
.
// RECALL: for-each loop will implicitly call begin() and end()
for (auto &obj : l1) {
// do something with value here
}
⚠️NOTE️️️⚠️
There's also ..
std::list
has several helper functions built-in.
merge()
combines two sorted std::list
s into a single sorted std::list
and empty them.splice()
transfers a range of elements from one std::list
to another.remove()
searches for and removes all matching elements in a std::list
.remove_if()
removes all elements in a std::list
that matches some predicate.reverse()
reverses an std::list
(in-place reversal).sort()
sorts an std::list
based on a comparator (in-place sort).unique()
removes consecutive duplicate elements.Ordered associative containers organize objects by key and potentially a value. Keys are sorted into a specific order, meaning that a comparison function is required. The underlying data structure used by ordered associative containers is a red-black tree.
⚠️NOTE️️️⚠️
Unsure if the spec defines if they should be implemented as red-black trees, but from what I've read that's how they're implemented.
The subsections below detail the various ordered associative containers that are provided by the C++ standard library.
↩PREREQUISITES↩
std::set
is a container that holds on to unique elements in sorted order, where that order is defined by a comparator.
To create a std::set
primed with a sequence of values known as compile-time, use typical braced initialization. Since only unique values are allowed, duplicates will be ignored. By default, the comparator std::less
is used which uses the less than operator (<) to compare two objects for priority.
std::set<int> s1 { 1, 1, 2, 3, 4, 5 };
To create a std::set
without priming it directly to a sequence of values, you can't use braced initialization or brace-plus-equals initialization. You must use parenthesis.
std::set<int> s2 (c.begin(), c.begin() + 10) // copy first 10 elems from another container
// CUSTOM COMPARATOR
auto comparator = [] (const int & lhs, const int & rhs) -> bool { return lhs < rhs; };
std::set<int, decltype(comparator)> s3 ({ 1, 1, 2, 3, 4, 5 }, comparator);
std::set<int, decltype(std::greater)> s4 ({ 1, 1, 2, 3, 4, 5 }, std::greater);
std::set
provides copy semantics and move semantics. Because elements are dynamic objects, moving one std::set
into another is fast because it's simply passing off a pointer / reference. Copying can potentially be expensive.
std::set<int> s5 { std::move(s1) }; // move s1 into s5
Similarly, because std::set
's elements are created as dynamic objects, you have the option of supplying a custom allocator.
CustomAllocator allocator {};
std::set<int, decltype(std::less), CustomAllocator> s6 (std::less, allocator);
To check an element exists, the following functions are available:
find()
return an iterator primed at the position of the found element (returns end()
iterator position if not found).contains()
returns bool (true if it exists, false otherwise).count()
returns integer (1 if it exists, 0 otherwise).bool found { s1.find(3) != s1.end() };
bool found { s1.contains(3) };
bool found { s1.count(3) == 1 };
To find the first element that is greater than or equal (>=) to some value, use lower_bound()
. Similarly, to find the first element that's greater than (>) some value, use upper_bound()
.
auto it1 { s1.lower_bound(4) }; // get iterator to elem 4 if 4 exists, otherwise get iterator to the elem just after 4 (could be s1.end() if no such elem)
auto it2 { s1.upper_bound(4) }; // get iterator to elem just after 4 (could be s1.end() if no such elem)
To add an element, the following functions are available:
insert()
either copies or moves into the container (depending on if the reference passed in is an rvalue reference).emplace()
adds an element by creating it directly (no copying / moving).emplace_hint()
like the above but also takes in an iterator primed to some position as a hint.s1.insert(6);
s1.emplace(6);
s1.emplace_hint(s1.begin(), 6); // iterator should be near to where the value is
⚠️NOTE️️️⚠️
emplace()
/ emplace_hint()
don't copy or move because you pass in initialization arguments directly into the functions. Internally, they use template parameter packs to forward arguments for object creation (e.g. constructor arguments, initializer list, etc..).
🔍SEE ALSO🔍
emplace()
function)emplace()
function)To delete an element at some specific iterator position, use either extract()
or erase()
. The difference is that extract()
will return the element while erase()
won't.
auto it1 { s1.begin() };
int res { s1.extract(it1) }; // WARNING: it1 invalid after this point
auto it2 { s1.begin() };
s1.erase(it2); // WARNING: it2 invalid after this point
// DELETE between idx range
auto it3 {s1.begin()};
auto it4 {s1.begin() + 10};
l1.erase(it3, it4); // WARNING: it3/it4 invalid after this call
To delete all elements, use clear()
.
s1.clear();
To get the number of elements, use size()
. Similarly, use empty()
to check if empty.
auto is_empty1 { s1.size() == 0 };
auto is_empty2 { s1.empty() };
To iterate over the elements, use being()
and end()
.
// RECALL: for-each loop will implicitly call begin() and end()
for (auto &obj : s1) {
// do something with value here
}
⚠️NOTE️️️⚠️
There's also ..
↩PREREQUISITES↩
std::multiset
is a container that, like a std::set
, holds on to elements in sorted order where that order is defined by a comparator. Unlike std::set
, it can hold on to multiple instances of the same element (elements aren't unique). Rather than using the equality operator (==) to find duplicates, std::multiset
uses the sorting comparator: two objects a and b are considered equivalent if neither compares less than the other: !comp(a, b) && !comp(b, a)
.
⚠️NOTE️️️⚠️
Definition is from cppreference.
To create a std::multiset
, the same std::set
constructors apply. The only major difference is that, if you're initializing the values, any duplicate values are kept.
// prime
std::multiset<int> s1 { 1, 1, 2, 3, 4, 5 }; // DUPLICATES RETAINED
// copy range
std::multiset<int> s2 (c.begin(), c.begin() + 10) // copy first 10 elems from another container
// custom comparator
auto comparator = [] (const int & lhs, const int & rhs) -> bool { return lhs < rhs; };
std::multiset<int, decltype(comparator)> s3 ({ 1, 1, 2, 3, 4, 5 }, comparator);
std::multiset<int, decltype(std::greater)> s4 ({ 1, 1, 2, 3, 4, 5 }, std::greater);
// copy/move
std::multiset<int> s5 { std::move(s1) }; // move s1 into s5
// custom allocator
CustomAllocator allocator {};
std::multiset<int, decltype(std::less), CustomAllocator> s6 (std::less, allocator);
Functions are more or less the same as those in std::set
. The only major difference is that count()
returns the number of instances for an element.
// find
bool found { s1.find(1) != s1.end() };
bool found { s1.contains(1) };
// get number of instances
bool is_two_instances { s1.count(1) == 2 };
// get lower/upper bound
auto it1 { s1.lower_bound(4) }; // get iterator to elem 4 if 4 exists, otherwise get iterator to the elem just after 4 (could be s1.end() if no such elem)
auto it2 { s1.upper_bound(4) }; // get iterator to elem just after 4 (could be s1.end() if no such elem)
// add
s1.insert(6);
s1.emplace(6);
s1.emplace_hint(s1.begin(), 6); // iterator should be near to where the value is
// remove
auto it1 { s1.begin() };
int res { s1.extract(it1) }; // WARNING: it1 invalid after this point
auto it2 { s1.begin() };
s1.erase(it2); // WARNING: it2 invalid after this point
// remove range
auto it3 {s1.begin()};
auto it4 {s1.begin() + 10};
l1.erase(it3, it4); // WARNING: it3/it4 invalid after this call
// remove all
s1.clear();
// get size
auto is_empty1 { s1.size() == 0 };
auto is_empty2 { s1.empty() };
// iterate
for (auto &obj : s1) { // RECALL: for-each loop will implicitly call begin() and end()
// do something with value here
}
⚠️NOTE️️️⚠️
There's also ..
↩PREREQUISITES↩
std::map
is a container similar to std::set
, with the major difference being that each element in a std::map
has a secondary value associated with it: key to value. Only the key is used for ordering, uniqueness, and lookup. The value just tags along.
To create a std::map
, the same std::set
constructors apply. The only major differences are that ...
std::pair<K,V>
(K
is key type, V
is value type).// prime
std::map<int. float> s0 {
std::pair<int, float> { 1, 99.0f },
std::pair<int, float> { 1, -99.0f }, // WARNING: this is the 2nd instance of the key 1, which value is inserted for the key is undefined
std::pair<int, float> { 2, -100.0f },
std::pair<int, float> { 4, 123.0f },
std::pair<int, float> { 5, 4.0f }
};
std::map<int. float> s1 {
{ 1, 99.0f },
{ 1, -99.0f }, // WARNING: this is the 2nd instance of the key 1, which value is inserted for the key is undefined
{ 2, -100.0f },
{ 4, 123.0f },
{ 5, 4.0f }
};
// copy range
std::map<int> s2 (c.begin(), c.begin() + 10) // copy first 10 key-value pairs from another container
// custom comparator
auto comparator = [] (const int & lhs, const int & rhs) -> bool { return lhs < rhs; };
std::map<int, float, decltype(comparator)> s3 ({ { 1, 99.0f }, { 2, 3.0f }, ... }, comparator);
std::map<int, float, decltype(std::greater)> s4 ({ { 1, 99.0f }, { 2, 3.0f }, ... }, std::greater);
// copy/move
std::map<int, float> s5 { std::move(s1) }; // move s1 into s5
// custom allocator
CustomAllocator allocator {};
std::map<int, float, decltype(std::less), CustomAllocator> s6 (std::less, allocator);
To check an element exists, the following functions are available:
find()
return an iterator primed at the position of the found element (returns end()
iterator position if not found).contains()
returns bool (true if it exists, false otherwise).count()
returns integer (0 is false, 1 is true).bool found { s1.find(3) != s1.end() };
bool found { s1.contains(3) };
bool found { s1.count(3) == 1 };
Dereferencing an iterator will give back both the key and value as a std::pair<K,V>
.
auto it { s1.find(3) };
std::pair<int, float> found_pair { *it };
To find the first key-value pair where the key is greater than or equal (>=) to some other key, use lower_bound()
. Similarly, to find the first key-value pair where the key is greater than (>) some other key, use upper_bound()
.
auto it1 { s1.lower_bound(4) }; // get iterator to elem 4 if 4 exists, otherwise get iterator to the elem just after 4 (could be s1.end() if no such elem)
auto it2 { s1.upper_bound(4) }; // get iterator to elem just after 4 (could be s1.end() if no such elem)
To add a key-value pair (not replace an existing value), the following functions are available:
insert()
either copies or moves into the container (depending on if the reference passed in is an rvalue reference).emplace()
adds an element by creating it directly (no copying / moving).try_emplace()
like emplace()
but has special move semantics (does not move template parameter pack rvalue arguments if insertion doesn't happen -- see docs for more info).emplace_hint()
like emplace()
but also takes in an iterator primed to some position as a hint.// NOTE: Each func returns a bool (true for insertion, false for already exists) + an iterator to the key-value pair (existing one if not added)
s1.insert(6, 122.0f);
s1.insert(std::pair<int, float> {6, 122.0f});
s1.emplace(6, 122.0f);
s1.emplace(std::pair<int, float> {6, 122.0f});
s1.try_emplace(6, 122.0f);
s1.try_emplace(std::pair<int, float> {6, 122.0f});
s1.emplace_hint(s1.begin(), 6, 122.0f); // iterator should be near to where the value is
s1.emplace_hint(s1.begin(), std::pair<int, float> {6, 122.0f});
To replace a value for an already existing key, insert_or_assign()
either copies or moves into the container (depending on if the reference passed in is an rvalue reference), replacing it if it already exists.
// NOTE: returns a bool (true for insertion, false for assignment) + an iterator to the key-value pair (existing one if not added)
s1.insert_or_assign(6, 122.0f);
s1.insert_or_assign(std::pair<int, float> {6, 122.0f});
To delete an element at some specific iterator position, use either extract()
or erase()
. The difference is that extract()
will return the key-value while erase()
won't.
auto it1 { s1.begin() };
int res { s1.extract(it1) }; // WARNING: it1 invalid after this point
auto it2 { s1.begin() };
s1.erase(it2); // WARNING: it2 invalid after this point
// DELETE between idx range
auto it3 {s1.begin()};
auto it4 {s1.begin() + 10};
l1.erase(it3, it4); // WARNING: it3/it4 invalid after this call
To delete all elements, use clear()
.
s1.clear();
To get the number of elements, use size()
. Similarly, use empty()
to check if empty.
auto is_empty1 { s1.size() == 0 };
auto is_empty2 { s1.empty() };
To iterate over the elements, use being()
and end()
.
// RECALL: for-each loop will implicitly call begin() and end()
for (auto &obj : s1) {
// do something with value here
}
⚠️NOTE️️️⚠️
There's also ..
↩PREREQUISITES↩
std::multimap
is a container that's a combination of std::multiset
and std::map
. That is, it's a std::map
but it allows for many key-value pairs with the same key (keys aren't unique). Similar to std::multiset
, std::multimap
uses the sorting comparator to find duplicates: two objects a and b are considered equivalent if neither compares less than the other: !comp(a, b) && !comp(b, a)
.
⚠️NOTE️️️⚠️
Definition is from cppreference.
To create a std::multimap
, the same std::map
constructors apply. The only major difference is that, if you're initializing the values, any duplicate values are kept.
// prime
std::unordered_multimap<int. float> s0 {
std::pair<int, float> { 1, 99.0f },
std::pair<int, float> { 1, -99.0f }, // NOTE: this is the 2nd instance of the key 1, both key-value pairs are kept
std::pair<int, float> { 2, -100.0f },
std::pair<int, float> { 4, 123.0f },
std::pair<int, float> { 5, 4.0f }
};
std::unordered_multimap<int. float> s1 {
{ 1, 99.0f },
{ 1, -99.0f }, // NOTE: this is the 2nd instance of the key 1, both key-value pairs are kept
{ 2, -100.0f },
{ 4, 123.0f },
{ 5, 4.0f }
};
// copy range
std::unordered_multimap<int> s2 (c.begin(), c.begin() + 10) // copy first 10 key-value pairs from another container
// custom comparator
auto comparator = [] (const int & lhs, const int & rhs) -> bool { return lhs < rhs; };
std::unordered_multimap<int, float, decltype(comparator)> s3 ({ { 1, 99.0f }, { 2, 3.0f }, ... }, comparator);
std::unordered_multimap<int, float, decltype(std::greater)> s4 ({ { 1, 99.0f }, { 2, 3.0f }, ... }, std::greater);
// copy/move
std::unordered_multimap<int, float> s5 { std::move(s1) }; // move s1 into s5
// custom allocator
CustomAllocator allocator {};
std::unordered_multimap<int, float, decltype(std::less), CustomAllocator> s6 (std::less, allocator);
Functions are more or less the same as their std::map
counterparts. The only major difference are that...
count()
returns the number of instances for an element.are availableinsert_or_assign()
is removed because it doesn't make sense to have it (you can have duplicate keys).try_emplace()
is removed because it doesn't make sense to have it (you can have duplicate keys).// find
bool found { s1.find(3) != s1.end() };
bool found { s1.contains(3) };
// get
auto it { s1.find(3) }; // WARNING: if there's multiple instances of key, any of them could be returned here
std::pair<int, float> found_pair { *it };
// get number of instances
bool is_two_instances { s1.count(1) == 2 };
// get lower/upper bound
auto it1 { s1.lower_bound(4) }; // get iterator to elem 4 if 4 exists, otherwise get iterator to the elem just after 4 (could be s1.end() if no such elem)
auto it2 { s1.upper_bound(4) }; // get iterator to elem just after 4 (could be s1.end() if no such elem)
// add
s1.insert(6, 122.0f);
s1.insert(std::pair<int, float> {6, 122.0f});
s1.emplace(6, 122.0f);
s1.emplace(std::pair<int, float> {6, 122.0f});
s1.emplace_hint(s1.begin(), 6, 122.0f); // iterator should be near to where the value is
s1.emplace_hint(s1.begin(), std::pair<int, float> {6, 122.0f}); // iterator should be near to where the value is
// remove
auto it1 { s1.begin() };
int res { s1.extract(it1) }; // WARNING: it1 invalid after this point
auto it2 { s1.begin() };
s1.erase(it2); // WARNING: it2 invalid after this point
// remove range
auto it3 {s1.begin()};
auto it4 {s1.begin() + 10};
l1.erase(it3, it4); // WARNING: it3/it4 invalid after this call
// remove all
s1.clear();
// get size
auto is_empty1 { s1.size() == 0 };
auto is_empty2 { s1.empty() };
// iterate
for (auto &obj : s1) {
// do something with value here
}
⚠️NOTE️️️⚠️
There's also ..
Unordered associative containers organize objects by key and potentially a value. Keys are stored in an unordered fashion. The underlying data structure used by unordered associative containers is a hash table.
By default, unordered associative containers attempt to hash keys by calling template specializations of std::hash<T>
. Several pre-existing template specializations are provided by the C++ standard library (e.g. int
,long
, std::string
, etc..), but custom types need their own specialization to be written by the user.
std::hash<T>
implementations must be exposed as a functor that takes in the type in question and returns a std::size_t
.
template<>
struct std::hash<MyType> {
std::size_t operator()(S const& s) const noexcept {
std::size_t h1 { std::hash<std::string>{} (s.student_name) };
std::size_t h2 { std::hash<int>{} (s.student_age) };
return h1 ^ h2; // see boost::hash_combine
}
};
A common point of confusion is what you have to do to use std::hash
with a reference type. Note that the function call operator in the example above is taking a reference, meaning you always use the non-reference type as the type argument when invoking.
int x { 55 };
int & xRef { x };
std::size_t hash1 { std::hash<int>{} (x) };
std::size_t hash2 { std::hash<int>{} (xRef) };
The subsections below detail the various unordered associative containers that are provided by the C++ standard library.
std::unordered_set
is a container that's similar to std::set
. It holds on to unique elements but does so unordered (whereas std::set
has some sort order). It's implemented as a hash table, so rather than having to specify a comparator, you're required to specify a hash function.
Several pre-existing hash functions are provided by the C++ standard library via std::hash<T>
(T
being the type in question). If the user doesn't supply a hash function directly, the default is to use std::hash<T>
with the element type substituted in (compilation will fail if no std::hash<T>
implementation for that element type exists). For example, std::hash<int>
exists for integers and gets automatically plugged in when the element type of the container is int
.
⚠️NOTE️️️⚠️
Details on providing a custom std::hash<T>
implementation are in the parent section.
In addition to a hash function, a std::unordered_set
may have a custom equivalence function. By default, std::equal_to<T>
is used if none is supplied by the user, which uses the equality operator (==).
To create a std::unordered_set
primed with a sequence of values known as compile-time, use typical braced initialization. Since only unique values are allowed, duplicates will be ignored.
std::unordered_set<int> s1 { 1, 1, 2, 3, 4, 5 };
To create a std::unordered_set
without priming it directly to a sequence of values, you can't use braced initialization or brace-plus-equals initialization. You must use parenthesis.
std::unordered_set<int> s2 (c.begin(), c.begin() + 10) // copy first 10 elems from another container
To create a std::unordered_set
with a custom hash function, that custom hash function can be implemented in one of two ways:
size_t
.std::hash<T>
for the element type (assuming one wasn't already supplied by the C++ standard library).// function-like object
auto hasher = [] (const int & val) -> size_t { return static_cast<size_t>(val); };
std::unordered_set<int, decltype(hasher)> s3 ({ 1, 1, 2, 3, 4, 5 }, hasher);
⚠️NOTE️️️⚠️
I think (not sure) if you create a std::hash<T>
implementation for the element type, the std::unordered_set
should automatically pick it without having to specify the template parameter + directly passing it in as an argument (as done above). It should work so long as the implementation is visible (e.g. whatever file its in has been #include
-ed) when the container is created.
std::unordered_set
provides copy semantics and move semantics. Because elements are dynamic objects, moving one std::unordered_set
into another is fast because it's simply passing off a pointer / reference. Copying can potentially be expensive.
std::unordered_set<int> s4 { std::move(s1) }; // move s1 into s4
Similarly, because std::unordered_set
's elements are created as dynamic objects, you have the option of supplying a custom allocator.
CustomAllocator allocator {};
std::unordered_set<int, decltype(std::hash<int>), decltype(std::equal_to<int>), CustomAllocator> s5 (allocator);
⚠️NOTE️️️⚠️
Similar to a Java HashMap
, a std::unordered_set
has concepts such as bucket count and load factor. It'll automatically add more buckets and rehash once the load factor reaches some point, all of which is tunable if you deem the performance of the defaults as not good. Alternatively, you can always trigger a rehash manually.
Those features aren't discussed here.
std::unordered_set
supports most of the same functions as std::set
except for those that have to do with sorted ordering. For example, lower_bound()
and upper_bound()
are missing here because the elements here aren't ordered.
To check if a set contains an element, the following functions are available:
find()
return an iterator primed at the position of the found element (returns end()
iterator position if not found).contains()
returns bool (true if it exists, false otherwise).count()
returns integer (1 if it exists, 0 otherwise).bool found { s1.find(3) != s1.end() };
bool found { s1.contains(3) };
bool found { s1.count(3) == 1 };
To add an element, the following functions are available:
insert()
either copies or moves a value into the container (depending on if the reference passed in is an rvalue reference).emplace()
adds an element by creating it directly (no copying / moving).emplace_hint()
like the above but also takes in an iterator primed to some position as a hint.s1.insert(6);
s1.emplace(6);
s1.emplace_hint(6, s1.begin());
⚠️NOTE️️️⚠️
emplace()
/ emplace_hint()
don't copy or move because you pass in initialization arguments directly into the functions. Internally, they use template parameter packs to forward arguments for object creation (e.g. constructor arguments, initializer list, etc..).
🔍SEE ALSO🔍
emplace()
function)emplace()
function)To delete an element at some specific iterator position, use either extract()
or erase()
. The difference is that extract()
will return the element while erase()
won't.
auto it1 { s1.begin() };
int res { s1.extract(it1) }; // WARNING: it1 invalid after this point
auto it2 { s1.begin() };
s1.erase(it2); // WARNING: it2 invalid after this point
// DELETE between idx range
auto it3 {s1.begin()};
auto it4 {s1.begin() + 10};
l1.erase(it3, it4); // WARNING: it3/it4 invalid after this call
To delete all elements, use clear()
.
s1.clear();
To get the number of elements, use size()
. Similarly, use empty()
to check if empty.
auto is_empty1 { s1.size() == 0 };
auto is_empty2 { s1.empty() };
To iterate over the elements, use being()
and end()
.
// RECALL: for-each loop will implicitly call begin() and end()
for (auto &obj : s1) {
// do something with value here
}
⚠️NOTE️️️⚠️
There's also ..
↩PREREQUISITES↩
std::unordered_multiset
is a container that, like a std::unordered_set
, holds on to unordered elements. Like std::unordered_set
, it requires a hashing function and an equivalence function (same defaults are used if not supplied). Unlike std::unordered_set
, it can hold on to multiple instances of the same element (elements aren't unique).
⚠️NOTE️️️⚠️
Details on providing a custom std::hash<T>
implementation are in the parent section.
Details on providing a custom hasher implementation specifically for the container are in the unordered set section.
To create a std::unordered_multiset
, the same std::unordered_set
constructors apply. The only major difference is that, if you're initializing the values, any duplicate values are kept.
// prime
std::unordered_multiset<int> s1 { 1, 1, 2, 3, 4, 5 }; // DUPLICATED RETAINED
// copy range
std::unordered_multiset<int> s2 (c.begin(), c.begin() + 10) // copy first 10 elems from another container
// custom hash function
auto hasher = [] (const int & val) -> size_t { return static_cast<size_t>(val); };
std::unordered_multiset<int, decltype(hasher)> s3 ({ 1, 1, 2, 3, 4, 5 }, hasher);
// copy/move
std::unordered_multiset<int> s4 { std::move(s1) }; // move s1 into s4
CustomAllocator allocator {};
std::unordered_multiset<int, decltype(std::hash<int>), decltype(std::equal_to<int>), CustomAllocator> s5 (allocator);
Functions are more or less the same as those in std::unordered_set
. The only major difference is that count()
returns the number of instances for an element.
// find
bool found { s1.find(1) != s1.end() };
bool found { s1.contains(1) };
// get number of instances
bool is_two_instances { s1.count(1) == 2 };
// add
s1.insert(6);
s1.emplace(6);
s1.emplace_hint(s1.begin(), 6); // iterator should be near to where the value is
// remove
auto it1 { s1.begin() };
int res { s1.extract(it1) }; // WARNING: it1 invalid after this point
auto it2 { s1.begin() };
s1.erase(it2); // WARNING: it2 invalid after this point
// remove range
auto it3 {s1.begin()};
auto it4 {s1.begin() + 10};
l1.erase(it3, it4); // WARNING: it3/it4 invalid after this call
// remove all
s1.clear();
// get size
auto is_empty1 { s1.size() == 0 };
auto is_empty2 { s1.empty() };
// iterate
for (auto &obj : s1) { // RECALL: for-each loop will implicitly call begin() and end()
// do something with value here
}
⚠️NOTE️️️⚠️
There's also ..
↩PREREQUISITES↩
std::unordered_map
is a container similar to std::unordered_set
, with the major difference being that each element in a std::unordered_map
has a secondary value associated with it: key to value. Only the key is used for uniqueness and lookup. The value just tags along.
⚠️NOTE️️️⚠️
Details on providing a custom std::hash<T>
implementation are in the parent section.
Details on providing a custom hasher implementation specifically for the container are in the unordered set section.
To create a std::unordered_map
, the same std::unordered_set
constructors apply. The only major differences are that ...
std::pair<K,V>
(K
is key type, V
is value type).// prime
std::unordered_map<int. float> s0 {
std::pair<int, float> { 1, 99.0f },
std::pair<int, float> { 1, -99.0f }, // WARNING: this is the 2nd instance of the key 1, which value is inserted for the key is undefined
std::pair<int, float> { 2, -100.0f },
std::pair<int, float> { 4, 123.0f },
std::pair<int, float> { 5, 4.0f }
};
std::unordered_map<int. float> s1 {
{ 1, 99.0f },
{ 1, -99.0f }, // WARNING: this is the 2nd instance of the key 1, which value is inserted for the key is undefined
{ 2, -100.0f },
{ 4, 123.0f },
{ 5, 4.0f }
};
// copy range
std::unordered_map<int> s2 (c.begin(), c.begin() + 10) // copy first 10 key-value pairs from another container
// custom hash function
auto hasher = [] (const int & val) -> size_t { return static_cast<size_t>(val); };
std::unordered_map<int, float, decltype(hasher)> s3 ({ { 1, 99.0f }, { 2, 3.0f }, ... }, hasher);
// copy/move
std::unordered_map<int, float> s5 { std::move(s1) }; // move s1 into s5
// custom allocator
CustomAllocator allocator {};
std::unordered_map<int, float, decltype(std::less), CustomAllocator> s6 (std::less, allocator);
To check an element exists, the following functions are available:
find()
return an iterator primed at the position of the found element (returns end()
iterator position if not found).contains()
returns bool (true if it exists, false otherwise).count()
returns integer (0 is false, 1 is true).bool found { s1.find(3) != s1.end() };
bool found { s1.contains(3) };
bool found { s1.count(3) == 1 };
Dereferencing an iterator will give back both the key and value as a std::pair<K,V>
.
auto it { s1.find(3) };
std::pair<int, float> found_pair { *it };
To add a key-value pair (not replace an existing value), the following functions are available:
insert()
either copies or moves into the container (depending on if the reference passed in is an rvalue reference).emplace()
adds an element by creating it directly (no copying / moving).try_emplace()
like emplace()
but has special move semantics (does not move template parameter pack rvalue arguments if insertion doesn't happen -- see docs for more info).emplace_hint()
like emplace()
but also takes in an iterator primed to some position as a hint.// NOTE: Each func returns a bool (true for insertion, false for already exists) + an iterator to the key-value pair (existing one if not added)
s1.insert(6, 122.0f);
s1.insert(std::pair<int, float> {6, 122.0f});
s1.emplace(6, 122.0f);
s1.emplace(std::pair<int, float> {6, 122.0f});
s1.try_emplace(6, 122.0f);
s1.try_emplace(std::pair<int, float> {6, 122.0f});
s1.emplace_hint(s1.begin(), 6, 122.0f); // iterator should be near to where the value is
s1.emplace_hint(s1.begin(), std::pair<int, float> {6, 122.0f});
To replace a value for an already existing key, insert_or_assign()
either copies or moves into the container (depending on if the reference passed in is an rvalue reference), replacing it if it already exists.
// NOTE: returns a bool (true for insertion, false for assignment) + an iterator to the key-value pair (existing one if not added)
s1.insert_or_assign(6, 122.0f);
s1.insert_or_assign(std::pair<int, float> {6, 122.0f});
To delete an element at some specific iterator position, use either extract()
or erase()
. The difference is that extract()
will return the key-value while erase()
won't.
auto it1 { s1.begin() };
int res { s1.extract(it1) }; // WARNING: it1 invalid after this point
auto it2 { s1.begin() };
s1.erase(it2); // WARNING: it2 invalid after this point
// DELETE between idx range
auto it3 {s1.begin()};
auto it4 {s1.begin() + 10};
l1.erase(it3, it4); // WARNING: it3/it4 invalid after this call
To delete all elements, use clear()
.
s1.clear();
To get the number of elements, use size()
. Similarly, use empty()
to check if empty.
auto is_empty1 { s1.size() == 0 };
auto is_empty2 { s1.empty() };
To iterate over the elements, use being()
and end()
.
// RECALL: for-each loop will implicitly call begin() and end()
for (auto &obj : s1) {
// do something with value here
}
⚠️NOTE️️️⚠️
There's also ..
↩PREREQUISITES↩
std::unordered_multimap
is a container that's a combination of std::unordered_multiset
and std::unordered_map
. That is, it's a std::unordered_map
but it allows for many key-value pairs with the same key (keys aren't unique). Similar to std::unordered_multiset
, std::unordered_multimap
requires a hashing function and an equivalence function (same defaults are used if not supplied).
⚠️NOTE️️️⚠️
Details on providing a custom std::hash<T>
implementation are in the parent section.
Details on providing a custom hasher implementation specifically for the container are in the unordered set section.
To create a std::multimap
, the same std::map
constructors apply. The only major difference is that, if you're initializing the values, any duplicate values are kept.
// prime
std::unordered_multimap<int. float> s0 {
std::pair<int, float> { 1, 99.0f },
std::pair<int, float> { 1, -99.0f }, // NOTE: this is the 2nd instance of the key 1, both key-value pairs are kept
std::pair<int, float> { 2, -100.0f },
std::pair<int, float> { 4, 123.0f },
std::pair<int, float> { 5, 4.0f }
};
std::unordered_multimap<int. float> s1 {
{ 1, 99.0f },
{ 1, -99.0f }, // NOTE: this is the 2nd instance of the key 1, both key-value pairs are kept
{ 2, -100.0f },
{ 4, 123.0f },
{ 5, 4.0f }
};
// copy range
std::unordered_multimap<int> s2 (c.begin(), c.begin() + 10) // copy first 10 key-value pairs from another container
// custom comparator
auto hasher = [] (const int & val) -> size_t { return static_cast<size_t>(val); };
std::unordered_multimap<int, float, decltype(hasher)> s3 ({ { 1, 99.0f }, { 2, 3.0f }, ... }, hasher);
// copy/move
std::unordered_multimap<int, float> s5 { std::move(s1) }; // move s1 into s5
// custom allocator
CustomAllocator allocator {};
std::unordered_multimap<int, float, decltype(std::less), CustomAllocator> s6 (std::less, allocator);
Functions are more or less the same as their std::unordered_map
counterparts. The only major difference are that...
count()
returns the number of instances for an element.are availableinsert_or_assign()
is removed because it doesn't make sense to have it (you can have duplicate keys).try_emplace()
is removed because it doesn't make sense to have it (you can have duplicate keys).// find
bool found { s1.find(3) != s1.end() };
bool found { s1.contains(3) };
// get
auto it { s1.find(3) }; // WARNING: if there's multiple instances of key, any of them could be returned here
std::pair<int, float> found_pair { *it };
// get number of instances
bool is_two_instances { s1.count(1) == 2 };
// add
s1.insert(6, 122.0f);
s1.insert(std::pair<int, float> {6, 122.0f});
s1.emplace(6, 122.0f);
s1.emplace(std::pair<int, float> {6, 122.0f});
s1.emplace_hint(s1.begin(), 6, 122.0f); // iterator should be near to where the value is
s1.emplace_hint(s1.begin(), std::pair<int, float> {6, 122.0f}); // iterator should be near to where the value is
// remove
auto it1 { s1.begin() };
int res { s1.extract(it1) }; // WARNING: it1 invalid after this point
auto it2 { s1.begin() };
s1.erase(it2); // WARNING: it2 invalid after this point
// remove range
auto it3 {s1.begin()};
auto it4 {s1.begin() + 10};
l1.erase(it3, it4); // WARNING: it3/it4 invalid after this call
// remove all
s1.clear();
// get size
auto is_empty1 { s1.size() == 0 };
auto is_empty2 { s1.empty() };
// iterate
for (auto &obj : s1) {
// do something with value here
}
⚠️NOTE️️️⚠️
There's also ..
↩PREREQUISITES↩
Container adapters are light-weight wrappers around sequential containers that expose them in a simplified way that matches a common data structure. For example, an std::vector
can be wrapped such that it's exposed as a queue. The caller of the queue doesn't have to know what type of sequential container is backing that queue.
The type of sequential container usable by a container adaptor depends on the functions it has. For example, some container adaptors require both front()
and back()
functions to be supported by the sequential container type.
std::stack
wraps a sequential container as if it were a stack abstract data type:
To create a std::stack
, pass in a reference to the sequential container to wrap: std::vector
, std::deque
, or std::list
. The container will either be copied or moved depending on whether the container reference is an normal reference or an rvalue reference. Alternatively, if you pass in no reference at all, an empty container of the type specified will get used.
⚠️NOTE️️️⚠️
The sequential container can technically be any class, so long as it supports the expected type traits (e.g. it's expected to have a function called push_back()
that has a single parameter of type ...).
// create by copying
std::vector<int> c1 { 5, 5, 5, 5, 5, 5, 5, 5 };
std::stack<int, decltype(c1)> s1 { c1 };
// create by moving
std::deque<int> c2 { 5, 5, 5, 5, 5, 5, 5, 5 };
std::stack<int, decltype(c2)> s2 { std::move(c2) };
// create into brand new
std::stack<int, std::list<int>> s3 {};
std::queue<int> q4 {}; // same as using std::deque<int> as the backing type
🔍SEE ALSO🔍
To add an item, use push()
. Internally, this invokes the wrapped container's push_back()
function.
s3.push(1);
s3.push(2);
s3.push(3);
To read the most recently added item, use top()
. Internally, this invokes the wrapped container's back()
function.
int a { s3.top() };
To remove the most recently added item, use pop()
. Internally, this invokes the wrapped container's pop_back()
function.
// NOTE: also returns the element removed
s3.pop();
s3.pop();
s3.pop();
To get the size, use size()
. Internally, this invokes the wrapped container function with the same name.
auto is_empty { s3.size() == 0 };
std::queue
wraps a sequential container as if it were a queue abstract data type:
To create a std::queue
, pass in a reference to the sequential container to wrap: std::deque
or std::list
. The container will either be copied or moved depending on whether the container reference is an normal reference or an rvalue reference. Alternatively, if you pass in no reference at all, an empty container of the type specified will get used.
⚠️NOTE️️️⚠️
The sequential container can technically be any class, so long as it supports the expected type traits (e.g. it's expected to have a function called push_back()
that has a single parameter of type ...).
// create by copying
std::deque<int> c1 { 5, 5, 5, 5, 5, 5, 5, 5 };
std::queue<int, decltype(c1)> q1 { c1 };
// create by moving
std::deque<int> c2 { 5, 5, 5, 5, 5, 5, 5, 5 };
std::queue<int, decltype(c2)> q2 { std::move(c2) };
// create into brand new
std::queue<int, std::list<int>> q3 {};
std::queue<int> q4 {}; // same as using std::deque<int> as the backing type
🔍SEE ALSO🔍
To add an item, use push()
. Internally, this invokes the wrapped container's push_back()
function.
q3.push(1);
q3.push(2);
q3.push(3);
To read the most recently added item, use either front()
or back()
. Internally, these invoke the wrapped container functions with the same name.
int a { q3.front() };
int b { q3.back() };
To remove the most recently added item, use pop()
. Internally, this invokes the wrapped container's pop_front()
function.
// NOTE: also returns the element removed
q3.pop();
q3.pop();
q3.pop();
To get the size, use size()
. Internally, this invokes the wrapped container function with the same name.
auto is_empty { q3.size() == 0 };
std::priority_queue
wraps a sequential container as if it were a priority queue abstract data type: Regardless of what order elements are added in, the only element that can be read / removed is the element with the highest priority (priority is defined by a comparator).
⚠️NOTE️️️⚠️
The sequential container can technically be any class, so long as it supports the expected type traits (e.g. it's expected to have a function called push_back()
that has a single parameter of type ...).
To create a std::priority_queue
, pass in a reference to the sequential container to wrap: std::vector
, std::deque
, or std::list
. The container will either be copied or moved depending on whether the container reference is an normal reference or an rvalue reference. Alternatively, if you pass in no reference at all, an empty container of the type specified will get used.
// create by copying
std::vector<int> c1 { 5, 5, 5, 5, 5, 5, 5, 5 };
std::priority_queue<int, decltype(c1)> q1 { c1 };
// create by moving
std::deque<int> c2 { 5, 5, 5, 5, 5, 5, 5, 5 };
std::priority_queue<int, decltype(c2)> q2 { std::move(c2) };
// create into brand new
std::priority_queue<int, std::list<int>> q3 {};
std::priority_queue<int> q4 {}; // same as using std::vector<int> as the backing type
🔍SEE ALSO🔍
In the above examples the default comparator of std::less
is used, which uses the less than operator (<) to compare two objects for priority. To define a custom comparator, pass in that comparator's type as the 3rd template parameter argument and pass it into the constructor.
auto comparator = [] (const int & lhs, const int & rhs) -> bool { return lhs < rhs; };
std::priority_queue<int, std::deque<int>, decltype(comparator)> q5 { comparator };
std::priority_queue<int, std::deque<int>, decltype(std::greater<int>)> q6 { std::greater<int> };
🔍SEE ALSO🔍
To add an item, use push()
.
q1.push(10);
q1.push(100);
q1.push(-5);
To read the most high priority element, use top()
.
int a { q1.top() };
To remove the most high priority element, use pop()
.
// NOTE: also returns the element removed
q1.pop();
q1.pop();
q1.pop();
To get the size, use size()
.
auto is_empty { s3.size() == 0 };
↩PREREQUISITES↩
Iterators in C++ are similar to iterators in Java. In Java, objects that...
Iterable
interface (e.g. ArrayList
)Iterator
interface.In C++, there is no requirement to extend from any base classes or interfaces. Instead, any type can act as an iterator so long as it supports as set of operators:
!=
- test if the position of one iterator doesn't match the position of another iterator (e.g. my_iterator != end_iterator
).++
- move to the next item (e.g. my_iterator++
).*
(dereference) - access the next item (e.g. int value {*my_iterator}
).⚠️NOTE️️️⚠️
Notice that the operators are more or less array / pointer behaviour. Given something like int *
pointing to the beginning of an array, ...
++
) moves it to the next element of the array via pointer arithmetic.*
) provides the value at the array element it points to.!=
) is a way to check if it hasn't gone past the last array element.An iterator is basically a set of operators that walk elements in the same way as you would an array. A class can implement the operator overloads and behave the same way.
Similarly, any class can act as an iterable by implementing begin()
and end()
member functions:
begin()
- returns an iterator pointing to the first item.end()
- returns an iterator pointing to past-the-end (just after the last element).MyIterator it {collection.begin()};
while (it != collection.end()) {
MyObject value {*it};
// do something with value here
++iterator;
}
C++ iterables and iterators can be used together in range-based for loops.
for (MyIterator it : collection) {
MyObject value {*it};
// do something with value here
}
In total, 5 kinds of iterators are supported by C++. The kind of iterator described above is called an input iterator and it typically requires an equality operator overload (operator ==()
) in addition to inequality. Input iterators are the closest thing to a standard Java Iterator
-- read-only and forward-only. Other kinds of iterators require different operator overloads.
input | output | forward | bidirectional | random access | |
---|---|---|---|---|---|
++it and it++ (move forward) |
✓ | ✓ | ✓ | ✓ | ✓ |
--it and it-- (move backward) |
✓ | ✓ | ✓ | ✓ | |
it1 == it2 andit1 != it2 (test if at same position) |
✓ | ✓ | ✓ | ✓ | ✓ |
it1 < it2 (test if before) |
✓ | ||||
it1 <= it2 (test if before or at) |
✓ | ||||
it1 > it2 (test if after) |
✓ | ||||
it1 >= it2 (test if after or at) |
✓ | ||||
x = *it (dereference and get) |
✓ | ✓ | ✓ | ✓ | |
*it = x (dereference and set) |
✓ | ✓ | ✓ | ✓ | |
it1 += n and it1 + n (add integer) |
✓ | ||||
it1 -= n and it1 - n (subtract integer) |
✓ | ||||
it2 - it1 (subtract iterators to get positional difference) |
✓ | ||||
it1 + it2 (add iterators) |
⚠️NOTE️️️⚠️
Note that the adding of iterators is listed above but is not supported by any of the iterator types. It's there to make it explicit that adding together two iterators isn't a thing.
⚠️NOTE️️️⚠️
If you're dealing with the STL, there's also special iterator implementations that allow insertions rather than setting elements. See insert_iterator
, back_insert_iterator
, and front_insert_iterator
.
Each of the following concepts map to a type of iterator.
std::input_iterator
enforces input iterator type traits.std::output_iterator
enforces output iterator type traits.std::forward_iterator
enforces forward iterator type traits.std::bidirectional_iterator
enforces bidirectional iterator type traits.std::random_access_iterator
enforces random access iterator type traits.std::contiguous_iterator
enforces contiguous iterator type traits.The type traits of each iterator type were described in the parent section.
input | output | forward | bidirectional | random access | |
---|---|---|---|---|---|
++it and it++ (move forward) |
✓ | ✓ | ✓ | ✓ | ✓ |
--it and it-- (move backward) |
✓ | ✓ | ✓ | ✓ | |
it1 == it2 andit1 != it2 (test if at same position) |
✓ | ✓ | ✓ | ✓ | ✓ |
it1 < it2 (test if before) |
✓ | ||||
it1 <= it2 (test if before or at) |
✓ | ||||
it1 > it2 (test if after) |
✓ | ||||
it1 >= it2 (test if after or at) |
✓ | ||||
x = *it (dereference and get) |
✓ | ✓ | ✓ | ✓ | |
*it = x (dereference and set) |
✓ | ✓ | ✓ | ✓ | |
it1 += n and it1 + n (add integer) |
✓ | ||||
it1 -= n and it1 - n (subtract integer) |
✓ | ||||
it2 - it1 (subtract iterators to get positional difference) |
✓ | ||||
it1 + it2 (add iterators) |
void print_range(std::random_access_iterator auto &&it) {
std::cout << it[3] << std::endl;
std::cout << it[1] << std::endl;
std::cout << it[2] << std::endl;
}
⚠️NOTE️️️⚠️
There are a handful of other iterator type traits that build on-top of each other to form a taxonomy of concepts for iterators. For example, at the bottom is ...
std::incrementable
/ std::weakly_incrementable
- enforces it++
and ++it
type traits.std::input_or_output_iterator
- enforces *it
type traits and enforces std::weakly_incrementable
.I don't think these are useful enough to document.
When writing generic code that makes use of iterators, directly using the iterator may lead to poor performance. For example, if you want an iterator to move up 100 spaces, you can't do it += 100
because it
may not be a random access iterator. The safest thing to do for generic code would be to perform it++
100 times, but that means you miss out any performance gains of doing it += 100
if it
were a random access iterator.
Several helper functions exist to help with examples like the one above. These helper functions choose the most performant way of doing something based on the type traits of the iterator (e.g. if it's a bidirectional iterator vs a random access iterator).
std::advance()
- move forward / backward an iterator by some amount.
std::advance(it, 100); // move forward 100 spaces
std::advance(it, -100); // move backward 100 spaces
std::next()
- move forward by some amount.
std::next(it); // move forward 1 space
std::next(it, 100); // move forward 100 spaces
std::prev()
- move backward by some amount.
std::prev(it); // move backward 1 space
std::prev(it, 100); // move backward 100 spaces
std::distance()
- get distance between two iterators.
auto d { std::distance(it1, it2) }; // it2 should be > than it1
std::iter_swap()
- given two iterators, swap the elements at their current position.
std::iter_swap(it1, it2);
// NOTE: it1 and it2 don't have to point to the same underlying container or
// underlying type -- as long as the types are assignable to each other, it'll work.
Iterator adapters are light-weight wrappers that either simplify operations or provide some functionality under the type traits of an iterator. For example, an iterator exists that can wrap a container as an iterator, where writes to that iterator will translate to inserts into the container.
↩PREREQUISITES↩
An iterator adaptor that wraps a container and exposes it as an output iterator, where writes are translated to inserts on the container. The It comes in 3 flavours:
std::back_insert_iterator
- invokes the container's push_back()
function.
std::vector<int> container {};
std::back_insert_iterator it { container };
*it = 1;
it++;
*it = 2;
it++;
*it = 3;
std::front_insert_iterator
- invokes the container's push_front()
function.
std::deque<int> container {};
std::front_insert_iterator it { container };
*it = 3;
it++;
*it = 2;
it++;
*it = 1;
std::insert_iterator
- invokes the container's insert()
function.
std::deque<int> container { 1, 2, 3, 4, 5, 6, 7};
std::insert_iterator it { container, container.begin() + 2 }; // start inserting at 2 elements down
*it = 100;
it++;
*it = 101;
it++;
*it = 102;
⚠️NOTE️️️⚠️
According to the book, these adapters ignore the increment operator because it isn't required for what's being done (insertion function calls to the wrapped container). Ignored parts are there because type traits for output iterator require them to be there.
⚠️NOTE️️️⚠️
There are helper functions available for creating these. The function names are similar to the iterator adapter names: replace the insert_iterator
part with inserter
(e.g. std::back_inserter()
).
🔍SEE ALSO🔍
An iterator adaptor that wraps another iterator but modifies the dereferencing operator such that it returns an rvalue -- it forcefully moves the element. The typical use case for this is moving items from one container to another (as opposed to copying).
std::vector<MyMovableObject> container1 {};
container1.emplace_back(1, "morning");
container1.emplace_back(2, "midday");
container1.emplace_back(3, "evening");
// MOVE each item in container1 to container 2 (move semantics / move constructor)
std::vector<MyMovableObject> container2(
std::move_iterator{ container1.begin() },
std::move_iterator{ container1.end() }
);
⚠️NOTE️️️⚠️
There is a helper function for creating this: std::make_move_iterator()
.
An iterator adaptor that wraps another iterator but exposes it in reverse order -- last element to first element.
std::vector<int> container1 { 1, 2, 3, 4, 5, 6, 7, 8, 9 };
std::vector<MyMovableObject> container2(
std::reverse_iterator{ container1.end() },
std::reverse_iterator{ container1.begin() }
);
One important thing to note is that the first iterator being wrapped must not be before the 2nd iterator being wrapped. If it were, it'd be undefined behaviour.
if (it2 > it1)
std::vector<MyMovableObject> container2( // THIS IS UNDEFINED BEHAVIOUR
std::reverse_iterator{ container1.begin() },
std::reverse_iterator{ container1.begin() + 2 }
);
⚠️NOTE️️️⚠️
There is a helper function for creating this: std::make_reverse_iterator()
.
⚠️NOTE️️️⚠️
Most collections expose rbegin()
/ rend()
functions that automatically give back a reverse iterator.
↩PREREQUISITES↩
The C++ standard library provides an functionality similar to Java streams, called ranges. Like Java streams, ranges enable functional programming in that a range can be fed into a chain of higher-level operations that manipulate the stream of elements within, lazily if possible.
// The code below prints the numbers 2 and 4. The Java streams equivalent is provided on the right-hand side.
//
// C++ vs JAVA
std::vector<int> v{ 0, 1, 2 }; // var v = ArrayList<Integer>();
// v.add(0);
// v.add(1);
// v.add(2);
//
auto range { // var range =
v // v.stream()
| std::views::transform([](int x) { return x * 2; }) // .map(e -> e * 2)
| std::views::filter([](int x) { return x != 0; }) // .filter(e -> e != 0);
}; //
for (int e : range) { // range.forEach(e -> { System.out.println(e); } );
std::cout << e << std::endl; //
} //
As shown in the example above, ranges work similarly to Java streams. Operations are chained together using the pipe operator (|), where those operations are applied from left-to-right.
⚠️NOTE️️️⚠️
WARNING: Once v
above gets destroyed (e.g. goes out of scope), range
becomes invalid. v
is referenced by range
, it isn't copied / moved into range
. See the subsection on owning views to workaround this problem.
⚠️NOTE️️️⚠️
Unlike Java streams, the current implementation of ranges (C++20) are missing some major functionality:
std::vector<int> {0, 1, 2} | std::views::transform([](int x) { return x * 2; })
and std::vector<int> {0, 1, 2} | std::views::filter([](int x) { return x != 0; })
don't have the same type)forEach()
in Java streams)Any object that has a particular set of type traits is a range. Those type traits map closely to iterator type traits: A range must have implementation for std::begin(R)
and std::end(R)
functions, and usage patterns similar to that of the type of iterator it maps to:
One major difference between iterators and ranges is that a range's end()
function doesn't necessarily have to return the same type as its begin()
function. It can instead return a sentinel type that marks the end of the range. If a range does return the same type for both begin()
and end()
, it's referred to as a common range. Containers in the C++ standard library are all of common ranges. However, once a container gets piped into an operation, it may end up not being a common range.
⚠️NOTE️️️⚠️
See std::views::common()
below. It wraps a view and makes it so that begin()
and end()
have a common return type.
container | range type |
---|---|
std::unordered_set |
input range |
std::unordered_map |
input range |
std::unordered_multiset |
input range |
std::unordered_multimap |
input range |
std::forward_list |
input range |
std::set |
output range |
std::map |
output range |
std::multiset |
output range |
std::multimap |
output range |
std::list |
output range |
std::deque |
bidirectional range |
std::array |
contiguous range |
std::vector |
contiguous range |
⚠️NOTE️️️⚠️
std::string
and other types of string variants, while not containers, are contiguous ranges.
When an operation such as transformation or filtering is applied on a range, it's applied through a view. A view is a special type of range that typically doesn't own any data and typically isn't mutable / is state-less. As such, a view typically has constant-time copy, move, and assignment.
view | example | description |
---|---|---|
std::views::filter |
v | std::views::filter([](int x) { return x != 0; } |
keep elements that pass predicate |
std::views::transform |
v | std::views::transform([](int x) { return x * 2; } |
modify elements |
std::views::take |
v | std::views::take(5) |
keep first n elements |
std::views::take_while |
v | std::views::take_while([](int x) { return x != 0; } |
keep elements until predicate fails |
std::views::drop |
v | std::views::drop(5) |
skip first n elements |
std::views::drop_while |
v | std::views::drop_while([](int x) { return x == 0; } |
skip elements until predicate fails |
std::views::join |
v | std::views::join |
flatten a range of ranges (2D) into a range (1D) |
std::views::join_with |
v | std::views::join_with(-1) |
flatten a range of ranges (2D) into a range (1D) with delimiters in between |
std::views::split |
v | std::views::split(-1) |
split a range into a range of ranges using a delimiter |
std::views::lazy_split |
v | std::views::lazy_split(-1) |
split a range into a range of ranges using a delimiter (lazily) |
std::views::counted |
std::views::counted(v.begin(), 5) |
keep a sub-range of a range |
std::views::common |
std::views::common(v) |
convert to a common view (being() and end() have same type) |
std::views::reverse |
v | std::views::reverse |
reverse a view |
std::views::elements |
v | std::views::elements<1> |
transform tuples to their nth item |
std::views::keys |
v | std::views::keys |
transform pairs to their 1st item |
std::views::values |
v | std::views::values |
transform pairs to their 2nd item |
std::views::zip |
std::views::zip(v1, v2, v3) |
zip multiple ranges together (similar to Python's zip() ) |
In addition to performing operations on another range's elements, a view may originate elements itself.
view | example | description |
---|---|---|
std::views::empty |
std::views::empty<int> |
empty range of some type |
std::views::single |
std::views::single<int> { 5 } |
range of a single element |
std::views::iota |
std::views::iota(1, 5) |
range of incrementing values (bounded) |
⚠️NOTE️️️⚠️
The tables above aren't exhaustive.
At it's core, a range must satisfy the concept std::ranges::range
, which only asks that the type have an implementation for the functions std::ranges::begin(R)
and std::ranges::end(R)
. There are two concept specializations:
std::ranges::sized_range
: A range type that has an implementation for std::ranges::size(R)
, which returns the number of elements within the range.std::ranges::borrowed_range
: A range type that provides a template specialization for std::ranges::enable_borrowed_range<R>
, which signals that the range type guarantees that the iterators it returns aren't bound to the lifetime of the range. Borrowed ranges are commonly generate elements on-the-fly.std::ranges::view
: A range type with constant-time copy/move/assignment operations and provides a template specialization for std::enable_view<R>
, which signals that the range type is a view. Views are commonly used to transform elements from another range or generate elements on the fly.The following templates provide access to the types used by a range.
std::ranges::iterator_t<R>
- iterator type of range R
std::ranges::sentinel_t<R>
- sentinel type of range R
(type returned by std::ranges::end(R)
, which may be different from the type returned by std::ranges::begin(R)
)std::ranges::size_t<R>
- type of range R
's size type (type returned by std::ranges::size(R)
, if implemented)std::ranges::difference_t<R>
- type returned by differencing two iterator types of range R
(resolves to std::iter_difference_t<std::ranges::iterator)t<R>>
)std::ranges::range_reference_t<R>
- type returned by dereferencing an iterator of range R
(type returned by *(std::ranges::begin(R)
)std::ranges::range_rvalue_reference_t<R>
- type returned by dereferencing an iterator of range R
but as an r-value reference (type returned by std::move(*(std::ranges::begin(R))
)std::ranges::range_value_t<R>
- type returned by dereferencing an iterator of range R
but with the reference, const
, and volatile
(e.g. if std::ranges::range_reference_t<R>
is const int&
, std::ranges::range_value_t<R>
is int
)void print_range(std::ranges::range auto &&range) {
using ELEM_REF = std::ranges::range_reference_t<decltype(range)>;
for (ELEM_REF v : range) {
std::cout << v << std::endl;
}
}
The following concepts detail the features supported by a range's iterator type. These concept loosely map to the concept for iterators.
std::ranges::input_range
maps to std::input_iterator
std::ranges::output_range
maps to std::output_iterator
std::ranges::forward_range
maps to std::forward_iterator
std::ranges::bidirectional_range
maps to std::bidirectional_iterator
std::ranges::random_access_range
maps to std::random_access_iterator
std::ranges::contiguous_range
maps to std::contiguous_iterator
void print_range(std::ranges::random_access_range auto &&range) {
auto it { std::ranges::begin(range) };
std::cout << it[3] << std::endl;
std::cout << it[1] << std::endl;
std::cout << it[2] << std::endl;
}
An owning view moves the range it's operating on into itself rather than reference that range. Doing this avoids the problem of a view referencing a destroyed range, which usually happens when a function returns a view but the range that view is referencing goes out of scope.
// This function is faulty because the returned view REFERENCES vec but
// vec gets destroyed when the function exits. The view references a
// destroyed object.
auto faulty_code() {
std::vector<int> vec{ 1, 2, 3 };
return vec
| std::views::transform([](int i) { return i * 2; })
| std::views::filter([](int i) { return i != 0; });
}
To create an owning view, use std::move()
on the original range.
// By using std::move() on the range, the view becomes an owning view.
auto good_code() {
std::vector<int> vec{ 1, 2, 3 };
return std::move(vec)
| std::views::transform([](int i) { return i * 2; })
| std::views::filter([](int i) { return i != 0; });
}
⚠️NOTE️️️⚠️
This started to be supported in version 12 of g++.
A range's type depends on the type of the underlying container or generator (e.g. std::ordered_set
), element type of the range (e.g. int
), and the list of view manipulations applied to that range. Each change ends up changing the underlying type of the range.
std::vector<int> vec{ 1, 2, 3 };
// THE CODE BELOW PRINTS "same!"
// ----------------------------
// decltype(v1) == decltype(v2) because both use the same underlying container type, element type,
// and have the exact same list of views applied WITH the exact same functor object.
auto functor { [](int i) { return i * 2; } };
auto v1 { vec | std::views::transform(functor) };
auto v2 { vec | std::views::transform(functor) };
if constexpr (std::is_same_v<decltype(v1), decltype(v2)>) {
std::cout << "same!";
} else {
std::cout << "NOT same!";
}
// THE CODE BELOW PRINTS "NOT same!"
// ---------------------------------
// decltype(v1) != decltype(v2) because although the two types use the same underlying container
// type, element type, and have the exact same list of views applied, those views are DIFFERENT the
// functor classes each are unique -- they're technically two different classes, each with its own
// unique type. Those unique types are included in the types of v3 and v4 somewhere in a depth of
// template parameter chains.
auto v3 { vec | std::views::transform([](int i) { return i * 2; }) };
auto v4 { vec | std::views::transform([](int i) { return i * 2; }) };
if constexpr (std::is_same_v<decltype(v3), decltype(v4)>) {
std::cout << "same!";
} else {
std::cout << "NOT same!";
}
The lack of type erasures sometimes causes problems when doing certain types of view manipulations. For example, combining together two ranges with the same element type (flattening) via std::views::join
isn't possible unless the types of those ranges are exactly the same.
std::ranges::empty_view<int> y{};
std::ranges::single_view<int> x{5};
std::vector combined{ x , y }; // x and y of different types, vector's template parameter can't be deduced
auto joined { std::ranges::join_view(combined) };
for (auto x : joined) {
std::cout << x << std::endl;
}
To mitigate this, a third-party library called ranges-v3 provides ranges::any_view<T>
. ranges::any_view<T>
essentially "erases" the type of a range by wrapping it and unifying it to a specific type. The downside of this wrapping is that it has a performance impact as abstracting away the type information involves extra runtime code.
std::ranges::empty_view<int> y{};
std::ranges::single_view<int> x{5};
std::vector<ranges::any_view<int>> combined{
ranges::any_view<int> { x },
ranges::any_view<int> { y }
};
auto joined { std::ranges::join_view(combined) };
for (auto x : joined) {
std::cout << x << std::endl;
}
One important thing about ranges::any_view<T>
is that it takes an optional second template parameter which defines the capabilities of the range its wrapping. By default, it's set to category::input
which supports capabilities of an input range, but it also supports ...
category::input
- satisfies std::ranges::input_range
conceptcategory::forward
- satisfies std::ranges::forward_range
conceptcategory::bidirectional
- satisfies std::ranges::bidirectional_range
conceptcategory::random_access
- satisfies std::ranges::random_access_range
conceptcategory::sized
- satisfies std::ranges::sized_ranges
concept⚠️NOTE️️️⚠️
There's also category::none
and category::mask
, not exactly sure what these are for.
std::ranges::empty_view<int> y{};
std::ranges::single_view<int> x{5};
std::vector<ranges::any_view<int, ranges::category::input>> combined{
ranges::any_view<int, ranges::category::input> { x },
ranges::any_view<int, ranges::category::input> { y }
};
auto joined { std::ranges::join_view(combined) };
for (auto x : joined) {
std::cout << x << std::endl;
}
Alternatively, in certain cases std::span<T>
also abstracts away type information.
std::ranges::empty_view<int> y{};
std::ranges::single_view<int> x{5};
std::vector<std::span<int>> combined{
std::span<int> { x },
std::span<int> { y }
};
auto joined { std::ranges::join_view(combined) };
for (auto x : joined) {
std::cout << x << std::endl;
}
⚠️NOTE️️️⚠️
C++20 / C++23 has nothing in the standard library for this except for std::span<T>
, and AFAIK type erasure isn't what it was intended for. You should use ranges-v3. Future versions of C++ might provide something.
To write a custom view, create a class that inherits from std::ranges::view_interface
with a begin()
function, and an end()
function, and either a default constructor (if generating values) and / or a constructor that takes in a range (if manipulating values).
struct FakeGeneratingView : public std::ranges::view_interface<FakeGeneratingView> {
auto begin() const { return &(values[0]); }
auto end() const { return &(values[3]); }
private:
int[3] values = { 0, 1, 2 };
};
// USE THE VIEW
for (auto x : FakeGeneratingView{}) {
std::cout << x << std::endl;
}
⚠️NOTE️️️⚠️
The above example class is feeding itself as a template parameter to std::ranges::view_interface
. This is a common C++ idiom referred to as the curiously recurring template pattern (CRTP) which allows for feeding the derived class back into a templated base class. Something to do with compile-time polymorphism.
The following example is another custom view but this time it takes in an another range and manipulates its values and it supports the pipe operator.
template<std::ranges::input_range R>
requires std::ranges::view<R>
struct AddFiveView : public std::ranges::view_interface<AddFiveView<R>> {
AddFiveView() = delete;
constexpr AddFiveView(R&& r):
i(std::forward<R>(r)),
_begin(std::begin(i.range)),
_end(std::end(i.range)) {}
constexpr auto begin() const { return _begin; }
constexpr auto end() const { return _end; }
private:
struct F : decltype([](auto x) { return x + 5; }) {};
using R_RES = std::ranges::transform_view<R, F>;
struct Internal {
Internal(R&& r) : range(std::forward<R>(r) | std::views::transform(F())) {}
R_RES range;
};
Internal i; // what do I put as the template arg???
std::ranges::iterator_t<R_RES> _begin;
std::ranges::iterator_t<R_RES> _end;
};
struct AddFiveViewAdaptorClosure {
constexpr AddFiveViewAdaptorClosure() {}
template <std::ranges::viewable_range R>
constexpr auto operator()(R&& r) const {
return AddFiveView<R>(std::forward<R>(r));
}
} ;
struct AddFiveViewAdaptor {
template<std::ranges::viewable_range R>
constexpr auto operator () (R && r) {
return AddFiveView(std::forward<R>(r)) ;
}
constexpr auto operator () () {
return AddFiveViewAdaptorClosure();
}
};
template <std::ranges::viewable_range R>
constexpr auto operator | (R&& r, AddFiveViewAdaptorClosure const & a) {
return a(std::forward<R>(r)) ;
}
namespace CustomViews {
AddFiveViewAdaptorClosure AddFiveView;
}
// USE THE VIEW VIA THE PIPE OPERATOR
// Note the use of std::views::all -- this is required for some reason (maybe it normalizes some missing pieces)
std::vector<int> v{0,1,3};
for (auto x : v | std::views::all | CustomViews::AddFiveView) {
std::cout << x << std::endl;
}
↩PREREQUISITES↩
The C++ standard library comes with a set of algorithms that work via iterators. The use of iterators means that the algorithms aren't necessarily bound to containers, but can work on any iterable type. For example, std::vector
, std::string
, and std::filesystem::directory_iterator
are all iterable types.
std::vector<int> v {3, 2, 3, 4, 5, 6, 8, 7, 9};
auto pos { std::find(v.begin(), v.end(), 5) }; // find first instance of the integer 5
Algorithms are commonly exposed as callable units. Common algorithm functions are listed below. More elaborate functions and their usages are covered in the subsections below.
Function | Description |
---|---|
std::for_each() |
Walks over each element of a range |
std::transform() |
Transforms each element of a range |
std::count() |
Counts the number of times some element occurs |
std::fill() |
Fills range with a specific element |
std::copy() |
Copies each element of a range into another range |
std::move() |
Moves each element of a range into another range |
std::replace() |
Replaces elements within a range |
std::remove() |
Removes elements within a range |
std::reverse() |
Reverses the order of a range's elements |
std::find() |
Finds an element in a range |
std::search() |
Finds a sub-range within a range |
std::equal() |
Checks if two ranges are equal |
std::sort() |
Sorts a range |
⚠️NOTE️️️⚠️
Why use some of the algorithms here instead of those that come in the ranges library? The ranges library in C++20 doesn't have parallel algorithms (future versions of C++ may have this) and is missing some of these algorithms.
If the algorithm supports it, an execution policy may be specified via the first parameter. This policy requests a level of parallelism for the algorithm's execution.
std::execution::seq
- single-threaded.std::execution::unseq
- single-threaded but vectorized (SIMD).std::execution::par
- multi-threaded.std::execution::par_unseq
- multi-threaded and vectorized (SIMD).std::vector<int> v {3, 2, 3, 4, 5, 6, 8, 7, 9};
auto pos { std::find(std::execution::par_unseq, v.begin(), v.end(), 5) }; // find first instance of the integer 5, requested multi-threaded + vectorized
Function overloads needing an execution policy may require more elaborate iterator types (e.g. std::find()
's execution policy overloads require forward iterators instead of input iterators). This is has to do with how multi-threaded / vectorized variants of an algorithm access data (e.g. typically need to hop around the data).
The subsections below list out various algorithm functions, their overloads, and usage examples. Different algorithm functions / overloads of the same function may use different type traits to perform the same task. For example, ...
std::equal_to<>()
overload.std::less<>()
overload.std::swap()
while others use assignment.⚠️NOTE️️️⚠️
I think the best thing you can do to avoid these type trait requirement issues is to maybe just ensure the type is "regular" / "semi-regular". Hopefully that'll make things just work with most algorithm functions.
🔍SEE ALSO🔍
These algorithms iterate over elements of a range to do something non-destructive.
Function | Description |
---|---|
std::for_each(it1, it2, f) |
Call f() on each element |
std::for_each_n(it1, n, f) |
Call f() on first n elements |
std::count(it1, it2, v) |
Count elements that are v |
std::count_if(it1, it2, p) |
Count elements where p() is true |
std::all_of(it1, it2, p) |
Check that all elements in it pass p(*it) == true |
std::none_of(it1, it2, p) |
Check that all elements in it pass p(*it) != true |
std::any_of(it1, it2, p) |
Check at least one element in it passes p(*it) == true |
std::vector<int> v {0,1,2,3,4,5};
std::for_each(v.begin(), v.end(), [](auto& v) { std::cout << v << ' '; }); // 0 1 2 3 4 5
std::for_each_n(v.begin(), 3, [](auto& v) { std::cout << v << ' '; }); // 0 1 2
auto c1 { std::count(v.begin(), v.end(), 5) }; // 1
auto c2 { std::count_if(v.begin(), v.end(), [](auto& v) { return v % 2 == 0; }) }; // 3
bool t1 { std::all_of(v.begin(), v.end(), [](auto& v) { return v % 2 == 0; }) }; // false
bool t2 { std::any_of(v.begin(), v.end(), [](auto& v) { return v % 2 == 0; }) }; // true
bool t3 { std::none_of(v.begin(), v.end(), [](auto& v) { return v % 2 == 0; }) }; // false
↩PREREQUISITES↩
The functions fill a range either with repeats of the same value or with generate values.
Function | Description |
---|---|
std::fill(it1, it2, v) |
Write out v |
std::fill_n(it1, n, v) |
Write out n copies of v |
std::generate(it1, it2, g) |
Write out g() 's result |
std::generate_n(it1, n, g) |
Write out g() 's result n times |
std::iota(it1, it2, v) |
Write sequentially increasing values starting from v |
std::vector<int> v1 {1,3,3,5};
std::fill(v1.begin(), v1.end(), 0); // v1 becomes {0,0,0,0}
std::vector<int> v2 {};
std::fill_n(std::back_insert_iterator { v2 }, 4, -3); // v2 becomes {-3,-3,-3,-3}
std::vector<int> v3 {1,2,3,4};
std::generate(v3.begin(), v3.end(), []() { return 65; }); // v3 becomes {65,65,65,65}
std::vector<int> v4 {};
std::generate_n(std::back_insert_iterator { v4 }, 4, []() { return 1; }); // v4 becomes {1,1,1,1}
std::vector<int> v5 {0,0,0,0};
std::iota(v5.begin(), v5.end(), 150); // v4 becomes {150,151,152,153}
// Note the use of std::back_insert_iterator in some of these examples. You need to use `std::back_insert_iterator`
// when you want to insert elements rather than overwrite them. It creates a phony never-ending output iterator that
// simply calls push_back() on the underlying container.
↩PREREQUISITES↩
These functions copy one range into another.
Function | Description |
---|---|
std::copy(itA1, itA2, itB1) |
Overwrite B with A |
std::copy_n(itA1, n, itB1) |
Overwrite B with first n elements of A |
std::copy(itA1, itA2, itB1, p) |
Overwrite B with A only for elements where p() is true |
std::copy_backward(itA1, itA2, itB2) |
Overwrite B with A (from last-to-first) |
std::vector<int> v1 {0,1,2,3,4,5};
std::vector<int> v2 {};
std::copy(v1.begin(), v1.end(), std::back_insert_iterator { v2 }); // v2 becomes {0,1,2,3,4,5}
std::vector<int> v3 {-1,-1,-1,-1,-1,-1};
std::copy(v1.begin(), v1.end(), v3.begin()); // v3 becomes {0,1,2,3,4,5}
std::vector<int> v4 {-1,-1,-1};
std::copy_n(v1.begin(), 3, v4.begin()); // v4 becomes {0,1,2}
std::vector<int> v5 {-1,-1,-1,-1,-1,-1};
std::copy_backward(v1.begin(), v1.end(), v5.end()); // v5 becomes {0,1,2,3,4,5}
// Note the use of std::back_insert_iterator in some of these examples. You need to use `std::back_insert_iterator`
// when you want to insert elements rather than overwrite them. It creates a phony never-ending output iterator that
// simply calls push_back() on the underlying container.
These functions move elements between ranges.
🔍SEE ALSO🔍
Function | Description |
---|---|
std::move(itA1, itA2, itB1) |
Move A to B using move semantics |
std::move_backward(itA1, itA2, itB2) |
Move A to B using move semantics (last to first) |
std::swap_ranges(itA1, itA2, itB1) |
Swap between A and B using std::swap(*itA, *itB) |
std::vector<int> v1 {1,3,3,5};
std::vector<int> v2 {};
std::move(v1.begin(), v1.end(), std::back_insert_iterator { v2 }); // v2 becomes {1,3,3,5}
std::vector<int> v3 {1,3,3,5};
std::vector<int> v4 {};
std::move(v3.begin(), v3.end(), v4.begin()); // v4 becomes {1,3,3,5}
std::vector<int> v5 {1,3,3,5};
std::vector<int> v6 {0,0,0,0};
std::move_backward(v5.begin(), v5.end(), v6.end()); // v6 becomes {1,3,3,5}
std::vector<int> v7 {1,3,3,5};
std::vector<int> v8 {9,9,9,9};
std::swap_ranges(v7.begin(), v7.end(), v8.begin()); // v7 becomes {9,9,9,9} and v8 becomes {1,3,3,5}
std::vector<int> v9 {1,3,3,5};
std::swap_ranges(v9.begin() + 2, v9.end(), v9.begin()); // v9 becomes {3,5,1,3}
// Note the use of std::back_insert_iterator in some of these examples. You need to use `std::back_insert_iterator`
// when you want to insert elements rather than overwrite them. It creates a phony never-ending output iterator that
// simply calls push_back() on the underlying container.
⚠️NOTE️️️⚠️
If you're swapping within the same range via std::swap_ranges()
, the positions being swapped can't overlap. If they do, it's undefined behaviour.
These functions replace elements within a range.
Function | Description |
---|---|
std::replace(it1, it2, v_old, v_new) |
Set all positions with v_old to v_new |
std::replace_if(it1, it2, p, v_new) |
Set all positions where p() is true to v_new |
std::replace_copy(itA1, itA2, itB1, v_old, v_new) |
std::replace(itA1, itA2, v_old, v_new) but result written to B |
std::replace_copy_if(itA1, itA2, itB1, p, v_new) |
std::replace(itA1, itA2, p, v_new) but result written to B |
std::vector<int> v1 {1,3,3,5};
std::replace(v1.begin(), v1.end(), 3, 9); // v1 becomes {1,9,9,5}
std::replace_if(v1.begin(), v1.end(), [](auto &v) { return v == 9; }, 0); // v1 becomes {1,0,0,5}
std::vector<int> v2 {};
std::replace_copy(v1.begin(), v1.end(), std::back_insert_iterator { v2 }, 0, 4); // v2 becomes {1,4,4,5}
std::vector<int> v3 {};
std::replace_copy_if(v1.begin(), v1.end(), std::back_insert_iterator { v3 }, // v3 becomes {1,3,3,5}
[](auto &v) { return v == 0; }, 3);
// Note the use of std::back_insert_iterator in some of these examples. You need to use `std::back_insert_iterator`
// when you want to insert elements rather than overwrite them. It creates a phony never-ending output iterator that
// simply calls push_back() on the underlying container.
↩PREREQUISITES↩
⚠️NOTE️️️⚠️
Are these using std::swap()
or std::move()
or assignment (=)?
These functions remove elements from a range.
Function | Description |
---|---|
std::remove(it1, it2, v_old) |
Remove elements that are v_old |
std::remove_if(it1, it2, p) |
Remove elements where p() is true |
std::remove_copy(itA1, itA2, itB1, v_old) |
std::remove(itA1, itA2, v_old) but result written to B |
std::remove_copy_if(itA1, itA2, itB1, p) |
std::remove(itA1, itA2, p) but result written to B |
Removing an element simply shuffles around elements accordingly and returns a new ending iterator. It won't resize the underlying container to end at that new ending iterator's position. That's the user's responsibility.
std::vector<int> v1 {1,3,3,5};
auto v1_new_end { std::remove(v1.begin(), v1.end(), 3) };
v1.erase(v1_new_end, v1.end()); // v1 becomes {1,5}
std::vector<int> v2 {1,3,3,5};
auto v2_new_end { std::remove_if(v2.begin(), v2.end(), [](auto &e) { return e == 3; }) };
v2.erase(v2_new_end, v2.end()); // v2 becomes {1,5}
std::vector<int> v3 {1,3,3,5};
std::vector<int> v3_removed {};
std::remove_copy(v3.begin(), v3.end(), std::back_insert_iterator { v3_removed }, 3); // v3_removed becomes {1,5}
std::vector<int> v4 {1,3,3,5};
std::vector<int> v4_removed {};
std::remove_copy_if(v4.begin(), v4.end(), std::back_insert_iterator { v4_removed },
[](auto &e) { return e == 3; }); // v4_removed becomes {1,5}
// Note the use of std::back_insert_iterator in some of these examples. You need to use `std::back_insert_iterator`
// when you want to insert elements rather than overwrite them. It creates a phony never-ending output iterator that
// simply calls push_back() on the underlying container.
⚠️NOTE️️️⚠️
C++20 offers a new function called std::erase()
/ std::erase_if()
which combines std::remove()
/ std::remove_if()
with calling the container's erase()
function, ensuring that container is properly resized.
These functions replace multiple adjacent occurrences of an element within a range with a single occurrence (collapse adjacent duplicates), which is essentially just another form of removing elements.
Function | Description |
---|---|
std::unique(it1, it2) |
Collapse adjacent duplicates |
std::unique(it1, it2, bp) |
Collapse adjacent duplicates, using bp() to determine if two elements are duplicates |
std::unique_copy(itA1, itA2, itB1) |
std::unique(itA1, itA2) but result written to B |
std::unique_copy(itA1, itA2, itB1, bp) |
std::unique(itA1, itA2, bp) but result written to B |
std::vector<int> v1 {1,3,3,5};
auto v1_new_end { std::unique(v1.begin(), v1.end()) } ;
v1.erase(v1_new_end, v1.end()); // v1 becomes {1,3,5}
std::vector<int> v2 {1,3,3,5};
auto v2_new_end { std::unique(v2.begin(), v2.end(), [](auto &e1, auto &e2) { return e1 == e2; }) };
v2.erase(v2_new_end, v2.end()); // v2 becomes {1,3,5}
std::vector<int> v3 {1,3,3,5};
std::vector<int> v3_unique {};
std::unique_copy(v3.begin(), v3.end(), std::back_insert_iterator { v3_unique }); // v3_unique becomes {1,3,5}
std::vector<int> v3 {1,3,3,5};
std::vector<int> v3_unique {};
std::unique_copy(v3.begin(), v3.end(), std::back_insert_iterator { v3_unique }); // v3_unique becomes {1,3,5}
std::vector<int> v4 {1,3,3,5};
std::vector<int> v4_unique {};
std::unique_copy(v4.begin(), v4.end(), std::back_insert_iterator { v4_unique },
[](auto &e1, auto &e2) { return e1 == e2; }); // v4_unique becomes {1,3,5}
// Note the use of std::back_insert_iterator in some of these examples. You need to use `std::back_insert_iterator`
// when you want to insert elements rather than overwrite them. It creates a phony never-ending output iterator that
// simply calls push_back() on the underlying container.
↩PREREQUISITES↩
⚠️NOTE️️️⚠️
Are these using std::swap()
or std::move()
or assignment (=)?
These functions reverse a range's order.
Function | Description |
---|---|
std::reverse(it1, it2) |
Reverse |
std::reverse_copy(itA1, itA2, itB1) |
Reverse A and copy into B |
std::vector<int> v1 {1,3,3,5};
std::reverse(v1.begin(), v1.end()); // v1 becomes {5,3,3,1}
std::vector<int> v2 {1,3,3,5};
std::vector<int> v3 {0,0,0,0};
std::reverse_copy(v2.begin(), v2.end(), v3.begin()); // v3 becomes {5,3,3,1}
std::vector<int> v4 {1,3,3,5};
std::vector<int> v5 {};
std::reverse_copy(v4.begin(), v4.end(), std::back_insert_iterator { v5 }); // v5 becomes {5,3,3,1}
// Note the use of std::back_insert_iterator in some of these examples. You need to use `std::back_insert_iterator`
// when you want to insert elements rather than overwrite them. It creates a phony never-ending output iterator that
// simply calls push_back() on the underlying container.
These functions rotate a range.
Function | Description |
---|---|
std::rotate(it1, it_mid, it2) |
Rotate such that mid 's position becomes the new first element |
std::rotate_copy(itA1, itA_mid, itA2, itB1) |
Copy into B rotated A that has mid 's position as its first element |
std::vector<int> v1 {1,3,3,5};
std::rotate(v1.begin(), v1.begin() + 1, v1.end()); // v1 becomes {3,3,5,1}
std::vector<int> v2 {1,3,3,5};
std::vector<int> v3 {0,0,0,0};
std::rotate_copy(v2.begin(), v2.begin() + 1, v2.end(), v3.begin()); // v3 becomes {3,3,5,1}
std::vector<int> v4 {1,3,3,5};
std::vector<int> v5 {};
std::rotate_copy(v4.begin(), v4.begin() + 1, v4.end(), std::back_insert_iterator { v5 }); // v5 becomes {3,3,5,1}
// Note the use of std::back_insert_iterator in some of these examples. You need to use `std::back_insert_iterator`
// when you want to insert elements rather than overwrite them. It creates a phony never-ending output iterator that
// simply calls push_back() on the underlying container.
These functions shift the elements within a range. Shifting an element simply shuffles around elements accordingly and returns a new ending / beginning iterator. It won't resize the underlying container to end at that new ending iterator's position. That's the user's responsibility.
Function | Description |
---|---|
std::shift_left(it1, it2, n) |
Shift left by n |
std::shift_right(it1, it2, n) |
Shift right by n |
std::vector<int> v1 {1,3,3,5};
auto v1_new_end { std::shift_left(v1.begin(), v1.end(), 2) };
v1.erase(v1_new_end, v1.end()); // v1 becomes {3,5}
std::vector<int> v2 {1,3,3,5};
auto v2_new_begin { std::shift_right(v2.begin(), v2.end(), 2) };
v2.erase(v2_new_begin, v2.end()); // v2 becomes {1,3}
↩PREREQUISITES↩
⚠️NOTE️️️⚠️
Are these using std::swap()
or std::move()
or assignment (=)?
These functions shuffle / sample based on some random number generator.
Function | Description |
---|---|
std::shuffle(it1, it2, rng) |
Shuffle using the uniform random number generator rng |
std::sample(itA1, itA2, itB1, n, rng) |
Sample n elements from A into B using the uniform random number generator rng |
std::random_device rd {};
std::mt19937 g { rd() };
std::vector<int> v1 {1,3,3,5};
std::shuffle(v1.begin(), v1.end(), g); // v1 ended up as {1,3,5,3} on one of the runs
std::vector<int> v2 {1,3,3,5};
std::vector<int> v3 {};
std::sample(v2.begin(), v2.end(), std::back_insert_iterator { v3 }, 2, g); // v3 ended up as {1,5} on one of the runs
std::vector<int> v4 {1,3,3,5};
std::vector<int> v5 {9,9};
std::sample(v4.begin(), v4.end(), v5.begin(), 2, g); // v3 ended up as {1,3} on one of the runs
// Note the use of std::back_insert_iterator in some of these examples. You need to use `std::back_insert_iterator`
// when you want to insert elements rather than overwrite them. It creates a phony never-ending output iterator that
// simply calls push_back() on the underlying container.
🔍SEE ALSO🔍
std::all_of()
/ std::any_of()
/ std::none_of()
)⚠️NOTE️️️⚠️
Do these functions actually call into the std::equal_to<>()
to compare, or is it just using the equal to operator (==) directly? cppreference doesn't seem to say?
These functions find a single element within a range.
Function | Description |
---|---|
std::find(it1, it2, v) |
Find first v |
std::find_if(it1, it2, p) |
Find first where p() determines a match |
std::find_if_not(it1, it2, p) |
Find first where !p() determines a match |
std::find_first_of(itA1, itA2, itB1, itB2) |
Find any element in B within A |
std::find_first_of(itA1, itA2, itB1, itB2, bp) |
Find any element in B within A , using bp() to determine a match |
std::string s {"hello world!"};
auto s1_it { std::find(s.begin(), s.end(), 'o') }; // points to s.begin()+4
auto s2_it { std::find(s.begin(), s.end(), 'z') }; // points to s.end() (not found)
auto s3_it { std::find_if(s.begin(), s.end(), [](auto &ch) { return ch == 'o'; }) }; // points to s.begin()+4
auto s4_it { std::find_if(s.begin(), s.end(), [](auto &ch) { return ch == 'z'; }) }; // points to s.end() (not found)
auto s5_it { std::find_if_not(s.begin(), s.end(), [](auto &ch) { return ch == 'h'; }) }; // points to s.begin()+1
auto s6_it { std::find_if_not(s.begin(), s.end(), [](auto &ch) { return ch == 'z'; }) }; // points to s.begin()
std::string s_ {"dw!"};
auto s7_it { std::find_first_of(s.begin(), s.end(), s_.begin(), s_.end()) }; // points to s.begin()+6
auto s8_it { std::find_first_of(s.begin(), s.end(), s_.begin(), s_.end(),
[](auto &ch1, auto &ch2) { return ch1 == ch2; }) }; // points to s.begin()+6
These functions finds consecutive elements in a range.
Function | Description |
---|---|
std::adjacent_find(it1, it2) |
Find first where same element appears twice |
std::adjacent_find(it1, it2, bp) |
Find first where same element appears twice using bp() to determine a match |
std::search_n(it1, it2, n, v) |
Find first where same element occurs n times consecutively |
std::search_n(it1, it2, n, v, bp) |
Find first where same element occurs n times consecutively using bp() to determine a match |
std::string s {"hello world AAA!"};
auto s1_it { std::adjacent_find(s.begin(), s.end()) }; // points to s.begin()+2
auto s2_it { std::adjacent_find(s.begin(), s.end(),
[](auto &ch1, auto &ch2) { return ch1 == ch2; }) }; // points to s.begin()+2
auto s3_it { std::search_n(s.begin(), s.end(), 3, 'A') }; // points to s.begin()+12
auto s4_it { std::search_n(s.begin(), s.end(), 3, 'A',
[](auto &ch1, auto &ch2) { return ch1 == ch2; }) }; // points to s.begin()+12
These functions finds the first mismatch between two ranges.
Function | Description |
---|---|
std::mismatch(itA1, itA2, itB1, itB2) |
Find first mismatch between A and B |
std::mismatch(itA1, itA2, itB1, itB2, bp) |
Find first mismatch between A and B where !bp() determines a mismatch |
std::string s1 {"hello world!"};
std::string s2 {"hello moon!"};
auto [s1_it, s2_it] { std::mismatch(s1.begin(), s1.end(), s2.begin(), s2.end()) }; // s1_it=s1.begin()+6, s2_it=s2.begin()+6
auto [s3_it, s4_it] { std::mismatch(s1.begin(), s1.end(), s2.begin(), s2.end(),
[](auto &ch1, auto &ch2) { return ch1 == ch2; }) }; // s3_it=s1.begin()+6, s3_it=s2.begin()+6
These functions find a sub-range within a larger range (e.g. find a substring).
Function | Description |
---|---|
std::search(itA1, itA2, itB1, itB2) |
Search for first occurrence of B within A |
std::search(itA1, itA2, itB1, itB2, bp) |
Search for first occurrence of B within A , using bp() to determine if elements are equal |
std::find_end(itA1, itA2, itB1, itB2) |
Search for last occurrence of B within A |
std::find_end(itA1, itA2, itB1, itB2, bp) |
Search for last occurrence of B within A , using bp() to determine if elements are equal |
std::string s {"hello world!"};
std::string s_ {"world"};
auto s1_it { std::search(s.begin(), s.end(), s_.begin(), s_.end()) }; // points to s.begin()+6
auto s2_it { std::search(s.begin(), s.end(), s_.begin(), s_.end(),
[](auto &ch1, auto &ch2) { return ch1 == ch2; }) }; // points to s.begin()+6
auto s3_it { std::find_end(s.begin(), s.end(), 3, s_.begin(), s_.end()) }; // points to s.begin()+6
auto s4_it { std::find_end(s.begin(), s.end(), 3, s_.begin(), s_.end(),
[](auto &ch1, auto &ch2) { return ch1 == ch2; }) }; // points to s.begin()+6
⚠️NOTE️️️⚠️
std::search()
in particular seems to have overloads for different kinds of "searchers": default_searcher
, boyer_moore_searcher
, boyer_moore_horspool_searcher
, and possibly others maybe provided by other libraries (e.g. boost). I don't know enough about this to know what the benefit of using one over the other is.
🔍SEE ALSO🔍
std::equal()
: std::mismatch()
)⚠️NOTE️️️⚠️
Do these functions actually call into the std::equal_to<>()
/ std::less<>()
to compare, or is it just using the equal to operator (==) / less than operator (<) directly? cppreference doesn't seem to say?
These functions compare ranges.
Function | Description |
---|---|
std::equal(itA1, itA2, itB1, itB2) |
Check if A and B are equal using == on elements |
std::equal(itA1, itA2, itB1, itB2, bp) |
Check if A and B are equal using bp() on elements |
std::lexicographical_compare(itA1, itA2, itB1, itB2) |
Check if A is less than B using < on elements |
std::lexicographical_compare(itA1, itA2, itB1, itB2, lt) |
Check if A is less than B using lt() on elements |
⚠️NOTE️️️⚠️
There's also std::lexicographical_compare_three_way()
, which does lexicographical comparison but uses the spaceship operator to do so.
std::string s {"hello world!"};
std::string s_ {"world"};
bool eq1 { std::equal(s.begin()+6, s.end()-1, s_.begin(), s_.end()) }; // true
bool eq2 { std::equal(s.begin()+6, s.end()-1, s_.begin(), s_.end(),
[](auto& ch1, auto& ch2) { return ch1 == ch2; }) }; // true
bool lt1 { std::lexicographical_compare(s.begin(), s.end(), s_.begin(), s_.end()) }; // true
bool lt2 { std::lexicographical_compare(s.begin(), s.end(), s_.begin(), s_.end(),
[](auto& ch1, auto& ch2) { return ch1 < ch2; }) }; // true
↩PREREQUISITES↩
⚠️NOTE️️️⚠️
Are these using std::swap()
or std::move()
or assignment (=)?
Function | Description |
---|---|
std::partition(it1, it2, p) |
Split into two partitions based on p() |
std::stable_partition(it1, it2, p) |
Split into two partitions based on p() and maintaining relative order |
std::partition_copy(itA1, itA2, itB, itC, p) |
Split A into two partitions based on p() , copying to B p() == true otherwise into C |
std::is_partitioned(it1, it2, p) |
Check if partitioned into two based on p() |
std::partition_point(it1, it2, p) |
Return the first element after the first partition, based on p() |
std::vector<int> v1 {0,1,2,3,4,5};
auto v1_odd_it { std::partition(v1.begin(), v1.end(), [](auto &e) { return e%2 == 0; }) };
// v1 becomes {0,4,2,3,1,5}, v1_odd_it is iterator pointing to first element of the odd group
std::vector<int> v2 {0,1,2,3,4,5};
auto v2_odd_it { std::stable_partition(v2.begin(), v2.end(), [](auto &e) { return e%2 == 0; }) };
// v2 becomes {0,2,4,1,3,5}, v2_odd_it is iterator pointing to first element of the odd group
std::vector<int> v3 {0,1,2,3,4,5};
std::vector<int> v3_even {};
std::vector<int> v3_odd {};
std::partition_copy(v3.begin(), v3.end(),
std::back_insert_iterator { v3_even },
std::back_insert_iterator { v3_odd },
[](auto &e) { return e%2 == 0; }) // v3_even becomes {0,2,4} and v3_odd becomes {1,3,5}
bool v3_partitioned { std::is_partitioned(v3.begin(), v3.end(), [](auto &e) { return e%2 == 0; }) }; // false
bool v1_partitioned { std::is_partitioned(v1.begin(), v1.end(), [](auto &e) { return e%2 == 0; }) }; // true
auto v1_odd_it2 { std::partition_point(v1.begin(), v1.end(), [](auto &e) { return e%2 == 0; }) };
// v1_odd_it2 returns iterator pointing to first element of the odd group
// Note the use of std::back_insert_iterator in some of these examples. You need to use `std::back_insert_iterator`
// when you want to insert elements rather than overwrite them. It creates a phony never-ending output iterator that
// simply calls push_back() on the underlying container.
⚠️NOTE️️️⚠️
You may be wondering what the point of partitioning is if you can only partition into two groups. You can partition into more than two groups by iteratively calling std::partition()
. For example, imagine you need to partition into 4 groups: Call std::partition()
with the predicate that partitions by the first group's criteria, which will end up partitioning and returning an iterator that points to the element just after where all the elements for the first group got moved to. Then call std::partition()
with the second group's criteria but use the return value of the previous std::partition()
for the starting iterator. Do the same thing for the third and fourth group.
↩PREREQUISITES↩
⚠️NOTE️️️⚠️
Do these functions actually call into the std::less<>()
to compare, or is it just using the less than operator (<) directly? cppreference doesn't seem to say?
For std::stable_sort()
, it says it maintains the order of elements that are the same, but cppreference doesn't say how "sameness" is determined? Does it use equal operator (==), std::equal_to()
, or is it doing !(a < b) && !(b < a)
?
How are they moving values around? Are these using std::swap()
or std::move()
or assignment (=)?
🔍SEE ALSO🔍
std::sort()
: std::lexicographical_compare()
)These functions sort a range / check to see if a range is sorted.
Function | Description |
---|---|
std::sort(it1, it2) |
Sort using < to compare |
std::sort(it1, it2, cmp) |
Sort using cmp() to compare |
std::stable_sort(it1, it2) |
Sort using < to compare, maintaining order of elements that are the same |
std::stable_sort(it1, it2, cmp) |
Sort using cmp() to compare, maintaining order of elements that are the same |
std::is_sorted(it1, it2) |
Check if sorted using < to compare |
std::is_sorted(it1, it2, cmp) |
Check if sorted using cmp() to compare |
std::partial_sort(it1, itm, it2) |
Sort using < to compare until the element at itm |
std::partial_sort(it1, itm, it2, cmp) |
Sort using cmp() to compare until the element at itm |
std::partial_sort_copy(itA1, itA2, itB1, itB2) |
Sort A using < , copying result into B |
std::partial_sort_copy(itA1, itA2, itB1, itB2, cmp) |
Sort A using cmp() , copying result into B |
std::is_sorted_until(it1, it2) |
Find first unsorted element using < to compare |
std::is_sorted_until(it1, it2, cmp) |
Find first unsorted element using cmp() to compare |
std::nth_element(it1, itn, it2) |
Place into itn the element that would be there if range were sorted, using < to compare |
std::nth_element(it1, itn, it2, cmp) |
Place into itn the element that would be there if range were sorted, using cmp() to compare |
std::vector<int> v1 {0,5,1,2,3,4};
std::sort(v1.begin(), v1.end()); // v1 becomes {0,1,2,3,4,5}
std::vector<int> v2 {0,5,1,2,3,4};
std::sort(v2.begin(), v2.end(), [](auto &a, auto &b) { return a < b; }); // v2 becomes {0,1,2,3,4,5}
std::vector<int> v3 {0,5,1,2,3,4};
std::stable_sort(v3.begin(), v3.end()); // v3 becomes {0,1,2,3,4,5}
bool s1 { std::is_sorted(v1.begin(), v1.end()) }; // true
std::vector<int> v4 {0,5,1,2,3,4};
std::partial_sort(v4.begin(), v4.begin()+3, v4.end()); // v4 becomes {0,1,2,5,3,4}
std::vector<int> v5 {0,5,1,2,3,4};
std::vector<int> v6 {0,0,0};
std::partial_sort_copy(v5.begin(), v5.begin()+3, v6.begin(), v6.end()); // v6 becomes {0,1,5}
auto v6_it1 { std::is_sorted_until(v6.begin(), v6.end()) }; // returns iterator pointing to 3rd element of v6
These functions merge two sorted ranges together (result is also sorted).
Function | Description |
---|---|
std::merge(itA1, itA2, itB1, itB2, itC1) |
Merge sorted range A and sorted range B into sorted C , using < to compare |
std::merge(itA1, itA2, itB1, itB2, itC1, cmp) |
Merge sorted range A and sorted range B into sorted C , using cmp() to compare |
std::inplace_merge(it1, itm, it2) |
Merge together two sorted sub-ranges [it1 , itm ) and [itm , it2 ), using < to compare |
std::inplace_merge(it1, itm, it2, cmp) |
Merge together two sorted sub-ranges [it1 , itm ) and [itm , it2 ), using cmp() to compare |
std::includes(itA1, itA2, itB1, itB2) |
Check if all elements of sorted range A are in sorted range B , using < to compare |
std::includes(itA1, itA2, itB1, itB2, cmp) |
Check if all elements of sorted range A are in sorted range B , using cmp() to compare |
std::vector<int> v1 {0,2,4};
std::vector<int> v2 {1,3,5};
std::vector<int> v3 {};
std::merge(v1.begin(), v1.end(), v2.begin(), v2.end(), std::back_insert_iterator { v3 }); // v3 becomes {0,1,2,3,4,5}
std::vector<int> v7 {0,2,4,1,3,5};
std::inplace_merge(v7.begin(), v7.begin()+3, v7.end()); // v7 becomes {0,1,2,3,4,5}
std::vector<int> v9 {0,1,2,3,4,5};
std::vector<int> v10 {1,3,5};
bool sorted1 { std::includes(v9.begin(), v9.end(), v10.begin(), v10.end()) }; // true
bool sorted2 { std::includes(v9.begin(), v9.end(), v10.begin(), v10.end(),
[](auto &a, auto &b) { return a < b; }) }; // true
// Note the use of std::back_insert_iterator in some of these examples. You need to use `std::back_insert_iterator`
// when you want to insert elements rather than overwrite them. It creates a phony never-ending output iterator that
// simply calls push_back() on the underlying container.
These functions perform binary search on a sorted range.
Function | Description |
---|---|
std::binary_search(it1, it2, v) |
Searches for v in the sorted range, using < to compare |
std::binary_search(it1, it2, v, bp) |
Searches for v in the sorted range, using cmp() to compare |
std::lower_bound(it1, it2, v) |
Searches for left-most v in the sorted range, using < to compare |
std::lower_bound(it1, it2, v, bp) |
Searches for left-most v in the sorted range, using cmp() to compare |
std::upper_bound(it1, it2, v) |
Searches for right-most v in the sorted range, using < to compare |
std::upper_bound(it1, it2, v, bp) |
Searches for right-most v in the sorted range, using cmp() to compare |
std::equal_range(it1, it2, v) |
Both std::lower_bound(it1, it2, v) and std::upper_bound(it1, it2, v) returned as std::pair<> |
std::equal_range(it1, it2, v, bp) |
Both std::lower_bound(it1, it2, v, bp) and std::upper_bound(it1, it2, v, bp) returned as std::pair<> |
std::vector<int> v1 {0,1,2,3,3,4,5};
bool found1 { std::binary_search(v1.begin(), v1.end(), 3) }; // true
auto it1 { std::lower_bound(v1.begin(), v1.end(), 3) }; // returns iterator pointing to first 3 entry
auto it3 { std::upper_bound(v1.begin(), v1.end(), 3) }; // returns iterator pointing to last 3 entryry
auto [it5, it6] { std::equal_range(v1.begin(), v1.end(), 3) }; // returns pair of iterators pointing to first / last 3 entries
auto [it7, it8] { std::equal_range(v1.begin(), v1.end(), 3,
[](auto &a, auto &b) { return a < b; }) }; // returns pair of iterators pointing to first / last 3 entries
↩PREREQUISITES↩
⚠️NOTE️️️⚠️
Do these functions actually call into the std::less<>()
to compare, or is it just using the less than operator (<) directly? cppreference doesn't seem to say?
How are they moving values around? Are these using std::swap()
or std::move()
or assignment (=)?
These functions are set operations, but they only operate on sorted ranges.
⚠️NOTE️️️⚠️
It's better to think of these operations as a bag rather than a set, since a sorted range can have duplicates and these operations work just fine if they see duplicates.
Function | Description |
---|---|
std::set_difference(itA1, itA2, itB1, itB2, itC1) |
Difference sorted A and sorted B into sorted C , using < to compare |
std::set_difference(itA1, itA2, itB1, itB2, itC1, cmp) |
Difference sorted A and sorted B into sorted C , using cmp() to compare |
std::set_symmetric_difference(itA1, itA2, itB1, itB2, itC1) |
Sym difference sorted A and sorted B into sorted C , using < to compare |
std::set_symmetric_difference(itA1, itA2, itB1, itB2, itC1, cmp) |
Sym difference sorted A and sorted B into sorted C , using cmp() to compare |
std::set_intersection(itA1, itA2, itB1, itB2, itC1) |
Intersect sorted A and sorted B into sorted C , using < to compare |
std::set_intersection(itA1, itA2, itB1, itB2, itC1, cmp) |
Intersect sorted A and sorted B into sorted C , using cmp() to compare |
std::set_union(itA1, itA2, itB1, itB2, itC1) |
Union sorted A and sorted B into sorted C , using < to compare |
std::set_union(itA1, itA2, itB1, itB2, itC1, cmp) |
Union sorted A and sorted B into sorted C , using cmp() to compare |
⚠️NOTE️️️⚠️
There's also std::includes()
, which checks to see if all the elements in sorted range A
are in sorted range B
. In other words, if A
is a subsequence of B
(not substring, but subsequence).
std::vector<int> v1 {0,1,2,3,3,4,5};
std::vector<int> v2 {3,4,5,5,6,7,8};
std::vector<int> v3 {};
std::vector<int> v4 {};
std::set_difference(v1.begin(), v1.end(), v2.begin(), v2.end(), std::back_insert_iterator { v3 } ); // v3 becomes {0,1,2,3}
std::set_difference(v2.begin(), v2.end(), v1.begin(), v1.end(), std::back_insert_iterator { v4 } ); // v4 becomes {5,6,7,8}
std::vector<int> v5 {};
std::vector<int> v6 {};
std::set_symmetric_difference(v1.begin(), v1.end(), v2.begin(), v2.end(), std::back_insert_iterator { v5 } ); // v5 becomes {0,1,2,3,5,6,7,8}
std::set_symmetric_difference(v2.begin(), v2.end(), v1.begin(), v1.end(), std::back_insert_iterator { v6 } ); // v6 becomes {0,1,2,3,5,6,7,8}
std::vector<int> v7 {};
std::vector<int> v8 {};
std::set_intersection(v1.begin(), v1.end(), v2.begin(), v2.end(), std::back_insert_iterator { v7 } ); // v7 becomes {3,4,5}
std::set_intersection(v2.begin(), v2.end(), v1.begin(), v1.end(), std::back_insert_iterator { v8 } ); // v8 becomes {3,4,5}
std::vector<int> v9 {};
std::vector<int> v10 {};
std::set_union(v1.begin(), v1.end(), v2.begin(), v2.end(), std::back_insert_iterator { v9 } ); // v9 becomes {0,1,2,3,3,4,5,5,6,7,8}
std::set_union(v2.begin(), v2.end(), v1.begin(), v1.end(), std::back_insert_iterator { v10 } ); // v10 becomes {0,1,2,3,3,4,5,5,6,7,8}
std::vector<int> v11 {};
std::set_union(v2.begin(), v2.end(), v1.begin(), v1.end(), std::back_insert_iterator { v11 },
[](auto &a, auto &b) { return a < b;} ); // v10 becomes {0,1,2,3,3,4,5,5,6,7,8}
// Note the use of std::back_insert_iterator in some of these examples. You need to use `std::back_insert_iterator`
// when you want to insert elements rather than overwrite them. It creates a phony never-ending output iterator that
// simply calls push_back() on the underlying container.
⚠️NOTE️️️⚠️
Do these functions actually call into the std::less<>()
to compare, or is it just using the less than operator (<) directly? cppreference doesn't seem to say?
These functions find the min / max element within a range.
Function | Description |
---|---|
std::min_element(it1, it2) |
Get min element, using < to compare |
std::min_element(it1, it2, cmp) |
Get min element, using cmp() to compare |
std::max_element(it1, it2) |
Get max element, using > to compare |
std::max_element(it1, it2, cmp) |
Get max element, using cmp() to compare |
std::minmax_element(it1, it2) |
Get min and max element, using < to compare |
std::minmax_element(it1, it2, cmp) |
Get min and max element, using cmp() to compare |
std::vector<int> v1 {0,5,1,4,2,3};
auto v1_it1 { std::min_element(v1.begin(), v1.end()) }; // returns iterator to 0
auto v1_it2 { std::max_element(v1.begin(), v1.end()) }; // returns iterator to 6
auto [v1_it3, v1_it4] { std::minmax_element(v1.begin(), v1.end()) }; // returns iterators to 0 and 5
auto [v1_it5, v1_it6] { std::minmax_element(v1.begin(), v1.end(),
[](auto &a, auto &b) { return a < b;}) }; // returns iterators to 0 and 5
These functions compare two elements to determine which one's the min / max.
Function | Description |
---|---|
std::min(a, b) |
Get smaller between a and b , using < to compare |
std::min(a, b, cmp) |
Get smaller between a and b , using cmp() to compare |
std::max(a, b) |
Get larger between a and b , using < to compare |
std::max(a, b, cmp) |
Get larger between a and b , using cmp() to compare |
std::minmax(a, b) |
Combine min(a,b) and max(a,b) |
std::minmax(a, b, cmp) |
Combine min(a,b,cmp) and max(a,b,cmp) |
std::clamp(v, lo, hi) |
Return v clamped to be between [lo , hi ], using < to compare |
std::clamp(v, lo, hi, cmp) |
Return v clamped to be between [lo , hi ], using cmp(lo, hi) to compare |
int res1 { std::min(0, 5) }; // returns 0
int res2 { std::max(0, 5) }; // returns 5
int [res3, res4] { std::minmax{0, 5} }; // returns 0 and 5
int res5 { std::clamp(3, 6, 9) }; // returns 6
int res6 { std::clamp(3, 6, 9, [](auto &a, auto &b) { return a < b;}) }; // returns 6
// NOTE: The overloads for these functions are such that each either ...
// * takes in and returns lvalue references.
// * takes in an initializer_list and returns a copy.
// I think that's happening in the above usages is that the initializer list version is being invoked. To
// make it explicit which one you're using, maybe invoke with curly braces for initializer list version? Like
// std::min({4,1,2,3,3}). The lvalue reference overloads can only take 2 items, while the initializer list
// overloads can take more than two. Maybe this'll change in future versions of C++ (as of writing is C++20).
⚠️NOTE️️️⚠️
Do these functions actually call into the std::less<>()
to compare, or is it just using the less than operator (<) directly? cppreference doesn't seem to say?
These functions shuffle around an existing range such that it acts as a binary search tree (max heap).
Function | Description |
---|---|
std::make_heap(it1, it2) |
Convert range such that it becomes a heap, using < to compare |
std::make_heap(it1, it2, cmp) |
Convert range such that it becomes a heap, using cmp() to compare |
std::sort_heap(it1, it2) |
Convert heap to sorted range (ascending order), using < to compare |
std::sort_heap(it1, it2, cmp) |
Convert heap to sorted range (ascending order), using cmp() to compare |
std::is_heap(it1, it2) |
Check if range is heap, using < to compare |
std::is_heap(it1, it2, cmp) |
Check if range is heap, using cmp() to compare |
std::is_heap_until(it1, it2) |
Determine until which position range is heap, using < to compare |
std::is_heap_until(it1, it2, cmp) |
Determine until which position range is heap, using cmp() to compare |
std::push_heap(it1, it2) |
Push element into heap, using < to compare |
std::push_heap(it1, it2, cmp) |
Push element into heap, using cmp() to compare |
std::pop_heap(it1, it2) |
Pop largest element out of heap, using < to compare |
std::pop_heap(it1, it2, cmp) |
Pop largest element out of heap, using cmp() to compare |
std::vector<int> v1 {0,5,1,4,2,3};
std::make_heap(v1.begin(), v1.end()); // v1 becomes {5,4,3,0,2,1}
std::is_heap(v1.begin(), v1.end()); // returns true
std::sort_heap(v1.begin(), v1.end()); // v1 becomes {0,1,2,3,4,5}
std::is_heap(v1.begin(), v1.end()); // returns false
std::make_heap(v1.begin(), v1.end()); // v1 becomes {5,4,3,0,2,1}
std::pop_heap(v1.begin(), v1.end()); // v1 becomes {4,2,3,0,1,5} -- 5 got popped off heap (moved to last element), heap's range is now v1.begin() to v1.end()-1
auto it1 { std::is_heap_until(v1.begin(), v1.end()) }; // returns iterator pointing to v1.end()-1 (where popped element is)
bool h1 { std::is_heap(v1.begin(), v1.end()-1) }; // returns true
bool h2 { std::is_heap(v1.begin(), v1.end()) }; // returns false
v1[5] = 8;
std::push_heap(v1.begin(), v1.end()); // v1 becomes {8,2,4,0,1,3} -- 8 got pushed onto heap (moved from last element), heap's range is now v1.begin() to v1.end()
auto it2 { std::is_heap_until(v1.begin(), v1.end()) }; // returns iterator pointing to v1.end()
bool h3 { std::is_heap(v1.begin(), v1.end()-1) }; // returns false
bool h4 { std::is_heap(v1.begin(), v1.end()) }; // returns true
⚠️NOTE️️️⚠️
Do these functions actually call into the std::less<>()
to compare, or is it just using the less than operator (<) directly? cppreference doesn't seem to say?
These functions take an existing sorted range and shuffle its elements around to list out all possible permutations.
Function | Description |
---|---|
std::next_permutation(it1, it2) |
Reorders to next permutation using < to compare, returning false on overflow |
std::next_permutation(it1, it2, cmp) |
Reorders to next permutation using cmp() to compare, returning false on overflow |
std::prev_permutation(it1, it2) |
Reorders to previous permutation using < to compare, returning false on overflow |
std::prev_permutation(it1, it2, cmp) |
Reorders to previous permutation using cmp() to compare, returning false on overflow |
⚠️NOTE️️️⚠️
There's also std::is_permutation()
, which checks to see if a range is a permutation of another sorted range.
std::vector<int> v1 {1,2,3};
std::next_permutation(v1.begin(), v1.end()); // v1 becomes {1,3,2}, returns true
std::next_permutation(v1.begin(), v1.end()); // v1 becomes {2,1,3}, returns true
std::next_permutation(v1.begin(), v1.end()); // v1 becomes {2,3,1}, returns true
std::next_permutation(v1.begin(), v1.end()); // v1 becomes {3,1,2}, returns true
std::next_permutation(v1.begin(), v1.end()); // v1 becomes {3,2,1}, returns true
std::next_permutation(v1.begin(), v1.end()); // v1 becomes {1,2,3}, returns false
std::prev_permutation(v1.begin(), v1.end(), [](auto &a, auto &b) { return a < b;}); // v1 becomes {3,2,1}, returns false
std::prev_permutation(v1.begin(), v1.end(), [](auto &a, auto &b) { return a < b;}); // v1 becomes {3,1,2}, returns true
std::prev_permutation(v1.begin(), v1.end(), [](auto &a, auto &b) { return a < b;}); // v1 becomes {2,3,1}, returns true
std::prev_permutation(v1.begin(), v1.end(), [](auto &a, auto &b) { return a < b;}); // v1 becomes {2,1,3}, returns true
std::prev_permutation(v1.begin(), v1.end(), [](auto &a, auto &b) { return a < b;}); // v1 becomes {1,3,2}, returns true
std::prev_permutation(v1.begin(), v1.end(), [](auto &a, auto &b) { return a < b;}); // v1 becomes {1,2,3}, returns true
↩PREREQUISITES↩
⚠️NOTE️️️⚠️
Do these functions actually call into the std::plus<>()
/ std::multiplies<>()
, or is it just using the plus operator (+) / multiply operate (*) directly? cppreference doesn't seem to say?
This section covers the common parallel primitives: map, reduce, and prefix sum (scan).
These functions map values.
Function | Description |
---|---|
std::transform(itA1, itA2, itB1, m) |
Transform A via m() into B |
std::transform(itA1, itA2, itB1, itC, bt) |
Transform A and B such that each index is transformed via bt() and put into C |
// Transform to other range
std::vector<int> v1 {1,2,3};
std::vector<int> v2 {};
std::transform(v1.begin(), v1.end(), std::back_insert_iterator { v2 }, [](auto &v) { return v*2; }); // v2 becomes {2,4,6}
// Transforms in-place (destination is the source)
std::vector<int> v3 {1,2,3};
std::transform(v3.begin(), v3.end(), v3.begin(), [](auto &v) { return v*2; }); // v3 becomes {2,4,6}
// Transforms two range in-place (destination is the first source)
std::transform(v1.begin(), v1.end(), v2.begin(), v1.begin(), [](auto &e1, auto &e2) { return e1+e2; }); // v1 becomes {3,6,9}
// Note the use of std::back_insert_iterator in some of these examples. You need to use `std::back_insert_iterator`
// when you want to insert elements rather than overwrite them. It creates a phony never-ending output iterator that
// simply calls push_back() on the underlying container.
These functions reduce values into a single element.
Function | Description |
---|---|
std::reduce(it1, it2) |
Reduce using + with an initial value of range's element type created via no-arg constructor |
std::reduce(it1, it2, v) |
Reduce using + with an initial value of v |
std::reduce(it1, it2, v, f) |
Reduce using f() with an initial value of v |
⚠️NOTE️️️⚠️
There's also std::accumulate()
, which is like std::reduce()
but guarantees the operation goes from left-to-right (left-fold). std::reduce()
doesn't guarantee which elements get operated on first, meaning multiple runs of std::reduce()
with the same inputs may return different result if the type isn't commutative.
std::vector<int> v1 {1,2,3};
int ret1 { std::reduce(v1.begin(), v1.end()) }; // returns 6
int ret2 { std::reduce(v1.begin(), v1.end(), 0) }; // returns 6
int ret3 { std::reduce(v1.begin(), v1.end(), 0, [](auto &e1, auto &e2) { return e1+e2; }) }; // returns 6
These functions apply prefix sum (scan) to values into a single element.
Function | Description |
---|---|
std::inclusive_scan(itA1, itA2, itB2) |
Inclusive scan A to B using + with initial value of no-arg constructor created range element type |
std::inclusive_scan(itA1, itA2, itB2, f) |
Inclusive scan A to B using f() with initial value of no-arg constructor created range element type |
std::inclusive_scan(itA1, itA2, itB2, f, v) |
Inclusive scan A to B using f() with initial value of v |
std::exclusive_scan(itA1, itA2, itB2) |
Exclusive scan A to B using + with initial value of no-arg constructor created range element type |
std::exclusive_scan(itA1, itA2, itB2, f) |
Exclusive scan A to B using f() with initial value of no-arg constructor created range element type |
std::exclusive_scan(itA1, itA2, itB2, f, v) |
Exclusive scan A to B using f() with initial value of v |
⚠️NOTE️️️⚠️
There's also std::partial_sum()
, which is like std::inclusive_scan()
but guarantees the operation goes from left-to-right (left-fold). The scan functions don't guarantee which elements get operated on first, meaning multiple runs of o the same scan function with the same inputs may return different result if the type isn't commutative.
// THESE EXAMPLES ARE JUST FOR inclusive_scan(), BUT exclusive_scan() IS USED IN THE EXACT SAME WAY
// Scan to other range
std::vector<int> v1 {1,2,3};
std::vector<int> v2 {};
std::inclusive_scan(v1.begin(), v1.end(), std::back_insert_iterator { v2 }); // v2 becomes {1,3,6}
// Scan in-place (destination is source)
std::vector<int> v3 {1,2,3};
std::inclusive_scan(v3.begin(), v3.end(), v3.begin()); // v3 becomes {1,3,6}
std::vector<int> v4 {1,2,3};
std::inclusive_scan(v4.begin(), v4.end(), v4.begin(),
[](auto &e1, auto &e2) { return e1+e2; }, 0); // v4 becomes {1,3,6}
// Note the use of std::back_insert_iterator in some of these examples. You need to use `std::back_insert_iterator`
// when you want to insert elements rather than overwrite them. It creates a phony never-ending output iterator that
// simply calls push_back() on the underlying container.
Function | Description |
---|---|
std::transform_reduce(itA1, itA2, v, r, m) |
std::transform(itA1, itA2, m) followed by std::reduce(it1, it2, v, r) |
std::transform_reduce(itA1, itA2, itB1, v, r, m) |
std::transform(itA1, itA2, itB1, m) followed by std::reduce(it1, it2, v, r) |
std::transform_reduce(itA1, itA2, itB1, v) |
std::transform_reduce(itA1, itA2, itB1, v) with + for reduce and * for map |
⚠️NOTE️️️⚠️
There are also functions that are a combination of various functions above: std::transform_reduce()
, std::transform_inclusive_scan()
, and std::transform_exclusive_scan()
.
std::inner_product()
is like std::transform_reduce()
except it guarantees the operation goe from left-to-right (left-fold).
↩PREREQUISITES↩
A span is a the generic version of string view. Where std::string_view
provides a view into a part of some other string, std::span
provides a view into a part of some other sequence of objects. That sequence can be almost anything as long as matches the concept std::ranges::contiguous_range
, which requires that the sequence must be contiguous in memory. For common containers, that means ...
std::string
/ std::vector
/ std::array
/ C-style arrays (e.g. int arr[3] {1,2,3}
) will work because they represent their elements contiguously in memory.std::list
/ std::deque
won't work because they don't represent their elements contiguously in memory.std::span
comes in two flavours: static extent and dynamic extent. Creating either flavour of std::span
from a contiguous sequence is fairly straight forward:
Static extent - The size of the span has to be known at compile-time, similar to std::array
.
A static extent span needs 2 template parameters: type of elements it holds and the number of elements the it holds.
std::vector<int> v{1, 2, 3, 4, 5};
std::span<int, 5> ss1 {v}; // OKAY: starting at index 0 of v for length of 5
std::span<int, 3> ss2 {v}; // UNDEFINED BEHAVIOUR: span's length (3) doesn't match v's length (5)
std::span<int, 8> ss3 {v}; // UNDEFINED BEHAVIOUR: span's length (8) doesn't match v's length (5)
std::span<int, 5> ss5 {std::begin(v), std::end(v)}; // OKAY: span of length 5, starting at index 0 of v
std::span<int, 3> ss6 {std::begin(v) + 2, std::end(v)}; // OKAY: span of length 3, starting at index 2 of v
std::span<int, 3> ss7 {std::begin(v), std::end(v)}; // UNDEFINED BEHAVIOUR: span's length (3) doesn't match v's length (5)
std::span<int, 5> ss10 {std::begin(v), 5}; // OKAY: span of length 5, starting at index 0 of v
std::span<int, 3> ss11 {std::begin(v) + 2, 3}; // OKAY: span of length 3, starting at index 2 of v
std::span<int, 3> ss12 {std::begin(v), 5}; // UNDEFINED BEHAVIOUR: span's length (3) doesn't match v's length (5)
std::span<int, 8> ss13 {std::begin(v), 5}; // UNDEFINED BEHAVIOUR: span's length (8) doesn't match v's length (5)
⚠️NOTE️️️⚠️
The book says that a static extent span's size can't be 0 and if it is you'll get a compile-time error. When I try this in G++12, I don't get an error and size()
appropriately reports 0. Also, the documentation here says nothing about this.
Dynamic extent - The size size of the span doesn't have to be known at compile-time.
A dynamic extent span needs only 1 template parameters: The type of elements the span holds goes in the first parameter.
std::vector<int> v{1, 2, 3, 4, 5};
std::span<int> ds1 {v}; // OKAY: starting at index 0 of v for length of 5
std::span<int> ds2 {std::begin(v), std::end(v)}; // OKAY: span of length 5, starting at index 0 of v
std::span<int> ds3 {std::begin(v) + 2, std::end(v)}; // OKAY: span of length 3, starting at index 2 of v
std::span<int> ds4 {std::begin(v), 5}; // OKAY: span of length 5, starting at index 0 of v
std::span<int> ds5 {std::begin(v) + 2, 3}; // OKAY: span of length 3, starting at index 2 of v
⚠️NOTE️️️⚠️
This warning is from the book, and seems important:
When you change the size of the underlying contiguous range, the contiguous range may be reallocated, and the std::span refers to stale data. Only a std::span with dynamic extent can have a resizable underlying contiguous range and can, therefore, be a victim of this subtle issue.
I think what this is saying is that, a std::span
may be holding on to the actual pointer to the data of the contiguous range, not the contiguous range object itself. A contiguous range has data()
function that gives you a pointer to the data and if that's what the std::span
implementation is using, that pointer and the data within it changes on modification. That's why you may end up with a std::span
that points to stale data?
A resizable contiguous range (e.g. std::vector
) requires a dynamic extent std::span
, but that dynamic extent std::span
won't update if that contiguous range resizes / reallocates its data to another piece of contiguous memory.
Copying a std::span
has similar rules to creating a std::span
from a contiguous sequence. Copying or converting a ...
static extent std::span
to another static extent std::span
is okay as long as the sizes are the same.
std::vector<int> v{1, 2, 3, 4, 5};
std::span<int, 5> ss1 {v};
std::span<int, 5> ss2 {ss1}; // OKAY: copied ss1
std::span<int, 8> ss3 {ss2}; // COMPILER ERROR: ss3's length (8) doesn't match ss2's length (5)
dynamic extent std::span
to another static extent std::span
is okay as long as the sizes are the same.
std::vector<int> v{1, 2, 3, 4, 5};
std::span<int> ds1 {v};
std::span<int, 6> ss2 {ds1}; // UNDEFINED BEHAVIOUR: ss2's length (6) doesn't match ds1's length (5)
static extent std::span
to another dynamic extent std::span
is okay.
std::vector<int> v{1, 2, 3, 4, 5};
std::span<int, 5> ss1 {v};
std::span<int> ds2 {ss1}; // OKAY: copied ss1
To pull out a contiguous region of an std::span
as another std::span
, use subspan()
. First parameter is the offset and the second parameter is the count.
std::vector<int> v{1, 2, 3, 4, 5};
std::span<int> ds1 {v};
std::span<int> ds2 {ds1.subspan(1,3)};
If the destination is a static extent std::span
, be careful that it's sized appropriately (undefined behaviour if it isn't). One option may be to use auto
for the destination. Another option is to use a variant of subspan()
that takes in compile-time arguments as template parameters, thereby ensuring the destination type is correct.
std::vector<int> v{1, 2, 3, 4, 5};
std::span<int, 5> ss1 {v};
std::span<int, 7> ss2 {ss1.subspan(1,3)}; // UNDEFINED BEHAVIOUR: size is 7 but only subspan's size is 3
std::span<int, 3> ss3 {ss1.subspan<1,3>()}; // OKAY: templated version ensures destination is correct size
auto ss4 {ss1.subspan(1,3)}; // OKAY
Accessing elements and iterating over a std::span
is very similar to std::vector
. Most of the basic functions are the same.
// DATA ACCESS
int v1 { s[20] };
int v2 { std::get<20>(my_arr2) };
int v3 { s.front() }; // WARNING: undefined behaviour of len is 0
int v4 { s.back() }; // WARNING: undefined behaviour of len is 0
// SIZE
int len { s.size() };
bool is_empty { s.empty() }; // checks if size() is 0
// UNDERLYING DATA ACCESS
auto ptr { s.data() }; // get pointer to the underlying contiguous sequence, similar to std::vector.data()
// RANGE-BASED FOR LOOP
for (auto& e : s) {
// do something here ...
}
// ITERATORS
auto it = s.begin(); // iterator start
auto itEnd = s.end(); // iterator end
auto rIt = s.rbegin(); // reverse iterator start
auto rItEnd s.rend(); // reverse iterator end
Similar to Java's java.time
package, the C++ standard library offers several classes that represent various time-based constructs. This includes timestamps, durations, calendar representations, timezones, and various helper functions.
The subsections below document some common time-related classes and their usages.
Time points are classes that represent some point in type. They're typically created by clocks, which are classes that measure time. Each clock has a set of specifics:
property | description |
---|---|
Epoch | When does it start from? (e.g. since boot, app launch, Jan 1, 1970 00:00:00 UTC, etc..) |
Tick Period | How often does it update? (e.g. once per millisecond) |
Monotonicity | Could it go back in time? (e.g. time returned is before a previous time returned) |
Leap Seconds | Does it include leap seconds? |
⚠️NOTE️️️⚠️
Monotonicity is important. In certain cases the clock could go back in time (e.g. inaccurate clocks are a thing, leap seconds, updates from NTP server, etc..).
There are multiple types of clock. Each type of clock has the following set of important type traits that you can use to obtain key details about it:
T::period
- Reports tick period of the clock in seconds (std::ratio
)T::is_steady
- Reports monotonicity of the clock (bool
).T::now()
- Get the current time (std::chrono::time_point
).std::cout << "Ticks per Second: " << std::chrono::system_clock::period::den << std::endl;
std::cout << "Monotonic: " << std::chrono::system_clock::is_steady << std::endl;
std::chrono::time_point<std::chrono::system_clock> time { std::chrono::system_clock::now() };
Note how the std::chrono::time_point
type returned by the clock above is templated to the clock's type. The return type of a clock's now()
typically won't be able to intermingle with one returned by another type of clock. To do that, you need to use std::chrono::clock_cast<SRC, DST>()
first to convert it.
auto sys_pt { std::chrono::system_clock::now() };
auto utc_pt { clock_cast<std::chrono::system_clock, std::chrono::utc_clock>(sys_pt) };
Common types of clock are listed below.
std::chrono::system_clock
This clock is the system clock (e.g. wrist watch -- it tells you what time it is). Its epoch is whatever the epoch of the system's clock is (e.g. Jan 1, 1970 at midnight on most systems). Leap second inclusions are unspecified.
auto now { std::chrono::system_clock::now() };
std::chrono::steady_clock
This clock is used to measure time intervals (e.g. stopwatch -- measure how long something takes). It's guaranteed to be monotonic, meaning each time you query it for the time it'll be greater than or equal to the result of your last query (e.g. next_time >= prev_time
). Epoch is unspecified and leap second inclusions are unspecified.
auto now { std::chrono::steady_clock::now() };
⚠️NOTE️️️⚠️
Would it make sense to include leap seconds here if this is a "stopwatch"? I don't think so, but nothing is mentioned about leap seconds when I look up the docs online.
std::chrono::high_resolution_clock
This clock is guaranteed to have the shortest possible tick period available (e.g. gaming, real-time systems, etc..). Its epoch and leap second inclusions are unspecified.
auto now { std::chrono::high_resolution_clock::now() };
⚠️NOTE️️️⚠️
Would it make sense to include leap seconds here if this is supposed to be used for high-precision timing? I don't think so, but nothing is mentioned about leap seconds when I look up the docs online.
std::chrono::utc_clock
(Coordinated Universal Time)
This clock is guaranteed to have an epoch of Jan 1, 1970 at midnight UTC and includes leap seconds.
auto now { std::chrono::utc_clock::now() };
auto ls_info { get_leap_second_info(now) };
std::cout << ls_info.elapsed; // leap seconds elapsed since epoch until time_point
std::cout << ls_info.is_leap_second // did time_point fall on a leap second?
std::chrono::tai_clock
(International Atomic Time)
This clock is guaranteed to have an epoch of Dec 31, 1957 at 23:59:50 UTC and does not include leap seconds.
auto now { std::chrono::tai_clock::now() };
⚠️NOTE️️️⚠️
I'm not sure what the point of this clock is? Is it slowing down time / speeding up time based on the rate of "real" time vs atomic time?
std::chrono::gps_clock
(Global Positioning System)
This clock is guaranteed to have an epoch of Jan 6, 1980 at midnight UTC and does not include leap seconds.
auto now { std::chrono::gps_clock::now() };
⚠️NOTE️️️⚠️
I'm not sure what the point of this clock is? Is it slowing down time / speeding up time based on the rate of "real" time vs gps time? (e.g. GPS time is roughly ~38 microseconds faster per day)
std::chrono::file_clock
This clock is used for file times (alias for std::filesystem::file_time_type::clock
). Its epoch and leap second inclusions are unspecified.
auto now { std::chrono::gps_clock::now() };
Boost has a similar set of clocks: boost::chrono::system_clock
, boost::chrono::steady_clock
, boost::chrono::high_resolution_clock
, boost::chrono::process_cpu_clock
, etc... But, Boost clocks can't intermingle with clocks from the C++ standard library (e.g. clock_cast()
won't work).
⚠️NOTE️️️⚠️
Timezone functionality doesn't seem to be implemented as of clang or G++ as of yet, meaning the code below that uses timezones fails to compile.
To convert a time point from the system clock to a date and time representation, use std::chrono::floor()
to cut out the relevant durations before using them to create the date and time objects...
// SOURCE: https://stackoverflow.com/a/15958113
using namespace std::chrono;
auto tp = system_clock::now();
auto tp_rounded = floor<days>(tp);
year_month_day ymd { tp_rounded };
hh_mm_ss time { floor<milliseconds>(tp - tp_rounded) };
The process is similar for converting a zoned time point (time point associated with a timezone) ..
// SOURCE: https://stackoverflow.com/a/15958113
using namespace std::chrono;
auto tp = zoned_time{current_zone(), system_clock::now()}.get_local_time();
auto tp_rounded = floor<days>(tp_rounded);
year_month_day ymd {tp - tp_rounded};
hh_mm_ss time {floor<milliseconds>(tp-dp)};
To go the other way around (date and time objects to time point), use the std::chrono::local_days
/ std::chrono::sys_days
aliases (they alias std::chrono::time_point
).
using namespace std::literals::chrono_literals;
std::chrono::year_month_day date { January / 27d / 2022y };
std::chrono::hh_mm_ss time { 8h + 30m + 45s };
auto tp = std::chrono::local_days { date } + time; // or use sys_days for system time
↩PREREQUISITES↩
🔍SEE ALSO🔍
A duration is a class that represents some amount of time.
// use helper functions
auto hour { std::chrono::hours(1) };
auto hour { std::chrono::minutes(60) };
auto hour { std::chrono::seconds(3600) };
auto hour { std::chrono::milliseconds(3600000) };
auto hour { std::chrono::microseconds(3600000000) };
auto hour { std::chrono::nanoseconds(3600000000000) };
// use user-defined literals
using namespace std::literals::chrono_literals; // import the literals
auto hour { 1h };
auto hour { 60m };
auto hour { 3600s };
auto hour { 3600000ms };
auto hour { 3600000000us };
auto hour { 3600000000000ns };
Subtracting two time points produces the duration in between them.
auto before_tp { std::chrono::steady_clock::now() };
auto after_tp { std::chrono::steady_clock::now() };
std::chrono::duration d { after_tp - before_tp };
Similarly, adding a duration to a time point moves it accordingly.
auto before_tp { std::chrono::steady_clock::now() };
auto hour { std::chrono::hours(1) };
auto after_tp { before_tp + hour }; // move up by 1 hr
A duration object has a tick period. For example, std::chrono::hours(1)
and std::chrono::minutes(60)
both represent exactly 1 hour (equality operator (==) returns true), but the former has a tick period of 1 hour and the latter has a tick period if 1 minute. To get the number of ticks in a duration, use count()
.
// NOTE if you're calling a method of a user-defined literal, you need a space before the dot
auto x {3600s .count()}; // x will be 3600
auto y {1h .count()}; // y will be 1
auto z {3600s == 1h}; // z will be true
⚠️NOTE️️️⚠️
I've read online that you shouldn't use count()
unless absolutely necessary because it breaks a lot of the abstraction / encapsulation that the library does.
std::chrono::duration_cast<DST>
may be used to convert one type of duration to another.
auto x { std::chrono::duration_cast<std::chrono::seconds>(1h) };
auto y { std::chrono::duration_cast<std::chrono::hours>(3600s) };
auto xTicks { x.count() }; // xTicks will be 3600
auto yTicks { y.count() }; // xTicks will be 1
auto z { std::chrono::duration_cast<std::chrono::hours>(3599s) };
auto zTicks { z.count() }; // zTicks will be 0 (ROUNDS DOWN TO 0 -- not enough seconds for 1 hour)
Date and time functionality build on durations and time points by providing things like calendar representations, time of day representations (e.g. 12-hour vs 24-hour), timezone conversions, etc..
Calendar classes represent some exact region (e.g. 5th, 1st, last, etc..) of a specific calendar granularity (day, month, year, weekday).
day | day of week | month | year | example | |
---|---|---|---|---|---|
std::chrono::day |
X | 31 | |||
std::chrono::weekday |
X | Tuesday | |||
std::chrono::weekday_indexed |
X | 3rd Tuesday of unknown month | |||
std::chrono::weekday_last |
X | last weekday of unknown month | |||
std::chrono::month |
X | January | |||
std::chrono::month_day |
X | X | December 25 | ||
std::chrono::month_day_last |
X | X | last day of January | ||
std::chrono::month_weekday |
X | X | 3rd Tuesday of January | ||
std::chrono::month_weekday_last |
X | X | last weekday of January | ||
std::chrono::year |
X | 2022 | |||
std::chrono::year_month |
X | X | January 2022 | ||
std::chrono::year_month_day |
X | X | X | January 26, 2022 | |
std::chrono::year_month_day_last |
X | X | X | last day of January 2022 | |
std::chrono::year_month_weekday |
X | X | X | 3rd Tuesday of January | |
std::chrono::year_month_weekday_last |
X | X | X | last weekday of January 2022 |
// create jan 27 2022 as year_month_day
std::chrono::year y { 2022 };
std::chrono::month m { 1 };
std::chrono::day d { 27 };
std::chrono::year_month_day today { y, m, d };
auto y { today.year() };
auto m { today.month() };
auto d { today.day() };
auto wd { today.weekday() };
Adding or subtracting a duration to a calendar object adjusts it accordingly.
std::chrono::year_month_day today { y, m, d };
using namespace std::literals::chrono_literals;
today += 5d; // add 5 days to today
Calendar objects can be tested to see if they're valid or invalid (e.g. 32nd day of a month is not valid).
std::chrono::day day31 { 31 };
day31.ok(); // returns true
std::chrono::day day32 { day31 + std::chrono::days(1) };
day32.ok(); // returns false
std::chrono::year y { 2022 };
std::chrono::month m { 1 };
std::chrono::day d { 32 };
std::chrono::year_month_day today { y, m, d };
today.ok(); // returns false
If the calendar class captures a full date (e.g. year_month_day
, year_month_weekday
, etc..), it's convertible to a time point via std::chrono::local_days
/ std::chrono::sys_days
.
std::chrono::year y { 2022 };
std::chrono::month m { 1 };
std::chrono::day d { 27 };
std::chrono::year_month_day today { y, m, d };
std::chrono::local_days today_tp { today };
⚠️NOTE️️️⚠️
Which should you use? I'm not sure of the difference. std::chrono::sys_days
is shorthand for std::chrono::time_point<std::chrono::system_clock, std::chrono::days>
, which is the time point type for the system clock. std::chrono::local_days
expands to the same thing but for the local clock. I'm not sure what local clock actually is. It wasn't listed as one of the clocks.
Similarly, a time point is convertible to a full date calendar class.
auto sys_tp { std::chrono::system_clock::now() };
auto sys_tp_rounded { std::chrono::floor<std::chrono::days>(sys_tp) }; // round tp down to days
std::chrono::year_month_day ymd{ sys_tp_rounded };
⚠️NOTE️️️⚠️
What type is sys_tp_rounded
? It's std::chrono::sys_days
, which is shorthand for std::chrono::time_point<std::chrono::system_clock, std::chrono::days>
. The std::chrono::year_month_day
constructor also accepts std::chrono::local_days
-- I'm unsure which clock generates that (maybe utc clock?).
If two calendar objects both capture a full date but are of different types, you can still compare them by first converting them to time points.
// as year_month_day
std::chrono::year y { 2022 };
std::chrono::month m { 1 };
std::chrono::day d { 27 };
std::chrono::year_month_day today1 { y, m, d };
// as year_month_weekday
std::chrono::weekday thurs { 4 };
std::chrono::weekday_indexed _4th_thurs { thurs, 4 };
std::chrono::year_month_weekday today2 { y, m, _4th_thurs };
// convert both to time point
std::chrono::local_days today1_tp {today1};
std::chrono::local_days today2_tp {today2};
// compare -- they both represent the same date so they should be equal
auto sameDay { today1_tp == today2_tp }; // returns true
Calendar objects can be created more intuitively via a set of operator overloads, constants, and user-defined literals.
using namespace std::literals::chrono_literals;
using namespace std::chrono;
year_month_day today1 { January / 27d / 2022y };
year_month_weekday today2 { 2022y / January / Thursday[4] };
year_month_weekday_last today3 { 2022y / January / Thursday[last] };
// all 3 of the above represent the same date
The std::chrono::hh_mm_ss
class is a container for the time that's elapsed since midnight (also known as time of day).
auto tp { std::chrono::system_clock::now() };
auto tp_rounded { std::chrono::floor<days>(tp) };
std::chrono::year_month_day ymd{ tp_rounded };
auto time_duration ( std::chrono::floor<milliseconds>(tp - tp_rounded) );
std::chrono::hh_mm_ss time{ time_duration };
// the variables below are of duration type
auto h { time.hours() };
auto m { time.minutes() };
auto s { time.seconds() };
auto ms { time.subseconds() };
Several time-related helper functions are provided to deal with 12-hour vs 24-hour time.
auto hour_of_day { time.hours() }; // using object from example above
auto am { std::chrono::is_am(hour_of_day) };
auto pm { std::chrono::is_pm(hour_of_day) };
auto hour_of_day_12 { std::chrono::make_12(hour_of_day) }; // as 12-hour format
auto hour_of_day_24 { std::chrono::make_12(hour_of_day_12, pm) }; // back to 24-hour format
⚠️NOTE️️️⚠️
Timezone functionality doesn't seem to be implemented as of clang or G++ as of yet, meaning the code below fails to compile. It does seem to be implemented in MSVC though.
Timezones are accessible through a timezone database.
const auto my_tzdb = std::chrono::get_tzdb(); // also get_tzdb_list()
const std::chrono::time_zone* la_tz { my_tzdb.locate_zone("America/Los_Angeles") };
const std::chrono::time_zone* local_tz { my_tzdb.current_zone() };
You can apply a timezone to a time point, then convert it to the appropriate date and time objects.
// From https://stackoverflow.com/a/15958113
auto tp { std::chrono::system_clock::now() };
auto ztp { std::chrono::zoned_time {local_tz, tp}.get_local_time() };
auto ztp_rounded { std::chrono::floor<days>(ztp) };
std::chrono::year_month_day ymd { ztp_rounded };
std::chrono::hh_mm_ss time { std::chrono::floor<milliseconds>(ztp - ztp_rounded) };
Similarly, you can construct a zoned time point from date and time details.
using namespace std::literals::chrono_literals;
std::chrono::year_month_day date { January / 27d / 2022y };
std::chrono::hh_mm_ss time { 8h + 30m + 45s };
auto tp { std::chrono::local_days { date } + time }; // local_days is local clock, use sys_days for system clock
↩PREREQUISITES↩
Most time-related types can be written to / parsed from a text string:
std::format()
.std::chrono::from_stream()
and std::chrono::parse()
).The subsections below detail text output and text parsing in closer detail.
↩PREREQUISITES↩
Most time-related types provide overloads for both std::formatter
and output streams.
std::cout << std::chrono::hours(5) << std::endl; // prints "5h"
std::string s1 { std::format("{}", std::chrono::hours(5)) }; // s1 is "5h"
The examples below show the standard output formats for common types. The examples target streams, but those same objects passed into std::format("{}", obj)
will produce the same output.
// Durations
// ---------
std::cout << 5ns; // prints "5ns"
std::cout << 5ms; // prints "5ms"
std::cout << 5us; // prints "5µs"
std::cout << 5s; // prints "5s"
std::cout << 5min; // prints "5min"
std::cout << 5h; // prints "5h"
std::cout << std::chrono::hours(5); // prints "5h"
std::cout << std::chrono::days(5); // prints "5d"
std::cout << std::chrono::weeks(5); // prints "5[604800]s"
std::cout << std::chrono::months(5); // prints "5[2629746]s"
std::cout << std::chrono::years(5); // prints "5[31556952]s"
// Time points from different clock types
// --------------------------------------
// NOTE: I haven't gotten anything other than system_clock to work. The program does
// a hard crash if I try to use anything else (or a compiler error because I
// guess it lacks support for outputting) because godbolt currently has a bug
// in it where icu.dll is missing or causing an issue or something.
std::cout << std::chrono::system_clock::now(); // prints "2022-08-26 13:41:08.4774688"
std::cout << std::chrono::steady_clock::now(); // COMPILER ERROR -- provides no overloads for output?
std::cout << std::chrono::file_clock::now(); // MSVC compiler generated program that hard crashed
std::cout << std::chrono::gps_clock::now(); // MSVC compiler generated program that hard crashed
std::cout << std::chrono::tai_clock::now(); // MSVC compiler generated program that hard crashed
std::cout << std::chrono::utc_clock::now(); // MSVC compiler generated program that hard crashed
// Time points that go through truncation
// --------------------------------------
auto now = std::chrono::system_clock::now();
std::cout << now; // prints "2022-08-26 14:07:30.2313305"
std::cout << std::chrono::floor<std::chrono::microseconds>(now); // prints "2022-08-26 14:07:30.231330"
std::cout << std::chrono::floor<std::chrono::milliseconds>(now); // prints "2022-08-26 14:07:30.231"
std::cout << std::chrono::floor<std::chrono::seconds>(now); // prints "2022-08-26 14:07:30"
std::cout << std::chrono::floor<std::chrono::minutes>(now); // prints "2022-08-26 14:07:00"
std::cout << std::chrono::floor<std::chrono::hours>(now); // prints "2022-08-26 14:00:00"
std::cout << std::chrono::floor<std::chrono::days>(now); // prints "2022-08-26"
std::cout << std::chrono::floor<std::chrono::weeks>(now); // prints "2022-08-25"
// Calendar dates
// --------------
using namespace std::chrono_literals;
using namespace std::chrono;
std::cout << std::chrono::year_month_day(2021y, January, 1d); // prints "2021-01-01"
std::cout << std::chrono::year_month_day_last(2021y, std::chrono::month_day_last(March)); // prints "2021/Mar/last"
std::cout << std::chrono::year_month_weekday(2021y, August, Thursday[3]); // prints "2021/Aug/Thu[3]"
std::cout << std::chrono::year_month_weekday_last(2021y, August, std::chrono::weekday_last(Monday)); // prints "2021/Aug/Mon[last]"
std::cout << 1d /*std::chrono::day(1)*/; // prints "01"
std::cout << January /*std::chrono::month(1)*/; // prints "Jan"
std::cout << 2021y /*std::chrono::year(2021)*/; // prints "2021"
std::cout << Friday /*std::chrono::weekday(5)*/; // prints "Fri"
std::cout << std::chrono::year_month(2021y, January); // prints "2021/Jan"
std::cout << std::chrono::month_day(October, 22d); // prints "Oct/22"
std::cout << std::chrono::month_day_last(October); // prints "Oct/last"
std::cout << std::chrono::month_weekday(October, Monday[3]); // prints "Oct/Mon[3]"
std::cout << std::chrono::month_weekday_last(October, std::chrono::weekday_last(Sunday)); // prints "Oct/Sun[last]"
As mentioned above, if you're using std::format("{}", obj)
, the outputs will be the same as those of streams. However, if using std::format()
with specifiers, the output will be formatted based on those specifiers.
auto now { std::chrono::system_clock::now() };
auto s1 { std::format("{0:%B} -- {0:y}", now) }; // "August -- 22"
Specifier | Description | Example |
---|---|---|
%c |
Locale’s Date and time representation | Mon Aug 9 22:58:04 2021 |
%x |
Locale’s Date representation | 09/08/21 |
%F |
year-month-day | 2021-08-08 |
%D |
month/day/year | 09/08/21 |
%Y |
Year | 2021 |
%y |
Year without century | 21 |
%C |
Century as two digits | 20 |
%b /%h |
Abbreviated month name | Aug |
%B |
Month name | August |
%m |
Month | 08 |
%W |
Week of the year (01 until 53, week starts Monday) | 31 |
%U |
Week of the year (01 until 53, week starts Sunday) | 31 |
%a |
Abbreviated weekday name | Mon |
%A |
Weekday name | Monday |
%w |
Weekday as number (Sunday (0) until Saturday (6)) | 1 |
%u |
Weekday as number (Monday (1) until Sunday (7)) | 1 |
%e |
Day (leading space if necessary) | 9 |
%d |
Day with two digits | 09 |
%c |
Date and time representation | Mon Aug 9 22:58:04 2021 |
%X |
Time representation | 22:58:04 |
%r |
12-hour clock time | 10:58:04 PM |
%T |
hours:minutes:seconds | 22:58:04.435 |
%R |
hours:minutes | 22:58 |
%H |
24-hour clock | 22 |
%I |
12-hour clock | 10 |
%p |
AM or PM (12-hour clock) | PM |
%M |
Minute | 58 |
%S |
seconds.subseconds | 04.453 |
%Z |
Time zone abbreviation | CEST |
%z |
Offset (hours and minutes) from UTC | +0200 |
%j |
Day of the year (Starting wiht 001) | 221 |
%q |
Unit suffix according to the time’s duration | ms |
⚠️NOTE️️️⚠️
The table and examples above were taken directly from the book, which the book claims was adapted from cppreference.
↩PREREQUISITES↩
Most time-related types provide overloads for either std::chrono::from_stream()
, std::chrono::parse()
, or both. Both functions parse in essentially the same way, but the former takes the stream as an argument while the latter is passed into the stream as an argument. Parsing happens using the same specifiers used by std::format()
to output text. In this case, only the specifier itself is required, not the curly braces and the part before the colon (e.g. %F
vs {:%F}
).
// from_stream() usage
std::chrono::system_clock::time_point tp {};
std::istringstream is {"2021-08-11 21:49:35"};
std::chrono::from_stream(is, "%F %T", tp); // parses is into tp
// parse() usage
std::chrono::system_clock::time_point tp {};
std::istringstream is {"2021-08-11 21:49:35"};
is >> std::chrono::parse("%F %T", tp); // parses is into tp
⚠️NOTE️️️⚠️
The specifier %q
is not supported for inputs as it's impossible to know what the unit suffix is beforehand.
⚠️NOTE️️️⚠️
It turns out that, if you pass in an invalid specifier for whatever type it is you're trying to parse to, it doesn't barf by default. It may be that you need to check fail()
or you need to explicitly tell the stream to throw an exception via exceptions()
? For example, parsing "Aug 2021"
using "%b %Y"
into std::chrono::year
doesn't work because std::chrono::year
can only contain a year.
Time point types of various clocks provide overloads for both std::chrono::from_stream()
and std::chrono::parse()
. The examples below parse text representations of time points using different specifiers. The examples target std::chrono::parse()
, but those same objects and specifiers can be used with std::chrono::from_stream()
.
std::chrono::system_clock::time_point tp {};
std::istringstream is {"2021-08-11 21:49:35"}; is >> std::chrono::parse("%F %T", tp); // tp is "2021-08-11 21:49:35.0000000"
std::istringstream is {"2021-08-11 21:49:35"}; is >> std::chrono::parse("%Y-%m-%d %H:%M:%S", tp); // tp is "2021-08-11 21:49:35.0000000"
std::istringstream is {"2021"}; is >> std::chrono::parse("%Y", tp); // tp is "1970-01-01 00:00:00.0000000" -- what??? should be beginning of year?
Durations only provide overloads for std::chrono::from_stream()
. The examples below parse text representations of durations using different specifiers.
std::chrono::seconds d {}; std::istringstream is {"5"}; std::chrono::from_stream(is, "%S", d); // 5s
std::chrono::seconds d {}; std::istringstream is {"5"}; std::chrono::from_stream(is, "%M", d); // 500s
std::chrono::seconds d {}; std::istringstream is {"5"}; std::chrono::from_stream(is, "%H", d); // 18000s
std::chrono::seconds d {}; std::istringstream is {"5"}; std::chrono::from_stream(is, "%e", d); // FAILED: 0s -- should be number of secs in 5 days?
std::chrono::hours d {}; std::istringstream is {"5"}; std::chrono::from_stream(is, "%e", d); // FAILED: 0h -- should be number of hours in 5 days?
std::chrono::days d {}; std::istringstream is {"5"}; std::chrono::from_stream(is, "%e", d); // FAILED: 0d -- should be number of days in 5 days?
std::chrono::days d {}; std::istringstream is {"Aug 2021"}; std::chrono::from_stream(is, "%b %Y", d); // FAILED: 0d -- should this not be the number of days in aug 2021?
std::chrono::years d {}; std::istringstream is {"10"}; std::chrono::from_stream(is, "%Y", d); // this is reporting 0 years duration when it should be reporting 10
⚠️NOTE️️️⚠️
Either the version of MSVC I'm using is bugged, or has things unimplemented, or maybe I'm doing something wrong here.
Individual time units provide overloads for std::chrono::from_stream()
. The examples below parse text representations of durations using different specifiers.
std::chrono::day x {}; std::istringstream is {"5"}; std::chrono::from_stream(is, "%e", d); // 05
std::chrono::year x {}; std::istringstream is {"2010"}; std::chrono::from_stream(is, "%Y", y); // 2010
std::chrono::day x {}; std::istringstream is {"Aug 2021"}; std::chrono::from_stream(is, "%b %Y", d); // FAILED: 0d -- should this not be the number of days in aug 2021?
std::chrono::year_month x {}; std::istringstream is {"Aug 2021"}; std::chrono::from_stream(is, "%b %Y", d); // 2021/Aug
⚠️NOTE️️️⚠️
Why didn't the 3rd failed example above work? Maybe what needs to happens is that you need to parse it as std::chrono::year_month
(as is done in the 4th example), then finagle it into a duration. Maybe something like what's below.
std::chrono::year_month d1 { 2021y, August };
std::chrono::year_month d2 { d1 - std::chrono::months(1) };
auto dur { d1 - d2 }; // 1[2629746]s -- this is the num of seconds in August? nope. it comes out to 30.5 days while aug has 31 days
Both the C++ standard library and third-party libraries (e.g. Boost) provide several pieces of functionality that make working with numbers easier: Math constants and functions, random number generation, bounds-checked numeric type casting, etc..
The subsections below document some common number-related classes and their usages.
There are several options for random number generation. For ...
std::mt19937_64
, an implementation of Mersenne Twister.std::random_device
, which tries to use an unpredictable hardware source (but may not).The classes are functors, where each invocation generates a random integral.
std::mt19937_64 mt_rand{ 12345 }; // seed value of 12345
std::cout << mt_rand() << std::endl;
std::random_device secure_rand {}; // doesn't take a seed
std::cout << secure_rand() << std::endl;
To have a random number generator return a distribution other than a normal distribution, you can use one of the distribution wrappers.
std::mt19937_64 rng{ 12345 };
std::uniform_int_distribution<int> uniform_dist{ 0, 10 };
auto value { uniform_dist(rng) };
std::uniform_int_distribution
std::uniform_real_distribution
(like the above but for floating point types)std::normal_distribution
(a tweaked normal distribution)Boost provides a set of distributions as well.
⚠️NOTE️️️⚠️
Are there friendly wrappers here? What if I want the random number generator to give me a float, bool, or an alphanumeric string instead of an int?
Recall that C++'s numeric types are wishy-washy (e.g. there is no guarantee as to how large an int
is, just that it must be greater than or equal to short
). The std::numeric_limits
class allows you to get compile-time information about a numeric type, such as signed-ness, min, max, etc..
auto a { std::numeric_limits<float>::is_integer }; // false
auto b { std::numeric_limits<uint16_t>::is_integer }; // true
auto c { std::numeric_limits<uint16_t>::has_infinity }; // false
std::numeric_limits<T>::is_signed
- if the type is signedstd::numeric_limits<T>::is_integer
- if the type is an integralstd::numeric_limits<T>::has_infinity
- if the type supports infinity (e.g. floats do)std::numeric_limits<T>::has_quiet_NaN
- if the type can be set to not-a-number (e.g. IEEE floats can be set to not a number)std::numeric_limits<T>::round_style
- rounding mode for a typestd::numeric_limits<T>::is_iec559
- if the type is an IEEE float.std::numeric_limits<T>::lowest()
- maximum negative value.std::numeric_limits<T>::max()
- maximum value.std::numeric_limits<T>::min()
- smallest representable value (different from ::lowest()
).std::numeric_limits<T>::quiet_NaN()
- get a not-a-number value.Boost's Integer library also provides additional functionality for determining information about numerics (e.g. which one is the fastest for the platform you're on).
🔍SEE ALSO🔍
Typically, the named conversion function static_cast
is used for converting from one numeric type to another (e.g. double
to int
). In most cases, static_cast
is fine to use, however certain scenarios require a more customizable form of conversion (e.g. don't allow overflow). More customizable forms of numeric conversions are possible via the class boost::numeric::converter
.
To use boost::numeric::converter
, two template parameters are required:
T
- (REQUIRED) output numeric type for the conversion.S
- (REQUIRED) input numeric type for the conversion.Either use its convert()
function or invoke the class directly (it's a functor) to perform a conversion.
int x { boost::numeric::converter<int, double>::convert(1.234) };
int y { boost::numeric::converter<int, double>(1.234) }; // same thing as above
Several other optional template parameters control how the numeric conversion happens. For example, what to do on overflow (e.g. throw exception), how to round a float (e.g. round down), etc.. The most important thing to remember is that the default overflow configuration is to throw an exception -- either boost::numeric::positive_overflow
or boost::numeric::negative_overflow
.
⚠️NOTE️️️⚠️
See here for all template parameters.
If the default conversion options are desirable, then an analog to static_cast
called boost::numeric_cast
may be used instead of boost::numeric::converter
.
int z { boost::numeric_cast<int>(1.234) }; // same thing as the examples above
↩PREREQUISITES↩
The appropriate way to convert to string is std::format()
. For quick-and-dirty conversions of numeric built-in types to std::string
/ std::wstring
, use std::to_string()
/ std::to_wstring()
. Other string types such as std::u8string
aren't supported.
auto s { std::to_string(10) }; // int
auto s { std::to_string(10U) }; // unsigned int
auto s { std::to_string(10L) }; // long
auto s { std::to_string(10UL) }; // unsigned long
auto s { std::to_string(10LL) }; // long long
auto s { std::to_string(10ULL) }; // unsigned long long
auto s { std::to_string(1.0f) }; // float
auto s { std::to_string(1.0) }; // double
auto s { std::to_string(1.0L) }; // long double
For quick-and-dirty conversions of std::string
/ std::wstring
to built-in numeric types, use any of the std::sto*()
functions. For integer types, it also takes in a base (defaults to base 10).
auto num { std::stoi(my_string) }; // int
// THERE IS NO sto*() FOR unsigned int -- use stoul() and cast to an unsigned int
auto num { std::stol(my_string) }; // long
auto num { std::stoul(my_string) }; // unsigned long
auto num { std::stoll(my_string) }; // long long
auto num { std::stoull(my_string) }; // unsigned long long
auto num { std::stof(my_string) }; // float
auto num { std::stod(my_string) }; // double
auto num { std::stold(my_string) }; // long double
⚠️NOTE️️️⚠️
There is no equivalent or overloads for string specializations like std::u8string
? How is someone supposed to convert those? The answer seems to be to use a third-party library (ICU might provide some functionality for this). It seems as if the C++20 standard still doesn't have full support for text encoding. Third-party libraries are required.
⚠️NOTE️️️⚠️
sto*()
functions also take in a pointer as a parameter, when finished, will be set to the pointer of the input string's c_str()
just after the number. By default this parameter is set to std::nullptr
, which means don't set it.
Several common math functions are provided directly within the C++ standard library.
function(s) | description |
---|---|
std::midpoint(a, b) |
same as (a + (b-a)/2) , which is the midpoint between two points |
std::lerp(a, b, t) |
same as (a + t*(b-a)) , which is the linear interpolation between two points |
std::abs(x) |
absolute value |
std::min(x) |
minimum/maximum of two values |
std::max(x) |
minimum/maximum of two values |
std::isfinite(x) |
check if finite / infinite (e.g. floating point infinite) |
std::isinf(x) |
check if finite / infinite (e.g. floating point infinite) |
std::pow(x,y) |
power of (x to the power of y) |
std::sqrt(x) |
square root |
std::cbrt(x) |
cube root |
std::sin(x) |
trigonometry functions |
std::cos(x) |
trigonometry functions |
std::tan(x) |
trigonometry functions |
std::asin(x) |
trigonometry functions |
std::acos(x) |
trigonometry functions |
std::sinh(x) |
hyperbolic functions |
std::cosh(x) |
hyperbolic functions |
std::tanh(x) |
hyperbolic functions |
std::asinh(x) |
hyperbolic functions |
std::acosh(x) |
hyperbolic functions |
std::atanh(x) |
hyperbolic functions |
std::ceil(x) |
rounding function |
std::floor(x) |
rounding function |
std::round(x) |
rounding function |
std::div(x, y) |
divides and gives both the quotient AND remainder |
std::fmod(x, y) |
modulo for floating point |
std::remainder(x, y) |
signed remainder of x div y (different from modulo for non-positive values) |
std::log(x) |
logarithm functions |
std::log10(x) |
logarithm functions |
std::log2(x) |
logarithm functions |
⚠️NOTE️️️⚠️
Built-in functions like std::midpoint()
are preferred over rolling it by hand because they properly formulate things to work around integer overflow issues.
The C++ standard library also provides several math constants.
constant | description |
---|---|
std::numbers::pi |
archimedes's constant |
std::numbers::e |
euler's constant |
std::numbers::degree |
number of radians per degree |
std::numbers::radian |
number of degrees per radian |
⚠️NOTE️️️⚠️
The functions / constants listed above the useful ones. There are more. There are constants in boost::math::double_constants
as well.
By default, the type of these constants are double
. However, they can be re-targeted to either float
, double
, or long double
via their templated *_v
variants (e.g. std::numbers::pi<long double>
).
std::cout << std::numbers::pi << std::endl;
std::cout << std::numbers::pi_v<double> << std::endl;
std::cout << std::numbers::pi_v<float> << std::endl;
In addition, there's support for complex numbers via std::complex
, which implements various common complex number operations via operator overloading and free functions.
std::complex<double> a{1.0, 33.71};
auto aReal { std::real(a) }; // get real part
auto aImaginary { std::imag(a)} ; // get imaginary part
⚠️NOTE️️️⚠️
This seems like such a niche thing that I don't think it's worth fleshing it out.
The C++ standard library provides several functions to aid with bit manipulation tasks.
To determine the endian-ness of the platform (e.g. x86 is little-endian while ARM is big-endian), use std::endian
.
std::endian::little
- constant for little-endianstd::endian::big
- constant for big-endianstd::endian::native
- the endian-ness of the current platform.if constexpr (std::endian::native == std::endian::little) {
std::cout << "little" << std::endl;
} else if constexpr (std::endian::native == std::endian::big) {
std::cout << "big" << std::endl;
} else {
std::cout << "mixed (some types are little while others are big)" << std::endl;
}
⚠️NOTE️️️⚠️
The book mentions that the std::endian
system covers all possible edge cases, such as the case where some type are big-endian but others are little-endian (tested for in the example above). The other edge case it mentions is where all types are exactly 1 byte in size, in which case the platform has no endian-ness (std::endian::little == std::endian::native == std::endian::big
).
The following bit-manipulation operations all require an unsigned integer type as input (e.g. unsigned short
, unsigned int
, etc..). The recommended way to cast to an unsigned integer is std::bit_cast<DST_TYPE>(v)
, which is similar to a reinterpret_cast
except it has certain benefits (e.g. can be used inside of constexpr
/ consteval
/ constinit
).
⚠️NOTE️️️⚠️
The following uses left-most/right-most bit instead of most-significant/least-significant bit .
std::has_single_bit(v)
- Check if v
only a single bit is 1 (v
is power of 2).
std::cout << (std::has_single_bit(2u) ? "yes" : "no") << std::endl; // "yes"
std::cout << (std::has_single_bit(3u) ? "yes" : "no") << std::endl; // "no"
std::bit_floor(v)
- Return v
with all bits set to 0 other than left-most 1. (get largest power of 2 that's <= v
).
std::cout << std::bit_floor(3u) << std::endl; // "2"
std::cout << std::bit_floor(4u) << std::endl; // "4"
std::bit_ceil(v)
- Return v
if std::has_single_bit(v) == true
, otherwise return std::bit_floor(v << 1)
(get smallest power of 2 that's >= v
).
std::cout << std::bit_ceil(3u) << std::endl; // "4"
std::cout << std::bit_ceil(4u) << std::endl; // "4"
std::bit_width(v)
- Return minimum number of bits needed to store v
(calculate 1+log2(v)
).
std::cout << std::bit_width(3u) << std::endl; // "2"
std::cout << std::bit_width(4u) << std::endl; // "3"
std::rotl(v, s)
- Return v
rotated left by s
.
std::cout << std::rotl(3u, 1) << std::endl; // "6"
std::cout << std::rotl(4u, 1) << std::endl; // "8"
std::rotr(v, s)
- Return v
rotated right by s
.
std::cout << std::rotr(3u, 1) << std::endl; // "2147483649"
std::cout << std::rotr(4u, 1) << std::endl; // "2"
std::countl_zero(v)
- Count number of consecutive 0s from the left-most position.
std::cout << std::countl_zero(3u) << std::endl; // "30"
std::cout << std::countl_zero(4u) << std::endl; // "29"
std::countl_one(v)
- Count number of consecutive 1s from the left-most position.
std::cout << std::countl_one(3u) << std::endl; // "0"
std::cout << std::countl_one(4u) << std::endl; // "0"
std::countr_zero(v)
- Count number of consecutive 0s from the right-most position.
std::cout << std::countr_zero(3u) << std::endl; // "0"
std::cout << std::countr_zero(4u) << std::endl; // "2"
std::countr_one(v)
- Count number of consecutive 1s from the right-most position.
std::cout << std::countr_one(3u) << std::endl; // "2"
std::cout << std::countr_one(4u) << std::endl; // "0"
std::popcount(v)
- Count the total number of 1s.
std::cout << std::popcount(3u) << std::endl; // "2"
std::cout << std::popcount(4u) << std::endl; // "1"
↩PREREQUISITES↩
std::bitset
is a pseudo-container that wraps a fixed-size sequence of bits. It's similar to an std::array<bool, N>
or bool [N]
, but optimized for space and provides functions more appropriate for working with bits.
To create a std::bitset
from an integral type, set the number of bits to capture as the template parameter. The constructor can optionally take in the integral value to initialize to (if not present, all sets initialized to 0).
std::bitset<4> b1 {}; // 4 bits, 0000
std::bitset<4> b2 {0b1011}; // 4 bits, 1011
🔍SEE ALSO🔍
To create a std::bitset
that's potentially larger than the largest available integral type, pass in an std::string
of ones and zeros. Alternatively, you can use a custom character for the both ones and zeros by passing those characters into the constructor.
std::string str3 { "1011" };
std::bitset<4> b3 { str3 };
std::string str4 { "TFTT" };
std::bitset<4> b4 { str4, 0, str4.size(), 'T', 'F' };
⚠️NOTE️️️⚠️
When working with std::bitset
's functions, bits are represented as a bool type. The value false is for 0 / true is for 1.
Operator overloads are available for all bitwise operators and their assignment operator equivalents.
auto b5 { b1 & b2 }; //AND
auto b6 { b1 | b2 }; // OR
auto b7 { b1 ^ b2 }; // XOR
auto b8 { ~b1 }; // NOT
auto b9 { b1 << 2 }; // shift-left
auto b10 { b1 >> 2 }; // shift-right
b1 &= b2;
b1 |= b2;
b1 ^= b2;
b1 <<= 2;
b1 >>= 2;
⚠️NOTE️️️⚠️
I'm assuming if you're going to be using bitwise operators, the set::bitset
s must be of the same size.
To read an individual bit as a bool, use either the subscript operator ([]) or test()
. The difference is that test()
provides bounds checking.
bool a = b1[1];
bool b = b1.test(1);
bool c = b1.test(999); // throws std::out_of_range
To replace an individual bit as a bool, use either the subscript operator ([]) or set()
. The difference is that set()
provides bounds checking.
b1[1] = true;
b1.set(1, true)
b1.set(999, true) // throws std::out_of_range
To set a single bit to 0, use reset()
.
b1.reset(1);
To set all bits to 1, use set()
without any arguments. Similarly, to set all bits to 0, use reset()
without any arguments.
b1.set(); // sets all bits to true
b1.reset(); // sets all bits to false
To flip a single bit, use flip()
. Don't specify an argument to flip all bits.
b1.flip(1);
b1.flip(); // flips all bits
To get the number of times 1 occurs in the sequence, use count()
.
int d { b1.count() };
To test the sequence if ...
all()
.none()
.any()
.bool e { b1.all() };
bool f { b1.none() };
bool g { b1.any() };
To get the size, use size()
.
int len { b1.size() };
Comparing signed and unsigned integral types may lead to unexpected results. This happens because the C++ compiler performs an implicit cast to get the two operands of the comparison to have matching types. In the example below, neg < pos
implicitly converts neg
to pos
's type (int
to unsigned int
) so that the comparison operation can take place, meaning neg
's value of -1
gets converted to UINT_MAX
.
int neg { -1 };
unsigned int pos { 1 };
std::cout << (neg < pos ? "true" : "false") << std::endl; // outputs "false"
std::cout << static_cast<unsigned int>(neg) << std::endl; // outputs "4294967295"
⚠️NOTE️️️⚠️
Why does it change to UINT_MAX
? Because of twos complement number system used by integers. Out of scope to describe in this document.
🔍SEE ALSO🔍
*_MIN
/*_MAX
constants)To provide proper comparisons between signed and unsigned types, the C++ standard library provides several functions:
Function | Operator |
---|---|
std::cmp_equal |
== |
std::cmp_not_equal |
!= |
std::cmp_less |
< |
std::cmp_less_equal |
<= |
std::cmp_greater |
> |
std::cmp_greater_equal |
>= |
int neg { -1 };
unsigned int pos { 1 };
std::cout << (neg < pos ? "true" : "false") << std::endl; // outputs "false"
std::cout << (std::cmp_less(neg, pos) ? "true" : "false") << std::endl; // outputs "true"
In addition to null-terminated character strings (e.g. const char * = "hello world"
), the C++ standard library provides a higher-level character string abstractions. These higher-level abstractions provide more type safety, protect against common problems like buffer overflows, and generally make working with strings easier.
The subsections below document some common number-related classes and their usages.
⚠️NOTE️️️⚠️
As of C++20, there is very little support for things like locale and character encodings. If you need need that type of functionality, check out the ICU library.
🔍SEE ALSO🔍
std::basic_string
is used as a wrapper for representing character strings. It's different from null-terminated character strings in that strings are resizable and manipulatable similarly to how they are in other high-level languages (e.g. Java or Python). Unlike other high-level languages, a C++ string is not immutable (it's characters can change).
A std::basic_string
supports all of the same functionality as a std::vector
in addition to more. It takes 3 template parameters:
CharT
- type of character.Traits
- type supporting a specific set of fields and methods for working with CharT
(defaults to std::char_traits<CharT>
).Allocator
- type of custom allocator (defaults to std::allocator<CharT>
).Several template specializations are provided out-of-the-box for std::basic_string
, one for each character type. In almost all cases, you'll want to use these template specializations rather than than using std::basic_string
. The most common case for using std::basic_string
directly is the need for a custom allocator, which isn't possible with template specializations.
class | character type |
---|---|
std::string |
char |
std::wstring |
wchar_t |
std::u8string |
char8_t |
std::u16string |
char16_t |
std::u32string |
char32_t |
⚠️NOTE️️️⚠️
The text and examples below use std::string
, but they should work for the other template specializations as well. Make sure to use the correct literal for raw character string types (e.g. u8"example"
for char_8t
).
To create a std::string
primed with a sequence of characters known as compile-time, use typical braced initialization.
std::string s1 { 'h', 'e', 'l', 'l', 'o' };
To create a std::string
without priming it directly to a sequence of characters, you can't use braced initialization or brace-plus-equals initialization. You must use parenthesis.
std::string s2("hello"); // create from null-terminated string
std::string s3("hello", 2); // create from first 2 chars of null-terminated string
std::string s4(10, 'a'); // create by repeating a 10 times
std::string s5(s1); // create by copying s1
std::string s6(s1, 3); // create by copying substring of s1 from index 3 until end
std::string s7(s1, 3, 2); // create by copying 2 char long substring of s1 starting at index 3
std::string s8(s1.begin() + 3, s1.begin() + 5); // create by copying substring of s1 from index 3 to 5
std::string s9(std::move(s1)); // create by moving s1 into s5
To append to a string, use either the addition operator (+) the assignment addition operator (+=), push_back()
, or append()
.
std::string s10 { s1 + s2 };
std::string s11 { s1 + "boop" };
std::string s12 { s1 };
s12 += s2;
s12.push_back('x'); // single character only
s12.append({ 'x', 'y', 'z' }); // append compile-time list
s12.append("xyz"); // append null-terminated string
s12.append(s1); // append s1
s12.append(s1, 3, 2); // append 2 char substring of s1 starting at index 3
s12.append(s1.begin() + 3, s1.begin() + 5); // append substring of s1 from index 3 to 5
s12.append(10, 'a'); // append a 10 times
To insert at a specific position of a string, use insert()
.
s1.insert(3, 5, 'X'); // insert X 5 times at index 3
s1.insert(3, "xyz"); // insert "xyz" at index 3
s1.insert(3, s2); // insert s2 at index 3
s1.insert(s1.begin() + 3, 5, 'X'); // insert X 5 times at iterator position
s1.insert(s1.begin() + 3, "xyz"); // insert "xyz" at iterator position
s1.insert(s1.begin() + 3, s2); // insert s2 at iterator position
To see if a string has a specific prefix or suffix with a string, use starts_with()
and ends_with()
.
s1.starts_with('h');
s1.starts_with("he");
s1.ends_with(s2);
To see if a string contains a specific substring, use contains()
.
auto found { s1.contains("ell") };
auto found { s1.contains(s2) };
To find the position of a substring within a string, use find()
or rfind()
. The latter finds in reverse (from end to beginning). If the substring wasn't found, std::string::npos
is returned.
// find
auto pos1 { s1.find("llo") }; // find index within s1 going FORWARD from index 0, or npos if not found
auto pos2 { s1.find("llo", 2) }; // find index within s1 going FORWARD from index 2, or npos if not found
auto pos3 { s1.find('l', 2) }; // find index within s1 going FORWARD from index 2, or npos if not found
// rfind
auto pos4 { s1.rfind("llo") }; // find index within s1 going BACKWARD from last index, or npos if not found
auto pos5 { s1.rfind("llo", 5) }; // find index within s1 going BACKWARD from index 5, or npos if not found
auto pos6 { s1.rfind('l', 5) }; // find index within s1 going BACKWARD from index 5, or npos if not found
To get a substring of a string, use substr()
.
std::string s13 { s1.substr(3, 2) }; // create by copying 2 char long substring of s1 starting at index 3
std::string s14(s1, 3, 2); // same as the substr() above
To delete a specific position or range of a string, use either pop_back()
, clear()
, or erase()
.
s1.pop_back(); // remove element from end
s1.clear(); // reset to empty string
s1.erase(1, 3); // remove 3 characters starting from index 1
s1.erase(s1.begin() + 1, s1.begin() + 5); // remove characters at index 1 to 5
s1.erase(s1.begin() + 1); // remove character at index 1
To replace a part of the string, use replace()
, which takes in some position / range and another string to replace it with.
s1.replace(3, 2, "hello"); // replace 2 char substring of s1 starting at index 3 with hello
s1.replace(3, 2, "hello", 3, 2); // replace 2 char substring of s1 starting at index 3 with 2 char substring at index 3 of "hello"
s1.replace(3, 2, 10, 'a'); // replace 2 char substring of s1 starting at index 3 with a 10 times
s1.replace(s1.begin() + 1, s1.begin() + 5, "hello"); // replace characters at index 1 to 5 with hello
s1.replace(s1.begin() + 1, s1.begin() + 5, {'h', 'e', 'l', 'l', 'o'}); // replace characters at index 1 to 5 with hello
s1.replace(s1.begin() + 1, s1.begin() + 5, 10, 'a'); // replace characters at index 1 to 5 with a 10 times
⚠️NOTE️️️⚠️
Need more elaborate string algorithms? Check out Boost's string functions.
To get the number of characters in a std::string
, use either size()
, length()
, or empty()
.
bool empty { s1.empty() }; // check if empty
auto len { s1.size() };
auto len { s1.length() }; // same as size()
To test if two strings have the exact same sequence of characters, use the equality operator (==) and inequality operator (!=).
bool equal { s1 == s2 };
bool not_equal { s1 != s2 };
To test if a string is lexicographically less than the other, use the greater than operator (>) or less than operators (<).
bool less_than { s1 < s2 };
bool greater_than { s1 > s2 };
⚠️NOTE️️️⚠️
Don't depend on this to sort alphabetically because it isn't portable. Lexicographically doesn't mean alphabetical, it just means that it compares by the symbol (character in this case). The comparisons depend on the encoding of the string. According to the book, for US-ASCII (most common), it means A < Z < a < z
.
To access individual characters within a std::string
, use either the subscript operator ([]), at()
, front()
, and back()
. The behaviour of these functions is similar to their std::vector
equivalents.
// WARNING: first() and last() have undefined behaviour if size is 0.
char first_char { s1.first(); }
char last_char { s1.last(); }
// WARNING: at() does bounds checking while the subscript operator does not.
char x { s1.at(2) };
char w { s1[2] };
char y { s1.at(1000) }; // throws std::out_of_range
char z { s1[1000] }; // out of bounds -- undefined behaviour
To access individual characters within std::string
via a random access iterator, use the typical begin()
and end()
functions (and their variants).
begin()
/ end()
cbegin()
/ cend()
- returns characters as const
rbegin()
/ rend()
- returns characters in reversecrbegin()
/ crend()
- returns characters in reverse and as const
To access the underlying character data of a std::string
, use either data()
or c_str()
. Both return a null-terminated string, but the latter is const
.
char * data1 { s1.data() };
const char * data2 { s1.c_str() };
↩PREREQUISITES↩
std::basic_string_view
is a wrapper around a std::basic_string
that represents some range of characters within the string. Similar to std::basic_string
, std::basic_string_view
has several out-of-the-box template specializations for each character type.
class | character type |
---|---|
std::string_view |
char |
std::wstring_view |
wchar_t |
std::u8string_view |
char8_t |
std::u16string_view |
char16_t |
std::u32string_view |
char32_t |
std::basic_string_view
works by holding on to the underlying string as a pointer, meaning that it's efficient but unsafe. Specifically, it has the potential for a memory leak: If the underlying string gets destroyed, the view pointing to it will be pointing at bad data.
std::basic_string_view
(and its specializations) support most of the same functions as std::basic_string
(and its specializations).
⚠️NOTE️️️⚠️
The text and examples below use std::string_view
, but they should work for the other template specializations as well.
std::string_view sv1 { s1, 4 }; // view of first 4 characters of s1
std::string_view sv2 { s1 }; // view of s1
std::string_view sv3 {}; // view of an empty string
std::string_view sv4 { "hello" }; // view of the constant C-string hello
↩PREREQUISITES↩
⚠️NOTE️️️⚠️
This is from a C++ library called fmt which formats strings (similar to Python format strings). It's been included into the C++ standard library as of C++20, which is the version this section references.
std::format()
is a string formatting class that provides functionality very similar to Python's string formatting. Unlike older formatting systems like sprintf()
, it's ...
std::string
) rather than null-terminated character strings (char *
).⚠️NOTE️️️⚠️
The examples below all use std::format()
, which returns a string. The variant ...
std::format_to()
writes the output to an output iterator.std::format_to_n()
writes at most n characters of the output to an output iterator.There's also std::formatted_size()
, which returns the length of what the formatted string would be.
std::format("Hello {}, it's {} degrees outside.", "steven", 42); // Hello steven, it's 42 degrees outside
std::format("Here's a number: {0}. Here's that same number again: {0}.", 42); // Here's a number: 42. Here's that same number again: 42.
std::format("{0:x>10} {0:x<10}", 42); // xxxxxxxx42 42xxxxxxxx
The formatting of a parameter is controlled by what's inside of the curly brackets for that parameter. At a minimum, it's is empty (e.g. 1st example above). If it should target a specific argument, it needs to take in the index of that parameter (e.g. 2nd example above). Then, any output options for a specific parameter are specified by inserting a colon followed by those options (e.g. 3rd example above).
Examples of the most common formatting options are provided below.
Padding and alignment
std::format("{:10}", 42); // " 42"
std::format("{:<10}", 42); // "42 "
std::format("{:>10}", 42); // " 42"
std::format("{:^10}", 42); // " 42 "
std::format("{:x^10}", 42); // "xxxx42xxxx"
Number signs (e.g. should plus sign be put on a positive integer)
std::format("{0:},{0:+},{0:-},{0: }", 1); // "1,+1,1, 1"
std::format("{0:},{0:+},{0:-},{0: }", -1); // "-1,-1,-1,-1"
Number precision (e.g. where to truncate floating point)
std::format("{:.5f}", 3.14); // "3.14000"
std::format("{:0>10.5f}", 3.14); // "0003.14000"
Numeric encoding (e.g. decimal, hex, octal)
std::format("{:d}", 10); // "10"
std::format("{:x}", 10); // "a"
std::format("{:#x}", 10); // "0xa"
std::format("{:#X}", 10); // "0xA"
std::format("{:#04X}", 10); // "0x0A"
std::format("{:o}", 10); // "12"
std::format("{:#o}", 10); // "012"
std::format("{:b}", 10); // "1010"
std::format("{:#b}", 10); // "0b1010"
std::format
provides support for many common parameter types: numbers (e.g. integral and floating point), pointers, single characters, character strings (e.g. null-terminated strings, C++ strings, and C++ string views), dates, times, durations, timezones, etc.. To add support for a new type, that type needs a std:formatter
template specialization (note the "er" at the end -- formatter, not format).
The simplest approach to implement a template specialization for a custom type is to inherit from an existing template specialization. Formatting options are typically ignored in this case.
template <>
struct std::formatter<Person> : std::formatter<std::string> {
auto format(Person s, format_context& ctx) {
return format_to(ctx.out(), "{} {}", s.firstName, s.lastName);
}
};
Person p { "steve", "smith" };
std::format("Hello {}, the temperature today is {}!", p, 42);
To support formatting options, an extra function needs to be implemented (parse()
) which sets member variables based on the options it sees.
// For a better example, see https://www.modernescpp.com/index.php/extend-std-format-in-c-20-for-user-defined-types
template <>
struct std::formatter<Person> {
int space_count;
auto parse(format_parse_context& ctx) {
std::string val {};
for (auto it { begin(ctx) }; it != end(ctx); ++it) {
char c { *it };
if (c == '}') {
space_count = std::stoi(val);
return it;
} else {
val += c;
}
}
return end(ctx);
}
auto format(Person s, format_context& ctx) {
std::string spacer(space_count, ' ');
return format_to(ctx.out(), "{}{}{}", s.firstName, spacer, s.lastName);
}
};
Person p { "steve", "smith" };
std::format("Hello {:1}, the temperature today is {}!", p, 42);
↩PREREQUISITES↩
std::basic_regex
is a templated class for regular expression functionality. Similar to std::basic_string
, std::basic_regex
has several out-of-the-box template specializations for specific character types (not all character types).
class | character type |
---|---|
std::regex |
char |
std::wregex |
wchar_t |
⚠️NOTE️️️⚠️
What about other character types (e.g. char8_t
)? Not supported because encoding support in C++ is not really there as of C++20. So what encoding is used here? Platform-specific maybe? or ASCII? It's probably stated somewhere but I have yet to find out what it is. On most major platforms, it's probably safe to assume that basic printable ASCII characters are there encoded as they would be in ASCII.
⚠️NOTE️️️⚠️
The text and examples below use std::regex
, but they should work for the other template specializations as well.
🔍SEE ALSO🔍
To create a std::regex
, prime it with a specific regex pattern and optionally regex flags. Unless the pattern string is presented as an initializer list argument, you can't use braced initialization or brace-plus-equals initialization. You must use parenthesis.
std::regex pattern1 { '\\', 'd', '+' }; // initializer list of pattern.
std::regex pattern2(R"|\d+|");
std::regex pattern3(R"|\d+|", std::regex_constants::ECMAScript); // same as above
std::regex pattern4(R"|\d+|", std::regex_constants::ECMAScript | std::regex_constants::icase);
To get the regex flags, use flags()
. To get the number of groups (sub-expressions), use mark_count()
.
auto flags { pattern1.flags() };
auto group_count { pattern1.mark_count() };
To search a string for a pattern, use either std::regex_match()
or std::regex_search()
. Both have the same set of parameters, but the former requires the entire string to match the pattern while the latter searches the string for a substring that matches the pattern. Parameter number ...
std::string
.const char *
.std::match_results
where match results go (templated class).
std::string
, use template specialization std::smatch
.std::wstring
, use template specialization std::wmatch
.const char *
, use template specialization std::cmatch
.const wchar_t *
, use template specialization std::wcmatch
.⚠️NOTE️️️⚠️
There are lots more specializations for parameter 2. See here for more information.
std::string("hello steven");
std::regex pattern5(R"|hello (.*+)|");
std::smatch result;
bool matched { std::regex_match(s1, result, pattern5, std::regex_constants::match_default) };
// matched contains true if the pattern matched the string
// result contains information about the match (e.g. what parts of the string matched which sub-expressions) -- see cppreference for more info
⚠️NOTE️️️⚠️
If using std::regex_search()
, you can continue searching the string by extracting the end position of the search from the match result and running the search again from that position.
To search a string for a pattern and replace it, use std::replace()
. Similar to std::regex_match()
and std::regex_search()
, this also has a flag argument (same type and default) that defines how replacement happens (e.g. $1
to replace with capture group 1).
std::regex pattern6(R"|hello (.*+)|");
std::string res = std::regex_replace("hello steven", pattern6, "goodbye $1");
// res should be "goodbye steven"
↩PREREQUISITES↩
Similar to Java's InputStream
and OutputStream
interfaces (and surrounding utilities and packages), the C++ standard library offers several stream classes and interfaces. Similar to std::basic_string
, a set of templated classes are provided for streams.
std::basic_ostream
is the version of Java's OutputStream
.std::basic_istream
is the version of Java's InputStream
.std::basic_iostream
is a combination of the above two.Each of the classes above requires two template parameters: element type of the stream (e.g. is it streaming char
s, int
s, a custom type, etc..) and a class that describes the element type's traits (e.g. similar to the std::basic_string
's character traits type). Template specializations are provided for some commonly used element types (e.g. char
and wchar_t
).
base stream type | element type | specialized stream type |
---|---|---|
std::basic_ostream |
char |
std::istream |
std::basic_istream |
char |
std::ostream |
std::basic_iostream |
char |
std::iostream |
std::basic_ostream |
wchar_t |
std::wistream |
std::basic_istream |
wchar_t |
std::wostream |
std::basic_iostream |
wchar_t |
std::wiostream |
You typically won't need to implement your own stream types. The C++ standard library provides stream implementations for common use-cases such as reading/writing to the console and files. The subsections below document these implementations, while the remainder of this section discusses the stream API.
⚠️NOTE️️️⚠️
The rest of this section talks about the general functionality of streams using std::cout
for an output stream / std::cin
for an input stream. These are for writing to / reading from the console, which is documented further in one of the subsections. For now just assume they exist.
🔍SEE ALSO🔍
To read and write text, operator overloads are provided called formatted operations: The left-shift operator (<<) is for writing while the right-shift operator (>>) is for reading. Each operator overload takes in the type to write/read and returns a reference back to the stream itself, allowing for chaining.
std::cout << 5 << ' ' << "hello world";
int x {};
int y {};
std::cin >> x >> y;
By default, the C++ standard library provides operator overloads for most built-in types (e.g. int
, long
, etc.. ) as well as some higher-level types within the C++ standard library strings (e.g. std::string
, std::complex
, etc..). To provide support for custom types, simply overload the operators for that type.
struct MyType {
int intValue;
long longValue;
}
std::ostream& operator<<(std::ostream& s, const MyType& val) {
return s << val.intValue << val.longValue;
}
std::istream& operator>>(std::istream& s, MyType& val) {
s >> val.intValue;
s >> val.longValue;
return s;
}
Special objects called manipulators may be used to to modify how a stream interprets formatted operations.
std::ws
skips over all whitespace in the input.std::flush
flushes any buffered output.std::ends
writes a null byte (e.g. 0).std::endl
writes a newline character and flushes.std::boolalpha
tells the stream to write/read booleans as text rather than 0/1.std::noboolalpha
tells the stream to write/read booleans as 0/1 rather than text.std::oct
tells the stream to write/read integrals as octal.std::dec
tells the stream to write/read integrals as decimal.std::hex
tells the stream to write/read integrals as hexidecimal.std::setprecision(p)
tells the stream to write/read floating point at a specific precision.std::fixed
tells the stream to write/read floating point in fixed notation.std::scientific
tells the stream to write/read floating point in scientific notation.std::cin >> std::ws >> x; // skip over whitespace and read into variable
std::cout << "hello" << std::flush; // writes string and forces buffer to flush
std::cout << "hello" << std::ends; // writes string followed by null character
std::cout << "hello" << std::endl; // writes string followed by new-line character AND forces buffer to flush
std::cout << std::boolalpha << true; // writes true
std::cout << std::noboolalpha << true; // writes 1
std::cin >> std::boolalpha >> b_var; // reads true/false into boolean variable
std::cin >> std::noboolalpha >> b_var; // writes 0/1 into boolean variable
std::cout << std::oct << 10 << st::dec << 10 << std::hex << 10; // writes 10 as octal, decimal, and hex
std::cin >> std::oct >> i_var1 >> st::dec >> i_var2 >> std::hex >> i_var3; // reads integral as octal, decimal, and hex
std::cout << std::setprecision(2) << 3.14159; // writes 3.14
std::cout << std::fixed << 0.1; // writes 0.100000
std::cout << std::scientific << 0.1; // writes 1.000000e-01
🔍SEE ALSO🔍
At any point, a stream may end or enter into a bad state. A set of member functions can be used to query the state.
good()
returns true if the stream is in a good state.eof()
returns true if the stream has ended.fail()
returns true if the last operation failed (but the stream may still be usable).bad()
returns true if the stream is in an unrecoverable state.⚠️NOTE️️️⚠️
At any point, you can call clear()
to reset the state to good. Why would you ever want to do this?
In addition, exceptions()
can be used to make the stream throw an exception if it enters into one (or more) of the states listed above.
std::cin.exceptions(std::istream::badbit | std::istream::failbit); // exception if bad/fail, but not good/eof
Streams provide an implicit type conversion for bool
that gives back the result of good()
, allowing for shorthand testing of the stream state.
// keep reading characters until the stream breaks or eof
while (std::cin) {
char ch {};
std::cin >> ch;
process(ch);
}
To read non-text data, a set of member functions referred to as unformatted operations are available.
To read non-text data, use get()
, peek()
, getline()
, read()
, readsome()
, and ignore()
. gcount()
may be used to determine exactly how many characters were read in one of these functions (e.g. may have terminated early because it hit end-of-file or a new-line character).
char ch {};
std:array<char, 100> arr {};
ch = std::cin.peek(); // read single character WITHOUT moving forward in the stream
ch = std::cin.get(); // read single character
std::cin.get(ch); // read single character
std::cin.get(arr, 100); // read 100 characters OR until \n (\n included in arr)
std::cin.get(arr, 100, ';'); // read 100 characters OR until ; (; included in arr)
std::cin.getline(arr, 100); // read 100 characters OR until \n (\n DISCARDED)
std::cin.getline(arr, 100, ';'); // read 100 characters OR until ; (; DISCARDED)
std::cin.read(arr, 100); // read 100 characters
std::cin.readsome(arr, 100); // read 100 characters or however many are "immediately available"
std::cin.ignore(); // skip single char
std::cin.ignore(5); // skip 5 chars
std::cin.ignore(5, '\n'); // skip up to 5 chars, stopping if \n is encountered (stops AFTER skipping \n)
auto count { std::cin.gcount() };
⚠️NOTE️️️⚠️
readsome()
is a little more dicey in that how it works is implementation specific.
To write non-text data, use put()
and write()
. For buffered streams, the buffer may be explicitly flushed by flush()
.
std::cout.put('x'); // write single character
std::cout.put("hello", 5); // write 5 characters
std::cout.flush();
To get and move the position of the underlying stream, use tell*()
and seek*()
respectively. The suffix depends on the type of stream:
tellg()
/ seekg()
tellp()
/ seekp()
tell()
/ seek()
// NOTE: not supported on all stream types
auto pos { std::cin.tellg() };
String streams are similar to Java's ByteArrayInputStream
/ StringReader
and ByteArrayOutputStream
/ StringWriter
. The underlying types for string streams are ...
std::basic_istringstream
for input string stream.std::basic_ostringstream
for output string stream.std::basic_stringstream
for both input and output string stream.The above types are templated classes, where the template parameters specify element type, element traits, and a custom allocator. The following template specializations are provided out-of the box...
type | element type |
---|---|
std::ostringstream |
char |
std::wostringstream |
wchar_t |
std::istringstream |
char |
std::wistringstream |
wchar_t |
std::stringstream |
char |
std::wstringstream |
wchar_t |
⚠️NOTE️️️⚠️
The text and examples below use std::ostringstream
/ std::istringstream
, but they should work for the other template specializations as well. Make sure to use the correct literal for raw character string types (e.g. u8"example"
for char_8t
).
For output string streams, in addition to all of the normal output stream functionality, ...
str()
returns a copy of the internal buffer as a std::string
.view()
returns a view to the internal buffer as a std::string_view
.std::ostringstream out {};
out << 3 << "hello!" << std::endl;
std::string output { out.str() };
std::string_view view { out.view() };
Input string streams have the same two methods, but they're hardly used because the main point of input string streams is to parse data out of the stream.
std::istringstream in { "1 9.555555" };
int x;
double y;
in >> x >> y;
File streams are similar to Java's FileInputStream
and FileOutputStream
. The underlying types for string stream are ...
std::basic_ifstream
for input string streams.std::basic_ofstream
for output string streams.std::basic_fstream
for both input and output string stream.The above types are templated classes, where the template parameters specify element type, element traits, and a custom allocator. The following template specializations are provided out-of the box...
type | element type |
---|---|
std::ofstream |
char |
std::wofstream |
wchar_t |
std::ifstream |
char |
std::wifstream |
wchar_t |
std::fstream |
char |
⚠️NOTE️️️⚠️
The text and examples below use std::ofstream
/ std::ifstream
, but they should work for the other template specializations as well. Make sure to use the correct literal for raw character string types (e.g. u8"example"
for char_8t
).
To access a file, either pass that file's path to the constructor or to open()
along with the set of file access flags. Those flags are ...
std::ios::in
- file must exist.std::ios::out
- file created if it doesn't exist.std::ios::app
- file created if it doesn't exist AND writes go to the end of the file.std::ios::trunc
- file contents discarded.std::ios::binary
- if set, no implicit text manipulations are performed on the file (e.g. replacing \n
with \r\n
or vice-versa).std::fstream f1 { "/path/to/file.txt", std::ios::in | std::ios::trunc }; // file must exist AND truncate it
std::fstream f2 {};
f2.open("/path/to/file.txt", std::ios::in | std::ios::trunc); // same open operation as f1
To close a file, use close()
or call the stream object's destructor by destroying it.
f1.close();
To check if the stream has a file open, use is_open()
.
bool open { f1.is_open() };
To read and write, the standard stream mechanisms are available: formatted operations and unformatted operations.
// write
f1 << 3 << "hello!" << std::endl; // write
// read
int x;
double y;
f1 >> x >> y;
To get and set the position, the standard stream mechanisms are available: seek*()
and tell*()
f1.seek(500);
auto pos { f1.tell() };
To handle IO errors, the standard stream mechanisms are available: exceptions()
to throw exceptions or explicitly check flags (e.g. invoke good()
).
f1.exceptions(std::istream::badbit | std::istream::failbit); // exception if bad/fail, but not good/eof
For console access, global streams provides access to standard input, standard output, and standard error. Global streams are presented to the user as global variables.
channel | element type | global variable |
---|---|---|
standard in | char |
std::cin |
standard out | char |
std::cout |
standard error | char |
std::cerr |
standard in | wchar_t |
std::wcin |
standard out | wchar_t |
std::wcout |
standard error | wchar_t |
std::wcerr |
⚠️NOTE️️️⚠️
The text and examples below use std::cout
/ std::cin
/ std::cerr
, but they should work for the other template specializations as well. Make sure to use the correct literal for raw character string types (e.g. u8"example"
for char_8t
).
To read and write, the standard stream mechanisms are available: formatted operations and unformatted operations.
// write
std::cout << 3 << "hello!" << std::endl; // write
// read
int x;
double y;
std::cin >> x >> y;
To handle IO errors, the standard stream mechanisms are available: exceptions()
to throw exceptions or explicitly check flags (e.g. invoke good()
).
std::cin.exceptions(std::istream::badbit | std::istream::failbit); // exception if bad/fail, but not good/eof
The std::filesystem
namespace contains functionality related to file systems. The functionality provided in this namespace is similar to filesystem libraries that come with other high-level langauges (e.g. Java or Python).
The subsections below detail describe basic types and common functionality. Note that most filesystem functions come in two form:
std::error_code
object and fills in its members (error information).std::filesystem::path p1 { std::filesystem::current_path() }; // on error, throws exception
std::error_code ec {};
std::filesystem::path p2 { std::filesystem::current_path(ec) }; // on error, populates ec
For brevity, the examples in the subsection typically only show one of the two overloads being used.
⚠️NOTE️️️⚠️
There's also boost::filesystem
, which is what std::filesystem
is based off of.
The type std::filesystem::path
is the abstraction used to represent paths. Common ways to initialize a path object is to supply a string, string view, or an input iterator sequence of characters. Multiple character types are supported: char
, wchar_t
, char8_t
, char16_t
, ..., where the characters are re-encoded to the type used by the native file system.
std::filesystem::path p1 { "/home/user/Downloads/my_file.txt" };
std::filesystem::path p2 { "Downloads/my_file.txt" };
std::filesystem::path p3 { str.begin(), str.end() };
⚠️NOTE️️️⚠️
Also, as of C++20, there is no built-in character set encoding/decoding functionality, so how exactly is it converting character set encodings (e.g. utf-8 characters to whatever the platform is expecting for its filesystem)? char
is for the platform's encoding - so maybe just use that and ignore everything else?
To fix path separators such that they're for the current platform, use make_preferred()
.
std::filesystem::path windows_path{"a\\b\\c"};
std::filesystem::path posix_path{"a/b/c"};
windows_path.make_preferred(); // if on Linux, will change windows_path to be "a/b/c"
Given a path, to ...
append()
, the slash operator (/), or the forward slash assignment (/=)parent_path()
.filename()
.remove_filename()
.remove_filename()
.std::filesystem::path p1 { "/home/user" };
std::filesystem::path p2 { p1 / "file.txt" }; // p2 is initialized to "/home/user/file.txt"
p1 /= "file.txt"; // p1 becomes "/home/user/file.txt"
std::filesystem::path p3 { "/home/user/file.txt" };
std::filesystem::path p4 { p3.parent_path() }; // p4 is initialized to "home/user"
std::filesystem::path p5 { p3.filename() }; // p5 is initialized to "file.txt"
p3.remove_filename(); // p3 becomes "/home/user"
p3.replace_filename("other"); // p3 becomes "/home/other"
To test if a path ...
is_absolute()
/ is_relative()
.has_filename()
.has_parent_path()
.std::filesystem::path p1 { "/home/user" };
bool is_abs { p1.is_absolute() }; // true
bool is_rel { p1.is_relative() }; // false
bool has_filename { p1.has_filename() }; // true
bool has_parent_path { p1.has_parent_path() }; // true
For file extensions, to ...
extension()
.stem()
.has_extension()
/ has_stem()
.replace_extension()
.std::filesystem::path p1 { "/home/user/file.txt" };
bool has_ext { p1.has_extension() }; // true
bool has_stem { p1.has_stem() }; // true
std::filesystem::path p1_stem { p1.stem() }; // p1_stem is initialized to "file"
std::filesystem::path p1_ext { p1.extension() }; // p1_ext is initialized to "txt"
p1.replace_extension(".png") // p1 becomes "/home/user/file.png"
p1.replace_extension("jpg") // p1 becomes "/home/user/file.jpg"
p1.replace_extension(".") // p1 becomes "/home/user/file."
p1.replace_extension("") // p1 becomes "/home/user/file"
↩PREREQUISITES↩
To get the current path, use std::filesystem::current_path()
.
std::filesystem::path p { std::filesystem::current_path() }; // on error, throws exception
// NOTE: std::error_code equivalent exists that doesn't throw exceptions
To relativize a path, use std::filesystem::relative()
.
std::filesystem::path p { "/home/user/hello/foo.txt") };
std::filesystem::path p_base { "/home/user" };
std::filesystem::path p_rel std::filesystem::relative(p, p_base);
// NOTE: std::error_code equivalent exists that doesn't throw exceptions
⚠️NOTE️️️⚠️
If you don't supply a base, it defaults the base to std::current_path()
.
According to cppreference, this also resolves symbolic links, meaning the path you submit has to exist? The text seems unclear. There's also std::filesystem::proximate()
which seems to be more loose with the rules? I'm not exactly sure what's going on here. The documentation isn't clear.
To convert a relative path to an absolute path, use std::filesystem::absolute()
. This is as if std::current_path()
were prepended to a relative path.
std::filesystem::path p { "hello/foo.txt") };
std::filesystem::path p_abs { std::filesystem::absolute(p)) }; // on error, throws exception
// NOTE: std::error_code equivalent exists that doesn't throw exceptions
⚠️NOTE️️️⚠️
There's no option to set the base path, so what's the point of this? Why not just append the paths together and normalize it (remove ..
and .
)?
To convert a path to an absolute path that also processes out .
and ..
in the path (e.g. if there's a ..
, it'll knock back a path element automatically), use std::filesystem::canonical()
.
std::filesystem::path p { ".././hello/foo.txt") };
std::filesystem::path p_abs1 { std::filesystem::absolute(p)) }; // on error, throws exception
std::filesystem::path p_abs2 { std::filesystem::canonical(p)) }; // on error, throws exception
// if the current path is /home/user, ...
// p_abs1 will be /home/user/.././hello/foo.txt
// p_abs2 will be /home/hello/foo.txt
⚠️NOTE️️️⚠️
According to cppreference, this also resolves symbolic links, meaning the path you submit has to exist? The text seems unclear. There's also std::filesystem::weakly_canonical()
which will only "resolve" up until the last known path element and just append the rest? I don't know for sure.
↩PREREQUISITES↩
To check if a path exists, use std::filesystem::exists()
.
std::filesystem::path p1 { "/home/user/file.txt" };
bool exists1 { std::filesystem::exists(p1) }; // on error, throws exception
// NOTE: std::error_code equivalent exists that doesn't throw exception
To run a ...
stat
-like command on a path, use std::filesystem::status()
.lstat
-like command on a path, use std::filesystem::symlink_status()
.⚠️NOTE️️️⚠️
The difference is that, for the latter, symlinks aren't followed. The status is for the symlink itself, not what it points to.
std::filesystem::path p1 { "/home/user/file.txt" };
std::filesystem::file_status stat1 { std::filesystem::status(p1) }; // on error, throws exception
// NOTE: std::error_code equivalent exists that doesn't throw exception
if (stat1.type() == std::filesystem::file_type::none) { /* error or not evaluated yet? */ }
if (stat1.type() == std::filesystem::file_type::not_found) { /* path not found */ }
if (stat1.type() == std::filesystem::file_type::regular) { /* regular file */ }
if (stat1.type() == std::filesystem::file_type::directory) { /* directory */ }
if (stat1.type() == std::filesystem::file_type::symlink) { /* symlink */ }
if (stat1.type() == std::filesystem::file_type::block) { /* block special file */ }
if (stat1.type() == std::filesystem::file_type::character) { /* character special file */ }
if (stat1.type() == std::filesystem::file_type::fifo) { /* fifo / pipe */ }
if (stat1.type() == std::filesystem::file_type::socket) { /* socket */ }
if (stat1.type() == std::filesystem::file_type::implementation-defined) { /* for other types, such as "NTFS junctions" */ }
if (stat1.type() == std::filesystem::file_type::unknown) { /* file exists but type unknown */ }
file::system::perms { stat1.permissions() }; // standard linux file permissions: three octal digits (e.g. 0777)
std::cout << ((perms & std::filesystem::perms::owner_read) != std::filesystem::perms::none ? "r" : "-")
<< ((perms & std::filesystem::perms::owner_write) != std::filesystem::perms::none ? "w" : "-")
<< ((perms & std::filesystem::perms::owner_exec) != std::filesystem::perms::none ? "x" : "-")
<< ((perms & std::filesystem::perms::group_read) != std::filesystem::perms::none ? "r" : "-")
<< ((perms & std::filesystem::perms::group_write) != std::filesystem::perms::none ? "w" : "-")
<< ((perms & std::filesystem::perms::group_exec) != std::filesystem::perms::none ? "x" : "-")
<< ((perms & std::filesystem::perms::others_read) != std::filesystem::perms::none ? "r" : "-")
<< ((perms & std::filesystem::perms::others_write) != std::filesystem::perms::none ? "w" : "-")
<< ((perms & std::filesystem::perms::others_exec) != std::filesystem::perms::none ? "x" : "-")
<< std::endl;
To directly test a path's type, a set of helper commands are available (e.g. std::filesystem::is_directory()
).
std::filesystem::path p1 { "/home/user/file.txt" };
if (std::filesystem::is_block_file(p1)) { /* block special file */ }
if (std::filesystem::is_character_file(p1)) { /* character special file */ }
if (std::filesystem::is_directory(p1)) { /* directory */ }
if (std::filesystem::is_fifo(p1)) { /* fifo / pipe */ }
if (std::filesystem::is_regular_file(p1)) { /* regular file */ }
if (std::filesystem::is_socket(p1)) { /* socket */ }
if (std::filesystem::is_symlink(p1)) { /* symlink */ }
if (std::filesystem::is_other(p1)) { /* equiv to exists(s) && !is_regular_file(s) && !is_directory(s) && !is_symlink(s) */ }
if (std::filesystem::status_known(p1)) { /* test for std::filesystem::file_type::none, NOT std::filesystem::file_type::unknown */ }
// NOTE: std::error_code equivalents exist that don't throw exceptionw
To set a path's permissions, use std::filesystem::permissions()
.
std::filesystem::path p1 { "/home/user/file.txt" };
std::filesystem::perms prms { std::filesystem::perms::owner_all | std::filesystem::perms::group_all | std::filesystem::perms::others_read };
std::filesystem::permissions(p1, prms); // on error, throws exception
// NOTE: There's a 3rd parameter called opts that controls how permissions are
// replaced, you almost always want to keep this as the default (replace).
// NOTE: std::error_code equivalent exists that doesn't throw exception
To ...
std::filesystem::file_size()
.std::filesystem::last_write_time()
.std::filesystem::path p1 { "/home/user/file.txt" };
std::uintmax_t sz { std::filesystem::size(p1) }; // on error, throws exception
std::filesystem::file_time_type time { std::filesystem::last_write_time(p1) }; // on error, throws exception
std::filesystem::last_write_time(p1, time); // on error, throws exception
// NOTE: file_time_type is an alias to std::chrono::time_point<std::chrono::file_clock>
// NOTE: std::error_code equivalents exist that don't throw exceptions
🔍SEE ALSO🔍
To either truncate a file or expand it by filling it with zero'd out bytes (file holes), use std::filesystem::resize_file()
.
std::filesystem::path p1 { "/home/user/file.txt" };
std::filesystem::resize_file(64u*1024u); // on error, throws exception
// NOTE: std::error_code equivalents exist that don't throw exceptions
To get the usage statistics for the disk that a path is for, use std::filesystem::space()
.
std::filesystem::path p1 { "/home" };
std::filesystem::space_info si { std::filesystem::space(p1) }; // on error, throws exception
std::cout << "disk capacity: " si.capacity << " "
<< "disk free: " si.free << " "
<< "disk available: " si.available << std::endl;
// NOTE: std::error_code equivalents exist that don't throw exceptions
↩PREREQUISITES↩
To copy a file, use std::filesystem::copy_file()
.
Optionally, a std::filesystem::copy_options
object may be passed in which defines how the copy occurs. By default, it's set to std::filesystem::copy_options::none
(default behaviour), but other options include skipping if the file exists, overwriting it, replacing it, etc..
std::filesystem::path p_from { "/home/user/file.txt" };
std::filesystem::path p_to { "/home/user/COPIED_file.txt" };
bool copied { std::filesystem::copy_file(p_from, p_to) };
// NOTE: std::error_code equivalents exist that don't throw exceptions
To copy a directory or a file, use std::filesystem::copy()
.
Optionally, a std::filesystem::copy_options
object may be passed in which defines how the copy occurs. By default, it's set to std::filesystem::copy_options::none
(default behaviour). If the path being copied is a directory, the default behaviour is to recursively copy that directory, meaning std::filesystem::copy_options::none
is the same as ``std::filesystem::copy_options::recursive` when the source path is a directory.
std::filesystem::path p_from { "/home/user1" };
std::filesystem::path p_to { "/home/user2" };
std::filesystem::copy(p_from, p_to);
// NOTE: std::error_code equivalents exist that don't throw exceptions
🔍SEE ALSO🔍
↩PREREQUISITES↩
To move / rename a file or directory, use std::filesystem::rename()
.
If the source is a directory, the destination can be non-existent (in which case it'll get created) or an empty directory (in which case it'll get deleted first and recreated).
std::filesystem::path p_from { "/home/user/file.txt" };
std::filesystem::path p_to { "/home/user/MOVED_file.txt" };
bool copied { std::filesystem::rename(p_from, p_to) };
// NOTE: std::error_code equivalents exist that don't throw exceptions
⚠️NOTE️️️⚠️
Symlinks are not followed. If the source is a symlink, the symlink itself is renamed, not the target.
↩PREREQUISITES↩
To remove a file or empty directory, use std::filesystem::remove()
.
std::filesystem::path p1 { "/home/user/file.txt" };
bool removed { std::filesystem::remove(p1) };
// NOTE: std::error_code equivalents exist that don't throw exceptions
To recursively remove all files in a directory as well as the directory itself, use std::filesystem::remove_all()
.
std::filesystem::path p1 { "/home/user" };
std::uintmax_t num_items_removed { std::filesystem::remove_all(p1) };
// NOTE: std::error_code equivalents exist that don't throw exceptions
⚠️NOTE️️️⚠️
Symlinks are not followed. If the source is a symlink, the symlink itself is removed.
↩PREREQUISITES↩
🔍SEE ALSO🔍
To create a directory, use either std:filesystem::create_directory()
or std::filesystem::create_directories()
(plural). The latter will recursively create all missing directories in the chain.
std::filesystem::path p { "/home/user/a/b/c" };
bool created { std::filesystem::create_directories(p) };
// NOTE: std::error_code equivalents exist that don't throw exceptions
To get the directory suitable for creating temporary files, use std::filesystem::temp_directory_path()
.
std::filesystem::path temp_path { std::filesystem::temp_directory_path() };
std::cout << "Temp directory is " << temp_path << std::endl;
To iterate over a directory's children, use either std::filesystem::directory_iterator
or std::filesystem::recursive_directory_iterator
. Both classes are input iterators that give back std::filesystem::directory_entry
objects, which hold metadata information about a child path. The latter class will recursively iterate down all child paths.
std::filesystem::path p { "/home/user/a/b/c" };
for (std::filesystem::directory_entry& e : std::filesystem::recursive_directory_iterator(p)) {
std::cout << e.path() << std::endl
<< e.file_size() << std::endl
<< e.last_write_time() << std::endl
<< e.hard_link_count() << std::endl
<< e.is_block_file() << std::endl
<< e.is_character_file() << std::endl
<< e.is_directory() << std::endl
<< e.is_fifo() << std::endl
<< e.is_regular_file() << std::endl
<< e.is_socket() << std::endl
<< e.is_symlink() << std::endl
<< e.is_other() << std::endl;
// std::cout << e << std::endl; // directory_entry also has a direct overload for output streams
}
🔍SEE ALSO🔍
↩PREREQUISITES↩
To create a ...
std::filesystem::create_symlink()
/ std::filesystem::create_directory_symlink()
.std::filesystem::create_hard_link()
.std::filesystem::path p_from { "/home/user/file.txt" };
std::filesystem::path p_to { "/home/user/LINKED_file.txt" };
std::filesystem::create_symlink(p_from, p_to);
std::filesystem::create_hard_link(p_from, p_to);
// NOTE: you should use create_directory_symlink() instead of create_symlink() when symlinking dirs
// NOTE: std::error_code equivalents exist that don't throw exceptions
To check if a path is a symlink, use std::is_symlink()
.
std::filesystem::path p1 { "/home/user/LINKED_file.txt" };
bool is_sym { std::filesystem::is_symlink(p1) };
// NOTE: std::error_code equivalent exists that doesn't throw exceptions
To follow a symlink, use std::filesystem::read_symlink()
.
std::filesystem::path p_from { "/home/user/LINKED_file.txt" };
std::filesystem::path p_to { std::filesystem::read_symlink(p_from) };
// NOTE: std::error_code equivalent exists that doesn't throw exceptions
To copy a symlink (not copy it's target, but copy the symlink itself), use std::filesystem::copy_symlink()
.
std::filesystem::path p1 { "/home/user/file.txt" };
std::filesystem::path p2 { "/home/user/LINKED_file.txt" };
bool equiv { std::filesystem::copy_symlink(p1, p2) }; // on error, throws
// NOTE: std::error_code equivalent exists that doesn't throw exceptions
🔍SEE ALSO🔍
To check if two paths point to the same place (e.g. symlink or hard link), use std::filesystem::equivalent()
.
std::filesystem::path p1 { "/home/user/file.txt" };
std::filesystem::path p2 { "/home/user/LINKED_file.txt" };
bool equiv { std::filesystem::equivalent(p1, p2) }; // on error, throws
// NOTE: std::error_code equivalent exists that doesn't throw exceptions
To count the number of hard links a file has, use std::filesystem::hard_link_count()
.
std::filesystem::path p2 { "/home/user/LINKED_file.txt" };
std::uintmax_t cnt { std::filesystem::hard_link_count(p2) };
// NOTE: std::error_code equivalent exists that doesn't throw exceptions
The C++ standard library provides functionality to assist with debugging. The subsections below detail some of the most common ones.
The function std::source_location::current()
determines where in the source code the program is executing. The function generates a std::source_location
object which specifies which source file and where in that file the invocation took place. The std::source_location
object provides the following member functions:
file_name()
returns which source file in which the invocation took place.line()
returns which source line in which the invocation took place.column()
returns the index within the source line in which the invocation took place.function()
returns the name of the function in which the invocation took place.//
// NOTE: log() works because the default for the location parameter is generated by the caller.
//
void log(std::string_view message, const std::source_location& location = std::source_location::current()) {
std::cout << location.file_name() << ':' << location.line() << ' ' << message;
}
int main() {
log("Hello world!"); // prints "info:main.cpp:19 Hello world!"
return 0;
}
⚠️NOTE️️️⚠️
It seems like this is replacing standard C++ preprocessor macros like __FILE__
and __LINE__
.
↩PREREQUISITES↩
In certain scenarios, it isn't easily possible to get access to the address of an object. The C++ standard library provides two functions that help with this.
std::addressof()
forcefully returns the address of an object. This is useful in cases where the object could have an operator overload for the address-of operator (&) that returns something other than the object's actual address.
int x;
std::cout << std::addressof(x) << std::endl; // prints "0x7ffc0504312c"
std::to_address()
converts a pointer-like object (e.g. a raw pointer or a smart pointer) to a raw pointer. This is useful because it abstracts out accessing the address of an object, even if that object is wrapped in something like a smart pointer.
int x { 5 };
int *xRawPtr { &x };
auto custom_deleter = [](int* x) { /* do nothing */ };
std::unique_ptr<int, decltype(custom_deleter)> xSmartPtr{ &x, custom_deleter };
std::cout << std::to_address(&x) << std::endl; // prints "0x7ffc36e943b4"
std::cout << std::to_address(xRawPtr) << std::endl; // prints "0x7ffc36e943b4"
std::cout << std::to_address(xSmartPtr) << std::endl; // prints "0x7ffc36e943b4"
⚠️NOTE️️️⚠️
See here for discussion on specific use-cases.
The function std::is_constant_evaluated()
can be invoked inside of a constexpr
function to determine whether it's being invoked at compile-time or run-time. One of the typical use-cases of constexpr
functions is to increase performance by forcefully evaluating things at compile-time when possible. As such, when debugging performance issues, std::is_constant_evaluated()
can be used to ensure that the correct path is being taken for an invocation.
Here's the example from the book...
constexpr double power(double b, int x) {
static_assert std::is_constant_evaluated();
return std::pow(b, double(x));
}
🔍SEE ALSO🔍
⚠️NOTE️️️⚠️
Technically, std::is_constant_evaluated()
can be used anywhere. If you use it ...
consteval
, it will always evaluate to trueconstexpr
, it may evaluate to true or false depending on where it was calledpreprocessor - A tool that takes in a C++ source file and performs basic manipulation on it to produce what's called a translation unit.
compiler - A tool that takes in a translation unit to produce an intermediary format called an object file.
linker - A tool that takes multiple object files to produce an executable. Linkers are also responsible for finding libraries used by the program and integrating them into the executable.
enumeration - A user-defined type that can be set to one of a set of possibilities.
enum class MyEnum {
OptionA,
OptionB,
OptionC
};
MyEnum x {MyEnum::OptionC};
class - A user-defined type that pairs together data and the functions that operate on that data.
class MyClass {
public:
MyClass(int x, long y) {
this->x = x;
this->y = y;
}
int add(int z) {
this->x += z;
return y + z;
}
private:
int x;
long y;
}
union - A user-defined type where all members share the same memory location (different representations of the same data).
union MyUnion {
int x;
long y;
}
plain-old-data class - A class that contains only data, not functions.
struct Podo {
int x;
long y;
}
member - Data or function belonging to a class.
member function - Function belonging to a class (class member that is a function).
struct C {
...
int add(int y) { return this->x + y; }
};
free function - Function not belonging to a class.
int negate(int x) { return -x; }
field - Variable belonging to a class (class member that is a variable).
struct C {
int x;
};
class invariant - When using some class, a class invariant is a feature of that class that is always true (never varies). For example, if a class is used to hold on to an IP and port combination, and it ensures that the port can never be 0, that's a class invariant.
fundamental type - C++ type that's built into the compiler itself rather than being declared through code. Examples include void
, bool
, int
, char
, etc..
user-defined type - A type that's defined by a user, typically derived from existing types. Examples include enumerations, classes, unions, etc..
object initialization - The process by which a C++ program initializes an object (e.g. an int
, array of int
s, object of a class type, etc..).
braced initialization - A form of object initialization where braces are used to set values (e.g. int x {1}
, MyStruct x{ 1, true }
, etc..). Braced initialization is often the least error-prone form of object initialization, where other forms may introduce ambiguity.
MyStruct x{int(a), int(b)}; // call the constructor taking in two ints
MyStruct x(int(a), int(b)); // possibly interpreted as function declaration -- equiv to MyStruct(int a, int b)
float a{1}, b{2};
int b (a/b); // no compiler warning generated about narrowing (why? -- book doesn't say)
int b {a/b}; // compiler warning generated about narrowing
⚠️NOTE️️️⚠️
This is also called uniform initialization.
equals initialization - A form of object initialization where the equals sign is used (e.g. int x = 5
).
braces-plus-equals initialization - A form of object initialization where both the equals sign and braces are used for initialization (e.g. MyStruct x = { 1, true }
). This is mostly the same as braced initialization.
⚠️NOTE️️️⚠️
See here. Even though there's an equal sign (=), there is no copy semantics / move semantics.
constructor - A function used for initializing an object.
struct MyStruct : MyParent {
...
MyStruct() {
// do some setup here
}
};
destructor - A function used for cleanup when an object is destroyed.
struct MyStruct : MyParent {
...
~MyStruct() {
// do some cleanup here
}
};
See also: virtual destructor.
default constructor - A constructor that has in no parameters.
pointer - A data type used to point to a different piece of memory (e.g. int yPtr { &y }
).
reference - A data type used to point to a different piece of memory, but in a more sanitized / less confusing manner (e.g. int &yRef { y };
).
sizeof - A unary operator that returns the size of a type or object (known at compile-time).
int x {5};
size_t x_size {sizeof x};
address-of (&) - A unary operator used to obtain the memory address of an object (pointer) (e.g. int *ptr {&x}
).
dereference (*) - A unary operator used to obtain the object at some memory address (e.g. int x {*ptr}
).
member-of-pointer (->) - An operator that dereferences a pointer and accesses a member of the object pointed to (e.g. ptr->x
).
member-of-object (.) - An operator that accesses a member of an object to (e.g. obj.x
).
pointer arithmetic - Adding or subtracting integer types to a pointer will move that pointer by the number of bytes that makes up its underlying type (e.g. uint32_t *ptrB = ptrA + 1
will set ptrB
to 4 bytes ahead of ptrA).
reseating - The concept of a variable that points to something updating to point to something else. Pointers can be reseated, but references cannot.
int x {5};
int *p {&x};
int y {7};
p = &y; // reseat p
member initializer list - A comma separated list of object initializations for the fields of a class appearing just before a constructor's body.
struct MyStruct {
int count;
bool flag;
MyStruct(): count{0}, flag{false} {
}
}
default member initialization - The object initialization of a field directly where that field is declared.
struct MyClass {
...
int my_var {5};
};
object - A region of memory that has a type and a value (e.g. class, an integer, a pointer to an integer, etc..).
allocation - The act of reserving memory for an object.
deallocation - The act of releasing the memory used by an object.
storage duration - The duration between an object's allocation and deallocation.
lifetime - The duration between when an object's constructor completes (meaning the constructor finishes) and when its destructor is invoked (meaning when the destructor starts).
automatic object - An object that's declared within an enclosing code block. The storage duration of these objects start at the beginning of the block and finish at the end of the block.
int my_func(int x) {
int automatic_object {x + 5};
return automatic_object;
}
static object - An object that's declared using static
or extern
. The storage duration of these objects start at the beginning of the program and finish at the end of the program.
local static object - A static object but declared at function scope. The storage duration of these objects start at the first invocation of the function and finish at the end of the program.
int my_func() {
static int local_static_object {0};
local_static_object++;
return local_static_object;
}
static member - An object that's a member of a class but bound globally rather than on an instance of the class. A static field is essentially a static object that's accessible through the class itself (not an instance of the class). Similarly, a static method is essentially a global function that's accessed through the class (not an instance of the class).
struct MyClass {
...
static int my_var {5};
};
thread local object - An object where each thread has access to its own copy. The storage duration of these objects start at the beginning of the thread and finish when the thread ends.
dynamically allocated object - An object that's allocated and deallocated at the user's behest, meaning that it's storage duration is also controlled by the user.
int * x { new x {5} };
delete x;
internal linkage - A variable only visible to the translation unit it's in.
external linkage - A variable visible to the translation units that it's in as well as other translation units.
scope resolution (::) - An operator that's used to access static members (e.g. MyStruct::static_func()
).
extend - Another way of expressing class inheritance (e.g. B extends A is the same as saying B is a child of A).
exception - An exception operation accepts an object and unwinds the call stack until reaching a special region specifically intended to stop the unwinding for objects of that type, called a try-catch block. Exceptions are a way for code to signal that something unexpected / exceptional happened.
structured binding - A language feature that allows for unpacking an object's members / array's elements into a set of variables (e.g. auto [x, y] { two_elem_array }
).
copy semantics - The rules used for making copies of objects of some type. A copy, once made, should be the same as its source. A modification on the copy shouldn't modify the source as well.
member-wise copy - The default copy semantics for classes. Each individual field is copied.
copy constructor - A constructor with a single parameter that takes in a reference to an object of the same type (e.g. T(const T &) { ... }
). A copy constructor is used to specify the copy semantics for that class.
copy assignment - An assignment operator overload that copies one object into another (e.g. x = y
). Copy assignment requires that resources in the destination object be cleaned up prior to performing the copy.
RAII - Short for resource acquisition is initialization, the concept that the life cycle of some resource (e.g. open file, database object, etc..) is bound to an object's lifetime via it's constructor and destructor.
Sometimes also referred to as constructor acquires destructor releases (CADRe).
moved-from object - When an object is moved to another object, that object enters a special state where the only possible operation allowed on it is either destruction or re-assignment.
move constructor - A constructor with a single parameter that takes in an rvalue reference to an object of the same type (e.g. T(T &&) { ... }
). A move constructor is used to specify the move semantics for that class.
move assignment - An assignment operator overload that moves one object into another (e.g. x = y
).
value categories - A classification hierarchy for C++ expressions. Any C++ expression falls into one of the following categories: lvalue, xvalue, or prvalue.
The intent of this hierarchy is to enable the moving of objects. In this case, moving doesn't mean copying. It means gutting out the contents of one object and moving it into another object.
prvalue - An expression that, once evaluated, is a transient / temporary object.
(x + 51) / n // this is a prvalue (the result is temporary, needing to go somewhere)
x // this is NOT a prvalue (the result of x is just x -- it's an exist object)
lvalue - An expression that, once evaluated, is an addressable object (NOT transient / NOT temporary / the address-of operator is usable on it).
(x + 51) / n // this is NOT an lvalue (the result is temporary, needing to go somewhere)
x // this is a lvalue (the result of x is just x -- it's an exist object)
xvalue - An expression that, similar to lvalue, is an addressable object. But, unlike lvalue, the object is marked as being near the end of its lifetime.
⚠️NOTE️️️⚠️
See the expression categories for more information.
variable length array - A feature of C99 that allows for declaring an automatic storage duration array whose length is determined at runtime (non-constant length). This feature is not available in C++ because C++ provides higher-level abstractions for collections of objects in its STL (speculation).
void test(int n) {
int x[] = int[n]; // okay in C99, but not in C++
}
rvalue reference - A data type that's more-or-less the same as a reference but conveys to the compiler that the data it's pointing to is an rvalue (e.g. MyType &&rref { y }
).
virtual method - A method in a base class that is overridable by any class that inherits from that base class.
struct MyParent {
...
virtual int v2() {
return this->x + this->y;
}
};
pure virtual method - A virtual method that requires an implementation (no implementation has been provided by the base class that declares it). For a class to be instantiable, it cannot have any pure virtual methods (similar to an abstract class in Java).
struct MyParent {
...
virtual int v2() = 0;
...
};
pure virtual class - A class that only contains pure virtual methods.
struct MyParent {
virtual int v1() = 0;
virtual int v2() = 0;
virtual ~MyParent() {}; // also okay to do "virtual ~MyParent() = default"
};
virtual destructor - A destructor that's a virtual method.
struct MyStruct : MyParent {
...
virtual ~MyStruct() {
// do some cleanup here
}
};
vtable - A table of pointers to virtual functions, generated by the compiler. When a virtual function gets invoked (runtime) vtables are used to determine which method implementation to use.
template - A class or function where parts of the code are intended for substitution (by other code). At compile-time, a user supplies a set of substitutions for each usage of a template, customizing it for the specific use-case that user is dealing with.
template <typename X, typename Y, typename Z>
X add(Y y, Z z) {
return y + z;
}
template parameter - An identifier within the template. At compile time, any time a template is used its template parameters are substituted with code that the usage supplies.
A template parameter may be used multiple times throughout the template. At compile-time, each usage is substituted with the same piece of code.
// X, Y, Z, and N are template parameters
template <typename X, typename Y, typename Z, int N>
struct MyClass {
X perform(Y &var1, Z &var2) {
return (var1 + var2) * N;
}
};
template instantiation - The process of substituting the template parameters in a template with real code.
MyClass<float, int, int, 2> obj {}; // X = float, Y = int, Z = int, N = 2
float x { obj.perform(5, 3) };
named conversion - A set of language features / functions used for converting types (casting): const_cast
, static_cast
, reinterpret_cast
, and narrow_cast
.
concept - A compile-time check to ensure that the type substituted for a template parameter matches a set of requirements (e.g. the type supports certain operators).
// concept
template <typename T1, typename T2, typename TR>
concept MyConcept = std::is_default_constructible<T1>::value
&& std::is_default_constructible<T2>::value
&& requires(T1 a, T2 b) {
{ a + b } -> std::same_as<TR>;
{ a * b } -> std::same_as<TR>;
};
// usage of concept
template <typename T1, typename T2>
requires MyConcept<T1, T2, T1>
T1 add_and_multiply(T1 &var1, T2 &var2) {
return (var1 + var2) * var2;
}
compile-time - Used in reference to something that happens during the compilation process.
runtime - Used in reference to something that happens when the compiled program is running.
zero-arg - Short for zero argument. A function with zero parameters.
parameter pack - In the context of templates, a parameter pack is a single template parameter declaration that can take in zero or more substitutions (variadic).
template <typename X, typename... R>
X create(R... args) {
return X {args...};
}
variadic - A function that takes in a variable number of arguments, sometimes also called varargs.
float avg(size_t n, ...) {
va_list args;
va_start(args, n);
float sum {0};
while (size_t i {0}; i < n; i++) {
sum += va_args(args, float);
}
va_end(args);
return sum /= n;
}
template specialization - Given a specific substitutions set substitutions for the template parameters of a template, a template specialization is code that overrides the template generated code. Oftentimes template specializations are introduced because they're more memory or computationally efficient than the standard template generated code.
// template
template<typename T>
T sum(T a, T b) {
return a + b;
}
// template specialization for bool: bitwise or
template<>
bool sum<bool>(bool a, bool b) {
return a | b;
}
partial template specialization - A template specialization where not all of the template parameters have been removed.
// template
template<typename R, typename T>
struct MyClass {
R sum(T a, T b) {
return a + b;
}
};
// template specialization for pointers of unknown type: already return false
template<typename X>
struct MyClass<bool, X*> {
bool sum(X * a, X* b) {
return false;
}
};
partial template - A template with some of its template parameters set (not all).
// declare
template <typename Y, typename Z>
using MyClassPartialTemplate = MyClass<float, Y, Z, 42>;
// use
MyClass<float, int, int, 42> x{};
MyClassPartialTemplate<int, int> y{}; // same type as previous line
default template argument - The default substitute in use for a template parameter.
template <typename X, typename Y = long, typename Z = long>
X perform(Y &var1, Z &var2) {
return var1 + var2;
}
heap - An implementation-specific block of memory used for dynamic objects. Also called the free store.
implicit type conversion - When an object of a certain type is converted automatically, without code explicitly changing the object to a different type (e.g. long x {1}
implicitly converts the int
literal in the initializer to the long
type).
explicit type conversion - When an object of a certain type is explicitly converted to another type: casting and named conversions.
promotion rule - An implicit type conversion that may occur when an operator's operands are of differing integral and floating point types. For example, adding an integral type with a smaller integral type will cause the result to be of the same type as the larger type.
int x {5};
long y {5L};
auto z {x + y}; // z will be long
narrowing conversion - When an object of a certain type is truncated to a lesser type (e.g. int
to short
).
Narrowing conversions may be implicit during object initialization. To erroneous cases of narrowing, use braced initialization to force the compiler to generate a warning.
constant expression - A function that gets evaluated at compile-time, such that at run-time any invocation of it simply returns the result computed at compile-time. Constant expressions are represented as functions prefixed with constexpr
.
constexpr int test(int n) {
return n % 2;
}
immediate function - A function that gets evaluated at compile-time and must produce a compile-time constant. Immediate functions expressions are represented as functions prefixed with consteval
.
consteval int test(const int n) {
return n % 2;
}
literal type - A type that's usable in a constant expression (for parameters and return), meaning that objects of this type can have a value that's knowable at compile-time (e.g. nullptr
).
volatile - A volatile variable's usage in code is immune to compiler optimizations such as operation re-ordering and removal. Mutations and accesses, no matter how irrelevant they may seem, are kept in-place and in-order by the compiler.
volatile int x {5};
x = 5;
x = 6;
x = 7;
type alias - A synonym (different name) for an existing type.
using BasicGraph = DirectedGraph::Graph<std::string, std::map<std::string, std::string>, std::string, std::map<std::string, std::string>>;
BasicGraph removeLimbs(const BasicGraph &g);
attribute - A tag applied to code that provides information to the user / compiler about whatever it is that it's applied to. Similar to Java annotations.
if (x == 0) [[likely]] {
return x + y;
} else {
report_error();
}
iterator - A type used to access elements within some sequence (e.g. array, class representing a list, class representing an infinite stream of int
s, etc..). An iterator requires a specific set of operators to be implemented, where those operators function similar to accessing memory using pointer arithmetic / arrays.
MyIterator it {collection.begin()};
while (it != collection.end()) {
MyObject value {*it};
// do something with value here
++iterator;
}
Five types of iterators exist:
modifier - Optional marker that alters a function. With functions, a modifier may be required to go either before the return type (prefix modifier) or after the parameter list (suffix modifier).
// modifier here
// vvvvvvvv
int add(int x, int y) noexcept {
return x + y;
}
fold expression - Exhaustively applies a binary operator to the contents of a parameter pack and returns the final result.
template<typename... R>
T test(R... args) {
R l_ass_res {... - args}; // ((((a-b)-c)-d)-...)
R r_ass_res {args - ...}; // (...-(w-(x-(y-z))))
return l_ass_res + r_ass_res;
}
associativity - In the context of binary operators, associativity refers to the order in which an expression with a chain of the same binary operator is evaluated. The term ...
left associative means that the chain is evaluated left-to-right (left-most first, right-most last).
a ? b ? c ? d == (((a ? b) ? c) ? d)
right associative means that the chain is evaluated right-to-left (right-most first, left-most last).
a ? b ? c ? d == (a ? (b ? (c ? d)))
function pointer - A pointer to a function.
int add(int a, int b) {
return a + b;
}
int (*p)(int, int) {add};
p(1, 2); // invoke
functor - A class that you can invoke as if it were a function because it has an operator overload for the function-call operator.
struct MyFunctor {
int operator()(int y) const { return -y + x; }
private:
int x {5};
};
function call operator - The operator used for making function calls (parenthesis), may be operator overloaded on classes to turn them into functors.
int operator()(int y) const { return -y + x; }
lambda - Shorthand expression for an unnamed functor.
auto f = [] (int z) -> int { return -z; };
named capture - Pulling in objects from the outer scope into a lambda by explicitly listing their names in the capture clause, adding &
before each name if wanting to pull it in by reference rather than by copy.
auto f = [&x, &y] (int z) -> int { return x + y + z; }; // x and y from outer scope
default capture - Pulling in objects from the outer scope into a lambda automatically (based on their usage) but putting either an =
(for copying into lambda) or &
(for referencing into lambda) in the capture clause.
auto f = [=] (int z) -> int { return x + y + z; };
init capture - An initializer expression used as a lambda named capture.
auto f = [new_x=x/2, &y] (int z) -> int { return new_x + y + z; };
callable object - An object that can be invoked: functor, or lambda. A function isn't considered an object and as such it doesn't qualify as a callable object. However, many documents online seem to include functions under the umbrella of callable object.
function overload - A function that has the same name as another function within the same scope.
bool test(int a) { return a != 0; }
bool test(double a) { return a != 0.0; }
operator overload - A function that gets invoked when a certain operator is used with some specific class. The function can be either a free function or a member function of the class the operator is intended for.
struct MyClass {
int operator()(int y) const { return -y + x; } // function call operator
...
};
forward declaration - To use a function, class, variable, etc.. within some C++ code, only its declaration is needed, not its definition (implementation). The compiler will ensure that the usage points to the implementation when the time comes.
The compiler needs this to handle cyclical references. It can also significantly reduce build times.
class MyClassA; // forward declaration of MyClassA
class MyClassB; // forward declaration of MyClassB
int myFunction(MyClassA &objA, MyClassB &objB); // forward declaration of a function
// implement myFunction, using MyClassA and MyClassB before implementation is defined
int myFunction(MyClassA &objA, MyClassB &objB) {
...
}
// implement MyClassA, using MyClassB before implementation is defined
class MyClassA {
...
private:
MyClassB objB;
}
// implement MyClassB
class MyClassA {
...
private:
MyClassA objA;
}
user-defined literal - A literal suffix defined by a user, where when that suffix is applied to some literal, some computation is performed.
Distance d {42.0_km}; // the suffix _km converts the literal 42.0 to an instance of the Distance type
module unit - A translation unit that contains a module declaration.
export module MyModule; // module declaration
export int add(int a, int b) {
return a + b;
}
three-way comparison operator - Given two objects a
and b
, the three-way comparison operator determines if a < b
, a == b
, or a > b
.
The symbol for the operator is an equal-sign sandwiched between angle brackets: a <=> b
. This operator is sometimes called the spaceship operator because it's said that the symbol for the operator looks like a spaceship.
smart pointer - A class that wraps a pointer to a dynamically object. The class provides some level of automated pointer management / memory management through the use of move semantics, copy semantics, and RAII.
namespace - C++'s mechanism of organizing code into a logical hierarchy / avoiding naming conflicts, similar to packages in Java or Python.
unnamed namespace - A special namespace that limits the visibility of the code to the containing translation unit, meaning that code can't be referenced at all outside of the translation unit.
universal reference - A function template that automatically creates overloads based on whether the argument passed in for a parameter is a lvalue reference or a rvalue reference.
// TEMPLATE where the parameter x is a universal reference
template<typename T>
void test(T && x) {
if (x % 2 == 0) {
vector.push_back(std::forward<T>(x)); // forward based on the reference type
}
}
// When the type is an it, the above template expands to the following two overloads ...
void test(int & x) {
if (x % 2 == 0) {
vector.push_back(x); // calls push_back(int &x) / push_back(int x)
}
}
void test(int && x) {
if (x % 2 == 0) {
vector.push_back(std::move(x)); // calls push_back(int &&x)
}
}
special member function - A member function which, if invoked but not explicitly implemented, the compiler will automatically generate a default implementation for. Each of the following is considered a special member function: default constructor, copy constructor, move constructor, copy assignment operator, move assignment operator, and destructor.
equivalence - Equivalence and equality are different mechanisms for comparing two object for same-ness. When two objects A
and B
are ...
A
and B
are indistinguishable (A
can be substituted for B
and vice-versa without any side effects).A
and B
may or may not be indistinguishable, but are considered to be the same under certain criteria.For example, two strings "hello world"
and "HELLO WORLD"
are not equal, but would be considered equivalent if the criteria were that case is to be ignored.
⚠️NOTE️️️⚠️
The C++ standard library separates the idea of equality and equivalence by assuming the equality operator (==) only tests for equality.
a == b
!(a < b) && !(b < a)
(or !(a > b) && !(b > a)
, or some other relation that doesn't use ==
)See here.
strong ordering - A form of ordering supported by the spaceship operator that, when A == B
, guarantees that A
can be substituted for B
(and vice-versa) without any side effects. The objects are considered indistinguishable, and therefore are considered both equal and equivalent.
When the spaceship operator determines that the type can only support partial ordering, its return type is std::strong_ordering
.
std::strong_ordering::less
std::strong_ordering::equal
std::strong_ordering::equivalent // same as equal
std::strong_ordering::greater
weak ordering - A form of ordering supported by the spaceship operator that, when A == B
, doesn't assume that A
can be substituted for B
(and vice-versa). The objects are not guaranteed to be equal (indistinguishable from each other), but they are considered equivalent. An example of weak ordering is when an object encapsulating a text string does a case-insensitive test for equality (==). The strings "hello world"
and "HELLO WORLD"
will be considered the same, but one can't necessarily be substituted for the other.
When the spaceship operator determines that the type can only support partial ordering, its return type is std::weak_ordering
.
std::weak_ordering::less
std::weak_ordering::equivalent
std::weak_ordering::greater
partial ordering - A form of ordering supported by the spaceship operator that is essentially weak ordering but with the caveat that the objects being compared may not be comparable at all. For example, the floating point number 5.5
is not comparable at all to NaN
:
5.5 == NaN
is false.5.5 < NaN
is false.5.5 > NaN
is false.When the spaceship operator determines that the type can only support partial ordering, its return type is std::partial_ordering
.
std::partial_ordering::less
std::partial_ordering::equivalent
std::partial_ordering::greater
std::partial_ordering::unordered