Near-duplicate features of C++
A large collection of features within the C++ programming language have very similar functionality. Much of this is due to features being inherited straight from C, then new native C++ features being added as alternatives.
Knowing which feature to use in what situation can be a continual learning struggle. Each team might use a different subset of features, which can cause needless confusion and stylistic variation. As new features get added to the C++ language or as awareness of existing features increases, it’s sometimes necessary to revisit old codebases and revise them to comply with modern practices.
Finally, although this state of affairs is an unchangeable fact about C++, other major programming languages have far less duplication. Don’t automatically assume that C++’s situation is desirable, necessary, or inevitable.
Abstract feature | Features in C | Features in C++ |
---|---|---|
File extension |
|
|
Single include |
|
(No new feature in C++) |
Null pointer value |
|
|
Minimum integer width |
|
(No new feature in C++) |
Operator token |
|
(No new feature in C++) |
Reference type |
|
|
Reference passing |
|
|
Type aliasing |
|
|
Type specifier |
|
|
Type conversion |
|
|
Variable initializer |
|
|
Nullary function |
|
|
Nullary constructor call | (Unavailable in C) |
|
Optional argument |
|
|
Return type |
|
|
Compound data |
|
|
Field initialization |
|
|
Member hiding |
|
|
Namespacing | (Unavailable in C) |
|
Generic code |
|
|
Type parameter | (Unavailable in C) |
|
Variable function |
|
|
Compile-time evaluation |
|
|
Method definitions | (Unavailable in C) |
|
Heap allocation |
|
|
Heap deallocation |
|
|
Character string |
|
|
Sequence of values |
|
|
Access into sequence |
|
|
Array filling |
|
|
Array copying |
|
|
C standard library header |
|
|
I/O library |
|
|
Random number generation |
|
|
Exception handling |
|
|
- File extension
-
Many pieces of C source code (but not all) can be compiled in C++ mode without modifications – so .c is a valid file extension for C++ files. Many pure-C++ header files still use a .h extension (same extension as C header files) instead of a C++-specific extension.
As for the C++-specific file extensions, generally .cpp and .cc can be found in the wild. The other alternatives are rare. Unlike other major languages, there is no standardization on file extensions. Surprisingly, C doesn’t suffer from this because C files are universally named as .c or .h.
- Single include
-
There are at least two ways to ensure that any given header file is included at most once. The standards-compliant, but cumbersome and brittle way is to use an
#ifdef
+#define
+ body +#endif
construct. It requires 3 lines of code, and the defined constant needs to be manually synchronized with the file name. The convenient and widely supported (but technically non-standard) way is to simply write#pragma once
at the top. If the chosen C/C++ compiler doesn’t support this, it’s not hard to write a script that replaces every header file’s#pragma once
with an auto-generated#ifdef
guard. - Null pointer value
-
The only null pointer value in C is
0
, andNULL
is simply a macro constant defined as0
. Thenullptr
keyword introduced in C++11 is much more type-safe and less ambiguous in overloads. Always usenullptr
instead of the oldNULL
or0
.(More info: C++ for Arduino: Why C++ programmers don’t use NULL?, mina86.com: 0 is ambiguous)
- Minimum integer width
-
The C and C++ standards make a number of guarantees on the bit widths of basic integer types, such as:
short
andint
are at least 16 bits,long
is at least 32 bits, width(char
) ≤ width(short
) ≤ width(int
), et cetera. C99 and C++11 introduce the stdint.h header, which defines explicitly sized types likeint_least16_t
. Because the simpleint
type is already guaranteed to have at least 16 bits, we might as well use it instead of the fancier type name. - Operator token
-
The C language uses characters such as
|
and~
, and can support non-ASCII character sets. Some characters used in the language are absent from certain character sets, so alternate spellings of certain operators and tokens were added to the language. In C, these synonyms are activated by including iso646.h, whereas in C++ these synonyms are a mandatory part of the language. One consequence is that you cannot name a variable or function asand
,or
,not
, etc. Other consequences are that the feature can lead to style disagreements or can be abused for code obfuscation.(More info: cppreference.com: Alternative operator representations)
- Reference type
-
Example using a pointer:
int x; int *ptr = &x; *ptr = 2;
Example using a reference:
int x; int &ref = x; ref = 2;
Both pieces of example code above behave identically. Internally, the reference is implemented as a pointer. Some key differences are that a reference is never
nullptr
, a reference cannot be redefined (reseating), and a reference cannot be indexed/subscripted – a pointer can do all three of these things, but the functionality is often unneeded. References are essentially restricted pointers, and don’t really add new features (except possibly for checkingnullptr
at the time of assignment instead at the time of reading/writing the value). References do reduce the syntactic burden where you continually write*
to dereference a pointer. References tend to be more useful and idiomatic in C++ than pointers, but pointers are indispensable for some tasks still. - Reference passing
-
Passing a raw value by reference requires no symbol at the call site, whereas passing by pointer does. While this is convenient, it can easily hide the fact that another function can change the value of a variable even though no function has a pointer to the variable.
- Type aliasing
-
C++11 introduces a new way to create type aliases. The new way uses a different keyword, and the ordering of the tokens is arguably more natural, especially for complex types such as arrays and functions.
- Type specifier
-
When using a
enum
/struct
/union
type in C, you need to include that keyword, unless the type wastypedef
’d. In C++ (also applies toclass
), the keyword can be omitted if the name is unambiguous, which is almost always the case.(More info: Stack Overflow: Why write a “class” in the declaration of a class object?, Stack Overflow: Why does C need “struct” keyword and not C++?)
- Type conversion
-
C++ exploded the number of ways to convert between types. For primitive types, if the old C cast of
(Type)val
is valid, then the constructor notations (officially called function-style casts) ofType(val)
(all C++ versions) andType{val}
(since C++11) are valid too. For example:bool a = (...); int b = (...); long c = (...); float d = (...); char e = char(a); // From bool short f = short(b); // Front int typedef long long LL; // Need this for multi-token types long long g = LL(c); // Can't just write: long long(c) double h = double{d}; // Introduced in C++11
The various language-level cast operators cover conversions on integers, constness, primitive pointers, object pointers, etc.:
static_cast
,const_cast
,reinterpret_cast
,dynamic_cast
.For structs and classes, a unary constructor without the
explicit
designation can be used as an implicit cast:class Foo { public: Foo(int x) {} explicit Foo(char *y) {} }; int a = (...); char *b = (...); Foo c = a; // OK Foo d = b; // Compile-time error
(More info: Bjarne Stroustrup’s C++ Style and Technique FAQ - What good is static_cast?)
- Variable initializer
-
A variable can be initialized in 3 possible ways, with different semantics with respect to which constructor is called, the assignment operator, and variable-length lists:
Foo x = w; // C style Foo y(w); // C++ style Foo z{w}; // C++11 and above
- Nullary function
-
In C++, these two constructs are synonyms, and the simpler form with
()
is preferred over(void)
. In C, the form with(void)
means that the function must take no arguments, whereas the form with()
has complicated semantics that can lead to subtle errors; hence the form with(void)
is strongly recommended in C.(More info: Stack Overflow: Is there a difference between foo(void) and foo() in C++ or C?, Stack Overflow: Is it better to use C void arguments “void foo(void)” or not “void foo()”?)
- Nullary constructor call
-
When creating an object on the heap with
new
and calling a zero-argument constructor, there are 3 possible notations, with the last two being semantically equivalent:Foo *u = new Foo; Foo *v = new Foo(); Foo *w = new Foo{};
When creating an object on the stack and calling a zero-argument constructor, the parentheses option is not available because that would declare a function prototype instead:
Foo x; // OK Foo y(); // Different meaning Foo z{}; // OK
Now consider these class definitions:
// POD (plain old data) type class A { public: int i; }; // Non-POD type, and compiler provides default constructor class B { public: int i; ~B() {} }; // Explicit constructor without initialization class C { public: int i; C() {} }; // Explicit constructor with initialization class D { public: int i; D() { i=1; } };
If we create an object of each type without parentheses/braces (e.g.
A *p = new A;
), then:an object of type
A
will havei
uninitialized.an object of type
B
will havei
uninitialized.an object of type
C
will have theC()
constructor called andi
uninitialized.an object of type
D
will have theD()
constructor called andi
initialized to1
.
Whereas if we create an object of each type with parentheses/braces (e.g.
A *p = new A();
):an object of type
A
will havei
default-initialized to0
.an object of type
B
will havei
default-initialized to0
.an object of type
C
will have theC()
constructor called andi
uninitialized.an object of type
D
will have theD()
constructor called andi
initialized to1
.
As we can see, the parentheses/braces are optional when the target type has a default constructor explicitly defined. Otherwise, the parentheses/braces will force default initialization.
(More info: Stack Overflow: Do the parentheses after the type name make a difference with new?)
- Optional argument
-
A function can be declared with default argument values for optional parameters:
int foo(int bar=0) { ... } print(foo()); // Equivalent to print(foo(0))
However, the above construct is a special case of the more general and powerful mechanism of function overloading:
int foo() { return foo(0); } int foo(int bar) { ... } print(foo()); // Calls the top definition, which leads to foo(0)
By comparison, Python only has default arguments, and Java only has method overloading.
- Return type
-
The classic C syntax (also adopted in C++, C#, D, Java, etc.) places the return type in front of the function name:
int main(...) { ... }
C++11 allows the keyword
auto
as a dummy return type, then have the actual return type declared after the argument list and an arrow:auto main(...) -> int { ... }
The functional benefit of this style is that the trailing return syntax allows the return type to depend on the arguments.
The trailing style could aid readability. Perhaps because of this, many new languages like Scala, Go, Rust, Swift, etc. declare functions in this way.
- Compound data
-
structs
andclasses
can both contain the same things (fields, constructors, methods, nested classes, etc.) and can have parent classes, but they differ with respect to default visibility level and possibly other subtle characteristics. The cleanest approach is to use astruct
if it contains only fields and no other members, and aclass
when constructors and methods are needed. - Field initialization
-
The fields of a
struct
orclass
can be initialized in a few possible places:class Foo { int x = 0; int y; int z; Foo () : y(1) { z = 2; } };
The constructor’s initializer list (between the colon and opening brace) is mandatory for variables with a reference type or a type without a default constructor.
Note that Java suffers from three choices too, with two of them being syntactically identical to C++:
class Bar { int x = 0; int y; int z; { // Instance initializer block (rarely used) y = 1; } public Bar() { z = 2; } }
- Member hiding
-
Members outside of classes can be confined to the compilation unit by adding
static
to the declaration:static int counter = 0; static void func() { ... }
Members outside of classes can also be confined to the compilation unit by putting them inside an anonymous namespace:
namespace { int counter = 0; void func() { ... } }
Members inside classes/structs are hidden with the
private
access modifier:class Test { private: static int counter = 0; private: static void func() { ... } };
- Namespacing
-
Global-ish variables and functions can be placed inside a namespace or as static members inside a class:
namespace Alpha { int gamma; void delta(); } class Beta { static int gamma; static void delta(); }; // Same usage syntax print(Alpha::gamma); print(Alpha::delta()); print(Beta::gamma); print(Beta::delta());
- Generic code
-
Some forms of generic code are expressible using C preprocessor macros:
#define MAX(x, y) ((x) >= (y) ? (x) : (y))
But C++ templates are far more type-safe and powerful:
template <typename T> T max(T x, T y) { return x >= y ? x : y; }
- Type parameter
-
A template with type parameters can be specified with
class
(old style, discouraged) ortypename
(modern style). - Variable function
-
Function pointers are one way to convey a variable function (this comes from C):
int foo() { ... } int bar() { ... } int (*chosen)() = choice ? foo : bar; print(chosen());
Objects with virtual methods are another way to convey a variable function (and this is the only way in Java):
class Base { virtual int doIt(); }; class Foo : Base { virtual int doIt() { ... } }; class Bar : Base { virtual int doIt() { ... } }; Base *chosen = choice ? new Foo() : new Bar(); print(chosen->doIt());
Lambda expressions (introduced in C++11) provide a new way to convey a variable function:
auto foo = []() { return 0; }; auto bar = []() { return 1; }; int (*chosen)() = choice ? foo : bar;
- Compile-time evaluation
-
C and C++ provide ways to define complicated expressions/functions such that the compiler evaluates their ultimate value at compile time (instead of at run time). The mechanism provided by C is macro functions, which is limiting and brittle. In C++, specializations of templates can be used to achieve compile-time evaluation of values, but this is cumbersome. C++11 adds the
constexpr
keyword and rules for what compile-time-evaluable functions are allowed to do. - Method definitions
-
A class can have its methods defined in the class declaration, for example:
class Foo { void bar() { ... do stuff ... } int qux() { ... do stuff ... } };
If the class is part of a generic template, then the above format is mandatory. Also, this format is mandatory in newer languages like Java, C#, etc. – there is no concept of a function prototype.
Otherwise, the class can be declared with a bunch of empty method prototypes:
class Foo { void bar(); int qux(); };
Subsequently, the method definitions are placed in a .cpp file:
#include "Foo.hpp" void Foo::bar() { ... do stuff ... } int Foo::qux() { ... do stuff ... }
The advantage of defining methods in the class is that it reduces duplication, which makes code reading and refactoring easier. The advantage of defining separately is that it allows separate compilation and parallel builds.
- Heap allocation
-
The idiomatic C++ way to allocate an object on the heap is to use the
new
operator:class Foo { ... }; Foo *x = new Foo; Foo *y = new Foo[10];
An alternative way that allows lower level control is to use
malloc()
(from C) and manually call placement-new
:Foo *x = (Foo*)malloc(sizeof(Foo)); new (x) Foo; Foo *y = (Foo*)malloc(10 * sizeof(Foo)); new (&y[0]) Foo; new (&y[1]) Foo; (... et cetera ...)
- Heap deallocation
-
When a heap object is allocated with
malloc()
(both scalars and arrays), simply callfree()
on the pointer.When a single heap object is allocated with
new
, it must be released withdelete
. But an array of heap objects likeptr = new Type[n]
must be released withdelete[] ptr
. The distinction betweendelete
anddelete[]
must be carefully respected, or else undefined behavior occurs. - Character string
-
Raw C strings are popular in C++ but cumbersome when it comes to memory allocation:
#include <string.h> char buffer[100] = "Hello"; // Need to set size strcat(buffer, " world"); // Need to avoid overrun
C++ provides a string library that handles memory allocation under the hood:
#include <string> std::string str("Hello"); str += " world"; const char *cstr = str.c_str(); // Easy conversion
- Sequence of values
-
C has arrays (supported natively in the language) and linked lists (supported manually through structs). C++ adds safer, more powerful, and more convenient implementations of the sequence ADT, primarily
std::vector
,std::array
, andstd::list
(linked list). - Access into sequence
-
An array is accessed by an integer index:
int *a = (...); int index = 5; print(a[index]);
A vector can be accessed by index or iterator:
std::vector<int> b = (...); print(b[index]); // No bounds checking print(b.at(index)); // Bounds-checked std::vector<int>::iterator it = b.begin(); ++it; print(*it); // Same as b[1]
- Array filling
-
C only has the
memset()
function to fill a block of memory with a repeatedchar
-sized value. It is mainly useful for setting to zero, or occasionally to0xFF
. It cannot fill a multi-byte value or work with specific struct fields. However, this simplicity and narrow scope makes it relatively easy to have an assembly-optimized implementation in the standard library.The
std::fill()
function in C++ is essentially a loop that performs a value assignment on each element within a range. This means it works on types of any size, and also calls the appropriate constructor (with possible computations and side effects). - Array copying
-
The C way to copy an array of values is to call the
memcpy()
ormemmove()
function. This is also appropriate in C++ for arrays of numbers and simple structs.The C++ way to copy a sequence of values is to call the
std::copy()
orstd::copy_backward()
function. Choosing which function to use is only relevant if the input and output ranges overlap; otherwisestd::copy()
is fine. Compared tomemcpy()
, the functionstd::copy()
also works onstd::vector
and other container types with iterators, and will properly call the (possibly overridden) type assignment operator to set the destination values. - C standard library header
-
Almost all C++ code depends on features of the C standard library (which are a part of C++). Including a C standard library header file can be done in one of two ways:
#include <stdfoo.h> // Old (compatible with C) #include <cstdfoo> // New (pure C++)
These ways are almost equivalent except for the subtle matter of namespacing. The first way guarantees that members will be available in the global namespace, e.g.
size_t
andprintf()
. The second way guarantees that members will be available in thestd
namespace, e.g.std::size_t
andstd::printf()
. (Preprocessor macros have no namespace and are always global.) This means it is technically a mistake to#include <cstdint>
and use the typeuint32_t
, because the type needs thestd::
prefix. However, most compilers make both the global name and thestd
-namespaced name available, which masks this subtle error. - I/O library
-
The C way of doing I/O is through
FILE*
handles,fread()
andfwrite()
, andprintf()
andscanf()
functions with format strings and variable-length arguments. Note that the stdio library covers I/O for the console, files, and strings.The C++ way of doing I/O is through objects derived from the
istream
andostream
classes, calling instance methods, using the overloaded<<
and>>
operators, and passing option objects into the overloaded operators. The functionality of C’s stdio is covered by multiple C++ headers such as iostream, fstream, sstream. - Random number generation
-
The RNG library of C is small, making it easy but weak at the same time. There is only one global generator state.
srand()
has a rather small range for a seed.RAND_MAX
is often defined as 215−1 or 231−1, which makes it painful to generate large numbers (such as uint64) or double-precision floating-point numbers.The RNG library of C++ is simultaneously fancy and intimidating. Each RNG is a separate object, and can be chosen from multiple implementations – linear congruential, Mersenne Twister, hardware RNG, etc. To generate a random number, you have to first define a distribution – such as integers in the range [a, b] or a Boolean with probability p – then call the distribution with the generator, i.e.
double val = dist(gen);
. - Exception handling
-
In C, the closest mechanism to modern exception handling is the pair of functions
setjmp()
andlongjmp()
. Otherwise, exceptional situations are conveyed through function return values, global status code variables/functions, or by signals.The C++ exception mechanism with
try
,catch
, andthrow
is used in many other languages.try
blocks can be nested, and differentcatch
blocks are used to catch different types of values that are thrown.