background preloader


Facebook Twitter

Thread is still running? C++11 Concurrency. Cpp11-on-multicore/sema.h at master · preshing/cpp11-on-multicore. Semaphores are Surprisingly Versatile. In multithreaded programming, it’s important to make threads wait.

Semaphores are Surprisingly Versatile

They must wait for exclusive access to a resource. They must wait when there’s no work available. One way to make threads wait – and put them to sleep inside the kernel, so that they no longer take any CPU time – is with a semaphore. I used to think semaphores were strange and old-fashioned. They were invented by Edsger Dijkstra back in the early 1960s, before anyone had done much multithreaded programming, or much programming at all, for that matter.

My opinion changed once I realized that, using only semaphores and atomic operations, it’s possible to implement all of the following primitives: CppCon 2015: Nicolas Guillemot & Sean Middleditch “Birth of Study Group 14..." Is there an alternative for sleep() in C? Beginner's Guide to Linkers. This article is intended to help C & C++ programmers understand the essentials of what the linker does.

Beginner's Guide to Linkers

I've explained this to a number of colleagues over the years, so I decided it was time to write it down so that it's more widely available (and so that I don't have to explain it again). [Updated March 2009 to include more information on the pecularities of linking on Windows, plus some clarification on the one definition rule.] Applying the Signal-Slot Mechanism to Implement Efficient AI Conditions.

This article continues a series of tutorials which focuses on designing and implementing the AI for a simple simulation.

Applying the Signal-Slot Mechanism to Implement Efficient AI Conditions

In the last tutorial, you learned how to design a behavior tree for a virtual dog that responds to clicks. A good way to do this is by using AI conditions that suspend themselves and wait for an event from the input manager before reactivating themselves and terminating. There are many ways to implement event-driven conditions in practice — each with its own advantages. But one of the simplest approach is to use the signal / slot mechanism. C++11 - Lambda Closures, the Definitive Guide. One of the most exciting features of C++11 is ability to create lambda functions (sometimes referred to as closures).

C++11 - Lambda Closures, the Definitive Guide

What does this mean? A lambda function is a function that you can write inline in your source code (usually to pass in to another function, similar to the idea of a functor or function pointer). With lambda, creating quick functions has become much easier, and this means that not only can you start using lambda when you'd previously have needed to write a separate named function, but you can start writing more code that relies on the ability to create quick-and-easy functions. In this article, I'll first explain why lambda is great--with some examples--and then I'll walk through all of the details of what you can do with lambda. Why Lambdas Rock Imagine that you had an address book class, and you want to be able to provide a search function. A generic loop unroller based on template meta-programming. Loop unrolling (or unwinding) is code transformation used by compilers to improve the utilization of functional units present in modern super-scalar CPUs.

A generic loop unroller based on template meta-programming

Indeed, processors have a pipelined architecture consisting of multiple staged (minimum are 5). While the CPU is executing the instruction in one of the stages he can simultaneously load and decode the next operation pointed by the program counter. However, in the presence of branch instructions, the CPU needs to wait the decode stage in order to know whether the branch has been taken or not in order to adjust the program counter and correctly load the next assembly instruction.

Over the years several architectural optimizations have been introduced to reduce the problem (e.g. branch prediction units), however in specific situation the CPU can loose up to 20 cycles because of a branch instruction. For this reason it is very important to reduce the amount of branched in the input code. And that's it. Software optimization resources. C++ and assembly. Windows, Linux, BSD, Mac OS X.

See also my blog Contents Optimization manuals This series of five manuals describes everything you need to know about optimizing code for x86 and x86-64 family microprocessors, including optimization advices for C++ and assembly language, details about the microarchitecture and instruction timings of most Intel, AMD and VIA processors, and details about different compilers and calling conventions.

Software optimization resources. C++ and assembly. Windows, Linux, BSD, Mac OS X

Operating systems covered: DOS, Windows, Linux, BSD, Mac OS X Intel based, 32 and 64 bits. Note that these manuals are not for beginners. CppCon2014/Back to the Basics! Essentials of Modern C++ Style - Herb Sutter - CppCon 2014.pdf at master · CppCon/CppCon2014. C++ Truths. C++ 11: Rvalue Reference - Perfect Fowarding. C C++ Game Development : Articles Discussions Tutorials. C C++ Game Development : Articles Discussions Tutorials.

C++ and Beyond 2011: Sean Gibb - C++ and Hardware, C++11, C++ Renaissance. Constant initialization. In one of my previous articles about compile-time computations we have seen how you can ‘abuse’ the new keyword constexpr in order to achieve some interesting effects.

Constant initialization

Now it is time to show some of the intended usages of the keyword. I could probably write about creating more compile-time constants, but you probably know it already, for instance from this proposal. This article is about constant initialization. In short, this is a new way global, static and thread-local objects (not necessarily constant) can be initialized without running into problems of initialization order or a data race. Locks Aren't Slow; Lock Contention Is. It’s true that locking is slow on some platforms, or when the lock is highly contended.

Locks Aren't Slow; Lock Contention Is

And when you’re developing a multithreaded application, it’s very common to find a huge performance bottleneck caused by a single lock. But that doesn’t mean all locks are slow. As I’ll show in this post, sometimes a locking strategy achieves excellent performance. Perhaps the most easily-overlooked source of this misconception: Not all programmers may be aware of the difference between a lightweight mutex and a “kernel mutex”. The World's Simplest Lock-Free Hash Table. A lock-free hash table is a double-edged sword.

The World's Simplest Lock-Free Hash Table

There are applications where it can provide a performance improvement that would be impossible to achieve otherwise. The downside is that it’s complicated. The first working lock-free hash table I heard about was written in Java by Dr. Cliff Click. He released the source code back in 2007 and gave a presentation about it at Google that same year. Luckily, six years has given me enough time to (mostly) catch up to Cliff on this subject.

Double-Checked Locking is Fixed In C++11. The double-checked locking pattern (DCLP) is a bit of a notorious case study in lock-free programming.

Double-Checked Locking is Fixed In C++11

Up until 2004, there was no safe way to implement it in Java. Resource Acquisition Is Initialization. Other names for this idiom include Constructor Acquires, Destructor Releases (CADRe) [6] and one particular style of use is called Scope-based Resource Management (SBRM).[7] This latter term is for the special case of automatic variables. RAII ties resources to object lifetime, which may not coincide with entry and exit of a scope. (Notably variables allocated on the free store have lifetimes unrelated to any given scope.) However, using RAII for automatic variables (SBRM) is the most common use case. Pointers - Reason why not to have a DELETE macro for C++ Multicore Storage Allocation.

When multicore-enabling a C/C++ application, it's common to discover that malloc() (or new) is a bottleneck that limits the speedup your parallelized application can obtain. This article explains the four basic problems that a good parallel storage allocator solves: Thread safety Overhead Contention Memory drift Thread Safety Basic storage allocators are not thread safe, although recent efforts have started to remedy this problem for many concurrency platforms. In other words, improper behavior due to races on the storage allocator's internal data structures can result from two parallel threads attempting allocate or deallocate at the same time. Figure 1 The simple solution to this problem is for applications to acquire a mutex (mutual exclusion) lock on the allocator before calling malloc() or free(), as illustrated below, which lets only one thread access the allocator's internal data structures at a time.

Figure 2. Custom C++ allocators suitable for video games. Object Pooling for Generic C++ classes. Introduction The default memory allocator is not efficient when it comes to frequent new and delete operations. There are a number of general purpose allocators that replace the standard one. There are certain situations when you want to improve performance at the cost of using more memory.

This is especially true for small objects that are frequently created and destroyed in processing intensive applications. So, I decided to create a class that pools instances of classes, so that new and delete works on an array of already existing objects. Usage. SmallObject Allocator. Proposed Changes to Small-Object Allocator Table of Contents 1. Introduction: Let's start with a recap of the Small-Object Allocator 's design. Each of its 4 layers has one purpose. There are two major known problems with my implementation - both of which are discussed in depth within this webpage. This zip file contains the source code and header files. C++ - Unnamed/anonymous namespaces vs. static functions. Lesson #4: Smart Pointers. One big change to modern C++ style that comes with C++11 is that you should never need to manually delete (or free) anymore, thanks to the new classes shared_ptr, unique_ptr and weak_ptr. Note that before C++11, C++ did have one smart pointer class – auto_ptr.

This was unsafe and is now deprecated, with unique_ptr replacing it. To use these classes, you'll need to #include <memory> (and also add using namespace std; or prefix with std::). unique_ptr simply holds a pointer, and ensures that the pointer is deleted on destruction. unique_ptr objects cannot be copied. It thus behaves very much like the now-deprecated auto_ptr behaved – the problem with auto_ptr was that it was aiming to work what unique_ptr does, but unable to do so properly when it was defined, back before C++11 was invented with features like move-constructors, and thus it was unsafe.

As example of how unique_ptr is used is as follows – say we have the following: C++ - Understanding the overhead of lambda functions in C++11. Lock-free Programming in C++ with Herb Sutter. C++: Custom memory allocation - General Programming. Fast memory allocations along with memory leak detection can have a big impact on games performance. Memory system – Part 1. Before we can really delve into the inner workings of the Molecule Engine’s memory system, we need to cover some base ground first.

Today, we’re taking a very thorough look at new, delete, and all their friends. There’s some surprising subleties involved, and judging from the interviews I conducted, sometimes even senior level staff messes up questions regarding the inner workings of new and delete. CodeXL - Powerful Debugging, Profiling & Analysis - AMD. AMD CodeXL is a comprehensive tool suite that enables developers to harness the benefits of AMD CPUs, GPUs and APUs. It includes powerful GPU debugging, comprehensive GPU and CPU profiling, static OpenCL™, OpenGL® and DirectX® kernel/shader analysis capabilities, and APU/CPU/GPU power profiling, enhancing accessibility for software developers to enter the era of heterogeneous computing.

AMD CodeXL is available both as a Visual Studio® extension and a standalone user interface application for Windows® and Linux®. [c++11] Fun With Functions - C++ Tutorials. Bind for fun and profit! Introduction A few days ago, I was messing with some old code of mine that uses the Boost random number library, looking at upgrading it to use the new random number features of C++11. Using the library requires the use of the boost::bind function, and this too can be replaced with the new C++11 std::bind. Unfortunately, I had not used either bind function in some time and had forgotten exactly how they work, and I had to write some simple code to remind myself. Delegates: C++11 vs. Impossibly Fast - A Quick and Dirty Comparison. History Jan 31, 2014 - Corrected code for Mikhail Semenov's approach.

Introduction. C++11 Observer Version 2. C++11 - C++ delegate implementation with member functions. Fatal bugs: A big issue is the fact that you take a copy of the object in the function in this line: std::function < void (Class, Params...) > f = func; Very, very fast event library - General Programming. Ok, I see dangling references happening in two ways: (1) not so bright library user doesn't realize that the connection set should be associated with an object and creates a global connection set somewhere. I know that a library doesn't necessarily need to protect users from themselves, but having the container in the public interface for a library is just asking for trouble, at least without clear documentation. (2) ConnectionSet only supports Add() and Clear(). Sign Up For Your Free Trial. Free Trial How do I sign up?

If you are eligible, your free trial will start when you sign up for Google Cloud Platform. To sign up, sign in or create a Google Account. You will also need a credit card or bank account details so we can verify your identity. You will not be charged or billed during your free trial. Performance Of A C++11 Signal System. Perfect forwarding and universal references in C++ PNG Filereader Implementation in C++: Using libpng. Thinking in C++ 2nd ed Volume 2. Home Page. How are entity systems cache-efficient? Quaternions and 3D Rotations.

Fastest container or algorithm for unique reusable ids in C++ Cache misses and usability in Entity Systems. What every programmer should know about memory, Part 1. Randy Gaul's Game Programming Blog. Data-Oriented Design. Writing Quick Code in C++, Quickly. Writing Quick Code in C++, Quickly. Top 10 Things I Wish I Had Known About C++ The cost of dynamic (virtual calls) vs. static (CRTP) dispatch in C++ Virtual functions and performance - C++ Inheritance - A polymorphic collection of Curiously Recurring Template Pattern (CRTP) in C++?