background preloader

Optimization

Facebook Twitter

Designing for Performance - Rico Mariani's Performance Tidbits. I wrote this article back in July and it ended up being the basis of this video (scroll to where it says “Thinking about Performance” and choose a speed) I was going to have the article edited and published seperately but somehow that never happened, so here it is now... the content isn't terribly new but it's kinda handy to have some of it in written form.

Designing for Performance I’m a “performance guy”. That means I care deeply about how fast things are, and about keeping them small and tight. I think I’m a lucky performance guy because I actually get paid to do it – I work on the .NET Runtime. Some days being a performance guy isn’t much fun. I think this article is a reaction to hearing the phrase “Premature Optimization is the root of all evil” one time too many. Now, please don’t get me wrong. “This is never going to work. If things have gone really badly then a team with a poor performing solution is going to have to do some major rewriting, even complete rewriting. Honest. Optimization in GCC.

In this article, we explore the optimization levels provided by the GCC compiler toolchain, including the specific optimizations provided in each. We also identify optimizations that require explicit specifications, including some with architecture dependencies. This discussion focuses on the 3.2.2 version of gcc (released February 2003), but it also applies to the current release, 3.3.2. Let's first look at how GCC categorizes optimizations and how a developer can control which are used and, sometimes more important, which are not.

A large variety of optimizations are provided by GCC. Most are categorized into one of three levels, but some are provided at multiple levels. Some optimizations reduce the size of the resulting machine code, while others try to create code that is faster, potentially increasing its size. For completeness, the default optimization level is zero, which provides no optimization at all. Gcc -O1 -o test test.c gcc -fdefer-pop -o test test.c gcc -O2 -o test test.c. A Whirlwind Tutorial on Creating Really Teensy ELF Executables for Linux. (or, "Size Is Everything") She studied it carefully for about 15 minutes. Finally, she spoke. "There's something written on here," she said, frowning, "but it's really teensy. " [Dave Barry, "The Columnist's Caper"] If you're a programmer who's become fed up with software bloat, then may you find herein the perfect antidote.

This document explores methods for squeezing excess bytes out of simple programs. (Of course, the more practical purpose of this document is to describe a few of the inner workings of the ELF file format and the Linux operating system. Please note that the information and examples given here are, for the most part, specific to ELF executables on a Linux platform running under an Intel-386 architecture.

Please also note that if you aren't a little bit familiar with assembly code, you may find parts of this document sort of hard to follow. In order to start, we need a program. So, here is our first version: /* tiny.c */ int main(void) { return 42; } $ gcc -Wall tiny.c $ . Efficiency - C++ ctors: What's the point of using initializer list in a .cpp file. C++ - Does the restrict keyword provide significant anti-aliasing benefits in gcc / g++ Osx - What's the easiest/best way to profile a math intensive C++ application for speed on Mac OS X.

Optimizing C++/Optimization life cycle. The construction of an efficient application should adhere to the following development process: This development process follows two criteria: Principle of diminishing returns. Optimizations that yield big results with little effort should be applied first, as this minimizes the time needed to reach the performance goals.Principle of diminishing portability. It is better to apply optimizations applicable to several platforms first, as they remain applicable on changing platform and are more understandable to other programmers. In the rare case of software that must be used with several compilers and several operating systems but just one processor architecture, the stages 4.5 and 4.6 should be swapped. This stage sequence is not meant to be a one-way sequence, in which once one stage is reached, the preceding stage is no longer used. This book is about only three of the above stages: Arrays, structs, and class instances are objects which, if not empty, contain sub-objects.