What's new in CPUs since the 80s and how does it affect programmers? This is a response to the following question from David Albert: My mental model of CPUs is stuck in the 1980s: basically boxes that do arithmetic, logic, bit twiddling and shifting, and loading and storing things in memory.
DICE. CLaSH.Tutorial. CλaSH (pronounced ‘clash’) is a functional hardware description language that borrows both its syntax and semantics from the functional programming language Haskell.
The merits of using a functional language to describe hardware comes from the fact that combinational circuits can be directly modeled as mathematical functions and that functional languages lend themselves very well at describing and (de-)composing mathematical functions. The CλaSH compiler transforms these high-level descriptions to low-level synthesizable VHDL. Busting 4 Modern Hardware Myths - Are Memory, HDDs, and SSDs Really Random Access? "It’s all a numbers game – the dirty little secret of scalable systems" Martin Thompson is a High Performance Computing Specialist with a real mission to teach programmers how to understand the innards of modern computing systems.
He has many talks and classes (listed below) on caches, buffers, memory controllers, processor architectures, cache lines, etc. His thought is programmers do not put a proper value on understanding how the underpinnings of our systems work. We gravitate to the shiny and trendy. His approach is not to teach people specific programming strategies, but to teach programmers to fish so they can feed themselves. Fog Computing? Air Computing.
19 May 2014 by Yuri Sagalov Yesterday, The Wall Street Journal published an informative article called Forget 'the Cloud'; 'the Fog' Is Tech's Future, where they brought up the notion that cloud computing is restricted by available bandwidth over 3G/4G and in the home, and (unfortunately) the United States ranks 35th in the world in terms of bandwidth per user.
When we incorporated the company four years ago (actually four years, one month, and eleven days ago), Weihan and I left our PhD and Masters studies to pursue an idea: We believed that although cloud computing is becoming quite popular, resource constraints like bandwidth will result in businesses and individuals seeking a more distributed approach to sharing data (particularly large data). During our earliest investor pitches, one of the questions that we continually came up against was "how did you come up with your company name? " which generally resulted in me launching into a short story: — Yuri & the AeroFS team. Observations from Uppsala. The Mill is a new general-purpose high-performance processor design from out-of-the-box computing ( They claim to beat typical high-end out-of-order (OOO) designs like the Intel Haswell generation by crazy factors, such as being 2.3x faster while using 2.3x less power compared to a Haswell.
All the while costing less. Ignoring the cost aspect, the power and performance numbers are truly impressive – especially for general code. Automata Processing. The Challenge of Complex, Unstructured Data Many of today’s most challenging computer science problems involve very large data structures, unstructured data, random access, or real-time data analysis.
These computationally intensive problems are not well aligned with traditional CPU and memory system architectures; they require a fundamentally new approach to computing. Blog: Playing with the CPU pipeline – Lol Engine. This article will show how basic knowledge of a modern CPU’s instruction pipeline can help micro-optimise code at very little cost, using a real world example: the approximation of a trigonometric function.
All this without necessarily having to look at lines of assembly code. The code used for this article is included in the attached file. Evaluating polynomials ¶ Who needs polynomials anyway? We’re writing games, not a computer algebra system, after all. Modern Microprocessors - A 90 Minute Guide! WARNING: This article is meant to be informal and fun!
Okay, so you're a CS graduate and you did a hardware course as part of your degree, but perhaps that was a few years ago now and you haven't really kept up with the details of processor designs since then. In particular, you might not be aware of some key topics that developed rapidly in recent times... pipelining (superscalar, OOO, VLIW, branch prediction, predication) multi-core and simultaneous multi-threading (SMT, hyper-threading) SIMD vector instructions (MMX/SSE/AVX, AltiVec) caches and the memory hierarchy Fear not!
This article will get you up to speed fast. Out-of-the-Box Computing. Talk by Ivan Godard – 2013-07-11 at Google Slides: 2013-07-11_mill_cpu_belt (.pptx) Belt Machines Data interchange without general registers A large fraction of the power budget of modern superscalar CPUs is devoted to renaming registers: the CPU must track the dataflow of the executing program, assign physical registers and map them to the logical registers of the program, schedule operations when arguments are available, restore visible state in the event of an exception—all while avoiding register update hazards.
Not all CPU architectures are subject to hazards that require register renaming. The belt machine model is inherently free of update hazards because all operation results go onto the belt by Single Assignment; in other words, once created they never change their value.