A number of Python-related libraries exist for the programming of solutions either employing multiple CPUs or multicore CPUs in a symmetric multiprocessing (SMP) or shared memory environment, or potentially huge numbers of computers in a cluster or grid environment. This page seeks to provide references to the different libraries and solutions available. Symmetric Multiprocessing Some libraries, often to preserve some similarity with more familiar concurrency models (such as Python's threading API), employ parallel processing techniques which limit their relevance to SMP-based hardware, mostly due to the usage of process creation functions such as the UNIX fork system call. dispy - Python module for distributing computations (functions or programs) computation processors (SMP or even distributed over network) for parallel execution. Advantages of such approaches include convenient process creation and the ability to share resources. Cluster Computing Cloud Computing Grid Computing
Related: Massive Parallel Processing Tools
CUDACUDA is a parallel computing platform and application programming interface (API) model created by Nvidia. It allows software developers and software engineers to use a CUDA-enabled graphics processing unit (GPU) for general purpose processing — an approach termed GPGPU (General-Purpose computing on Graphics Processing Units). The CUDA platform is a software layer that gives direct access to the GPU's virtual instruction set and parallel computational elements, for the execution of compute kernels. Background The graphics processing unit (GPU), as a specialized computer processor, addresses the demands of real-time high-resolution 3D graphics compute-intensive tasks. By 2012, GPUs had evolved into highly parallel multi-core systems allowing very efficient manipulation of large blocks of data. Programming abilities Example of CUDA processing flow CUDA provides both a low level API and a higher level API. CUDA 8.0 comes with these other software components: Advantages
Parallel Programming and Computing Platform | CUDAWhat is CUDA? Enroll today! Intro to Parallel Programming An open, online course from Udacity Instructors: Dr. John Owens, UC Davis and Dr. David Luebke, NVIDIA CUDA® is a parallel computing platform and programming model invented by NVIDIA. With millions of CUDA-enabled GPUs sold to date, software developers, scientists and researchers are finding broad-ranging uses for GPU computing with CUDA. Identify hidden plaque in arteries: Heart attacks are the leading cause of death worldwide. Analyze air traffic flow: The National Airspace System manages the nationwide coordination of air traffic flow. Visualize molecules: A molecular simulation called NAMD (nanoscale molecular dynamics) gets a large performance boost with GPUs. Background GPU Computing: The Revolution You're faced with imperatives: Improve performance. Not anymore. Visit CUDA Zone for examples of applications in diverse vertical markets… and awaken your GPU giant. History of GPU Computing Tools and Training
Methodologies for Network-level Optimization of Cluster Computerssupervisors: Dr Peter Strazdins A Beowulf-style cluster computer is a parallel computer using Commercial-off-the-Shelf switch-based network to communicate between the processors. The ANU Beowulf cluster Bunyip is such a cluster based on Fast (100Mb) Ethernet switches. Clusters have proved a highly cost-effective high performance computer model, and have largely displaced the traditional massively parallel computers built with expensive vendor-supplied networks. However, the COTS networks' raw communication speed has not kept up with the dramatic increases in processor speed, and provides a limit to the performance of many applications on clusters. It is therefore important to utilize all of the possible hardware capabilities of these networks in order to effectively increase communication speed. Hardware configuration options include using multiple NICS (channels). Time permitting, other aspects of communication performance on these networks may be considered.
High Performance Computing - HPC Cloud ComputingP2 instances are ideally suited for machine learning, engineering simulations, computational finance, seismic analysis, molecular modeling, genomics, rendering, high performance databases, and other GPU compute workloads. P2 instance offers 16 NVIDIA K80 GPUs with a combined 192 Gigabytes (GB) of video memory, 40,000 parallel processing cores, 70 teraflops of single precision floating point performance, over 23 teraflops of double precision floating point performance, and GPUDirect technology for higher bandwidth and lower latency peer-to-peer communication between GPUs. P2 instances also feature up to 732 GB of host memory, up to 64 vCPUs using custom Intel Xeon E5-2686 v4 (Broadwell) processors, dedicated network capacity for I/O operation, and enhanced networking through the Amazon EC2 Elastic Network Adaptor. Visit the User Guide >>
OpenACC Home | www.openacc.orgEXASOL. The in-memory analytic database that just works.EXASOL. The in-memory analytic database that just works. The world’s fastest in-memory analytic database was built from the ground up to offer the highest performance and scalability for in-memory analytics. Guaranteed. It does one thing. EXASOL is a high-performance, in-memory, MPP database specifically designed for in-memory analytics. In-memory meets MPP. The analytic database achieves lightning-fast performance with linear scalability by combining in-memory technology, columnar compression and storage, and massively parallel processing. Its standard SQL interfaces also mean avoiding the trap of NoSQL skilling shortages and easy compatibility with pre-existing applications and data structures. Learn what makes the world’s fastest database so fast Download our technical whitepaper and learn … Key features and benefits In-memory technologyInnovative in-memory algorithms enable large amounts of data to be processed in main memory for dramatically faster access times. Hadoop? Features