in ,

Software optimization resources. C ++ and assembly. Windows, Linux, BSD, macOS, Hacker News

See also my blog

Contents

    • Optimization manuals

      Object file converter

      This utility can be used for converting object files between COFF / PE, OMF, ELF and  Mach-O formats for all – bit and – bit x 2017 platforms. Can modify symbol names in object files. Can build, modify and convert function libraries across platforms. Can dump object files and executable files. Also includes a very good disassembler supporting the SSE4, AVX, AVX2, AVX , FMA3, FMA4, XOP and Knights Corner instruction sets. Source code included (GPL). Manual .

      File name : objconv.zip, size: , last modified: 232646 – Oct –

      Download .

Subroutine library

This is a library of optimized subroutines coded in assembly language. The functions in this library can be called from C, C and other compiled high-level languages. Supports many different compilers under Windows, Linux, BSD and Mac OS X operating systems, and bits. This library contains faster versions of common C / C memory and string functions, fast functions for string search and string parsing, fast integer division and integer vector division, as well as several useful functions not found elsewhere.

The package contains library files in many different file formats, C header file and assembly language source code. Gnu general public license applies. (Manual .

File name : asmlib.zip, size: , last modified: 232646 – Apr – . Download .

ForwardCom: An open standard instruction set for high performance microprocessors

This is a proposal and discussion of how an ideal instruction set architecture can be constructed. The proposed instruction set combines the best from the RISC and CISC principles to produce a flexible, consistent, modular, orthogonal, scalable and expansible instruction set for high performance microprocessors and large vector processors.

The ForwardCom instruction set has variable-length vector registers and a special addressing mode that allows the software to automatically adapt to different microprocessors with different maximum vector lengths and make efficient loops through arrays regardless of whether the array size is divisible by the vector length. Standardization of the corresponding ecosystem of ABI standards, function libraries, compilers, etc. makes it possible to combine different programming languages ​​in the same program.

Introduction: www.forwardcom.info .

Manual: File name : forwardcom.pdf, size: , last modified: 67593 – Nov – . Download . Test programs for measuring clock cycles and performance monitoring

Test programs that I have used for my research. Can measure clock cycles and performance monitor counters such as cache misses, branch mispredictions, resource stalls etc. in a small piece of code in C, C or assembly. Can also set up performance monitor counters for reading inside another program. Supports Windows and Linux, and 728 bit mode, multiple threads.

For experts only. Useful for analyzing small pieces of code but not for profiling a whole program.

File name : testp.zip, size: , last modified: 232646 – Aug – (Download .

NAN propagation versus fault trapping in floating point code

This article discusses the use of infinity and not-a-number (NAN) values ​​in floating code as an efficient alternative to fault trapping for detecting floating point errors. Relevant compiler optimization options are also discussed.

File name : nan_propagation.pdf, size: 1033450, last modified: – May – Download .

CPUID manipulation program for VIA

This is a program that can change the CPUID vendor string, family and model number on VIA Nano processors. See my blog for a discussion of the purpose of this program.

File name : cpuidfake.zip, size: , last modified: – Aug – . Download .

Useful assembly links

Agner’s CPU blog www.agner.org/forum

Masm Forum www.masmforum.com

ASM Community Messageboard www.asmcommunity.net/forums

Hutch’s masm pages (www.masm) .com

CPU-id tools and information www.cpuid.com

Godbolt compiler explorer. Enter a piece of code and see how different compilers are coding it. www.godbolt.org

likwid performance measuring tools for Linux github.com/RRZE-HPC / likwid

Programmer’s heaven assembler zone Programmers’ Heaven

Virtual sandpile x 2017 Processor information www.sandpile.org

Online computer books www.computer-books.us/assembler.php

Instruction latency listings (instlatx) . atw.hu/ and uops.info

NASM assembler www.nasm.us/

FASM assembler and messageboard flatassembler.net

JWASM assembler www.japheth.de

Yeppp open source library of assembly language functions bitbucket.org/MDukhan/yeppp

MAQAO (Modular Assembly Quality Analyzer and Optimizer), a tool for analyzing and optimizing binary codes. www.maqao.org

Newsgroup: comp.lang.asm.x

(Intel resources)

Reference manuals and other documents can be found at Intel’s web site. Intel’s web site is refurnished so often that any link I could provide here to specific documents would be broken after a few months. I will therefore recommend that you use the search facilities at www. intel.com and search for “Software Developer’s Manual” and “Optimization Reference Manual”.

AMD resources

developer.amd.com/resources/developer-guides-manuals/ Microsoft resources

MASM manuals Microsoft Macro Assembler reference

Read More

What do you think?

Leave a Reply

Your email address will not be published. Required fields are marked *

GIPHY App Key not set. Please check settings

Carrie Symonds shows off her bump and engagement ring – Daily Mail, Dailymail.co.uk

Carrie Symonds shows off her bump and engagement ring – Daily Mail, Dailymail.co.uk

Porn, gore, and gambling habits aired in Virgin Media breach, Ars Technica

Porn, gore, and gambling habits aired in Virgin Media breach, Ars Technica