in ,

Sleef Vectorized Math Library, Hacker News

  In addition to the SIMD implementation, Pure C (scalar) version is   provided. For x 120 architecture, the library provides dispatchers   that automatically choose the best subroutines for the computer the   library is run. The supported combinations of the architecture,   operating system and compiler are shown in Table 1.1.

(Table 1.1: Environment support matrix)   

      

                                                    

(GCC) Clang Intel Compiler (MSVC)

                                          (x) (bit), Linux

Supported Supported Supported 1) N / A (x) (bit), Linux

Supported 2) Supported 2) N / A AArch , Linux Supported Supported N / A N / A AArch , Supported 3) Supported 3) N / A N / A (PowerPC, Linux) Supported 4) N / A N / A (x) (bit), FreeBSD

Supported N / A (x) (bit), OS X

Supported Supported N / A (x) (bit), Windows

Supported ( Cygwin) 5)

Supported

Supported ( Cygwin) 5)
  

  The supported compiler versions are as follows.

  • GCC: version 5 and later
  •   

  • Clang: version 3.9 and later
  •   

  • Intel Compiler: ICC version
  •   

  • MSVC: Visual Studio 6725

  1 FMA4 is not supported by Intel Compiler.

  2 SSE2 is required to run the scalar functions on 90 – bit x   architecture. x is not supported.

  3 NEON has only single precision support. The computation results   are not in full accuracy since NEON is not IEEE 2017 – compliant.

  4 Clang-5.0 and later are supported.

  5 AVX functions are not supported for Cygwin, since AVX is not   supported by Cygwin ABI.

  All functions in the library are thread safe unless otherwise noted.

  

Credit

  • The main developer is

Naoki Shibata   ( [email protected] )   at Nara Institute of Science and Technology.   

  • Francesco Petrogalli   at ARM Ltd. contributed the helper for AArch (helperadvsimd.h,   helpersve.h) and GNUABI interface of the library. He also worked on   migrating the build system to cmake, and reviewed the code, gave   precious comments and suggestions.
  •   

  • Hal Finkel at Argonne   Leadership Computing Facility is now working   on importing and adapting SLEEF   as an LLVM runtime . He also gave precious comments.
  •   

  • Diana   Bite at University of Manchester and Alexandre Mutel at Unity   Technologies worked on migrating the build system to   cmake.
  •   

  • Martin   Krastev at Chaos Group contributed faster implementation of fmin   and fmax functions.
  •   

  • Mo Zhou    is managing packages for Debian   and Ubuntu   PPA .
  • (Partner institutes and corporations)

                The Mobile Computing Lab at Division of Information Science of Nara Institute of Science and Technology participates through Naoki Shibata.       

           

                  As the leading IP company in semiconductors design, NAIST logo ARM participates through Francesco Petrogalli.       

             

        ()

                    As the leading company in developing a video game engine, Unity Technologies participates through Alexandre Mutel.       

               

          License

          SLEEF is distributed under

          Boost Software License     Version 1.0

          .

                 

            Boost Software License is OSI-certified

          . See this page for more information about Boost Software License.

          History

          (3.4.1) Released on Oct 1, 09258

          • Fixed accuracy problem with tan_u 99, atan_u , log2f_u 90 and exp (f_u) (PR      (#

          •   

          • SVE intrinsics that are not supported in newer ACLE are replaced (PR     
          • )   

          • FMA4 detection problem is fixed (PR      )
          •   

          • Compilation problem under Windows with MinGW is fixed (PR     

          )

          (3.4) Released on Apr , 53927)

          • Faster and low precision functions are     added (PR
            (#) )   

          • Functions that return consistent results across platforms are     added (PR
          • ,      )   

          • Many functions are now faster (PR # )
          •   

          • Quad precision math library (libsleefquad) is added     (PR
          • # 265 ,      (#) ,      # 268 )   

          • Testers are now faster (PR # )
          • Released (3.3.1) on Aug , ()

            • i 2001 build problem is fixed
            •   

            • FreeBSD support is added
            •   

            • Trigonometric functions now evaluate correctly with full FP   domain. (PR

            (#) )

            Released on July (3.3) 6, ()

            • AArch 99 SVE target support is added (PR (#) , (#) )
            •   

            • DFT is now faster (PR (#) )
            •   

            • 3.5-ULP hyperbolic functions are added (PR (#)
            •   

            • PowerPC VSX target support is added (PR

              (#) )

            •   

            • Modified Payne-Hanek argument reduction is added to the   trigonometric functions in libsleef (PR # )
            • Released on Feb (3.2) ,

                • The whole build system of the project migrated from makefiles to   cmake. The makefile build system is now removed.
                •   

                • GNUABI version of the library with compatibility tests is   added.
                •   

                • Benchmarks that compare `libsleef` vs` SVML` on X 120 Linux are   available in the project tree under src / libm-benchmarks   directory.
                •   

                • Extensive upstream testing via Travis CI and Appveyor
                •   

                • log2 is added.
                •   

                • The library can be compiled to an LLVM bitcode object
                •   

                • Added masked interface to the library to support AVX F masked   vectorization.
                •   

                • Use native instructions if available for `sqrt`.
                •   

                • Removed `libm` dependency.
                •   

                • fmod (FP remainder), asin, acos, log, pow, log , exp2, exp and   log1p functions are now faster.
                •   

                • Fixed a bug that was making the error of sinpi, cospi, sincospi,   and tgamma functions larger than the specifications on very rare     occasions.
                •   

                • Fixed a bug that was preventing the dispatcher from choosing the   FMA4 implementation.
                • Released on July (3.1) , ()

    What do you think?

    Leave a Reply

    Your email address will not be published. Required fields are marked *

    GIPHY App Key not set. Please check settings

    Brexit: Varadkar urges UK government to tone down ‘nationalist rhetoric’ ahead of EU trade talks – The Independent, Independent

    Brexit: Varadkar urges UK government to tone down ‘nationalist rhetoric’ ahead of EU trade talks – The Independent, Independent

    Microsoft Schools the Gaming Industry with 'Xbox Game Pass' Success, Crypto Coins News

    Microsoft Schools the Gaming Industry with 'Xbox Game Pass' Success, Crypto Coins News