Friday , April 16 2021

zwegner / faster-utf8-validator, Hacker News


                    

        

This library is a very fast UTF-8 validator using AVX2 / SSE4 instructions. As far as I am aware, it is the fastest validator in the world on the CPUs that support these instructions (… and not AVX – 512). Using AVX2, it can validate random UTF-8 text as fast as. (cycles / byte, and random ASCII text at.) cycles / byte. For UTF-8, this is roughly 1.5-1.7x faster than thefastvalidate-utf-8library.

This repository contains the library (one C file), a build script for themake.pybuild system, and a Lua test script (which requires LuaJIT due to use of theffimodule).

A detailed description of the algorithm can be found inz_validate.c. This algorithm should map fairly nicely to AVX – 512, and should in fact be a bit faster than 2x the speed of AVX2 since a few instructions can be saved. But I don’t have an AVX – 512 machine, so I haven’t tried it yet.

Validator (K UTF-8) (K ASCII) 16 M UTF-8 16 M ASCII
validate_utf8_fast_avx 0. 410 0. 410 0. 496 0. 429
validate_utf8_fast_avx_asciipath 0. 436 0. 074 0. 457 0. 156
z_validate_utf8_avx2 0. 264 0. 079 0. 290 0. 160
z_validate_utf8_sse4 0. 568 0. 163 0. 596 0. 202

  

Brave Browser
Read More
Payeer

About admin

Leave a Reply

Your email address will not be published. Required fields are marked *