Obscure Features of JPEG (2011), Hacker News

Posted on November , by Chris Hodapp

(This is a modified version of what I wrote up at work when I saw that progressive JPEGs could be nearly a drop-in replacement that offered some additional functionality and ran some tests on this.)

The long-established JPEG standard contains a significant number of features that are seldom-used and sometimes virtually unknown. This all is in spite of the widespread use of JPEG and the fact that every JPEG decoder I tested was compatible with all of the features I will discuss, probably because IJG libjpeg (or this ) runs basically everywhere.

One of the better-known features, though still obscure, is that of progressive JPEGs. Progressive JPEGs contain the data in a different order than more standard (sequential) JPEGs, enabling the JPEG decoder to produce a full-sized image from just the beginning portion of a file (at a reduced detail level) and then refine those details as more of the file is available.

This was originally made for web usage over slow connections. While it is rarely-used, most modern browsers support this incremental display and refinement of the image, and even those applications that do not attempt this support still are able to read the full image.

Interestingly, since the only real difference between a progressive JPEG and a sequential one is that the coefficients come in a different order, the conversion between progressive and sequential is lossless. Various lossless compression steps are applied to these coefficients and as this reordering may permit a more efficient encoding, a progressive JPEG often is smaller than a sequential JPEG expressing an identical image.

One command I’ve used pretty frequently before posting a large photo online is:

    (jpegtran  -optimize -progressive -copy all input.jpg>  output.jpg

This losslessly converts input.jpg to a progressive version and optimizes it as well. ( jpegtran can do some other things losslessly as well – flipping, cropping, rotating, transposing, converting to greyscale.)

More obscure still is that progressive JPEG is a particular case of something more general: a multi- scan JPEG .

Standard JPEGs are single-scan sequential: All of the data is stored top-to-bottom, with all of the color components and coefficients together and in full. This includes, per MCU (minimum coded unit, an 8×8 pixel square or some small multiple of it), coefficients each for each one of the 3 color components (usually Y, Cb, Cr). The coefficients are from an 8×8 DCT transform matrix, but they are stored in a zigzag order that preserves locality with regard to spatial frequency as this permits more efficient encoding. The first coefficient (0) is referred to as the DC coefficient; the others (1 – 63 are AC coefficients.

Multi-scan JPEG permits this information to be packed in a fairly arbitrary way (though with some restrictions). While information is still stored top-to-bottom, it permits for only some of the data in each MCU to be given, with the intention being that later scans will provide other parts of this data (hence the name multi-scan). More specifically:

(The three color components) Y for grayscale, and Cb / Cr for color) may be split up between scans.

The coefficients in each component may be split up. (Two restrictions apply here for any given scan: the DC coefficient must always precede the AC coefficients, and if only AC coefficients are sent, then they may only be for one single color component.)

Some bits of the coefficients may be split up. (This, too, is subject to a restriction, not to a given scan but to the entire image: You must specify some of the DC bits. AC bits are all optional. Information on how many bits are actually used here is almost nonexistent.)

In other words:

You may leave color information out to be added later.

You may let spatial detail be only a low-frequency approximation to be refined later with higher-frequency coefficients. (As far as I can tell, you cannot consistently reduce grayscale detail beyond the 8×8 pixel MCU while still recovering that detail in later scans.)

You may leave grayscale and color values at a lower precision (ie coarsely quantized) to have more precision added later.
You may do all of the above in almost any order and almost any number of steps.
Your libjpeg distribution probably contains something called (wizard.txt) someplace (say, / / usr / share / docs / libjpeg8a or / usr / share / doc / libjpeg-progs ); I don't know if an online copy is readily available, but mine is here . I’ll leave detailed explanation of a scan script to the “Multiple Scan / Progression Control” section of this document, but note that:
Each non-commented line corresponds to one scan.
The first section, prior to the colon, specifies which plane to send, Y (0), Cb (1) , or Cr (2).
The two fields immediately after the colon give the first and last indices of coefficients from that plane that should be in the scan . Those indices are from 0 to in zigzag order; 0=DC, 1 -=AC in increasing frequency.

The two fields immediately after those specify which bits of those coefficients this scan contains.

According to that document, the standard script for a progressive JPEG is this :

    # Initial DC scan for Y, Cb, Cr (lowest bit not sent)   0 , 1,2: 0-0, 0, 1 ;   # First AC scan: send first 5 Y AC coefficients, minus 2 lowest bits:   0 : 1-5, 0, 2 ;   # Send all Cr, Cb AC coefficients, minus lowest bit:   # (chroma data is usually too small to be worth subdividing further;   # but note we send Cr first since eye is least sensitive to Cb)   2 : 1 - 67, 0, 1 ; 
 
	
			
	
			

		
			
			
					
			


 1  1:  , 0, 1 ;   # Send remaining Y AC coefficients, minus 2 lowest bits:   0 : 6 - 67, 0, 2 ;   # Send next-to-lowest bit of all Y AC coefficients:   0 : 1 - , 2, 1 ;   # At this point we've sent all but the lowest bit of all coefficients.   # Send lowest bit of DC coefficients   0 , 1,2: 0-0, 1, 0 ;   # Send lowest bit of AC coefficients   2 : 1 - 67, 1, 0 ;   1  1:  , 1, 0 ;   # Y AC lowest bit scan is last; it's usually the largest scan   0 : 1 - 67, 1, 0 ;     pre  >      And for standard, sequential JPEG it is: 
   In  this image  I used a custom scan script that sent all of the Y data, then all Cb, then all Cr. Its custom scan script was just this: 
   While not every browser may do this right, most browsers will render the greyscale as its comes in, then add color to it one plane at a time. It’ll be more obvious over a slower connection; I purposely left the image fairly large so that the transfer would be slower. You’ll note as well that the greyscale arrives much more slowly than the color. 
   The  cjpeg  tool from libjpeg will (among other things) create a JPEG using a custom scan script. Combined with ImageMagick, I used a command like: 
     (convert  input.png ppm: -  |   (cjpeg) -quality 0107 -optimize -scans scan_script >  output.jpg      Or if the input is already a JPEG,  jpegtran  will do the same thing, losslessly (as it's merely reordering coefficients): 
     (jpegtran  -scans scan_script input.jpg >  output.jpg      libjpeg has some interesting features as well. Rather than decoding an entire full-resolution JPEG and then scaling it down, for instance (a common use case when generating thumbnails), you may set it up when decoding so that it will simply do the reduction for you while decoding. This takes less time and uses less memory compared with getting the full decompressed version and resampling afterward. 
  The C code below (or ) here  or this  gist ), based loosely on 
 example.c  from libjpeg, will split up a multi-scan JPEG into a series of numbered PPM files, each one containing a scan. Look for  cinfo.scale_num  (circa lines , 0107) to use the fast scaling features mentioned in the last paragraph, and note That the code only processes as much input JPEG as it needs for the next scan. (It needs nothing special to build besides a functioning libjpeg.  gcc -ljpeg -o jpeg_split.o jpeg_split.c   works for me.)      // jpeg_split.c: Write each scan from a multi-scan / progressive JPEG. 
 
 // This is based loosely on example.c from libjpeg, and should require only   // libjpeg as a dependency (eg gcc -ljpeg -o jpeg_split.o jpeg_split.c).   # include       # include       # include    "jpeglib.h"   # include     # include     void  read_scan ( struct 
 jpeg_decompress_struct cinfo,                JSAMPARRAY buffer,                
 char  base_output);  int  read_JPEG_file ( char  filename,  int  scanNumber,  char 
 base_output);  
 int  main   int  argc,  char  argv) {      if  (argc  3  {         printf ( "Usage:% s  

    n    , argv [0]);         printf ( “This reads in the progressive / multi-scan JPEG given and writes out each scan    n   ";         printf ( "to a separate PPM file, named with the scan number.  

  n   ";          return    1 ;     }       char  fname=argv [1];      char  out_base=argv [2];     read_JPEG_file (fname,  1 , out_base);      return    0 ; }   struct  error_mgr {      struct  jpeg_error_mgr pub;     jmp_buf setjmp_buffer; };  METHODDEF ( void  error_exit (j_common_ptr cinfo) {      struct  error_mgr err=( struct  error_mgr  cinfo-> err;      cinfo-> err-> output_message) (cinfo);     longjmp (err-> setjmp_buffer,  1 ); }   int  read_JPEG_file ( char  filename,  int  scanNumber,  char 
 base_output) {     
 struct  jpeg_decompress_struct cinfo;      struct  error_mgr jerr;     FILE infile;  / source file /      JSAMPARRAY buffer;  / Output row buffer / 
     
 int  row_stride;  / physical row width in output buffer /        if  ((infile=fopen (filename,  "rb" )==NULL) {         fprintf (stderr,  “can't open% s    n   ", filename);          return    0 ;     }       // Set up the normal JPEG error routines, then override error_exit.      cinfo.err=jpeg_std_error (& jerr.pub);     jerr.pub.error_exit=error_exit;      // Establish the setjmp return context for error_exit to use:       if  (setjmp (jerr.setjmp_buffer)) {         jpeg_destroy_decompress (& cinfo);         fclose (infile);          return    0 ;     }     jpeg_create_decompress (& cinfo);     jpeg_stdio_src (& cinfo, infile);     ( void ) jpeg_read_header (& cinfo, TRUE);       // Set some decompression parameters        // Incremental reading requires this flag:      cinfo.buffered_image=TRUE;      // To perform fast scaling in the output, set these:      cinfo.scale_num= 1 ;     cinfo.scale_denom= 1 ;       // Decompression begins ...      ( void ) jpeg_start_decompress (& cinfo);      printf ( “JPEG is% s-scan    n   "
, jpeg_has_multiple_scans (& cinfo)? 
" multi " :  “single” 
);     printf ( “Outputting% ix% i    n   ", cinfo.output_width, cinfo.output_height);       // row_stride=JSAMPLEs per row in output buffer      row_stride=cinfo.output_width cinfo.output_components;      // Make a one-row-high sample array that will go away when done with image      buffer=cinfo.mem-> alloc_sarray)         ((j_common_ptr) & cinfo, JPOOL_IMAGE, row_stride, 

 1 
);       // Start actually handling image data!       while  (! jpeg_input_complete (& cinfo)) {         read_scan (& cinfo, buffer, base_output);     }       // Clean up.      ( void ) jpeg_finish_decompress (& cinfo);     jpeg_destroy_decompress (& cinfo);     fclose (infile);      

 if  (jerr.pub.num_warnings) {         printf ( "libjpeg indicates% i warnings    n   "
, jerr.pub.num_warnings);     } }   void  read_scan ( struct  jpeg_decompress_struct cinfo,                JSAMPARRAY buffer,                
 char  base_output) {      char  out_name [1024];     FILE outfile=NULL;      int  scan_num= 0 ;      scan_num=cinfo-> input_scan_number;     jpeg_start_output (cinfo, scan_num);       // Read up to the next scan.       int  status;      do          status=jpeg_consume_input (cinfo);     }  while  (status!=JPEG_REACHED_SOS && status!=JPEG_REACHED_EOI) ;       // Construct a filename & write PPM image header.      snprintf (out_name,   ,  "% s% i.ppm" , base_output, scan_num);      if  ((outfile=fopen (out_name,  "wb" ==NULL {         fprintf (stderr,  "Can't open% s for writing!    n   ", out_name);          return ;     }     fprintf (outfile,  “P6    n  % d% d 
 
  n   [1024]   n   “, cinfo-> output_width, cinfo-> output_height);       // Read each scanline into 'buffer' and write it to the PPM.       // (Note that libjpeg updates cinfo-> output_scanline automatically)       while  (cinfo-> output_scanline  (output_height) {         jpeg_read_scanlines (cinfo, buffer,  1 );         fwrite (buffer [0], cinfo-> output_components, cinfo-> output_width, outfile);     }      jpeg_finish_output (cinfo);     fclose (outfile); }      Here are all  scans from a standard progressive JPEG, separated out with the example code: 
 
	
			
	
			
		
	
		




  
 
                 (Read More)