in ,

# Ask HN: I just wrote an O (N) diffing algorithm – what am I missing ?, Hacker News

Hey folks,

I’ve been building a rendering engine for a code editor the past couple of days. Rendering huge chunks of highlighted syntax can get laggy. It’s not worth switching to React at this stage, so I wanted to just write a quick diff algorithm that would selectively update only changed lines.

With a link to this paper, the initial Git diff implementation: http://www.xmailserver.org/diff2.pdf

I couldn’t find the PDF to start with, but read “edit graph” and immediately thought – why don ‘ t I just use a hashtable to store lines from LEFT_TEXT and references to where they are, then iterate over RIGHT_TEXT and return matches one by one, also making sure that I keep track of the last match to prevent jumbling?

The algorithm I produced is only a few lines and seems accurate. It’s O (N) time complexity, whereas the paper above gives a best case of O (ND) where D is minimum edit distance.

``function lineDiff (left, right) {     left=left.split (' n');     right=right.split (' n');     let lookup={};     // Store line numbers from LEFT in a lookup table     left.forEach (function (line, i) {       lookup [line]=lookup [line] || [];       lookup [line]. push (i);     });     // Last line we matched     var minLine=-1;     return right.map (function (line) {       lookup [line]=lookup [line] || [];       var lineNumber=-1;       if (lookup [line]. length) {         lineNumber=lookup [line]. shift ();         // Make sure we're looking ahead         if (lineNumber>minLine) {           minLine=lineNumber;         } else {           lineNumber=-1         }       }       return {         value: line,         from: lineNumber       };     });   }``