String Diff Utilities
Line-based and character-level text diffing utilities for comparing strings and highlighting differences.
These functions provide both line-level diffs (similar to git diff) and character-level highlighting for granular change visualization.
Importsā
import { computeTextDiff, getCharacterDifferences } from 'nhb-toolbox';
computeTextDiffā
Computes a line-based text diff between two strings using the Longest Common Subsequence (LCS) algorithm.
Function Signatureā
computeTextDiff(originalText: string, modifiedText: string): DiffResult
| Parameter | Type | Description |
|---|---|---|
originalText | string | The original (before) text |
modifiedText | string | The modified (after) text |
Returns: DiffResult containing the list of diff lines and summary statistics
Featuresā
- Lines are classified as
added,removed,modified, orunchanged - Detects and pairs similar
removedandaddedlines asmodifiedwhen similarity exceeds 60% - Similarity calculated using Levenshtein edit distance
- Line numbers are preserved (1-based indexing)
Examplesā
const result = computeTextDiff('hello\nworld', 'hello\nearth');
// {
// lines: [
// {
// type: 'unchanged',
// original: 'hello',
// modified: 'hello',
// originalLineNum: 1,
// modifiedLineNum: 1
// },
// { type: 'removed', original: 'world', originalLineNum: 2 },
// { type: 'added', modified: 'earth', modifiedLineNum: 2 }
// ],
// stats: {
// linesAdded: 1,
// linesRemoved: 1,
// linesChanged: 0,
// linesUnchanged: 1
// }
// }
const result = computeTextDiff('foo\nbar\nbaz', 'foo\nqux');
// {
// lines: [
// {
// type: 'unchanged',
// original: 'foo',
// modified: 'foo',
// originalLineNum: 1,
// modifiedLineNum: 1
// },
// { type: 'removed', original: 'bar', originalLineNum: 2 },
// { type: 'removed', original: 'baz', originalLineNum: 3 },
// { type: 'added', modified: 'qux', modifiedLineNum: 2 }
// ],
// stats: {
// linesAdded: 1,
// linesRemoved: 2,
// linesChanged: 0,
// linesUnchanged: 1
// }
// }
Algorithmā
- Splits input strings into lines
- Builds LCS table comparing lines exactly
- Backtracks to identify added, removed, and unchanged lines
- Post-processes to pair similar removed/added lines as
modifiedwhen similarity ā„ 0.6
Similarity Thresholdā
Lines are considered modified (rather than separate removed + added) when their character-level similarity score is ā„ 0.6. The similarity score is calculated as:
similarity = 1 - (editDistance / max(stringLengths))
Where editDistance is the Levenshtein distance between the two strings.
getCharacterDifferencesā
Highlights character-level differences between two strings using the LCS algorithm.
Function Signatureā
getCharacterDifferences(original: string, modified: string): CharDiffResult
| Parameter | Type | Description |
|---|---|---|
original | string | The original string to compare from |
modified | string | The modified string to compare to |
Returns: CharDiffResult with annotated character arrays for both strings
Featuresā
- Each character in both strings is annotated with a
highlightedflag highlighted: trueindicates the character differs from the other string- Handles empty strings gracefully
- Based on Longest Common Subsequence (LCS) algorithm
Examplesā
const diff = getCharacterDifferences('cat', 'car');
// {
// original: [
// { text: 'c', highlighted: false },
// { text: 'a', highlighted: false },
// { text: 't', highlighted: true }
// ],
// modified: [
// { text: 'c', highlighted: false },
// { text: 'a', highlighted: false },
// { text: 'r', highlighted: true }
// ]
// }
// When one string is empty, all characters in the other are highlighted
const diff = getCharacterDifferences('', 'hi');
// {
// original: [],
// modified: [
// { text: 'h', highlighted: true },
// { text: 'i', highlighted: true }
// ]
// }
const diff = getCharacterDifferences('hello world', 'hello earth');
// {
// original: [
// { text: 'h', highlighted: false },
// { text: 'e', highlighted: false },
// { text: 'l', highlighted: false },
// { text: 'l', highlighted: false },
// { text: 'o', highlighted: false },
// { text: ' ', highlighted: false },
// { text: 'w', highlighted: true },
// { text: 'o', highlighted: true },
// { text: 'r', highlighted: false },
// { text: 'l', highlighted: true },
// { text: 'd', highlighted: true }
// ],
// modified: [
// { text: 'h', highlighted: false },
// { text: 'e', highlighted: false },
// { text: 'l', highlighted: false },
// { text: 'l', highlighted: false },
// { text: 'o', highlighted: false },
// { text: ' ', highlighted: false },
// { text: 'e', highlighted: true },
// { text: 'a', highlighted: true },
// { text: 'r', highlighted: false },
// { text: 't', highlighted: true },
// { text: 'h', highlighted: true }
// ]
// }
Algorithmā
- Builds LCS table for character-level matching
- Backtracks to identify matched character indices
- Marks unmatched characters as
highlighted: true
Typesā
DiffResultā
The result of a line-level diff operation.
interface DiffResult {
/** An array of line differences */
lines: DiffLine[];
/** Summary statistics */
stats: {
/** Total number of lines that were added */
linesAdded: number;
/** Total number of lines that were removed */
linesRemoved: number;
/** Total number of lines that were modified (changed content) */
linesChanged: number;
/** Total number of lines that remained unchanged */
linesUnchanged: number;
};
}
DiffLineā
A single line difference between two strings.
interface DiffLine {
/** The type of difference */
type: 'added' | 'removed' | 'unchanged' | 'modified';
/** The content of the original line (undefined for added lines) */
original?: string;
/** The content of the modified line (undefined for removed lines) */
modified?: string;
/** The line number in the original string, 1-based (undefined for added lines) */
originalLineNum?: number;
/** The line number in the modified string, 1-based (undefined for removed lines) */
modifiedLineNum?: number;
}
CharDiffResultā
Result of a character-level diff.
interface CharDiffResult {
/** An array of characters from the original string, each with a highlighted flag */
original: HighlightedText[];
/** An array of characters from the modified string, each with a highlighted flag */
modified: HighlightedText[];
}
type HighlightedText = {
/** The character text */
text: string;
/** Whether this character differs from the other string */
highlighted: boolean;
};
Common Workflowsā
Displaying Line Diffā
const result = computeTextDiff(originalCode, modifiedCode);
for (const line of result.lines) {
switch (line.type) {
case 'added':
console.log(`+ ${line.modified}`);
break;
case 'removed':
console.log(`- ${line.original}`);
break;
case 'modified':
console.log(`- ${line.original}`);
console.log(`+ ${line.modified}`);
break;
case 'unchanged':
console.log(` ${line.original}`);
break;
}
}
Inline Character Highlightingā
const diff = getCharacterDifferences('old text', 'new text');
function renderHighlighted(chars: HighlightedText[]): string {
return chars
.map(char => char.highlighted ? `<mark>${char.text}</mark>` : char.text)
.join('');
}
const originalHtml = renderHighlighted(diff.original);
const modifiedHtml = renderHighlighted(diff.modified);
Comparing JSON Objectsā
const original = JSON.stringify({ name: 'John', age: 30 }, null, 2);
const modified = JSON.stringify({ name: 'Jane', age: 30 }, null, 2);
const diff = computeTextDiff(original, modified);
console.log(`Changed lines: ${diff.stats.linesChanged}`);
Change Summaryā
const diff = computeTextDiff(before, after);
const { linesAdded, linesRemoved, linesChanged, linesUnchanged } = diff.stats;
console.log(`Summary:
Added: ${linesAdded}
Removed: ${linesRemoved}
Modified: ${linesChanged}
Unchanged: ${linesUnchanged}
Total lines: ${linesAdded + linesRemoved + linesChanged + linesUnchanged}`);
Performance Notesā
- Line-based diff: O(mn) time complexity where m and n are line counts. Suitable for typical text files (up to thousands of lines).
- Character diff: O(n²) where n is string length. Best for short strings or individual lines.
- Modified line detection: Uses Levenshtein distance (O(mn) per candidate pair), only applied to adjacent removed/added line pairs.
- Pure implementation: No external dependencies, works in any JavaScript environment.