Here’s how attempting to export images off a Word Document led to a quest for data deduplication and classification using the shell. The images I wanted to export were MS Word diagrams drawn in Word, rather than PNG files1. Because those doodle-shapes do not export to PNG well, I first copy-pasted them into Powerpoint to get the familiar “save as picture” context menu. But a couple of images were still deformed beyond recognition.