Tools

I’m a professor of anthropology and a writer. Sometimes, I write tools to help me writer, or do my research. Enjoy.

  • dejatext.py
    DejaText is a Python script for identifying duplicate and similar text in a directory of text or markdown files. It scans a directory of .txt’ or.md’ files, identifies duplicate and similar text segments, and produces organized reports for easy review. As part of my writing, I find it useful to go through a project and… Continue reading dejatext.py
  • pdf_splitr.py
    On problem with scanning books, for academic purposes, is one often ends up with two pages side by side, in a single PDF. This makes it hard to OCR, read, annotate, or process. pdf_splitr is a (very) simple Python tool, which I ‘wrote’ with cursor.ai and Claude. It splits each page into left and right… Continue reading pdf_splitr.py
  • q_transcribe.py
    I want to introduce q_transcribe a simple tool to transcribe images using QWEN 2 VL AI models. What did I do to write q_transcribe? I’ve added some simple logic to a CLI wrapper that Andy Janco wrote to run QWEN 2 VL from the CLI. q_transcribe can be used to transcribe typed and handwritten text… Continue reading q_transcribe.py
  • structur.py
    Structur is a simple, Python-based command-line tool to help extract and organize coded text from research notes. I’ve been using it for a year now, from the Finder. It’s useful to find the structure of longer pieces of text. I was inspired by John McPhee’s writing process, which he describes in Draft No. 4. Structur… Continue reading structur.py
  • wordwright.py
    As a graduate student, my supervisor sat me down one day and confessed to me that he used WordPerfect’s spell-checking window because it helped him find passive voice. He, like I, overused it. From that moment, over the years, I’ve found many ways to automate the flagging of passive voice in my writing. I’ve written… Continue reading wordwright.py