q_transcribe

I want to introduce q_transcribe a simple tool to transcribe images using QWEN 2 VL AI models.

What did I do to write q_transcribe? I’ve added some simple logic to a CLI wrapper that Andy Janco wrote to run QWEN 2 VL from the CLI.

q_transcribe can be used to transcribe typed and handwritten text from any image.

How could it be used?

  • Transcribe handwritten notes. One of the methods I use is freewriting longhand. Notetaking is often the first step in my writing process. But, at times, it can feel a slog to transcribe 20 page of handwritten notes. Enter, q_transcribe.
  • Transcribe handwritten archives. One of the projects I am working on with colleagues is an archival project in Colombia. We’re using QWEN 2B to extract text from images as part of a longer pipeline.

q_transcribe is a simplification of our workflow, which works on an image, a folder of images, or a folder of folders of images.

What is my contribution? I added logic to Andy Janco’s CLI wrapper to QWEB 2 VL’s sample code. My logic handles JPG, JPEG, or PNG files, sorts them, skips files that have already transcribed, and chooses between a CUDA (Nvidia GPU), MPS (Apple Silicon GPU), or CPU.

In my testing, it works with QWEN 2B on my M1 MacBook Pro with 16 GB of RAM, and on a https://lightning.ai server which offers free access to a GPU for researchers.

To install, clone the repository from GitHub, install the necessary dependencies, and then run.

   git clone https://github.com/dtubb/q_transcribe.git
   cd q_transcribe
   pip install -r requirements.txt
    python q_transcribe.py images

Structur.py

Structur is a simple, Python-based command-line tool to help extract and organize coded text from research notes.

I’ve been using it for a year now, from the Finder. It’s useful to find the structure of longer pieces of text.

I was inspired by John McPhee’s writing process, which he describes in Draft No. 4.

Structur exploded my notes. It read the codes by which each note was given a destination or destinations (including the dustbin). It created and named as many new Kedit files as there were codes, and, of course, it preserved intact the original set. In my first I.B.M. computer, Structur took about four minutes to sift and separate fifty thousand words. My first computer cost five thousand dollars. I called it a five-thousand-dollar pair of scissors.

Structur is my take on what McPhee describes.

It is available on GitHub.

Cite as:

Tubb, Daniel. Structur.py. GitHub, 2024. https://github.com/dtubb/structur.