Writing Diary #53: Cleanup Reepeated Text

What makes a book different from an article? What makes a book different from an essay? One thing: it’s long. In my case, what I thought would be one book, seems to be becoming two or maybe three. However, over the years I’ve been working on it, it has grown to almost a million words. This creates just a writing challenge. How to organize, cut, edit, and work with the detritus of various versions, drafts, initial starts, notes on an idea, and it just piles up in a chaotic, Escher-like jumble, that is so overwhelming as it leave me lost. One problem is the order. One is duplication.

Order can be solved by coding. Putting things about the same topic together. This solves the issues that writing different iterations of these books for quite a long time. This means that the ideas have sometimes flourished in different drafts, different versions of the same text, living in different places. I’ve long since lost track of where things are.

Pragmatically, they’re in a big Tinderbox file. But, how to turn that into a book?

When it comes to the final steps of editing, tightening, and turning rough ideas into book form, one step that is both banal and annoying is the general cutting of duplicate text.

There are different ways of doing this.

For a long time I’ve done it by hand. It’s not very efficient when dealing with so much text.

More recently, I have been using a script I wrote called structur.py. With structur.py I code text and send paragraphs and pieces of text to the right place. I put ideas together that are about similar topics. However, once structur.py has worked its magic, there is still a problem: I still have a lot of duplicate text? One way to deal with this is to keep coding until all the similar text is in the same place, then delete it. This is what I did. But it takes a long time just to read 10,000 words. Let alone 30,000 or 100,000, which is my problem. The problem is that I want a copy of a piece of text, and then any good sentences that I might want to put together.

A few days ago, I put together a script called DejaText.py that flags files, paragraphs, sentences and even words that are duplicated. This is super useful for identifying where text is being reused in multiple places. The result was somewhat shocking. My drafts were full of duplicates.

This morning, as I was trying to code six files to deal with the most egregious case of duplicated paragraphs and sentences, I realised that I was coding the same text over and over again. Why not write a script that deletes the second and subsequent instance of repeated paragraphs and sentences? It’s dangerous. But, point it at a temporary folder, and it works. I call it [dejatext_cleanup.py]. It’s on GitHub.

This combination of deleting duplicate text, and then coding with structur.py, allowed me to cut 35,000 words down to 13,000 words. The scripts will speed= things up considerably. How to proceed? The way forward is to work at the level of what I’ve already organised. Take a section, say a section on pencils. Delete duplicate text. Code the remainder. Put the codes in order. Then, I’ll have a complete section on pencils. Revise it a couple times, and then I’ll have a draft.

These are power tools. They’re dangerous. But, it iwll speed up turning a folder of notes into a book.

Writer’s Diary #52: Repeated Words

Today was revision. Cutting and tightening a few sections on pencils. A week ago it was 6,000 words. Now, it’s 4,000. The task today was words and phrases that are superfluous. Overused. Bugaboos. It’s not that all repeats are bad. But, the trick is to be deliberate. My drafts are full of words and phrases reused, without deliberation. They can often be cut. The idea for this came to me from John McPhee’s Draft No. 4.

It is toward the end of the second draft, if I’m lucky, when the feeling comes over me that I have something I want to show to other people, something that seems to be working
and is not going to go away. The feeling is more than welcome, but it is hardly euphoria. It’s just a new lease on life, a sense that I’m going to survive until the middle of next month. After reading the second draft aloud, and going through the piece for the third time (removing the tin horns and radio static that I heard while reading), I enclose words and phrases in pencilled boxes for Draft No. 4. If I enjoy anything in this process it is Draft No. 4. I go searching for replacements for the words in the boxes. The final adjustments may be small-scale, but they are large to me, and I love addressing them. You could call this the copy-editing phase if real copy editors were not out there
in the future prepared to examine the piece. The basic thing I do with college students is pretend that I’m their editor and their copy editor. In preparation for conferences
with them, I draw boxes around words or phrases in the pieces they write. I suggest to them that they might do this for themselves.

This is an early step. Cut early, then revise with care.

I use tools: a script that lists repeated words, and Pro Writing Aid, which has a tool to list repeated words and phrases.

Writer’s Diary #51: NaNoWriMo

This is a quick update. It’s November—time for NaNoWriMo, time for ambitious goals, audacious writing, even if the words themselves at the end will be ever contingent. The words end up being imperfect. Often, in many cases, so imperfect as to be nearly useless. But the aim is to get something done.

On Friday, my task was a “Finishing Friday.” I sent an article that I’ve been fiddling with for a long time. Is it good? No. Is it perfect? No. Am I happy with it? Not really. But I sent it off to a journal. It will get reviewed, sent back, and then I’ll try again. Finishing Friday.

I think this month is going to be something similar: Finishing November. Or, of course, NaNoWriMo. My goal this year? “Finish the Goddamn Book Writing Month.”

I have a small writing group with some friends. One of them, along with me, is adopting some goals. She’s going to write the first three chapters of the book she’s working on.

My goal? I will revise and reorder and magpie my way into a complete draft of the book by the end of the month.

What does that mean? At first blush, that means 90,000 words.

90,000 words is a good-sized academic book. Mien will be ordered, broken into scenes, with narrative and argument woven together, in support of a makeshift way of proceeding.

I don’t mean 90,000 words, perfect. I don’t mean done, for good. I don’t mean tight as I can get it.

But, I do mean that I want to take forward momentum, stop revising, and weave together a book of about 90,000 new words, organized, put into a temporary, contingent, place, lightly polished enough that I can get a feel for the whole things.

That’s the task.

What are the milestones?

Let’s see. First, a word budget can help.

90,000 words is a good-sized academic book, at least according to William Germano (2009, Getting It Published). Let’s break that down: take out 5,000 words for references and another 10,000 words for notes. That gets us to maybe 80,000. Add in some padding both ways. Say, 75,000 words. So, we’re left with getting a draft of about 75,000 words.

I’ve already got 5,000 words polished. So, that means my task for the November is 70,000 words.

I have 10,000 words from finishing Friday. So, that leaves 60,000 words.

60,000 words is the goal. It’s November 3rd today. Lets break 60,000 into 27 days. My goal is to write, revise, or reorder 2,500 words or so each day. Seems audacious. But, it’s doable. I’ve done it before. Quite a few times actually.

Crucially, my task right now is not write 2,500 words. Or at least, not most of the time. Rather, it’s to code, reorganize, gather, bring together, cut up.

The model is more like making a patchwork quilt, than knitting something from scratch.

But, the method is like a magpie. Taking shiny things, bringing them together, attacking them, seeing how they work.

Crucially, at this stage of the game, it’s not much thinking. It’s a manual work. Craft like. Physical labour.

Wish me luck.

I’ll do updates, daily.

Writer’s Diary #50: Become a Writer

Writing is a craft; it takes practice. The evocative power of ethnography to convey understanding requires careful attention to words. Words matter. Notes, fragments and jottings are the opening gambit of anthropology. They give energy to anthropologists’ writing and give us subjects to write about. We often begin with stories, fragments, and ethnographic shorts. These are the building blocks of an anthropological enterprise. But, to write them, requires reflecting on the writing process. So, why not experiment. Play. See what works. Read to. Keep reading. Write about what you read. Publish before you are ready. Write more. Rinse. Repeat. Write is a practice.

Writing as a practice needs to be decolonised. Words are often a black box that undergraduates and graduate students and professors aren’t really taught how to do.

As students, we engage with finished pieces and rarely see the messiness of the writing process. Writing is messy. It’s so so messy. Write. Cut. Revise. Reorder. Move. Writing is done on the page. But, the work can be creative and playful. It need not be a slog. It can be a place to experiment. So, play. Play with with free writing, play with genres, write essays, research notes, book reviews, articles, blog posts, social media posts. Experiment with writing short. Play with writing long. Write to think. Experiment with collaborative and group writing. Experiment in the classroom. See what works. See what doesn’t. Failure is fine. Try again.

What I am trying say is separate the writing from the anxieties and emotions of academic work. Ideas are important, sure. But, for me, they always emerge, truly, on the page. It’s on the page where I can make them do wonders. Where I can test them out. Feel them. Taste them. Let the words sing.

As Chilean Poet Pablo Neruda wrote in his memoir:

… You can say anything you want, yessir, but it’s the words that sing, they soar and descend … I bow to them … I love them, I cling to them, I run them down, I bite into them, I melt them down… I love words so much… The unexpected ones… The ones I wait for greedily or stalk until, suddenly, they drop… Vowels I love… They glitter like colored stones, they leap like silver fish, they are foam, thread, metal, dew… I run after certain words … They are so beautiful that I want to fit them all into my poem … I catch them in mid-flight, as they buzz past, I trap them, clean them, peel them, I set myself in front of the dish, they have a crystalline texture to me, vibrant, ivory, vegetable, oily, like fruit, like algae, like agates, like olives… And then I stir them, I shake them, I drink them, I gulp them down, I mash them, I garnish them, I let them go… I leave them in my poem like stalactites, like slivers of polished wood, like coals, pickings from a shipwreck, gifts from the waves… Everything exists in the word… An idea goes through a complete change because one word shifted its place, or because another settled down like a spoiled little thing inside a phrase that was not expecting her but obeys her… They have shadow, transparence, weight, feathers, hair, and everything they gathered from so much rolling down the river, from so much wandering from country to country, from being roots so long … They are very ancient and very new… They live in the bier, hidden away, and in the budding flower …
— Pablo Neruda, “The Word,” in Memoirs. Penguin Books, 1978, p. 53.

Ideas that are unwritten cannot be made external, or thought about, or tested, or changed, or made to sing. Instead, they remain amorphous. Always a potential. Write them, reflect on them, rework them. Revise. Writing is fun. As fun, revising.

Don’t say, “What am I going to write about today?” Instead, go back. Ask “What have I already written about?

Try this exercise. Make a slip box. Put int it a collection of notes and essays and pieces of text you’ve already worked on.

Your task?

Engage with what you’ve already prepared. Reread it. Recycle. Work like British Marxiust Eric Hobsbawn did. Develop a willingness to go back into the well of ideas that are yours and rework them. Revise them into new forms. Test them in public. Write lectures, op-eds, letters, articles, book chapters, and books from each other.

Writing becomes a task of preparing for publication, rather than starting carte blanche.

The trick? Not to draft or write every day, but to prepare for publication. Preparing for publication short pieces, long pieces. Book reviews, articles, essays. It’s not the biggest and most complicated piece; it would be small pieces. What’s the smallest piece you could work on? Start with that.

Start with the words. See what comes from them.

I am working on a book that revisits particular moments from Colombia and which, for now, is also about writing. I say for now, because I keep changing the book. At times it’s both. At times, it’s neither.

But, its building blocks are ethnographic shorts. These are moments that let me build stories and place them stories in a wider context and makes more general claims. My method is inductive, relying on detailed description through elaboration and explanation, trying to make the particular speak to the universal. I boil moments down to their essence, return to field notes to expand and explain, develop one moment and link it to the next, supplement my accounts with others reports–newspapers and archives and videos and research. Decide what to focus on and whose voices to hear.

The words matter too, though. Play them.
Punctuate dialogue, consider verb choice, and revise for active voice unless passive voice is preferred. The words matter. I use transitive verbs to drive description and animate the inanimate. Strive for clarity. Write then rewrite to assemble moments and analysis. Consider exposition, character, scenes, narrative voice, point of view, time and rhythm. I move around, change perspectives, take different approaches and revisit the same theme from different directions. I remove myself from the action or take someone else’s point of view. I adopt an omniscient perspective and choose my approach.

Each is an ethnographic choice, a way of writing to describe and make sense of moments. The raw material for this slip box is the field notes, written on the laptop or in one of the journals or notebooks, as the beginning of a long, slow, sustained process of my becoming a writer.

As a graduate student, in the field, in the Chocó, when there was electricity–carried by an unreliable power line through the jungle and across at least two rivers—I filled a database on my computer with field notes. When there was no electricity because of storms, rain and fallen branches, I wrote longhand with a pen and the light of a candle. Writing the notes was part of my learning to write. The book I am working on reflects on that learning because I am convinced that the evocative power of ethnography to give understanding requires careful attention to words.

Writing requires thinking, but it also requires playing, experimenting, and refining different techniques and processes to discover effective ways of communicating.

Writer’s Diary #49: Digital Workshops

At its best, my computer is not a distraction, but a place to work—a digital workshop. A text workshop, not so different from a carpenter’s workshop with its wood, chisels, drafting tables, power tools, planers, band saws, and jigsaws. In my digital workshop, many things are at hand.

Partly, I mean the storage—the hard drives and flash drives where I keep field notes, first drafts, projects in progress, publications, finished notes, video, film, maps, and photographs. Some of it comes from the computers, laptops, and iDevices I have used over the years: the tablets and readers. Much of it is the detritus accumulated over two decades as a student and then as an academic, stored in various folders. Mostly, I mean the tools. The tools of the word processors, screenwriting apps, mind-mapping apps, search tools, bibliographic managers, and search engines.

One of my favorite pieces of software is Eastgate System’s Tinderbox, which is, in many ways, a digital equivalent of an analogue notebook and a carpenter’s workshop and much more besides.

It’s a digital tool, and a place to work with text and to do things that were impossible before the digital age. To write, link, make maps, collect, edit, cut up, revise, reorder, outline, search, and much more. Mark Bernstein, the lead developer at Eastgate Systems, has offered regular updates for decades.

Tinderbox is a Swiss Army knife for notes, providing a single interface that suits the way I work.

It also has a powerful set of programming and automation tools that allow me to work with notes.

I think of it, the same way John McPhee thinks of his tools.

John McPhee, one of the most prolific writers of long-form creative non-fiction, has an article in The New Yorker about his writing process, which became part of his book Draft No. 4. McPhee tells the story of lying on a picnic table with all his notes, research, interviews, and everything else in manila envelopes, but he’s distraught because he didn’t know the structure. He says this is no way to write.

I agree with him, and indeed, structure is the hard part.

McPhee used to use analogue tools to find structure. But by the 1980s, he adopted a computer—specifically a dedicated word processor. Kedit, short for the Mansfield, Massachusetts-based company KEDIT, was a full-screen text editor. McPhee describes how he moved chunks of text around using custom-built text macros to code, split up, and bring back together text. It’s something I’ve duplicated for my own work.

For me, Tinderbox is my computer. A lot of writing is rewriting and revising, linking and connecting, making connections, and undertaking an archaeology of your own ideas and notes. Tinderbox is, for me, a powerful tool for that.

It’s the heart of my digital workshop.

Still, at times the computer is a place of distraction. There are times when I sit down with nothing but a pen and write longhand for an hour to see what comes out.

Writer’s DIary #48: Writing is also thinking about writing

What did I do today?

I went for a walk, thought about the Makeshift book, and listened to the Unpublished podcast. When I wrote my first book, I listened to various podcasts on writing, but I had stopped for a few years. Today, I listened to an interview with Devon Price. It was really good.

While Price quoted from Paul J. Silvia’s How to Write a Lot, their advice was more generally about deconstructing the Protestant work ethic. So, I liked it. All of this, of course, was relevant both to the book and to the process of writing it.

While this wasn’t really the day’s writing I had planned, the idea of sitting down to write at seven o’clock at night just seems exhausting to my 42-year-old self in a way it perhaps didn’t to my 32-year-old self. A walk with the dog and the moon and a podcast, much more doable.

So, I’m going to call this a win—not a word written today. And it was glorious. Tomorrow, I’ll do a couple of hours before reading for class. Gently, but not rushed.

Writer’s Diary #47: Reading

On the principle that I should write only when I feel like it, I didn’t think I would write today. Too much family, too many mini-crises of my own and others’ making, and too much sludge work to do. But, I was re-reading Friction for class. It’s so good, and the opening gave me some inspiration. I’m sure I’ll revise it a lot, but it’s the start of an idea.

Reading is central to writing.

I knew that, you did to. But, sometimes we forget.

Writer’s Diary #46: Structur.py

Today, I worked on coding a section for the introduction to Makeshift. It was about 3,000 words, very disorganized, repetitive, and in need of cutting and more focus.

I followed the process John McPhee described in Draft No. 4:

Structur exploded my notes. It read the codes by which each note was given a destination or destinations (including the dustbin). It created and named as many new Kedit files as there were codes, and, of course, it preserved intact the original set. In my first I.B.M. computer, Structur took about four minutes to sift and separate fifty thousand words. My first computer cost five thousand dollars. I called it a five-thousand-dollar pair of scissors (John McPhee, Draft No. 4, “Chapter Structure”).

I wrote a script six months ago that extracts coded text from a folder of text files.

I’ve been using the script for months now. It splits a text file into small chunks, based on codes I added. It’s almost a manual process—not too much thought. I coded 3,000 words, split them into 10 files, put them in order, and started revising them. Much easier.

When I ran out of steam, I posted structur.py to GitHub. It’s the first script I’ve ever posted publicly. Perhaps someone will find it useful.

Writer’s Diary #45: Writing on the Weekend

An update on progress today. First, it’s Saturday. It took some effort to get started. Weekend. But as soon as I got going, it’s been steady. Nice. Moving around the document in a bit of a flow state. Picking away at different pieces, listening to Belle and Sebastian. Why write on the weekend? Why not take a break? I know where I am going, and I am not willing to give up the momentum every week.

What did I do? I picked away at the front matter. I adjusted the title and edited a blurb (371 words). Mostly, I edited a piece on makeshift (571 words). Then, I stuck into some notes riffing for the next section, on the epigraph I want to use from Pablo Neruda, which unexpectedly is a way to get into writing about the Orkney Islands, where I was first exposed to Latin America.

This will allow me to get into a bit of an autobiography that I wrote years ago about how I ended up working in Colombia. However, it’s 3,000 words and ought to be about 1,000.

I’m not sure digressing to Scotland and autobiography makes sense, but I read somewhere “Don’t be afraid to digress.”

But, since the book is about tools and a tool we all use to write is a computer, getting to Scotland for a little will let me spend time on my first computer, and learning how to write. It would beep at every spelling error, and had a wonderful pixelated font, not unlike the one I am writing on now.

Writer’s Diary #44: Departure Mono

For the last few weeks, I’ve been drafting in Tinderbox my book Makeshift using the fantastic Departure Mono font. It is a monospaced pixel font, reminiscent of the fonts I used to learn to type on in the 1990s command-line interfaces and that graphical user interface of the Atari ST, which I learned to write on. It’s a glorious, free font. Kudos to its creator Helena Zhang.