Blog: November 2020

Most of these posts were originally posted somewhere else and link to the originals. While this blog is not set up for comments, the original locations generally are, and I welcome comments there. Sorry for the inconvenience.

Our legacies are not always what we think they will be

In the mid-80s, in my first full-time position after college, I worked for a now-defunct software company doing artificial intelligence, specifically natural-language processing. The most significant project I worked on while there was a text categorization system. I was the tech lead (this was 1987ish). The client was Reuters, who at the time had literal rooms full of people whose job was to skim news stories coming over the wire, attach categories to them, and send them back out quickly. Our job was to automate that -- or, more realistically, to automate the parts that machines could do and send a much smaller set of "don't know" cases to humans. I'm writing this from memory; it's been more than 30 years and details are fuzzy.

I left that company and went on to do other things. I was vaguely aware that, at some point, the corpus of news stories we used for training data had been released publicly, by agreement between Reuters and my then-employer. I wasn't a researcher, wasn't in the NLP business any more, and lost touch. Technology moves on, and I figured our little project had long since faded into obscurity.

Tonight I got email with a question about that data set. My name is in the README file as one of the original compilers, and somebody tracked me down.

Somebody still cares about that data set.

I Googled it. Our data set was popular for close to a decade, during which time people improved the formatting (SGML, baby!) and cleaned up some other things. It spawned a child -- the original either had, or had acquired, some duplicate entries, and the new one removed them. (The question I got was actually about the child data set.) And now I'm curious about the question I was asked too, because I either don't know or don't remember how it got that way.


Odds and ends

I haven't been posting regularly. Oops.

I've been baking bread about once a week. This past week I finally scored some rye flour (that was not exorbitantly priced), so I made a rye sourdough for the first time. I think I prefer less molasses than this recipe called for, so I'll adjust that next time or try a different recipe. The bread is tasty, aside from the molasses overwhelming the caraway. Most "rye bread" recipes I've seen use rye for only one third of the flour, which sent me searching for "all rye" rye bread, which apparently works and tastes good but might not rise as much? I'll probably try it at some point, especially since I had to buy four (small) bags of rye flour to get it.

Dani and I play board games every Shabbat now, and occasionally we have two other friends (who are also careful, and I guess this is a "pod"?) over to play. We play Pandemic in every session because, well, pandemic. Yesterday we pulled out Kings and Things, a game we all had vague memories of, and by the end had concluded that while it's appealing it's also kind of tedious and maybe sort of a shorter Titan, a game I like in principle but dislike actually playing. Ok, now we've refreshed our memories...

A friend has a game called McMulti, which is an economic game (oil/gas theme)... in German. There are lots of places where text matters, so when we've played we've used cheat sheets since none of us read German. We recently became aware of an English-language derivative, called Crude, and got it recently. They've changed some of the mechanics and made one really annoying change to how the board is laid out, but other changes are positive and the game's a little faster. I like it, but am tempted to figure out how to print my own board. The game is really strongly designed for four players, but there are rules for a two-player version, which Dani and I have played once, which seen to work ok.

Codidact, the project that consumes most of my spare time, is in the process of incorporating as a non-profit. We've got our lawyer on our Discord server and having conversations about incorporation documents via Google Docs comments. It looks like we will be able to clear an important hurdle soon. Neat!

On the project front, I'm not writing code -- I keep feeling like I should learn Ruby and the dev environment so I can help, then concluding that I probably won't be helping because I'd be taking time and attention from the developers who are actually being productive. But I've taken over bug-wrangling -- some analysis and testing, clarifying vague reports, and, especially, triaging. I was surprised to find that GitHub counts filing issues as contributions. I think that's new?

We just had our first birthday, counting from when the project founder set up a Discord server to talk about maybe building an alternative to Somewhere Else. We've still got a lot of work ahead of us, both technical and community development, but I'm pleased with where we are.

I've been reading a lot of fiction, a mix of short stories, novellas, and novels, many through the BookFunnel network (and also StoryBundle). I'm "meeting" a lot of authors I didn't previously know. I should really write a separate post about that.


I am relieved that my state came through in the end, and hopeful about the few other ones that are still outstanding.

I am very disappointed by how close that was. We've known what was inside the package for four years, and 70 million people, give or take, voted for a bigoted, bullying, self-absorbed fascist. I'm very glad Biden/Harris won, but with the margins this close, repairing the damage of the last four years is going to be hard. I'm glad to see they're losing no time on planning reversals of the things that can be reversed. Live by the executive order, die by the executive order -- Trumpists have no grounds for complaining there.

And, of course, he's not going to concede, let alone work constructively with a transition team. He's going to do whatever he can to trash the country on the way out. It's going to take a lot of work from a lot of people to build better relationships, better national cooperation if not actual unity. It's going to take more than one term of office, which we all need to remember in four years.

Healing is going to be hard, but the alternative is worse. I hope we can do something about the extreme polarization we're currently living with.

Recommended reading: what hurts about this election by "hudebnik".


The temperature tonight is supposed to be below freezing, so today I did a final harvest. There are a few small green tomatoes that would need rather a while to grow and then ripen and I don't think I can keep the plant warm enough for long enough, so I picked everything that was larger even though it was still green, and I'll see if they ripen indoors. I've picked tomatoes before when they were orange but not fully red (to beat the critters to them), and I've had the occasional green one ripen on the windowsill. Today's are in a brown paper bag with a sacrificial apple. Even if I lose these last couple dozen, I had a pretty good bounty for the year, as best I can tell having never done this before.

The last of the rosemary and basil are currently drying. I had two different rosemary plants -- no idea what the difference was, but one is lighter than the other and they smell a little different. I decided to oven-dry one and hang-dry the other, to see how the methods compare. It's not true science because there's a second variable; I didn't split each variety into two groups. So I won't really know if any differences are due to the type or the method, but oh well. The main goal is to get dried herbs.

The lunchbox peppers were a disappointment. The peppers I got were nice, but I only got a total of 15 between the two plants. I will probably skip those next year and use the pots for something else.

I think next year I want to add some oregano.