Selected coding projects
Nonconsumptive
A standard and set of python libraries for distributing fast, random-access access to large textual collections using the Apache Arrow format.python
Deepscatter
Fast, animated, interactive online maps that scales easily to billions, not millions, of points using WebGL and Apache Arrow.typescript
Stable Random Projection
General-purpose, lightweight dimensionality reduction for book or article-length texts. A trick involving cryptographic hashes makes it possible to use the same space for any language without a pre-trained model or dictionary.pythonjavascript
WordVectors
An R package for training and exploring word2vec models with a fluent vocabulary taking advantage of R's ability to add, subtract, and perform other vector-space models.R
Quires
An implementation of djot's rich document model as svelte components to allow the creation of rich interactive documents from markdown files. The software rendering blog posts here!typescript
Bookworm
Tools for tokenizing and visually exploring large textual collections backed by an extremely fast MySQL architecture and served over the web through an expressive API.pythonjavascript
Markdown Lectures
Document transformation scripts for writing talks and course lectures that simultaneous generate their own slidedecks and outlines with identifying terms, to keep everything aligned.Haskell