Changelog
A constant update of how I spend my time at work.
Tuesday Aug 27
UIUC.chat
Add cron job + background job queue runner.
Monday Aug 26
MMLI
Fully self-hosted the MMLI backend Kubernetes stack. Debugging required applying some K8 PVCs and ConfigMaps. Now it's all fully documented and self hosted.
Friday Aug 23
Meet with Amazon AICE research team.
Tried and failed to configure ELK stack to use Elastic Fleet to make it easier to connect external servers to the central ELK instance. Faced issues with TLS self signed certificates, even though I don't need or want TLS certs because I'm using Tailscale to connect services. It's a secure tunnel with automatic https, which removes the need for TLS certs.
Thursday Aug 22
UIUC.chat
Several hours of pair programming with Drshika, our new hire. Nearly did an end-to-end feature implementation: support for Azure OpenAI models, with user-supplied API keys. Great on boarding session to our code base.
Wednesday, Aug 21
UIUC.chat
Onboarded two new devs to UIUC.chat using our brand-new Developer Quickstart docs, they worked flawlessly! Pretty slick.
Fixed a critical production bug. Did incremental refactoring, the platonic ideal of production code.
Finally set up centralized log monitoring. Still WIP, need to add
filebeatandmetricbeatto other physical servers in our fleet.100% self hosted, because log-hosting companies have horrible pricing. My storage is cheap, nearly free, 10x better than hosted offerings. Therefore, worth the extra effort.

Tuesday, Aug 20
Infinite meetings on Tuesday. Updated stakeholders 😵💫
UIUC.chat
Fix Llama 3.1 context windows size.
Fix automatic LLM selection function. Default to the best available model, with price tradeoffs.
Monday, Aug 19
Did an incredible amount of Kubernetes debugging so I can self host the MMLI backend. Lots of PVCs and
kubectl apply
Sunday, Aug 18
Dotfiles refactor
Migrated my dotfiles from Gitlab to Github.
Greatly refined all my install scripts and README.
Notable mentions:
Powerlevel 10k, greatest thing since sliced breadglances- a better htoplsd- a better lsbat- a better catfzf- a better grep // excellent at reverse terminal search (command history search)ag- a better findsome great OhMyZsh plugins
miniconda. Edit: actuallymini-mamba, the faster version of conda developed by high frequency traders.
Aug 15-16, 2024
UIUC.chat
Extensive debugging of the Postgres database, it looks like a JSONB column in a table is causing
pg_dump -> pg_restoreto error out. Our guess is malfromed json snuck into the DB.
Amazon AICE
Synthetic data generation with Distilabel. Implementing our filtering rules via their classes.
More manual data cleaning, removed a further 30 bad questions of our 800. Quality is everything. It's worth manual filtering for the 'last bit of ultra-high quality post-training data'.
Home servers
Provisioned ZFS on mirrored Optane drives, passed through to the VM so the VM gets the raw performance of the Optane drives (instead of making a NFS share, which adds tons of latency and overhead).
Aug 14, 2024
UIUC.chat
See: UIUC.chat Vision & Medium Term Plan
Group planning for automated
Metadata extractionandInsightson UIUC.chat.Key ideas: structured outputs, creating charts and visualizations from user's documents, creating insights over multiple documents specifically
hierarchical summarizationandcontradiction identification.
Servers
Storage server: Installed 4x 18TB HDDs
It’s still much cheaper to DIY than anything else. Cloud is cheapest $7/TB/mo, and I can buy (raw) at $9/TB. So even at 9*1.33 = 11.97 I’m still at under 2 month payback period.
Web server: installed 2x 1TB Optane P905 U.2 drives.
These are the lowest-latency drives ever made for random database access, e.g. Q-depth-1 (QD-1) reads from disk). Optane U.2 drives are extremely impressive as ZFS Special Metadata devices, now I'm provisioning a DB server with mirrored Optane drives. This setup (fast CPU, tons of DDR4 memory and Optane storage) be perfect for
qdrantandpostgresweb serving.



Paragliding

Aug 13, 2024
MMLI
Fixed database initialization (was missing a secrets file, so I'm glad I asked for help).
Met with Bingji to plan frontend-backend integration. We wrote a spec for the two endpoints we need. It was a delightfully efficient meeting: as long as we agree on the API shape, we're good.
UIUC.chat
Merge Rohan's PR that adds Tools and Llama 3 support to our API.
Merge my PR Improve Default Model on /chat page
Respect previous choices. E.g. the "last selected model" will attempt to be used, fallback to the preference list.
Respect enabled/disabled models when selecting the default
Bugfix: gpt-4o-mini can now be disabled, previously that one model bypassed checks.
Redirect NCSA.ai -> UIUC.chat because it's now deprecated and superseded by UIUC.chat.
Aug 12, 2024
Set up local Kubernetes cluster for new endpoint testing
Encountered challenges with database initialization
Refined input/output handling for MMLI backend integration with ACERetro
📜 Added comprehensive documentation: "How to run locally with Docker and Minikube"
Research & Development
Completed critical reading for LLM-Guided Retrieval work:
Administrative
Handled extensive email and Slack communications
UIUC.chat (chat.illinois.edu) campus adoption progressing
Awaiting budget approval from Illinois CIO this week
Gies expansion advancing successfully with professors Melanie Wiscount and Vishal Sachdev
Bug Fixes
Corrected URL references to
tools.uiuc.chatacross UIUC.chat frontend
Aug 9, 2024
Amazon AICE Research
Manually cleaned 1,000 real user questions (90-minute process)
Focused on retaining highest quality STEM questions
Data cleaning process. Reduced dataset from 5,000 to 800 high-quality questions:
De-duplication using longest common substring
De-duplication near-match via embedding cosine similarity
AI filtering for STEM-related questions
Final manual human review
Key takeaway: Quality > Quantity 🎯
.jsonl)Molecule Maker Lab Institute (MMLI): finished integrating the backend into MMLI's custom Kubernetes job running framework (4+ hours). Basically, it'll take any docker image and any command and run that on the K8 cluster as a job. Pretty neat way to have one set of infra that runs many different docker images.
Aug 8, 2024
Finalized Vyriad project allocations. This project should officially begin in 2 weeks, then I'll transition off of MMLI and onto this for 25% time.
Created this changelog so my team has greater visibility into how I spend my time.
Setup Posthog monitoring of our Docs website traffic via GitBook integration. Our docs are getting really good, so I want to make sure people are finding them.
UIUC.chat: support Ollama models in the API. Assist Rohan with adding Tools support to our API.
Significant refactor to improve maintainability, especially RE our API. Now the /chat page and our internal API use the exact same functions to invoke a chat. Previously they were separate because we had one "client-centric implementation" and one "server-only implementation". Now we broke up the server-only part and call that from the client. No more duplicated code. Trivial support of new features, like Ollama and tool calling, in our API.
Upgrade from Clerk v4 to v5 for new features (google "one tap" sign in), better designed components.
Bugfix: gpt-4o-mini can now be disabled, previously that one model bypassed checks.
Logging: add Posthog logs to monitor the distribution of which LLM models are used.
Aug 7, 2024
Brought new home server online. It's designed to be a phenomenal single-purpose web server for UIUC.chat. It'll host our vector database (mostly in memory) and some of our helper docker containers, like our new secrets manager.



New secrets manager, Infisical, like Bitwarden for Devs with .env secrets. It's delightful to self host. It's here: env.ncsa.ai. It enables a delightful developer experience, no more shipping around .env files via Slack!
Light refactor of our frontend and backend repo to automatically use the secrets manager. just do
npm run devand your secrets are auto-injected (so log as you login with our secrets CLI tool). It's actually great.
Wrote fantastic developer onboarding docs for UIUC.chat contributors. Primarily for our new HPC-GPT project, funded by NSF CSSI.
Aug 6, 2024
Spent a few hours to get Grobid running on Delta. I used a
gradlebuild instead of Docker/Apptainer.UPDATE (Aug 8): I got apptainer working, thanks to the devs on github.
Wrote a script to automatically port forward all the way from my personal server to an active Delta compute node. This enables me to run Grobid as a web server and make API requests against it.
Next: write a script to create a reverse proxy against a pool of Delta compute nodes to enable multi-node scaling of my ingest process. On a single node my job might take a month of CPU-node-hours. So, I need ~30x parallel nodes to do it in a day.
Spent many hours over the weekend (August 4th) on adding Posthog logging to this process, now I have beautiful dashboard to track my overall speeds (avg and 95% latencies). Good monitoring is fundamental to my mission of increasing speed.


Last updated