Changelog

A constant update of how I spend my time at work.

Tuesday Aug 27

UIUC.chat

Add cron job + background job queue runner.
1. Exploring: https://bullmq.io/ vs https://github.com/rq/rq vs https://github.com/aptible/supercronic

Monday Aug 26

MMLI

Fully self-hosted the MMLI backend Kubernetes stack. Debugging required applying some K8 PVCs and ConfigMaps. Now it's all fully documented and self hosted.
1. Docs, adding ACERetro to K8s job runner, and Dockerizing ACERetro so it can run in K8s.

Friday Aug 23

Meet with Amazon AICE research team.

Tried and failed to configure ELK stack to use Elastic Fleet to make it easier to connect external servers to the central ELK instance. Faced issues with TLS self signed certificates, even though I don't need or want TLS certs because I'm using Tailscale to connect services. It's a secure tunnel with automatic https, which removes the need for TLS certs.

Thursday Aug 22

UIUC.chat

Several hours of pair programming with Drshika, our new hire. Nearly did an end-to-end feature implementation: support for Azure OpenAI models, with user-supplied API keys. Great on boarding session to our code base.

Wednesday, Aug 21

UIUC.chat

Onboarded two new devs to UIUC.chat using our brand-new Developer Quickstart docs, they worked flawlessly! Pretty slick.
Fixed a critical production bug. Did incremental refactoring, the platonic ideal of production code.
Finally set up centralized log monitoring. Still WIP, need to add filebeat and metricbeat to other physical servers in our fleet.
1. 100% self hosted, because log-hosting companies have horrible pricing. My storage is cheap, nearly free, 10x better than hosted offerings. Therefore, worth the extra effort.

Tuesday, Aug 20

Infinite meetings on Tuesday. Updated stakeholders 😵‍💫

UIUC.chat

Add secrets manager to backend. Add docs for secrets manager.
Fix Llama 3.1 context windows size.
Fix automatic LLM selection function. Default to the best available model, with price tradeoffs.

Monday, Aug 19

Did an incredible amount of Kubernetes debugging so I can self host the MMLI backend. Lots of PVCs and kubectl apply

Sunday, Aug 18

Dotfiles refactor

Get the goodies here: https://github.com/KastanDay/dotfiles

Migrated my dotfiles from Gitlab to Github.
Greatly refined all my install scripts and README.
Notable mentions:
- Powerlevel 10k , greatest thing since sliced bread
- glances - a better htop
- lsd - a better ls
- bat - a better cat
- fzf - a better grep // excellent at reverse terminal search (command history search)
- ag - a better find
- some great OhMyZsh plugins
- miniconda. Edit: actually mini-mamba , the faster version of conda developed by high frequency traders.

Aug 15-16, 2024

UIUC.chat

Extensive debugging of the Postgres database, it looks like a JSONB column in a table is causing pg_dump -> pg_restore to error out. Our guess is malfromed json snuck into the DB.

Amazon AICE

Synthetic data generation with Distilabel. Implementing our filtering rules via their classes.
More manual data cleaning, removed a further 30 bad questions of our 800. Quality is everything. It's worth manual filtering for the 'last bit of ultra-high quality post-training data'.

Home servers

Provisioned ZFS on mirrored Optane drives, passed through to the VM so the VM gets the raw performance of the Optane drives (instead of making a NFS share, which adds tons of latency and overhead).

Aug 14, 2024

UIUC.chat

See: UIUC.chat Vision & Medium Term Plan

Group planning for automated Metadata extraction and Insights on UIUC.chat.
- Key ideas: structured outputs, creating charts and visualizations from user's documents, creating insights over multiple documents specifically hierarchical summarization and contradiction identification.

Servers

Storage server: Installed 4x 18TB HDDs
- It’s still much cheaper to DIY than anything else. Cloud is cheapest $7/TB/mo, and I can buy (raw) at $9/TB. So even at 9*1.33 = 11.97 I’m still at under 2 month payback period.
Web server: installed 2x 1TB Optane P905 U.2 drives.
- These are the lowest-latency drives ever made for random database access, e.g. Q-depth-1 (QD-1) reads from disk). Optane U.2 drives are extremely impressive as ZFS Special Metadata devices, now I'm provisioning a DB server with mirrored Optane drives. This setup (fast CPU, tons of DDR4 memory and Optane storage) be perfect for qdrant and postgres web serving.

Paragliding

Aug 13, 2024

MMLI

Fixed database initialization (was missing a secrets file, so I'm glad I asked for help).
Met with Bingji to plan frontend-backend integration. We wrote a spec for the two endpoints we need. It was a delightfully efficient meeting: as long as we agree on the API shape, we're good.

UIUC.chat

Merge Rohan's PR that adds Tools and Llama 3 support to our API.
Merge my PR Improve Default Model on /chat page
- Respect previous choices. E.g. the "last selected model" will attempt to be used, fallback to the preference list.
- Respect enabled/disabled models when selecting the default
- Bugfix: gpt-4o-mini can now be disabled, previously that one model bypassed checks.
Redirect NCSA.ai -> UIUC.chat because it's now deprecated and superseded by UIUC.chat.

Aug 12, 2024

MMLI

Set up local Kubernetes cluster for new endpoint testing
Encountered challenges with database initialization
Refined input/output handling for MMLI backend integration with ACERetro
📜 Added comprehensive documentation: "How to run locally with Docker and Minikube"

Research & Development

Completed critical reading for LLM-Guided Retrieval work:
Check out my reading notes here.

Administrative

Handled extensive email and Slack communications
UIUC.chat (chat.illinois.edu) campus adoption progressing
- Awaiting budget approval from Illinois CIO this week
Gies expansion advancing successfully with professors Melanie Wiscount and Vishal Sachdev

Bug Fixes

Corrected URL references to tools.uiuc.chat across UIUC.chat frontend

Aug 9, 2024

Amazon AICE Research

Manually cleaned 1,000 real user questions (90-minute process)
Focused on retaining highest quality STEM questions
Data cleaning process. Reduced dataset from 5,000 to 800 high-quality questions:
1. De-duplication using longest common substring
2. De-duplication near-match via embedding cosine similarity
3. AI filtering for STEM-related questions
4. Final manual human review
Key takeaway: Quality > Quantity 🎯

Molecule Maker Lab Institute (MMLI): finished integrating the backend into MMLI's custom Kubernetes job running framework (4+ hours). Basically, it'll take any docker image and any command and run that on the K8 cluster as a job. Pretty neat way to have one set of infra that runs many different docker images.

Aug 8, 2024

Finalized Vyriad project allocations. This project should officially begin in 2 weeks, then I'll transition off of MMLI and onto this for 25% time.
Created this changelog so my team has greater visibility into how I spend my time.
Setup Posthog monitoring of our Docs website traffic via GitBook integration. Our docs are getting really good, so I want to make sure people are finding them.
UIUC.chat: support Ollama models in the API. Assist Rohan with adding Tools support to our API.
- Significant refactor to improve maintainability, especially RE our API. Now the /chat page and our internal API use the exact same functions to invoke a chat. Previously they were separate because we had one "client-centric implementation" and one "server-only implementation". Now we broke up the server-only part and call that from the client. No more duplicated code. Trivial support of new features, like Ollama and tool calling, in our API.
- Upgrade from Clerk v4 to v5 for new features (google "one tap" sign in), better designed components.
- Bugfix: gpt-4o-mini can now be disabled, previously that one model bypassed checks.
- Logging: add Posthog logs to monitor the distribution of which LLM models are used.

Aug 7, 2024

Brought new home server online. It's designed to be a phenomenal single-purpose web server for UIUC.chat. It'll host our vector database (mostly in memory) and some of our helper docker containers, like our new secrets manager.

New secrets manager, Infisical, like Bitwarden for Devs with .env secrets. It's delightful to self host. It's here: env.ncsa.ai. It enables a delightful developer experience, no more shipping around .env files via Slack!
- Light refactor of our frontend and backend repo to automatically use the secrets manager. just do npm run dev and your secrets are auto-injected (so log as you login with our secrets CLI tool). It's actually great.
Wrote fantastic developer onboarding docs for UIUC.chat contributors. Primarily for our new HPC-GPT project, funded by NSF CSSI.

Aug 6, 2024

Spent a few hours to get Grobid running on Delta. I used a gradle build instead of Docker/Apptainer.
- UPDATE (Aug 8): I got apptainer working, thanks to the devs on github.
- Wrote a script to automatically port forward all the way from my personal server to an active Delta compute node. This enables me to run Grobid as a web server and make API requests against it.
- Next: write a script to create a reverse proxy against a pool of Delta compute nodes to enable multi-node scaling of my ingest process. On a single node my job might take a month of CPU-node-hours. So, I need ~30x parallel nodes to do it in a day.
Spent many hours over the weekend (August 4th) on adding Posthog logging to this process, now I have beautiful dashboard to track my overall speeds (avg and 95% latencies). Good monitoring is fundamental to my mission of increasing speed.

Last updated 1 year ago