A sort of timeline and personal diary. Inspired by the nownownow movement but with my own twist.
[240229] - Microsoft published ResLoRA paper.
[240229] - Qualcomm released 80+ models that runs on their Snapdragon devices. More than 80 models?!
[240228] - Just read 1-bit LLM paper, very interesting stuff. We reach Pareto improvement from 4-bit quantization. Looking forward to see 1-bit MoE very soon. Oh and I tidied up this website a bit. I’m very pleased.
Also I am today year old when I find out what a Ulysses Pact is.
[240227] - Playing around Mistral’s latest Large Model, Le Chat. Le good model was teaching me how to build a transformer from scratch. Today’s reading was a paper on some Diffusion thing honestly I forgot but it’s published around 2020.
TIL about Bing’s deranged alter ego called Sydney. Chat, is Sydney real?
[240226] - Read papers on Gemma (Google 7B and 2B model) and Mixtral of Experts (Mixtral8x7B model). Added more papers on my bookmarks to read on weekend. Cooking more blog posts and essays, notes in my drafts are steadily growing. Working on work stuff, really busy right now. I’m just glad I kept my writing streak for more than 10 days now, a new record for me.
[240225] - Reading a lot about deep learning, training models, fine tuning and the likes. I got so much to learn, I am feeling left behind by miles but that’s what makes it fun. Other than that, I am socializing more on X and a bit of finding my people. I am also practicing my writing a lot. In short, reading, writing, and playing around models.