13 February 2023

  • Microprediction
    • Made a pull-request to the microprediction/rediz repo.

15 February 2023

  • The Little Learner
    • Book about machine learning in the style of The Little Schemer.
      • https://mitpress.mit.edu/9780262546379/the-little-learner/
      • https://news.ycombinator.com/item?id=34810332

16 February 2023

Lion optimizer

  • https://arxiv.org/pdf/2302.06675.pdf
  • Wrote a small implementation of this in Python with PyTorch.

28 February 2023

rsync

  • Copying files with rsync requires the remote host to have the rsync binary as well.
  • Install rsync on Rocky Linux with podman, copy /usr/bin/rsync to the remote.
  • When executing rsync on the local machine, pass --rsync-path /path/to/remote/rysync to use the binary you copied.

Machine learning

There is a large dataset for training Large Language Model’s called The Pile. It can be found here. It is an 825 GiB open-source dataset including books, github repositories, webpages, chat logs, and medical, physics, math, computer science, and philosophy papers.