This is going to be the last monthly status update of 2021. It contains small snippets of projects I’ve been working on since the last status update (2021-09-15 to 2021-12-15, 91 days).
I wish everyone happy holidays and a merry christmas. I hope to be back with more consistent status updates and a lot of exciting projects next year.
OSDev / Kernel
This has been a relatively quiet period for my kernel. There are two small items that are both performance-related.
In Cardinal; tasks are able to sleep for some time, and get blocked / unblocked externally. This meant that in order to react to CPU interrupts, a task either needed to sleep and poll in a very ugly and inefficient way, or some custom code needed to be written into the interrupt handler to notify the task.
Now each task can get notified of CPU interrupts and unblock itself if the interrupt is relevant to it. After this change, idle CPU usage of the system has gone down significantly.
While watching a Serenity OS video, I saw Andreas use of an external tool that connected to QEMU in order to profile his kernel. This seemed like a cool idea, and I didn’t have any other method of profiling my kernel, so I decided to implement the same thing.
After some research, I found that you can pass a command-line argument to QEMU to make it listen on a socket. QEMU uses a protocol called QMP, which stands for QEMU Machine Protocol, that allows external tools to inspect and interact with the guest VMs. This is a JSON-based protocol, and turned out to be very simple to implement.
I wrote a sampling profiler that uses QMP and tells me which functions (and even lines) in my kernel are taking up the most CPU time. I do this by periodically asking QEMU to dump the CPU registers, and recording the instruction pointers. After the data is collected, I map the instruction pointers to function names using addr2line.
Right now my profiler is not very usable outside of my kernel; it has hardcoded paths and port numbers, and its command line argument parsing is not very user-friendly. I’m planning to clean up the code a little and push it into a repo after it’s ready for public use. I might also write about this profiling technique on a future blog post.
Radio / DSP
SSTV, or slow-scan television, is an amateur radio technology that can be used to transmit grayscale or colour images using a radio. The system modulates the picture information as audible tones, and can be used in any method that can transmit audio in those frequencies.
I wrote a modulator in Python that can turn images into audio tones that can be transmitted with a radio. I was able to decode the images I transmitted with the Robot36 Android app.
In the future, I am planning to try writing a demodulator as well. But as usual, producing a signal is always easier to decoding it.
JPEG file format
In order to use it in future projects, I wanted to get more familiar with the JPEG format. So I wrote my own encoder and decoder for it.
I guess it would be more appropriate to call it JFIF encoder and decoder, but a JFIF file has the
.jpeg file extension and is universally called a JPEG file, so I’m not going to be pedantic about it.
These links were very helpful in understanding the JFIF format.
- The JFIF article on Wikipedia
- The digital photography section of ImpulseAdventure. Specifically, the JPEG huffman coding tutorial and their JPEGsnoop tool.
- This list of JPEG markers.
- This article by Yasoob Khalid
- This article by Koushtav Chakrabarty
JPEG decoder / viewer
I read some documentation about the JPEG format and started writing my own JPEG decoder. My goal was to read a JPEG file and turn it into the pixel values to display on the screen.
For practicality and lack of enough free time, I could only implement decoding for a subset of full JPEG functionality. Progressive images and chroma subsampling were not implemented. Fortumately I don’t need those features for my projects, and it’s pretty easy to instruct encoders not to emit such files.
The decoder works perfectly, but it is quite slow. I believe this is because it’s written in Python without much thought given to optimization. While it is possible to optimize the code or rewrite it in another language, this is not a high priority for me. In the JPEG reader, code clarity was prioritized as this whole project was meant to learn more about JPEG instead of writing a fast decoder.
Compared to the decoder, the encoder was a lot easier to write. The JPEGsnoop tool came really handly, on each step I could check the internal state of a verbose JPEG decoder to see if I got any fields wrong.
There were two goals for the JPEG encoder project. The first one is to output the most minimal JPEG file that displays the target image. This doesn’t necessarily mean output size, instead it refers to the lack of any superfluous fields that can be used to make signatures that reveal the version / system it was generated on.
The second goal is to give full access to modify each field of the JPEG, from the quantization tables to the DCT coefficients and the huffman tables. While this is not necessary to encode images, it is useful for another side project related to steganography.
As I’m following the spec closely and not exercising any edge cases, the output files should be viewable both by my own JPEG decoder and any existing decoders. I’m happy to report that this is the case. I can view output images with my own JPEG reader, feh, GIMP, Firefox, Signal, Telegram, and JPEGsnoop. There is no reason to suspect any potential problems with other viewers.
Quantization table optimization
Most cameras and image editing applications use fixed quantization tables and scale them for different quality settings. It would result in an increase in quality and a decrease in file size if the quantization table could be optimized for each image.
I’ve written a tool to do this exact thing. With my current implementation, it is a slow process and it’s not suitable to do per image. I can think of two ways to improve this process.
The first method is to come up with a more optimized algorithm, and rewrite it in a language that is not Python.
The second method is to collect samples of various categories of images (such as nature, text, animation, drawing), and generate optimized quantization tables for each category. This will ensure you’re not using a sub-optimal table that’s meant to be good enough for every image, but something designed specifically for the type of image you have.
I took an interest in steganography. In particular, image steganography seems like a very interesting field. I think open source tools for this purpose could be better, and I haven’t used any proprietary steganography tools before.
I wrote a few simple demo programs that can be used to hide encrypted messages into PNG files.
While PNG is a lossless format that is good for packing lots of data into pixels, it is not the best one for hiding the presence of data. This is why the next step for this project is to use the JPEG encoder I wrote to generate steganographic JPEG images. This format is much more suitable for hiding data while having a naturally noisy and lossy look that is hard to tell apart from a non-steganographic image.
The build tool can seamlessly import code from local paths and over HTTP URLs. It recursively gathers all the files that are needed, and builds a minified and optimized version using the Closure Compiler.
The tool is (uncreatively) named jsbuild. This is meant to be a temporary name and the repo is currently a code-dump rather than a fully-functional open-source project.
There are many dithering algorithms with different strengths and weaknesses, and each algorithm has a distinct look to it. The algorithm and the parameters to use for it depend on the type of image (animation, hand-drawn, human face, nature photography) and even the resolution.
I wanted to try a different approach and approach this as an optimization problem. My error function was how close the image was to the target after it is passed through a 3x3 or 5x5 matrix to blur it. This blur approximates the low-pass filter effect that our eyes do, in order to guess how a dithered image will look to a human. After this, pretty much any black-box optimization method can be used.
I ended up using a simple hill climbing algorithm. The starting point is an image with its pixels randomly selected from the palette (perhaps it would be better to start with the target image blindly nearest-neighboured to the constrained palette). On each iteration, several candidate solutions are generated, these solutions have one of their pixels randomly flipped to a palette colour. Afterwards, each solution is ranked according to the error function described above, and the one with the lowest error becomes our current image for the next generations. The algorithm repeats this step until we end up with a satisfying image.
In the future, I’d love to try different optimization methods other than simple hill climbing. Genetic algorithms might be a good fit as there is a lot of crossover opportunity due to each random modification only affecting the result around that pixel.