Accessing your hard drive – the OS developers moment of truth

When building your own operating system, the moment when you first write data to a real physical hard disk of a real PC is nothing less than thrilling – after all, making a mistake at this point could mean that you happily overwrite data on your hard drive randomly and wipe out important data on your test machine.

So working with real hard drives is nothing for the faint of heart – but course, no operating system would be complete without that ability. In this post, we look at the basic concepts behind the communication between the operating system and the hard drive.

First, let us see how data on a hard drive is stored. The most basic conceivable hard drive consists of a thin disk, called a platter, on which the data is stored, and a device called head that is able to read and write data. The disk is spinning and the head can be moved forth and back across the disk, so that it can be positioned at an arbitrary distance from the center of the disk.

Physically, the disk is organized in concentric tracks. When the head is positioned at a certain distance from the center of the disk, it can therefore access an individual track. Tracks are further separated into sectors, as shown in the upper part of the diagram below (taken from this Wikipedia page). Finally, a block is the intersection of a sector with a track. So to identify a specific block, you could use the combination of a sector and the track on which the block is located.

Cylinder_Head_Sector

This simple design is actually wasting some space, as you could only use one side of the disk. If you want to store data on both sides, you need two heads – one below and one above the disk. You could even increase the capacity further by building a stack of platters, which heads being located between the platters so that each side of a platter corresponds to one head reading from it and writing to it, as displayed in the lower part of the diagram.

In this setup, the union of tracks locates at corresponding positions on all platters is called a cylinder. If you know the cylinder and the head, you know the track. You could therefore specify the location of a block on the disk by the combination of a cylinder, a head and a sector. This addressing method is therefore called the CHS addressing and was the first addressing mode used on early IBM PCs and compatible machines.

Later, the actual geometry of the hard drive became more complicated, and addressing schemes that decouple the actual geometry of the hard drive from the address were introduced, most notably the so-called logical block addressing (LBA). In that addressing mode, which is the current standard, a hard disk is thought of as a sequence of blocks, numbered starting with zero. To access a block, you simply specify the number of the block. The hard disk controller, i.e the circuitry that is actually operating the hard drive and sitting between the CPU and the actual hard drive, will convert this LBA number into the actual physical location of the data on the drive.

So how do we actually access a block? The easiest way to do this is usually called programmed input output (PIO). To explain this, we first need to recall that a CPU is not only accessing the system memory, but can also talk to devices using designated channels, so called ports. If the CPU writes to a port, the data will be sent to the device, and vice versa data written by the device can be accessed by the CPU by reading from a port.

So to read from a hard drive, we could proceed as follows. First, we would write a command to a specific port associated with the hard drive. This command would tell the hard drive what to do – for instance to read – and it would contain the LBA number of the block we want to read. Then the hard drive controller would perform the actual read operation, and once complete, would send the data back to the CPU that would then read it from a port.

The format of the commands and the exact interaction between the CPU and the device in this process were defined in a standard called ATA, also called IDE or Parallel ATA (PATA). This specification did not only describe the interaction between the hard disk controller and the CPU, but also the low-level electronics, cables, connectors and so forth.

This method of reading and writing data is simple, but has a major disadvantage – it keeps the CPU busy. In the worst case, the CPU has to write the command, wait in a busy loop and then read the data one word at a time – and if you wanted to read several blocks, you had to do this over and over again. Not surprising that alternatives were developed soon.

The first improvement we can make is to make the operation interrupt driven. In this version, the CPU would send a command and could then go off and do something else while the hard disk controller is working. Once the data has been read, the controller informs the CPU that data is available by raising an interrupt, and the CPU can go ahead and fetch the data. This is already more efficient, but still suffers from the fact that the data has to be read byte for byte.

Consequently, the next step in the development was what is called Direct Memory Access (DMA). With DMA, the hard disk controller can communicate directly with the memory controller and transfer the data read directly into a designated area of the systems main memory without any involvement of the CPU. Only once all the data has been read, an interrupt is raised to raised to inform the CPU that the data has been transferred and can be worked with. With this approach, large blocks of data can be read and written while the CPU is able to work on other threads.

In 2000, a serial version of the ATA standard called Serial ATA (SATA) was introduced. Due to the serial processing, SATA allows much higher transfer rates between the hard drive and the motherboard and therefore a much higher I/O throughput and soon involved into the de-facto standard, also for consumer PCs. A SATA device can still operate in a so-called legacy mode, where it is presented to the CPU as a PATA/IDE device. However, to take advantages of some new features of the SATA protocol, for instance a higher number of devices per controller, a new protocol for the communication between the CPU and the controller needs to be used which is called AHCI. The AHCI protocol is significantly more complicated than the ATA protocol, but in fact both protocols still follow the same logical steps to read and write data.

Let us now combine this understanding with what we have learned about interrupts and multitasking to see how the operating system would typically orchestrate a disk drive access, for instance to read data from the disk. The following diagram illustrates the process.

HardDriveRead

First, the operating system reserves an area of system memory for the DMA transfer. Then, the kernel prepares the command. The format of the command depends on the used protocol (AHCI or legacy IDE), but typically, the command would encompass the location of the data on the disk to be read, the number of blocks to be read and the location of the DMA buffer. Then the command is sent to the hard drive.

At this point in time, the CPU is free to work on something else, so the kernel would typically put the current task to sleep and re-schedule so that a different task can start executing.

While this happens, the hard drive starts to execute the command. It reads the requested data and places it in the DMA buffer prepared by the OS. Then it raises an interrupt to indicate that the data has arrived in main memory.

Once the operating system receives the interrupt, it will stop the currently executing task and switch back to the task that requested the read. This task can now continue execution and work with the data.

Of course, the real picture is more complicated. What the image above calls the “device”, for instance, is in reality a combination of the actual hard disk controller and components of your motherboards chip set, like the AHCI controller or the DMA controller. In addition, several hard drives can operate in parallel, so the kernel needs a way to keep track of the currently executing requests and map incoming interrupts to the requests currently in flight. And, in an SMP environment with more than one CPU, semaphores or other locking mechanism need to be in place to make sure that different requests do not collide. If you are interested in all those details, you might want to take a look at the documentation of the ctOS hard disk driver.

We now understand how a hard drive and an operating system kernel communicate to read or write sectors. However, an application does of course not see individual blocks of a hard disk, instead the hard disk appears as a collection of files and directories. The translation between this logical view and the low-level view is done by the file system at which we will look in the next post in this series.

Why building an operating system from scratch?

What happens if you turn on a PC? How is an operating system able to run multiple tasks in parallel? What happens if you hit a key on your keyboard? And what actually is a process? If you have ever thought for more than a second about one of these things, then read on…

A few years back, actually in late 2010 or early 2011, I was at a point where my growing interest in computer science naturally lead me to those questions. So I did what most of us would probably do – I started to search for information on this and tried to understand what I got.

One day, I hit upon a tutorial that explained the boot process of a PC, and actually contained a short instruction on how to write a very simple program that would act as a boot loader – you would copy this program to a floppy disk (yes, that still existed in some machines at this time) or a CD-ROM and boot from there, and – voila – instead of a Windows or Linux coming up, the system would run your program and print some deep wisdom like “Hello World!” onto the screen.

Now printing something is easy, but what if I wanted the user to be able to enter something and print a response? Obviously, this requires some sort of communication with the keyboard and there is no operating system around that can help you with that, so I needed to learn about a thing called keyboard controller. But wait, why could I only use a few kB of memory? I found that I needed to switch from real mode to protected mode, install interrupt handlers … and slowly, the attempt to just build something that would boot turned into more.

Over the next roughly 18 months, I continued to walk down this path, and eventually had build a small Unix operating system kernel. I then added a simple command line interface, graphics mode and a C library and eventually got to a point where I was able to take simple programs originally written for Linux or other Unix-like operating systems, and compile and run them on my new operating system, which I did call ctOS. Here is a screenshot – my OS is running in a virtual machine on the left, and on the right you see a few measurements of the physical CPU load while running some multi-threading tests within ctOS.

ctOS_SMP_Test

As you might imagine, this required quite some work, and I eventually got to a point where I spend more time working on other things and further development came to an end.

Earlier this year, I was cleaning up a few things on my hard drive and – more or less by accident – hit upon my old code archive again. Being a curious person, I was wondering whether I would still be able to compile and run the system – and of course it badly failed when I tried it. This was not really surprising – my development environment had changed from a 32 bit system to a 64 bit system, and the emulators that I used to run and test the OS had changed quite a bit as well, fixing defects that did actually hide away some bugs in my OS. And of course the “average PC” had changed quite a lot since 2011. The BIOS is now a thing of the past, PCs use UEFI to boot and ACPI tables to store their configuration instead of BIOS tables, and you will hardly ever find a system with a floppy disk drive and only one CPU any more.

So I decided to invest some time to make my OS work again and to adapt it to the current generation of hardware. This included building a new development environment, fixing some defects that become suddenly apparent as the emulator had changed, making the system ACPI aware, removing all dependencies on the BIOS etc. I did also clean up the source code and migrated everything to a Github repository and added a few additional ports.

I have a realistic view on how much time I will probably have in the future to further develop and improve my OS. However, it can still serve a purpose – education. After all, this is what it was created for: helping me to understand some of the inner workings of an operating system.

In fact, my plan is to publish a series of blog posts on the principles and structures behind a Unix operating system, using ctOS to illustrate these concepts and linking to the documentation I have created over the years for the details. This will cover topics like the overall structure of an operating system, processes and the scheduler, interrupts, memory management, device driver and the networking stack. So if you ever wanted to understand how an operating system really works this is the time for you to learn it – stay tuned and join me again for the first part of the series. Until then, you might want to check out the ctOS source code on Github and the documentation that comes with it – if you want, you can even download some of the binary packages on the release page and play with it.