Who are the Ramones?

Basically, an early punk band that's known for playing a three-chord
progression.  If you're learning to play guitar and want to play some
relatively "easy" but somewhat well-known "classic" songs, start with
the Ramones.  See http://www.ramonesonline.com/faq.htm

However, my personal favorite of the "same last name" bands is the
BoDeans, named after Jethro of the Beverly Hillbillies.


What's the difference between sector, block, and page?

A sector is the smallest unit of disk access, and is usually 512
bytes. A page is the unit of memory access that the hardware designers
and operating system have designated as being convenient. Memory is
split into pages, and lots of bookkeeping takes place on whole pages
only. For example, the OS never sees the malloc calls the library
makes, but just allocates memory at page granularity. Since sectors
are generally too small for reducing overhead, the filesystem
generally creates blocks out of the sectors, and performs bookkeeping
on a block basis. Blocks are often select to be the same size as
pages.


What's a bit map, and how does it work?

There are lots of times that the bookkeeping you need to perform
contains only one bit of information, such as true/false, on/off, or
used/empty. Keeping these values in integers is convenient, but wastes
lots of memory. Instead, by treating part of memory as an array of
bits, you can keep the same information in a lot less space. So, a bit
map is an array of bits, usually packed into integers or characters.
So, for us, bit N in the free space bitmap tells us if block N on the
disk is in available.


What is cylinder-major order, and why are blocks numbered in
cylinder-major order in the free bitmap?

Any sector on disk can be identified by three things - its platter,
the track # on that platter, and the sector # within the track. So, to
reduce this three-dimensional array to a single dimension for the
bitmap requires that we pick an order to walk the dimensions. Let's
assume the first set of bits are all of the sectors on track 0,
platter 0. We then have to decide whether we move to track 1 or
platter 1. If we choose track 1, then it means that we're going across
the tracks on the same platter first. However, if we choose to go to
platter 1 first, it means we're going across the same track on all the
platters. The set of the same track number on all platters is known as
a cylinder, and cylinder-major means that we're choosing to organize
the bitmap in this manner.

The reason for choosing cylinder-major order is that if you start
walking down the bitmap taking free sectors in order, then it's likely
that they'll all end up on the same cylinder. If they end up on the
same cylinder, no extra seeking is required. If you organized them to
be on the same platter instead, you'd have to do track-to-track seeks.


When disks lie about their actual geometry, is the translation taking
place at the device driver level?

No - the translation is taking place on the disk firmware itself.  You
generally want to avoid having to ship different device drivers with
each disk, so it's simpler to just have a single device driver that
works for all ATA/IDE disks, and have the disks do their own
translation.


Why have inodes smaller than a sector? Why not just make inodes
sector-sized?

If you look at the information an inode "needs", it comes to less than
80 bytes. So, most of the rest of the space on unix is consumed with
pointers to the "direct blocks".  I believe there are arout 13 of
these direct block pointers for standard inodes. If the block size is
4KB, then that means that files < 52KB don't need any indirect blocks
- the inode points to all of the blocks of the file.

So, making the inode sector-sized would give us more space for direct
blocks, but would only benefit files larger than 52KB. At the same
time, it would waste a lot of space. If you assume that the average
file size is 10KB, that means that 128 bytes of inodes is used for
every 10KB of disk space. That's 1% of space used for inodes.
Increasing inodes to 512 bytes would mean a 4% loss.


Are inodes always in the same place on disk?

Traditionally, they're allocated when the filesystem is created
(conceptually when the disk gets initialized), and there's a fixed
number of them. So, yes, they tend to be in the same places on disk.
File systems with large numbers of files will need lots of inodes, and
filesystems with larger files need fewer inodes, so the number of
inodes can usually be specified. See "newfs" on Solaris or "mke2fs" on
Linux.


Does the filesystem cache inodes?

Any modern Unix system will cache recently-used inodes, and will
almost always keep the inodes for all currently open files in memory
as long as the file is open.


Do directories have inode numbers?

All files have their corresponding inodes, and directories are treated
like files, so directories will also have inodes. When a directory
file contains an entry for a subdirectory, it contains the name of
that subdirectory and its inode number.


When we want to read a file, does the filesystem have to traverse the
whole directory tree? Can't we do a hash on the absolute pathname or
something like that?

The first time the filesystem accesses that particular path, it will
have to traverse the whole directory tree. Once that's done, it can
cache the appropriate parts and avoid some painful lookups.


Does the disk inode get deleted if you create and delete a temp file?

If you've got the temp file open and are using it, the inode is being
kept open. The exact mechanism can vary from system to system, but
basically involves having the concept of a on-disk reference count and
an in-memory reference count. A simple trick to convince yourself that
this works is to start a program and delete its executable file while
it's running.


What does "rm -i" do?

It prompts you for confirmation before deleting the file. It's short
for "interactive".


Who picks the head of the FSF?

Battle to the death? I don't know.


Can you explain again the problem of restoring a disk dump onto a disk
with a different geometry?

The dumps and restores just work on a stream of sectors and don't care
about where those sectors lie on the disks. So, think about the
replication of the superblock. It generally gets replicated on each
platter, on a different track on each platter, etc.  If you save this
information and load it to a different disk, the superblocks won't be
in the places you expect, since the disk geometries are different.


Why does scandisk run a lot faster than 11+ hours?

Because these recovery programs run as root, and can read inodes,
etc., directly rather than having to walk through the filesystem.
So, they can optimize how they access the disk more than a regular
application could.


Since we mentioned scandisk, does what we're doing relate to
defragmentation?

Defragmentation programs need to get the same low-level information
about disk layout and usage that the fsck program and its counterparts
need. However, they need to do different things with the info.


Is it possible to crash with a sector partially written?

Disks usually provide some sort of guarantees about what the behavior
is when power is lost. Very often, that behavior is that the disk has
enough energy to write out any sector that was being written. This may
be done by using a capacitor to store energy, or by using the spinning
of the platters.


Can a user belong to more than one group in Unix?

Yes. So, when the filesystem is checking the file permissions, it
basically needs to say "does the user belong to the group that owns
this file", and may need to check a list of groups for that user.


What did you do at Company X and why did you leave it?

I was in a group that ported mainframe-based engineering programs to
PCs, and did everything from numerical optimization to user interface
design. However, after a while, I ran out of interesting things to do,
and various projects in the company were failing due to poor
computer-knowledgable oversight. Management decided that it would be
better to have me as an internal consultant doing oversight on these
projects rather than writing code. That wasn't interesting to me, so I
wrapped up my projects and left for grad school.


What kind of computers have hard driver platters big enough to fit
around Stallman's head?

Remember that bit density wasn't as high back then, so making
physically large platters was necessary to get high capacity. I've
seen people using old mainframe platters as coffetables in offices.


In the Unix paper, how do they determine what is rotationally optimal?

Trial and error. In the old days, you had to "interleave" the sectors.
It was easy to tell when your interleave value was too small - when it
was, reading a series of N sectors would require N full rotations of
the drive. When the interleave was right, you instead got all of the
sectors at the rate of (transfer rate)/interleave, which was much
higher. Track buffers now make all of this less useful.


Doesn't the system generate an interrupt when it senses that you've
lost power and try to back up memory?

If the system has a UPS attached to it, that device will generate the
interrupt. Otherwise, the system has no way of doing it itself.


There's a big debate going on in the Mac community about extended
metadata. What do I think about that?

Mac file generally had two "forks" - the program fork and the resource
fork. All of the data files for things like images, fonts, etc., could
be stored with the program on the resource fork. What this means at
some low level is that each "file" basically has to glob together two
files, one of which is highly structured. Complexity is ugly, and you
can generally wrap this stuff in libraries, so in the end, it's
probably a lot of huffing and puffing without a lot of difference.


On slide 14, why was the disk inode box inside the memory inode box?

The structure for the in-memory inode usually has a substructure that
contains the entire on-disk inode. The disk-inode is usually copied into
this structure when the file is opened.


In the reading 10 bytes example, if you read 10 bytes again, will it
come from memory or from disk?

Assuming that the system has a cache, the second 10 bytes will come
from the cache, rather than having to be read from the disk again.


Why would you have to read anything at all when issuing a write, and
how is this related to block/page size?

Since you can only write in multiples of the sector size, and since
the filesystem may be operating on multiples of the block size, in
order to perform a write, you first need to read in the unit of
granularity, modify the part you want, and write out the same unit of
granularity. Otherwise, if you're writing into the middle of a
pre-existing file, you'll destroy the surrounding data,


What's the difference again between writing 10 bytes and 4096 bytes?

If the large write is at the same unit of granularity and isn't
crossing sectors/pages, then you don't have to read in the original
data because none of it will be preserved when you copy in the new
data that you're writing.


Can you clarify the corner cases for writing 4096 bytes?

If the 4096 byte write is not aligned with the sector/page boundaries,
then there will be some sectors that will be a mix of their original
data and the new data. For these sectors, you'll need to read in the
original sector before doing the write.


If you're writing a large file but it's not a multiple of the sector
size, does it need to read in only the last sector/page?

On optimized systems, yes.


Can you explain how "unlinked" inodes are used to create temp files?

Create a file, open it, and then delete the name from the directory.
Since the file is open, you can access it via the file descriptor.


How exactly can the system "lose" space on a power failure when the
temp file trick is used?

Assume that a temporary file was created and was being used after its
name in the directory was deleted. Now, the inode exists and points to
data blocks, but there's no way to delete this inode by name since it
isn't reachable from anywhere in the directory tree.


Do any attacks on a system involve trying to edit directory contents?

Indirectly - the more common ones tend to involve trying to take over
temporary files. Look at the man pages for mktemp and mkstemp. Also,
note that anyone can create a file or symbolic link in the mail spool
with someone else's name. Mail programs generally check for this
attack before delivering mail to a user.


When are blocks returned to the free list?

As soon as the inode is deleted, the blocks used by that file can be
returned to the free list. The question is then one of "when is the
inode deleted," and the answer is "when its on-disk and in-memory
reference counts drop to zero".


What disk-specific data actually need to be in main memory?

The in-memory inode contains the disk inode, some information about
what parts of the file are in memory, and some data blocks for the
file. Likewise, recently-used metadata, such as parts of the free
block bitmap and the directory files, are also likely to be in memory.


Can I explain the diagrams on opening and reading a file? What is the
PCB?

The PCB is the process control block. It's basically the structure
that each process has to keep the information relevant to the process.


Can you explain why you need to know that a directory is a directory?

If I could create my own directory files, I could put in whatever
inode numbers I wanted. Let's say that someone makes their home
directory unreadable. This would normally preclude anyone from getting
into any files/directories below it. However, if I could guess the
inode number for any of their files, I could then put that number in
my directory, and try to bypass their security.


It seems that the bit that distinguishes files from directories can
have other meanings. What are they?

Find me a pointer and I'll try to explain.


What does a lock do?

out of the scope of these notes. They're getting too long.