System Programming

" One vision, one purpose. "

Copyright © Tony's Studio 2020 - 2022


Chapter Five - File System

5.1 Meet File System

“In UNIX Everything is a File.”

5.1.1 Types of Files

Generally, there are 7 types of file. This can be seen at the first character in ls -l command.

character type
d directory
l symbol link
s socket
b bock device
c character device
p pipe
- else, for example text file or executable

5.1.x Trivia

Here’s a little understanding of file system of us. Like composite pattern, directory and file are quite similar in most behaviors, except that directory maintains an entry list of its content. For directory, it will always have two special files: . and .. to represent itself and it parent. For root directory /, these two are the same.

Here is a good reference on Linux file system: https://www.eet-china.com/mp/a38145.html.

5.2 File Operation

5.2.1 Two Types of File I/O

Generally, there are two types of file I/O, file I/O and standard I/O. Standard I/O maintains an extra buffer while file I/O does not have a buffer like the real system call. File I/O follows POSIX standard and standard I/O follows C standard.

pros cons
system call have user buffer, which is read and write first, only call system call when necessary, reduced system call could not read and write in-time, and the buffer size is set and could not change
standard call can read and write a file directly, and can apply custom buffer size too much system calls will make system busy

5.2.2 File Operation

Basically, there are six file operations: open, creat, lseek, read, write, close. Since we’re familiar with these in C, here only system call is shown. And they require fcntl.h, and sys/types.h if some types are used.

5.2.2.1 open

1
2
int open(const char *pathname, int flags, ...);
int open(const char *pathname, int flags, mode_t mode); // The ellipse is for mode

Three essential flags must be one and only one: O_RDONLY, O_WRONLY, O_RDWR.

There are some extra flags to be added.

flags meaning
O_APPEND append data to the end
O_TRUNC if file exist and is opened with O_RDWR, then the file is truncated to empty
O_CREAT if file doesn’t exist, then a new one is created, and the third parameter is required
O_EXCL if O_CREAT is assigned, and the file exists, it will cause an error, exclusive, huh

For the return value, if succeeded, it will return file description, usually used as fd. Otherwise -1 is returned.

The process holds a mapping from fd to file pointer to locate the actual file.

image-20221225110549665

For mode_t, its just the, you know rwxr--r-- stuff, an oct number.

5.2.2.2 creat

Create a file needs mode. open can do this, too. For creat, the return file is write only, and will be emptied if exits. However, we can use open to open it in O_RDWR.

1
2
3
int creat(const char *pathname, mode_t mode);
int open(const char *pathname, O_WRONLY | O_CREAT | O_TRUNC, mode_t mode);
int open(const char *pathname, O_RDWR | O_CREAT | O_TRUNC, mode_t mode);

5.2.2.3 lseek

To change current file pointer offset. off_t is a long int. It will return new file offset or -1 if failed. It won’t cause system call, just change the record in kernal.

1
off_t lseek(int fd, off_t offset, int whence);
whence meaning
SEEK_SET ret = 0 + offset
SEEK_CUR ret = cur + offset
SEEK_END ret = end + offset

So we can get current offset by calling lseek(fd, 0, SEEK_CUR);

lseek may cause void area in a file.

image-20221225110117904

5.2.2.4 read

This is quite easy to understand. This doesn’t care about \0, and return actual size read.

1
size_t read(int fd, void *buff, size_t nbytes);

5.2.2.5 write

This is quite easy to understand. This doesn’t care about \0, and return actual size written.

1
size_t write(int fd, void* buff, size_t nbytes);

5.2.2.6 close

Just close a file by its description.

1
int close(int fd);

5.2.3 Duplication

dup function duplicate the file description, with the minimum file description returned. dup2 does the same thing but the new file description is specially assigned. If the new one is occupied, the previous one will be closed automatically. These just duplicate the file description, they share the same file entity.

1
2
3
#include <fcntl.h>
int dup(int oldfd);
int dup2(int oldfd, int newfd);

Well, file description 0, 1, 2 are occupied by default for user programs, they are stdin, stdou and stderr.

5.3 Directory Operation

“Not that important, I guess?”

Here is an example, which realized a basic ls command.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
// ls.c
#include <stdio.h>
#include <sys/types.h>
#include <dirent.h>

int main(int argc, char* argv[])
{
if (argc != 2)
{
printf("Usage: ./ls directory\n");
return 1;
}

DIR* dir = opendir(argv[1]);
if (dir == NULL)
{
printf("Oops, invalid directory!\n");
return 2;
}

struct dirent* file;
while ((file = readdir(dir)) != NULL)
printf("%s ", file->d_name);
putchar('\n');

closedir(dir);

return 0;
}

There are two types of links in Linux, hard link and soft link (symbol link). Hard link share the same file content, which means they have the same inode, just increase the inode reference count. Symbol link stores target file directory, and is in fact an independent file. Hard link and source file are identical, or it is a replica, while symbol link is like a shortcut in Windows.

To create a link, use ln command, and -s parameter to create symbol link.

1
ln [-s] src_name link_name

" Do or do not. There is no try. "

Copyright © Tony's Studio 2020 - 2022

- EOF -