17 min read

This is a guest post by Mihalis Tsoukalos. Mihalis is a Unix administrator, programmer, and Mathematician who enjoys writing. He is the author of Go Systems Programming from which this Go programming tutorial is taken.

What is Go?

Back when UNIX was first introduced, the only way to write systems software was by using C; nowadays you can program systems software using programming languages including Go. Apart from Go, other preferred languages for developing system utilities are Python, Perl, Rust and Ruby.

Go is a modern generic purpose open-source programming language that was officially announced at the end of 2009, was begun as an internal Google project and has been inspired by many other programming languages including C, Pascal, Alef and Oberon. Its spiritual fathers are Robert Griesemer, Ken Thomson and Rob Pike that designed Go as a language for professional programmers that want to build reliable and robust software. Apart from its syntax and standard functions, Go comes with a pretty rich and convenient standard library.

What is systems programming?

Systems programming is a special area of programming on UNIX machines. Please note that Systems programming is not limited to UNIX machines. Most commands that have to do with System Administration tasks such as disk formatting, network interface configuration, module loading, kernel performance tracking, and so on, are implemented using the techniques of Systems Programming.

Additionally, the /etc directory, which can be found on all UNIX systems, contains plain text files that deal with the configuration of a UNIX machine and its services and are also manipulated using systems software. You can group the various areas of systems software and related system calls in the following sets:

  1. File I/O: This area deals with file reading and writing operations, which is the most important task of an operating system. File input and output must be fast and efficient and, above all, it must be reliable.
  2. Advanced File I/O: Apart from the basic input and output system calls, there are also more advanced ways to read or write a file including asynchronous I/O and non-blocking I/O.
  3. System files and Configuration: This group of systems software includes functions that allow you to handle system files such as /etc/password and get system specific information such as system time and DNS configuration.
  4. Files and Directories: This cluster includes functions and system calls that allow the programmer to create and delete directories and get information such as the owner and the permissions of a file or a directory.
  5. Process Control: This group of software allows you to create and interact with UNIX processes.
  6. Threads: When a process has multiple threads, it can perform multiple tasks. However, threads must be created, terminated and synchronized, which is the purpose of this collection of functions and system calls.
  7. Server Processes: This set includes techniques that allow you to develop server processes, which are processes that get executed in the background without the need for an active terminal. Go is not that good at writing server processes in the traditional UNIX way – but let me explain this a little more. UNIX servers like Apache use fork(2) to create one or more children processes; this process is called forking and refers to cloning the parent process into a child process and continue executing the same executable from the same point and, most importantly, sharing memory. Although Go does not offer an equivalent to the fork(2) function this is not an issue because you can use goroutines to cover most of the uses of fork(2).
  8. Interprocess Communication: This set of functions allows processes that run on the same UNIX machine to communicate with each other using features such as pipes, FIFOs, message queues, semaphores and shared memory.
  9. Signal Processing: Signals offer processes a way of handling asynchronous events, which can be very handy. Almost all server processes have extra code that allows them to handle UNIX signals using the system calls of this group.
  10. Network Programming: This is the art of developing applications that work over computer networks with the he€lp of TCP/IP and is not Systems programming per se. However, most TCP/IP servers and clients are dealing with system resources, users, files and directories so most of the times you cannot create network applications without doing some kind of Systems programming.

The challenging thing with Systems programming is that you cannot afford to have an incomplete program; you can either have a fully working, secure program that can be used on a production system or nothing at all. This mainly happens because you cannot trust end users and hackers! The key difficulty in systems programming is the fact that an erroneous system call can make your UNIX machine misbehave or, even worst, crash it!

Most security issues on UNIX systems usually come from wrongly implemented systems software because bugs in systems software can compromise the security of an entire system. The worst part is that this can happen many years after using a certain piece of software!

Systems programming examples with Go

Printing the permission of a file or a directory

With the help of the ls(1) command, you can find out the permissions of a file:

$ ls -l /bin/ls
-rwxr-xr-x  1 root  wheel  38624 Mar 23 01:57 /bin/ls

The presented Go program, which is named permissions.go, will teach you how to print the permissions of a file or a directory using Go and will be presented in two parts. The first part is the next:

package main

import (
        "fmt"
        "os"
)

func main() {
        arguments := os.Args
        if len(arguments) == 1 {
                fmt.Println("Please provide an argument!")
                os.Exit(1)
        }

        file := arguments[1]

The second part contains the important Go code:

        info, err := os.Stat(file)
        if err != nil {
                fmt.Println("Error:", err)
                os.Exit(1)
        }
        mode := info.Mode()
        fmt.Print(file, ": ", mode, "n")
}

Once again most of the Go code is for dealing with the command line argument and making sure that you have one! The Go code that does the actual job is mainly the call to the os.Stat() function, which returns a FileInfo structure that describes the file or directory examined by os.Stat(). From the FileInfo structure you can discover the permissions of a file by calling the Mode() function. Executing permissions.go creates the following kind of output:

$ go run permissions.go /bin/ls
/bin/ls: -rwxr-xr-x
$ go run permissions.go /usr
/usr: drwxr-xr-x
$ go run permissions.go /us
Error: stat /us: no such file or directory
exit status 1

How to write to files using fmt.Fprintf()

The use of the fmt.Fprintf() function allows you to write formatted text to files in a way that is similar to the way the fmt.Printf() function works.

The Go code that illustrates the use of fmt.Fprintf() will be named fmtF.go and is going to be presented in three parts. The first part is the expected preamble of the program:

package main

import (
        "fmt"
        "os"
)
The second part has the next Go code:
func main() {
        if len(os.Args) != 2 {
                fmt.Println("Please provide a filename")
                os.Exit(1)
        }

        filename := os.Args[1]
        destination, err := os.Create(filename)
        if err != nil {
                fmt.Println("os.Create:", err)
                os.Exit(1)
        }
        defer destination.Close()

First, you make sure that you have one command line argument before continuing. Then, you read that command line argument and you give it to os.Create() in order to create it! Please note that the os.Create() function will truncate the file if it already exists.

The last part is the following:

        fmt.Fprintf(destination, "[%s]: ", filename)
        fmt.Fprintf(destination, "Using fmt.Fprintf in %sn", filename)
}

Here, you write the desired text data to the file that is identified by the destination variable using fmt.Fprintf() as if you were using the fmt.Printf() method. Executing fmtF.go will generate the following output:

$ go run fmtF.go test
$ cat test
[test]: Using fmt.Fprintf in test

In other words, you can create plain text files using fmt.Fprintf().

Developing wc(1) in Go

The principal idea behind the code of the wc.go program is that you read a text file line by line until there is nothing left to read. For each line you read you find out the number of characters and the number of words it has. As you need to read your input line by line, the use of bufio is preferred instead of the plain io because it simplifies the code. However, trying to implement wc.go on your own using io would be a very educational exercise.

But first you will see the kind of output the wc(1) utility generates:

$ wcwc.gocp.go
      68     160    1231wc.go
      45     112     755cp.go
     113     272    1986 total

So, if wc(1) has to process more than one file, it automatically generates summary information.

Counting words

The trickiest part of the implementation is word counting, which is implemented using Go regular expressions:

r := regexp.MustCompile("[^s]+")
    for range r.FindAllString(line, -1) {
        numberOfWords++
}

What the provided regular expression does is separating the words of a line based on whitespace characters in order to count them afterwards!

The code!

After this little introduction, it is time to see the Go code of wc.go, which will be presented in five parts. The first part is the expected preamble:

import (
        "bufio"
        "flag"
        "fmt"
        "io"
        "os"
        "regexp"
)

The second part is the implementation of the count() function, which includes the core functionality of the program:

func count(filename string) (int, int, int) {
        var err error
        varnumberOfLinesint
        varnumberOfCharactersint
        varnumberOfWordsint
        numberOfLines = 0
        numberOfCharacters = 0
        numberOfWords = 0

        f, err := os.Open(filename)
        if err != nil {
                fmt.Printf("error opening file %s", err)
                os.Exit(1)
        }
        defer f.Close()

        r := bufio.NewReader(f)
        for {
                line, err := r.ReadString('n')

                if err == io.EOF {
                        break
                } else if err != nil {
                        fmt.Printf("error reading file %s", err)
                }

                numberOfLines++
                r := regexp.MustCompile("[^s]+")
                for range r.FindAllString(line, -1) {
                        numberOfWords++
                }
                numberOfCharacters += len(line)
        }

        return numberOfLines, numberOfWords, numberOfCharacters
}

There exist lot of interesting things here. First of all, you can see the Go code presented in the previous section for counting the words of each line. Counting lines is easy because each time the bufio reader reads a new line the value of the numberOfLines variable is increased by one. The ReadString() function tells the program to read until the first occurrence of a ‘n’ in the input – multiple calls to ReadString() mean that you are reading a file line by line. Next, you can see that the count() function returns three integer values. Last, counting characters is implemented with the help of the len() function that returns the number of characters in a given string, which in this case is the line that was read. The for loop terminates when you get the io.EOF error message, which signifies that there is nothing left to read from the input file.

The third part of wc.go starts with the beginning of the implementation of the main() function, which also includes the configuration of the flag package:

func main() {
        minusC := flag.Bool("c", false, "Characters")
        minusW := flag.Bool("w", false, "Words")
        minusL := flag.Bool("l", false, "Lines")

        flag.Parse()
        flags := flag.Args()

        if len(flags) == 0 {
                fmt.Printf("usage: wc<file1> [<file2> [... <fileN]]n")
                os.Exit(1)
        }

        totalLines := 0
        totalWords := 0
        totalCharacters := 0
        printAll := false

        for _, filename := range flag.Args() {

The last for statement is for processing all input files given to the program. The wc.go program supports three flags: the -c flag is for printing the character count, the -w flag is for printing the word count and the -l flag is for printing the line count.

The fourth part is the next:

numberOfLines, numberOfWords, numberOfCharacters := count(filename)

                totalLines = totalLines + numberOfLines
                totalWords = totalWords + numberOfWords
                totalCharacters = totalCharacters + numberOfCharacters

                if (*minusC&& *minusW&& *minusL) || (!*minusC&& !*minusW&& !*minusL) {
                        fmt.Printf("%d", numberOfLines)
                        fmt.Printf("t%d", numberOfWords)
                        fmt.Printf("t%d", numberOfCharacters)
                        fmt.Printf("t%sn", filename)
                        printAll = true
                        continue
                }

                if *minusL {
                        fmt.Printf("%d", numberOfLines)
                }

                if *minusW {
                        fmt.Printf("t%d", numberOfWords)
                }

                if *minusC {
                        fmt.Printf("t%d", numberOfCharacters)
                }

                fmt.Printf("t%sn", filename)
        }

This part deals with the printing of the information on a per file basis depending on the command line flags. As you can see, most of the Go code here is for handling the output according to the command line flags. The last part is the following:

        if (len(flags) != 1) &&printAll {
               fmt.Printf("%d", totalLines)
               fmt.Printf("t%d", totalWords)
               fmt.Printf("t%d", totalCharacters)
               fmt.Println("ttotal")
               return
        }

        if (len(flags) != 1) && *minusL {
                fmt.Printf("%d", totalLines)
        }

        if (len(flags) != 1) && *minusW {
                fmt.Printf("t%d", totalWords)
        }

        if (len(flags) != 1) && *minusC {
                fmt.Printf("t%d", totalCharacters)
        }

        if len(flags) != 1 {
                fmt.Printf("ttotaln")
        }
}

This is where you print the total number of lines, words and characters read according to the flags of the program. Once again, most of the Go code here is for modifying the output according to the command line flags. Executing wc.go will generated the following kind of output:

$ go build wc.go
$ ls -l wc
-rwxr-xr-x  1 mtsouk  staff  2264384 Apr 29 21:10 wc
$ ./wcwc.gosparse.gonotGoodCP.go
120        280        2319        wc.go
44        98        697        sparse.go
27        61        418        notGoodCP.go
191        439        3434        total
$ ./wc -l wc.gosparse.go
120        wc.go
44        sparse.go
164        total
$ ./wc -w -l wc.gosparse.go
120        280        wc.go
44        98        sparse.go
164        378        total

If you do not execute go build wc.go in order to create an executable file, then executing go run wc.go using Go source files as arguments will fail because the compiler will try to compile the Go source files instead of treating them as command line arguments to the go run wc.go command:

$ go run wc.gosparse.go
# command-line-arguments
./sparse.go:11: main redeclared in this block
        previous declaration at ./wc.go:49
$ go run wc.gowc.go
package main: case-insensitive file name collision:
"wc.go" and "wc.go"
$ go run wc.gocp.gosparse.go
# command-line-arguments
./cp.go:35: main redeclared in this block
        previous declaration at ./wc.go:49
./sparse.go:11: main redeclared in this block
        previous declaration at ./cp.go:35

Additionally, trying to execute wc.go on a Linux system with Go version 1.3.3 will fail because it uses features of Go that can be found in newer versions – if you use the latest Go version you will have no problem running wc.go. The error message you will get will be the following:

$ go version
go version go1.3.3 linux/amd64
$ go run wc.go
# command-line-arguments
./wc.go:40: syntax error: unexpected range, expecting {
./wc.go:46: non-declaration statement outside function body
./wc.go:47: syntax error: unexpected }

Reading a text file character by character

Although reading a text file character by character is not needed for the development of the wc(1) utility, it would be good to know how to implement it in Go. The name of the file will be charByChar.go and will be presented in four parts.

The first part comes with the following Go code:

import (
        "bufio"
        "fmt"
        "io/ioutil"
        "os"
        "strings"
)

Although charByChar.go does not have many lines of Go code, it needs lots of Go standard packages, which is a naïve indication that the task it implements is not trivial. The second part is:

func main() {
        arguments := os.Args
        if len(arguments) == 1 {
                fmt.Println("Not enough arguments!")
                os.Exit(1)
        }
        input := arguments[1]
The third part is the following:
        buf, err := ioutil.ReadFile(input)
        if err != nil {
                fmt.Println(err)
                os.Exit(1)
        }

The last part has the next Go code:

        in := string(buf)
        s := bufio.NewScanner(strings.NewReader(in))
        s.Split(bufio.ScanRunes)

        for s.Scan() {
                fmt.Print(s.Text())
        }
}

ScanRunes is a split function that returns each character (rune) as a token. Then the call to Scan() allows us to process each character one by one. There also exist ScanWords and ScanLines for getting words and lines scanned, respectively. If you use fmt.Println(s.Text()) as the last statement to the program instead of fmt.Print(s.Text()), then each character will be printed in its own line and the task of the program will be more obvious. Executing charByChar.go generates the following kind of output:

$ go run charByChar.go test
package main
…

The wc(1) command can verify the correctness of the Go code of charByChar.go by comparing the input file with the output generated by charByChar.go:

$ go run charByChar.go test | wc
      32      54     439
$ wc test
      32      54     439 test

How to create sparse files in Go

Big files that are created with the os.Seek() function may have holes in them and occupy fewer disk blocks than files with the same size but without holes in them; such files are called sparse files. This section will develop a program that creates sparse files.

The Go code of sparse.go will be presented in three parts. The first part is:

package main

import (
        "fmt"
        "log"
        "os"
        "path/filepath"
        "strconv"
)

The second part of sparse.go has the following Go code:

func main() {
        if len(os.Args) != 3 {
                fmt.Printf("usage: %s SIZE filenamen", filepath.Base(os.Args[0]))
                os.Exit(1)
        }

        SIZE, _ := strconv.ParseInt(os.Args[1], 10, 64)
        filename := os.Args[2]

        _, err := os.Stat(filename)
        if err == nil {
                fmt.Printf("File %s already exists.n", filename)
                os.Exit(1)
        }

The strconv.ParseInt() function is used for converting the command line argument that defines the size of the sparse file from its string value to its integer value. Additionally, the os.Stat() call makes sure that you will not accidentally overwrite an existing file. The last part is where the action takes place:

        fd, err := os.Create(filename)
        if err != nil {
                log.Fatal("Failed to create output")
        }

        _, err = fd.Seek(SIZE-1, 0)
        if err != nil {
                fmt.Println(err)
                log.Fatal("Failed to seek")
        }

        _, err = fd.Write([]byte{0})
        if err != nil {
                fmt.Println(err)
                log.Fatal("Write operation failed")
        }

        err = fd.Close()
        if err != nil {
                fmt.Println(err)
                log.Fatal("Failed to close file")
        }
}

First, you try to create the desired sparse file using os.Create(). Then, you call fd.Seek() in order to make the file bigger without adding actual data. Last, you write a byte to it using fd.Write(). As you do not have anything more to do with the file, you call fd.Close() and you are done. Executing sparse.go generates the following output:

$ go run sparse.go 1000 test
$ go run sparse.go 1000 test
File test already exists.
exit status 1

How can you tell whether a file is a sparse file or not? You will learn in a while, but first let us create some files:

$ go run sparse.go 100000 testSparse
$ dd if=/dev/urandom  bs=1 count=100000 of=noSparseDD
100000+0 records in
100000+0 records out
100000 bytes (100 kB) copied, 0.152511 s, 656 kB/s
$ dd if=/dev/urandom seek=100000 bs=1 count=0 of=sparseDD
0+0 records in
0+0 records out
0 bytes (0 B) copied, 0.000159399 s, 0.0 kB/s
$ ls -l noSparse DDsparse DDtestSparse
-rw-r--r-- 1 mtsoukmtsouk 100000 Apr 29 21:43 noSparseDD
-rw-r--r-- 1 mtsoukmtsouk 100000 Apr 29 21:43 sparseDD
-rw-r--r-- 1 mtsoukmtsouk 100000 Apr 29 21:40 testSparse

So, how can you tell if any of the three files is a sparse file or not? The -s flag of the ls(1) utility shows the number of file system blocks actually used by a file. So, the output of the ls -ls command allows you to detect if you are dealing with a sparse file or not:

$ ls -ls noSparse DDsparse DDtestSparse
104 -rw-r--r-- 1 mtsoukmtsouk 100000 Apr 29 21:43 noSparseDD
  0   -rw-r--r-- 1 mtsoukmtsouk 100000 Apr 29 21:43 sparseDD
  8   -rw-r--r-- 1 mtsoukmtsouk 100000 Apr 29 21:40 testSparse

Now look at the first column of the output. The noSparseDD file, which was generated using the dd(1) utility, is not a sparse file. The sparseDD file is a sparse file generated using the dd(1) utility. Last, the testSparse is also a sparse file that was created using sparse.go.


Mihalis Tsoukalos is a Unix administrator, programmer, DBA and mathematician who enjoys writing. He is currently writing Mastering Go. His research interests include programming languages, databases and operating systems. He holds a B.Sc in Mathematics from the University of Patras and an M.Sc in IT from University College London (UK). He has written various technical articles for Sys Admin, MacTech, C/C++ Users Journal, Linux Journal, Linux User and Developer, Linux Format and Linux Voice.

LEAVE A REPLY

Please enter your comment!
Please enter your name here