Published on 2025-10-01
Today I was confronted at work with a bizarre test failure happening only in CI, in a project I do not know. This would only happen in CI, sometimes, probably due to slow I/O and a low number of threads in this environment.
An esteemed colleague of mine hypothesized this was a data race on a file. A component is writing to the file, another component is concurrently reading from this file, and due to improper synchronization, the latter sometimes reads empty or partial data.
My job was to prove it, without knowing anything about the codebase, armed only with the knowledge of the line and test file that fails.
For the impatient, here is the fix.
The real codebase is big but here is a minimal reproducer for this article. It's conceptually quite simple: a goroutine writes to the file, and another goroutine reads from this file as soon as it is present, and parses the data (which is simply a host and port e.g. localhost:4567
). In the real code, the two goroutines are in different components (they might even run in different OS processes now that I think of it) and thus an in-memory synchronization mechanism (mutex, etc) is not feasible:
package main
import (
"fmt"
"math/rand/v2"
"net"
"os"
"time"
)
const fileName = "test.addr"
func retryForever(fn func() error) {
for {
if err := fn(); err != nil {
time.Sleep(10 * time.Millisecond)
} else {
return
}
}
}
func writeHostAndPortToFile() {
addr := fmt.Sprintf("localhost:%d", rand.Uint32N(1<<16))
_ = os.WriteFile(fileName, []byte(addr), 0600)
}
func readHostAndPortFromFile() (string, string, error) {
retryForever(func() error {
_, err := os.Stat(fileName)
return err
})
content, err := os.ReadFile(fileName)
if err != nil {
return "", "", err
}
host, addr, err := net.SplitHostPort(string(content))
return host, addr, err
}
func main() {
os.Remove(fileName)
go writeHostAndPortToFile()
host, addr, err := readHostAndPortFromFile()
if err != nil {
fmt.Fprintf(os.Stderr, "error reading: %v\n", err)
os.Exit(1)
}
fmt.Printf("host=%s addr=%s\n", host, addr)
}
The code tries to (spoiler alert: not quite correctly) spin in a loop to wait for the file to appear, and then reads from it.
When we run this, sometimes it works fine, and sometimes it fails. A clear data race on a shared resource!
$ go run io_race.go
host=localhost addr=28773
$ go run io_race.go
error reading: missing port in address
exit status 1
The astute reader may have already detected a classic TOCTOU issue.
We have two main possible approaches: observe system calls (with the syscall::read:entry
syntax), or observe Go function calls (with the pid$target::os.ReadFile:entry
syntax). Let's do both because it will turn out useful in a moment:
pid$target::os.ReadFile:entry {
this->name = stringof(copyin(arg0, arg1));
ustack();
printf("name=%s\n", this->name);
}
pid$target::os.ReadFile:return {}
pid$target::os.WriteFile:entry {
this->name = stringof(copyin(arg0, arg1));
ustack();
printf("name=%s\n", this->name);
}
pid$target::os.WriteFile:return {}
pid$target::os.Stat:entry {}
pid$target::os.Stat:return {
printf("err=%d\n", arg2);
}
syscall::write:entry
/pid==$target && arg0 > 2/
{
self->trace = 1;
}
syscall::write:return
/pid==$target && self->trace != 0/
{
printf("write return: res=%d\n", arg0);
self->trace = 0;
}
syscall::read:entry
/pid==$target && arg0 > 2/
{
printf("fd=%d\n", arg0);
self->trace = 1;
}
syscall::read:return
/pid==$target && self->trace != 0/
{
printf("read return: res=%d\n", arg0);
self->trace = 0;
}
Let's observe the happy case:
dtrace: script '/Users/philippe.gaultier/scratch/data_race_rw.d' matched 10 probes
host=localhost addr=45564
CPU ID FUNCTION:NAME
5 65087 os.Stat:entry
5 65162 os.Stat:return err=0
7 65081 os.WriteFile:entry
io_race`os.WriteFile
io_race`main.writeHostAndPortToFile+0x88
io_race`runtime.goexit.abi0+0x4
name=test.addr
7 177 write:return write return: res=15
7 65130 os.WriteFile:return
7 65087 os.Stat:entry
7 65162 os.Stat:return err=0
7 62760 os.ReadFile:entry
io_race`os.ReadFile
io_race`main.readHostAndPortFromFile+0x34
io_race`main.main+0x3c
io_race`runtime.main+0x278
io_race`runtime.goexit.abi0+0x4
name=test.addr
7 174 read:entry fd=4
7 175 read:return read return: res=15
7 174 read:entry fd=4
7 175 read:return read return: res=0
7 65044 os.ReadFile:return
dtrace: pid 99037 has exited
The last read returning 0 is normal: this is how os.ReadFile
detects the end of the file.
We see that this case worked out well because the write fully finished before the read started.
And now the error case (note how the syscalls are interleaved differently from the previous case):
dtrace: script '/Users/philippe.gaultier/scratch/data_race_rw.d' matched 10 probes
error reading: missing port in address
CPU ID FUNCTION:NAME
7 532133 os.Stat:entry
7 532134 os.Stat:return err=0
6 532131 os.WriteFile:entry
io_race`os.WriteFile
io_race`main.writeHostAndPortToFile+0x88
io_race`runtime.goexit.abi0+0x4
name=test.addr
8 532133 os.Stat:entry
8 532134 os.Stat:return err=0
8 532129 os.ReadFile:entry
io_race`os.ReadFile
io_race`main.readHostAndPortFromFile+0x34
io_race`main.main+0x3c
io_race`runtime.main+0x278
io_race`runtime.goexit.abi0+0x4
name=test.addr
8 174 read:entry fd=5
8 175 read:return read return: res=0
10 532130 os.ReadFile:return
6 177 write:return write return: res=15
6 532132 os.WriteFile:return
dtrace: pid 432 has exited
Now, the bug triggered because the read operation finished before the write operation could. Thus, the read sees no data (res=0
).
Note the order of operations for this bug to occur:
os.WriteFile
. At this point this file is empty.os.Stat
, and notices the file has been created.Alright, so how do we systematically reproduce the issue, to convince ourselves that this is indeed the root cause? We would like to ideally add write latency, to simulate a slow disk. This is ultimately the main factor: if the write is too slow to complete, the read finishes too early before data (fully) appears.
My first attempt was to use in DTrace the system
action:
void system(string program, ...)
The system action causes the program specified by program to be executed as if it were given to the shell as input.
For example: system("sleep 10")
.
However this did nothing, because this action happens asynchronously, as noted by the docs:
The execution of the specified command does not occur in the context of the firing probe – it occurs when the buffer containing the details of the system action are processed at user-level. How and when this processing occurs depends on the buffering policy, described in Buffers and Buffering.
Alright, let's try something else, maybe chill
?
void chill(int nanoseconds)
The chill action causes DTrace to spin for the specified number of nanoseconds. chill is primarily useful for exploring problems that might be timing related. For example, you can use this action to open race condition windows, or to bring periodic events into or out of phase with one another. Because interrupts are disabled while in DTrace probe context, any use of chill will induce interrupt latency, scheduling latency, and dispatch latency.
Perfect! Let's try it (and remember to use the -w
option on the DTrace command line to allow destructive actions):
syscall::write:entry
/pid==$target/
{
+ chill(500000000);
self->trace = 1;
}
To limit adverse effects on the system, DTrace limits the chill
value to 500 ms:
DTrace will refuse to execute the chill action for more than 500 milliseconds out of each one-second interval on any given CPU.
This means that we increased significantly our odds of this bug occurring, but not to 100%. Still, this is enough.
Note that since goroutines are involved, the number of threads on the system and the behavior of the Go scheduler are also factors.
My biggest, niche pet peeve is seeing the use of stat(2)
before doing an operation on the file. Not only is this unnecessary in nearly every case (the I/O operation such as read(2)
or write(2)
will report EEXIST
if the file does not exist), not only is this a waste of time and battery, not only will many standard libraries do a stat(2)
call anyway as part of reading/writing a file, but it also opens the door to various, hard-to-diagnose, TOCTOU issues, such as here.
I guess a fine use of stat(2)
before reading a file is to discover the file size, in order to reserve capacity in a growable array before reading the file data into it. It is just a hint anyway to avoid lots of reallocations. This is in fact exactly what os.ReadFile
does under the covers. Which makes our use of os.Stat
even more redundant.
Now, let's fix the bug by doing less (as it is often the case, I have found). We can simply try to read the file and parse its content. If we fail, we retry.
--- io_race.go 2025-10-01 18:01:47
+++ io_race_fixed.go 2025-10-01 18:14:51
@@ -25,18 +25,18 @@
_ = os.WriteFile(fileName, []byte(addr), 0600)
}
-func readHostAndPortFromFile() (string, string, error) {
+func readHostAndPortFromFile() (host string, addr string, err error) {
retryForever(func() error {
- _, err := os.Stat(fileName)
+ var content []byte
+ content, err = os.ReadFile(fileName)
+ if err != nil {
+ return err
+ }
+
+ host, addr, err = net.SplitHostPort(string(content))
return err
})
- content, err := os.ReadFile(fileName)
- if err != nil {
- return "", "", err
- }
-
- host, addr, err := net.SplitHostPort(string(content))
return host, addr, err
}
With the patch applied, we see in the DTrace output, even with the write delay, that the bug is fixed. We sometimes do multiple reads until one succeeds with the right data.
There are other possible ways to fix this problem such as using a lockfile, etc, but this is the simplest.
Every time you use a stat(2)
syscall, ask yourself if this is necessary and if a TOCTOU bug is possible.
Also, I'm happy to have discovered the chill
DTrace action to simulate disk (or network, or anything really) latency. I can see myself running the test suite with this on, to detect other cases of TOCTOU. Another tool in the toolbox!
If you enjoy what you're reading, you want to support me, and can afford it: Support me. That allows me to write more cool articles!
This blog is open-source! If you find a problem, please open a Github issue. The content of this blog as well as the code snippets are under the BSD-3 License which I also usually use for all my personal projects. It's basically free for every use but you have to mention me as the original author.