⏴ Back to all articles

Published on 2024-10-29

Tip of the day #1: Count lines of Rust code, ignoring tests

Table of contents

I have a Rust codebase at work. The other day, I was wondering how many lines of code were in there. Whether you use wc -l ***.rs or a more fancy tool like tokei, there is an issue: this will count the source code as well as tests.

That's because in Rust and in some other languages, people write their tests in the same files as the implementation. Typically it looks like that:

// src/foo.rs

fn foo() { 
 ...
}

#[cfg(test)]
mod tests {
    fn test_foo(){
      ...
    }

    ...
}

But I only want to know how big is the implementation. I don't care about the tests. And wc or tokei will not show me that.

So I resorted to my trusty awk. Let's first count all lines, like wc does:

$ awk '{count += 1} END{print(count)}' src/***.rs
# Equivalent to:
$ wc -l src/***/.rs

On my open-source Rust project, this prints 11485.

Alright, now let's exclude the tests. When we encounter the line mod tests, we stop counting. Note that this name is just a convention, but that's one that followed pretty much universally in Rust code, and there is usually no more code after this section. Tweak the name if needed:

$ awk '/mod tests/{skip[FILENAME]=1}  !skip[FILENAME]{count += 1} END{print(count)}'  src/***.rs

And this prints in the same project: 10057.

Let's unpack it:

And that's it. AWK is always very nifty.

Addendum: exit

Originally I implemented it wrongly, like this:

$ awk '/mod tests/{exit 0} {count += 1} END{print(count)}'  src/***.rs

If we encounter tests, stop processing the file altogether, with the builtin statement exit (docs).

Running this on the same Rust codebase prints: 1038 which is obviously wrong.

Why is it wrong then?

Well, as I understand it, AWK processes all inputs files one by one, as if it was one big sequential file (it will still fill the builtin constant FILENAME though, that's why the solution above works). Since there is no isolation between the processing each file (AWK does not spawn a subprocess for each file), it means we simply stop altogether at the first encountered test in any file.

⏴ Back to all articles

This blog is open-source! If you find a problem, please open a Github issue. The content of this blog as well as the code snippets are under the BSD-3 License which I also usually use for all my personal projects. It's basically free for every use but you have to mention me as the original author.