cross-posted from: https://programming.dev/post/10657765

I made a replacement for the venerable paccheck. It checks if files managed by the package manger have changed and if so reports that back to the user. Unlike paccheck it is cross distro (supports Debian too and could be further extended), and it uses all your CPU cores to be as fast as possible.

Oh and it is written in Rust (that may be a plus or minus depending on your opinion, but it wouldn’t have happened at all in any language except Rust, and Rust makes it very easy to add this sort of parallelism).

There are more details (including benchmarks) in the readme on github. Maybe it is useful to some of you.

(The main goal of this project is not actually the program produced so far, but to continue building this into a library. I have a larger project in the planning phase that needs this (in library form) as part of it.)

    • Vorpal@programming.devOP
      link
      fedilink
      English
      arrow-up
      2
      ·
      8 months ago

      I went ahead and implemented support for filtering packages (just made a new release: v0.1.3).

      I am of course still faster. Here are two examples that show a small package (where it doesn’t really matter that much) and a huge package (where it makes a massive difference). Excuse the strange paths, this is straight from the development tree.

      Lets check on pacman itself, and lets include config files too (not sure if pacman has that option even?). Config files or not doesn’t make a measurable difference though:

      $ hyperfine -i -N --warmup 1 "./target/release/paketkoll --config-files=include pacman" "pacman -Qkk pacman"
      Benchmark 1: ./target/release/paketkoll --config-files=include pacman
        Time (mean ± σ):      14.0 ms ±   0.2 ms    [User: 21.1 ms, System: 19.0 ms]
        Range (min … max):    13.4 ms …  14.5 ms    216 runs
       
        Warning: Ignoring non-zero exit code.
       
      Benchmark 2: pacman -Qkk pacman
        Time (mean ± σ):      20.2 ms ±   0.2 ms    [User: 11.2 ms, System: 8.8 ms]
        Range (min … max):    19.9 ms …  21.1 ms    147 runs
       
      Summary
        ./target/release/paketkoll --config-files=include pacman ran
          1.44 ± 0.02 times faster than pacman -Qkk pacman
      

      Lets check on davici-resolve as well. Which is massive (5.89 GB):

      $ hyperfine -i -N --warmup 1 "./target/release/paketkoll --config-files=include pacman davinci-resolve" "pacman -Qkk pacman davinci-resolve"
      Benchmark 1: ./target/release/paketkoll --config-files=include pacman davinci-resolve
        Time (mean ± σ):     770.8 ms ±   4.3 ms    [User: 2891.2 ms, System: 641.5 ms]
        Range (min … max):   765.8 ms … 778.7 ms    10 runs
       
        Warning: Ignoring non-zero exit code.
       
      Benchmark 2: pacman -Qkk pacman davinci-resolve
        Time (mean ± σ):     10.589 s ±  0.018 s    [User: 9.371 s, System: 1.207 s]
        Range (min … max):   10.550 s … 10.620 s    10 runs
       
        Warning: Ignoring non-zero exit code.
       
      Summary
        ./target/release/paketkoll --config-files=include pacman davinci-resolve ran
         13.74 ± 0.08 times faster than pacman -Qkk pacman davinci-resolve
      

      What about a some midsized packages (vtk 359 MB, linux 131 MB)?

      $ hyperfine -i -N --warmup 1 "./target/release/paketkoll vtk" "pacman -Qkk vtk"
      Benchmark 1: ./target/release/paketkoll vtk
        Time (mean ± σ):      46.4 ms ±   0.6 ms    [User: 204.9 ms, System: 93.4 ms]
        Range (min … max):    45.7 ms …  48.8 ms    65 runs
       
      Benchmark 2: pacman -Qkk vtk
        Time (mean ± σ):     702.7 ms ±   4.4 ms    [User: 590.0 ms, System: 109.9 ms]
        Range (min … max):   698.6 ms … 710.6 ms    10 runs
       
      Summary
        ./target/release/paketkoll vtk ran
         15.15 ± 0.23 times faster than pacman -Qkk vtk
      
      $ hyperfine -i -N --warmup 1 "./target/release/paketkoll linux" "pacman -Qkk linux"
      Benchmark 1: ./target/release/paketkoll linux
        Time (mean ± σ):      34.9 ms ±   0.3 ms    [User: 95.0 ms, System: 78.2 ms]
        Range (min … max):    34.2 ms …  36.4 ms    84 runs
       
      Benchmark 2: pacman -Qkk linux
        Time (mean ± σ):     313.9 ms ±   0.4 ms    [User: 233.6 ms, System: 79.8 ms]
        Range (min … max):   313.4 ms … 314.5 ms    10 runs
       
      Summary
        ./target/release/paketkoll linux ran
          9.00 ± 0.09 times faster than pacman -Qkk linux
      

      For small sizes where neither tool performs much work, the majority is spent on fixed overheads that both tools have (loading the binary, setting up glibc internals, parsing the command line arguments, etc). For medium sizes paketkoll pulls ahead quite rapidly. And for large sizes pacman is painfully slow.

      Just for laughs I decided to check an empty meta-package (base, 0 bytes). Here pacman actually beats paketkoll, slightly. Not a useful scenario, but for full transparency I should include it:

      $ hyperfine -i -N --warmup 1 "./target/release/paketkoll base" "pacman -Qkk base"
      Benchmark 1: ./target/release/paketkoll base
        Time (mean ± σ):      13.3 ms ±   0.2 ms    [User: 15.3 ms, System: 18.8 ms]
        Range (min … max):    12.8 ms …  14.1 ms    218 runs
       
      Benchmark 2: pacman -Qkk base
        Time (mean ± σ):       8.8 ms ±   0.2 ms    [User: 2.8 ms, System: 5.8 ms]
        Range (min … max):     8.4 ms …  10.0 ms    327 runs
       
      Summary
        pacman -Qkk base ran
          1.52 ± 0.05 times faster than ./target/release/paketkoll base
      

      I always start a threadpool regardless of if I have work to do (and changing that would slow the case I actually care about). That is the most likely cause of this slightly larger fixed overhead.