Rust’s SemVer Snares: Sizedness and Size

2021-01-05

In Rust, changes to a type’s size are not usually understood to be Breaking Changes™. Of course, that isn’t to say you can’t break safe downstream code by changing the size of a type…

Sizedness

For one, you can change the sizedness of a type, by adding an unsized field:

pub mod upstream {
  #[repr(C)]
  pub struct Foo {
    bar: u8,
    // uncommenting this field is a breaking change:
    /* baz: [u8] */
  }
}

pub mod downstream {
  use super::upstream::*;

  fn example(foo: Foo) {
    todo!()
  }
}

error[E0277]: the size for values of type `[u8]` cannot be known at compilation time
  --> src/lib.rs:11:14
   |
11 |   fn example(foo: Foo) {
   |              ^^^ doesn't have a size known at compile-time
   |
   = help: within `upstream::Foo`, the trait `Sized` is not implemented for `[u8]`
   = note: required because it appears within the type `upstream::Foo`
help: function arguments must have a statically known size, borrowed types always have a known size
   |
11 |   fn example(&foo: Foo) {
   |              ^

Size

Changing the size of a Sized type can also break (poorly-behaving) downstream code. The mem::size_of intrinsic is a safe function that provides the size (in bytes) of any Sized type. By convention, downstream code should not rely on mem::size_of producing a SemVer stable result, but that’s only a convention. Consider:

pub mod upstream {
  #[repr(C)]
  pub struct Foo {
    bar: u8,
    // uncommenting this field is a breaking change for `downstream`:
    /* baz: u8 */
  }
}

pub mod downstream {
  use super::upstream::*;
  
  const _: [(); 1] = [(); std::mem::size_of::<Foo>()];
}

error[E0308]: mismatched types
  --> src/lib.rs:12:22
   |
12 |   const _: [(); 1] = [(); std::mem::size_of::<Foo>()];
   |                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ expected an array with a fixed size of 1 element, found one with 2 elements

Zero Sizedness

A downstream crate author doesn’t only need to worry that they aren’t using mem::size_of in a manner that breaks the stability contract of upstream code. As of 2018, there’s another mechanism that observes the size of a type: #[repr(transparent)].

The repr(transparent) attribute can be applied to types with at most one non-zero-sized field to specify that the annotated type’s layout is identical to that of the field. Applying repr(transparent) to a type with more than one non-zero-sized field is a compiler error:

#[repr(transparent)]
pub struct Foo {
    bar: u8,
    baz: u8
}

error[E0690]: transparent struct needs exactly one non-zero-sized field, but has 2
 --> src/lib.rs:2:1
  |
2 | pub struct Foo {
  | ^^^^^^^^^^^^^^ needs exactly one non-zero-sized field, but has 2
3 |     bar: u8,
  |     ------- this field is non-zero-sized
4 |     baz: u8
  |     ------- this field is non-zero-sized

Consequently, upstream changes that turn ZSTs into non-ZSTs can break downstream code.

pub mod upstream {
  #[repr(C)]
  pub struct Foo {
    bar: (),
    // uncommenting this field is a breaking change for `downstream`:
    /* baz: u8, */
  }
}

pub mod downstream {
  use super::upstream::*;

  #[repr(transparent)]
  struct Bar(u8, Foo);
}

error[E0690]: transparent struct needs exactly one non-zero-sized field, but has 2
  --> src/lib.rs:12:3
   |
12 |   struct Bar(u8, Foo);
   |   ^^^^^^^^^^^--^^---^^
   |   |          |   |
   |   |          |   this field is non-zero-sized
   |   |          this field is non-zero-sized
   |   needs exactly one non-zero-sized field, but has 2

You should therefore avoid #[repr(transparent)] unless the ZST field types are documented to remain ZSTs.

Email comments and corrections to jack@wrenn.fyi.