Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking issue for RFC 1861: Extern types #43467

Open
1 of 3 tasks
aturon opened this issue Jul 25, 2017 · 285 comments
Open
1 of 3 tasks

Tracking issue for RFC 1861: Extern types #43467

aturon opened this issue Jul 25, 2017 · 285 comments
Labels
A-ffi Area: Foreign Function Interface (FFI) B-RFC-implemented Blocker: Approved by a merged RFC and implemented. B-unstable Blocker: Implemented in the nightly compiler and unstable. C-tracking-issue Category: A tracking issue for an RFC or an unstable feature. F-extern_types `#![feature(extern_types)]` S-tracking-needs-summary Status: It's hard to tell what's been done and what hasn't! Someone should do some investigation. T-lang Relevant to the language team, which will review and decide on the PR/issue.

Comments

@aturon
Copy link
Member

aturon commented Jul 25, 2017

This is a tracking issue for RFC 1861 "Extern types".

Steps:

Unresolved questions:

@aturon aturon added B-RFC-approved Blocker: Approved by a merged RFC but not yet implemented. T-lang Relevant to the language team, which will review and decide on the PR/issue. labels Jul 25, 2017
@aturon aturon changed the title Tracking issue for RFC 1861: Extern tyupes Tracking issue for RFC 1861: Extern types Jul 25, 2017
@jethrogb
Copy link
Contributor

jethrogb commented Jul 25, 2017

This is not explicitly mentioned in the RFC, but I'm assuming different instances of extern type are actually different types? Meaning this would be illegal:

extern {
    type A;
    type B;
}

fn convert_ref(r: &A) -> &B { r }

@canndrew
Copy link
Contributor

@jethrogb That's certainly the intention, yes.

@glaebhoerl
Copy link
Contributor

glaebhoerl commented Jul 25, 2017

Relatedly, is deciding whether we want to call it extern type or extern struct something that can still be done as part of the stabilization process, or is the extern type syntax effectively final as of having accepted the RFC?

EDIT: rust-lang/rfcs#2071 is also relevant here w.r.t. the connotations of type "aliases". In stable Rust a type declaration is "effect-free" and just a transparent alias for some existing type. Both extern type and type Foo = impl Bar would change this by making it implicitly generate a new module-scoped existential type or type constructor (nominal type) for it to refer to.

@Ericson2314
Copy link
Contributor

Can we get a bullet for the panic vs DynSized debate?

@plietar
Copy link
Contributor

plietar commented Aug 10, 2017

I've started working on this, and I have a working simple initial version (no generics, no DynSized).

I've however noticed a slight usability issue. In FFI code, it's frequent for raw pointers to be initialized to null using std::ptr::null/null_mut. However, the function only accepts sized type arguments, since it would not be able to pick a metadata for the fat pointer.

Despite being unsized, extern types are used through thin pointers, so it should be possible to use std::ptr::null.

It is still possible to cast an integer to an extern type pointer, but this is not as nice as just using the function designed for this. Also this can never be done in a generic context.

extern {
    type foo;
}
fn null_foo() -> *const foo {
    0usize as *const foo
}

Really we'd want is a new trait to distinguish types which use thin pointers. It would be implemented automatically for all sized types and extern types. Then the cast above would succeed whenever the type is bounded by this trait. Eg, the function std::ptr::null becomes :

fn null<T: ?Sized + Thin>() -> *const T {
    0usize as *const T
}

However there's a risk of more and more such traits creeping up, such as DynSized, making it confusing for users. There's also some overlap with the various custom RFCs proposals which allow arbitrary metadata. For instance, instead of Thin, Referent<Meta=()> could be used

@SimonSapin
Copy link
Contributor

I think we can add extern types now and live with str::ptr::null not supporting them for a while until we figure out what to do about Thin/DynSized/Referent<Meta=…> etc.

@plietar
Copy link
Contributor

plietar commented Aug 10, 2017

@SimonSapin yeah, it's definitely a minor concern for now.

I do think this problem of not having a trait bound to express "this type may be unsized be must have a thin pointer" might crop up in other places though.

@SimonSapin
Copy link
Contributor

Oh yeah, I agree we should solve that eventually too. I’m only saying we might not need to solve all of it before we ship any of it.

@plietar
Copy link
Contributor

plietar commented Sep 3, 2017

I've pushed an initial implementation in #44295

bors added a commit that referenced this issue Oct 28, 2017
Implement RFC 1861: Extern types

A few notes :

- Type parameters are not supported. This was an unresolved question from the RFC. It is not clear how useful this feature is, and how variance should be treated. This can be added in a future PR.

- `size_of_val` / `align_of_val` can be called with extern types, and respectively return 0 and 1. This differs from the RFC, which specified that they should panic, but after discussion with @eddyb on IRC this seems like a better solution.
If/when a `DynSized` trait is added, this will be disallowed statically.

- Auto traits are not implemented by default, since the contents of extern types is unknown. This means extern types are `!Sync`, `!Send` and `!Freeze`. This seems like the correct behaviour to me.
Manual `unsafe impl Sync for Foo` is still possible.

- This PR allows extern type to be used as the tail of a struct, as described by the RFC :
```rust
extern {
    type OpaqueTail;
}

#[repr(C)]
struct FfiStruct {
    data: u8,
    more_data: u32,
    tail: OpaqueTail,
}
```

However this is undesirable, as the alignment of `tail` is unknown (the current PR assumes an alignment of 1). Unfortunately we can't prevent it in the general case as the tail could be a type parameter :
```rust
#[repr(C)]
struct FfiStruct<T: ?Sized> {
    data: u8,
    more_data: u32,
    tail: T,
}
```

Adding a `DynSized` trait would solve this as well, by requiring tail fields to be bound by it.

- Despite being unsized, pointers to extern types are thin and can be casted from/to integers. However it is not possible to write a `null<T>() -> *const T` function which works with extern types, as I've explained here : #43467 (comment)

- Trait objects cannot be built from extern types. I intend to support it eventually, although how this interacts with `DynSized`/`size_of_val` is still unclear.

- The definition of `c_void` is unmodified
@kennytm
Copy link
Member

kennytm commented Nov 16, 2017

@plietar In #44295 you wrote

Auto traits are not implemented by default, since the contents of extern types is unknown. This means extern types are !Sync, !Send and !Freeze. This seems like the correct behaviour to me. Manual unsafe impl Sync for Foo is still possible.

While it is possible for Sync, Send, UnwindSafe and RefUnwindSafe, doing impl Freeze for Foo is not possible as it is a private trait in libcore. This means it is impossible to convince the compiler that an extern type is cell-free.

Should Freeze be made public (even if #[doc(hidden)])? cc @eddyb #41349.

Or is it possible to declare an extern type is safe-by-default, which opt-out instead of opt-in?

extern {
    #[unsafe_impl_all_auto_traits_by_default]
    type Foo;
}
impl !Send for Foo {}

@eddyb
Copy link
Member

eddyb commented Nov 16, 2017

@kennytm What's the usecase? The semantics of extern type are more or less that of a hack being used before the RFC, which is struct Opaque(UnsafeCell<()>);, so the lack of Freeze fits.
That prevents rustc from telling LLVM anything different from what C signatures in clang result in.

@kennytm
Copy link
Member

kennytm commented Nov 16, 2017

@eddyb Use case: Trying to see if it's possible to make CStr a thin DST.

I don't see anything related to a cell in #44295? It is reported to LLVM as an i8 similar to str. And the places where librustc_trans involves the Freeze trait reads the real type, not the LLVM type, so LLVM treating all extern type as i8 should be irrelevant?

@eddyb
Copy link
Member

eddyb commented Nov 16, 2017

@kennytm So with extern type CStr;, writes through &CStr would be legal, and you don't want that?
The Freeze trait is private because it's used to detect UnsafeCell and not meant to be overriden.

@RalfJung
Copy link
Member

as I previously proposed (sample usage), Box<T, A> can be changed to have impl Drop for Box<T, A> just call a trait instead of the usual drop_in_place and deallocation (that trait's impl for allocators does dropping and deallocation), this allows FFI code to just have that trait impl call whatever destroy/free/release FFI function you need.

That sounds like what you actually want is the ability to write impl Drop for ExternType.

@programmerjake
Copy link
Member

That sounds like what you actually want is the ability to write impl Drop for ExternType.

that seems kinda problematic, since you can't actually pass a extern type around by value, references/pointers aren't a good substitute for Box. Also, the idea for having a trait for dropping boxes is also quite useful for automatic object pooling and other things that aren't related to extern types.

@Skepfyr
Copy link
Contributor

Skepfyr commented Apr 24, 2024

What would be the benefits of min_extern_types over:

mod hide_the_internals {
  struct ExternType;
  pub struct OwnedExternType(*mut ExternType);
  pub struct ExternTypeRefMut<'a>(*mut ExternType, PhantomData<&'a mut ExternType>);
  pub struct ExternTypeRef<'a>(*const ExternType, PhantomData<&'a ExternType>);
}

Given all the limitations to extern types then I'm not clear on how having at all would be better than just hiding a unit struct.

@RalfJung
Copy link
Member

That's a fair question. Your proposal completely hides the extern type from the outside so it seems basically equivalent to the "only provide pointer types" variant of min-extern-type. My proposed variant would go a bit further and accommodate some of the things the compiler does with extern types. But the compiler actually implements a bunch of traits for structs with extern type tail as well, which probably means it relies on instantiating generic types with such an extern-type-tail struct.

@jmillikin
Copy link
Contributor

A Rust reference must be aligned and you don't know the alignment of UiWidget. A Rust reference must point to a valid object, and the pointer might have metadata stuffed into the high or low bits.

why not just say that rustc doesn't know what the alignment should be, so it just assumes that whatever unsafe code originally made the reference was correct and doesn't assume that their is any particular alignment...therefore having a reference to an extern type should be valid, rather than saying rust can't have references without known alignments even though it's perfectly capable of passing the address around correctly.

Passing around the address (without additional semantics) is what *mut T (or NonNull<T>, or usize) is for. A usize, *mut (), *mut T, NonNull<T>, and &mut () all have the same representation in the registers, but rustc treats them differently.

Constructing a reference that is misaligned or dangling is undefined behavior, and (except in extremely narrow circumstances) there's no way to know whether a C opaque pointer would meet Rust's requirements. Could your FFI code handle the case where the call returns structs with different alignments at runtime?


@RalfJung I think the compiler's use of extern types to implement thin-pointer DSTs is probably unhelpful -- it continues to tangle up the two completely separate use cases:

  1. Types defined in an external language and used to represent C opaque pointers. This is what the original RFC covers in the summary and motivation sections, and the only thing those types need to do is be pointed to without being dereferenceable. These types should not be usable with size_of_val() or align_of_val().

  2. Types representing a memory region of dynamic size that is prefixed with some fixed-layout header, as used in RawList and discussed in RFC 3536. These types have known alignments, have sizes that can be peek'd, and are defined within Rust code (i.e. are not "external").

It's unfortunate that the original RFC enabled both patterns in such a way that the two features became conjoined, and I think the most likely path forward involves separating them.

@jmillikin
Copy link
Contributor

@Skepfyr That design is one of the common ways to implement external types in stable, and it's mentioned in the RFC. The main downside is that code with access to struct ExternType; can accidentally use it "as a structure" -- it's not a fatal drawback, but it's a bit unfortunate.

It's possible that the subset of extern type that could be an MVP is essentially just the existing ZST structure pattern plus a bit of extra type-checking to enforce opacity.

@RalfJung
Copy link
Member

RalfJung commented Apr 24, 2024

Constructing a reference that is misaligned or dangling is undefined behavior, and (except in extremely narrow circumstances) there's no way to know whether a C opaque pointer would meet Rust's requirements. Could your FFI code handle the case where the call returns structs with different alignments at runtime?

Rust's requirements for extern types would be to not require alignment since we don't know it. Same for data validity. There's really no problem here. Undefined Behavior arises when the code violates the assumptions made by the compiler, but here the compiler cannot make any assumptions, and therefore the code cannot violate them. This is, in fact, already implemented correctly both in codegen and in Miri.

I think the compiler's use of extern types to implement thin-pointer DSTs is probably unhelpful -- it continues to tangle up the two completely separate use cases:

I tend to agree, what the compiler does here is a hack.

The main downside is that code with access to struct ExternType; can accidentally use it "as a structure" -- it's not a fatal drawback, but it's a bit unfortunate.

That's impossible in the design proposed by @Skepfyr as that struct is private to the module.

@jmillikin
Copy link
Contributor

jmillikin commented Apr 24, 2024

Rust's requirements for extern types would be to not require alignment since we don't know it. Same for data validity. There's really no problem here. Undefined Behavior arises when the code violates the assumptions made by the compiler, but here the compiler cannot make any assumptions, and therefore the code cannot violate them. This is, in fact, already implemented correctly both in codegen and in Miri.

I'm not trying to be snarky when I write this response, so please read literally: if extern types are compatible with Rust's reference semantics and the compiler already correctly emits code for them, then (1) why does size_of_val() cause a runtime panic and (2) why is stabilization blocked on incorrect handling of Box<T>?

It seems to me that the main blocker for extern type is that <T: ?Sized> makes assumptions about references that are guaranteed by the current semantics, and are broken by extern types. Given that extern types as described in the RFC could be restricted to pointers without breaking their main use case, and doing so would also fix the remaining known blockers, it's worth considering that as an approach.


That's impossible in the design proposed by @Skepfyr as that struct is private to the module.

mod hide_the_internals {
  struct ExternType;
  pub struct OwnedExternType(*mut ExternType);
  pub struct ExternTypeRefMut<'a>(*mut ExternType, PhantomData<&'a mut ExternType>);
  pub struct ExternTypeRef<'a>(*const ExternType, PhantomData<&'a ExternType>);

  fn do_something(x: &ExternType) { ... }
}

I know it's not the most critical failure mode, but considering that solving that specific drawback is the motivation for the RFC that this tracking issue exists for, I think it's worth at least keeping in mind.

@RalfJung
Copy link
Member

RalfJung commented Apr 24, 2024

why does size_of_val() cause a runtime panic

Because that's the most correct least dangerous semantics.

I was only talking about the runtime semantics. The type system clearly needs work. I never claimed the feature is ready.

We were talking about the question of alignment requirements of references, why are you now deflecting by talking about size_of_val? Your claim, as I understood it, was specifically that there's a problem defining the validity invariant for references (which typically involves alignment) as we don't know the alignment of these types. Two people now told you that this is in fact not a problem as we can just make the validity invariant not require any particular alignment for extern types.

why is stabilization blocked on incorrect handling of Box<T>?

I don't know what you are referring to here. Stabilization is blocked on the issue that the Rust type system does not understand the concept of a type that does not have a dynamically computable size.

mod hide_the_internals {

If you change other people's working code then it may not work any more. Not sure which point you are trying to make here. Usually when we evaluate the correctness of other people's code we only consider what can be done by calling that code as-if it was in another crate; we can't just access private fields or private types. It's trivial to break basically everything if you allow yourself to access private fields and private types.

But if we have e.g. a macro that generates the mod hide_the_internals then it won't be possible to do that.

@jmillikin
Copy link
Contributor

We were talking about the question of alignment requirements of references, why are you now deflecting by talking about size_of_val?

My point is that size_of_val() (and align_of_val()) currently accept references to external types, but crash because those operations do not make sense for external types. When C code returns an opaque pointer, that pointer cannot be assumed to meet the requirements of Rust references that existing APIs depend on, so creating a reference to such a type should be considered undefined behavior.

A practical way to solve that problem is to prevent external types from being used as references, because if the only thing they're good for is their bit pattern then they're not actually references in the Rust sense.

why is stabilization blocked on incorrect handling of Box?

I don't know what you are referring to here. Stabilization is blocked on the issue that the Rust type system does not understand the concept of a type that does not have a dynamically computable size.

I'm referring to #115709, in which a Box<T> is allowed to be constructed for an extern type despite that being undefined behavior (because extern types do not have a size or alignment). And then the thing gets passed as a dereferenced T function parameter, which should also be undefined behavior because there's no way to move or copy such a type.

If you change other people's working code then it may not work any more. Not sure which point you are trying to make here.

But if we have a macro that generates the mod hide_the_internals then it won't be possible to do that.

I thought I was clear in my point, but in case not, the RFC for this tracking issue exactly describes the use of a ZST for extern types in the "motivation" section, and also describes why that is not the ideal solution.

Scrapping the extern type implementation and replacing it with ZSTs would be one possible approach, but I think it would be frustrating to the people who are subscribed to this tracking issue in the hopes of one day seeing this feature stabilize.

@crumblingstatue
Copy link
Contributor

crumblingstatue commented Apr 24, 2024

A practical way to solve that problem is to prevent external types from being used as references, because if the only thing they're good for is their bit pattern then they're not actually references in the Rust sense.

I have a use case that looks similar to this, that utilizes references to opaque types:

#![feature(extern_types)]

#[repr(C)]
#[derive(Debug)]
struct TextureSize {
    width: u32,
    height: u32,
}

extern "C" {
    type FfiSprite;
    
    fn ffi_sprite_create() -> *mut FfiSprite;
    fn ffi_sprite_get_texture(sprite: *const FfiSprite) -> *const Texture;
    
    type Texture;
    
    fn ffi_texture_get_size(texture: *const Texture) -> TextureSize;
}


struct Sprite<'texture> {
    handle: *mut FfiSprite,
    // The handle to the texture is on the FFI side
    _texture: core::marker::PhantomData<&'texture Texture>,
}

impl<'texture> Sprite<'texture> {
    fn new() -> Self { Self {
        handle: unsafe { ffi_sprite_create() },
        _texture: std::marker::PhantomData,
    }}
    fn texture(&self) -> &Texture {
        unsafe { &*ffi_sprite_get_texture(self.handle) }
    }
}

// Directly implement the opaque type
impl Texture {
    // Method by reference
    fn size(&self) -> TextureSize {
        unsafe { ffi_texture_get_size(self) }
    }
}

fn main() {
    let sprite = Sprite::new();
    let texture = sprite.texture();
    dbg!(texture.size());
}

Is this/should this be undefined behavior because you can't have a reference to an opaque type?

Here is a real life example in rust-sfml (using pseudo-opaque types similar to what bindgen generates, but the principle is the same): https://github.com/jeremyletang/rust-sfml/blob/3e275e06f408343c63133108e65009d1f4e6295c/src/graphics/texture.rs

Is this undefined behavior? In all my testing it worked fine.

@Skepfyr
Copy link
Contributor

Skepfyr commented Apr 24, 2024

I think there are usability problems with what I've described, reborrowing the Ref(Mut)? versions is painful, and it means that no-one else can name a *mut ExternType preventing them from using it directly (which I think might actually be a problem for -sys crates). I agree the fact that you can misuse it from within the module is bad, but we already have the module as the scope of unsafety so I think it's acceptable.

I agree we should consider getting the opaque tail stuff working a separate issue, it's incredibly painful due to alignment, and I haven't seen a working proposal yet.

@jmillikin Box<T> wouldn't be an issue though if we banned extern types from showing up in generic parameters, and it's not constructable anyway so it's of limited concern. On references, I think banning them from generics also solves this problem, as it prevents them from being passed to *_of_val, or any other API that could inspect the type behind the reference. A reference to an extern type does match Rust's existing semantics, if you assume it doesn't satisfy T: ?Sized, that's why rust-lang/rfcs#3396 works, it proposes to add another bound less restrictive than the current meaning of T: ?Sized.

@RalfJung
Copy link
Member

RalfJung commented Apr 24, 2024

My point is that size_of_val() (and align_of_val()) currently accept references to external types, but crash because those operations do not make sense for external types.

Yes. As I said, that's a type system issue. They crash in a safe way (a non-unwinding panic) so this is not unsound.

When C code returns an opaque pointer, that pointer cannot be assumed to meet the requirements of Rust references that existing APIs depend on, so creating a reference to such a type should be considered undefined behavior.

This does not follow, not at all. UB is a very big hammer we use to enable the compiler to optimize code better. "size_of_val would panic" is most definitely not an excuse for introducing UB. The consequences of accidental UB are way too severe to gratuitously use it here.

Ideally we will use the type system to prevent size_of_val on references or pointers to extern types. But if that doesn't work we can use a runtime check like we do right now, with a panic. That's still miles better than Undefined Behavior. There's no need to risk arbitrary memory corruption just because someone created a reference to an extern type.

I'm referring to #115709, in which a Box is allowed to be constructed for an extern type despite that being undefined behavior (because extern types do not have a size or alignment). And then the thing gets passed as a dereferenced T function parameter, which should also be undefined behavior because there's no way to move or copy such a type.

That issue isn't about Box, it is about unsized_fn_params, which (like size_of_val) assumes that all types are DynSized. There are indeed some gaps in the support for the combination of these two unstable features. Whatever type system solution is used to safeguard size_of_val against extern types should also be used to ensure unsized function parameters (and unsized locals) are dyn-sized.

@kennytm
Copy link
Member

kennytm commented Apr 24, 2024

@crumblingstatue

IIUC #43467 (comment) the proposed min_extern_type feature did not outright "ban" &Texture.

One big issue of &Texture was that align_of_val::<Texture>(_) is undefined, so a Rust program could not properly generate a valid &Texture reference1. According to https://doc.rust-lang.org/reference/behavior-considered-undefined.html this was one of the UB scenarios:

  • Producing an invalid value, … The following values are invalid (at their respective type):
    • A reference or Box<T> that is dangling, misaligned, or points to an invalid value.

But given that Rust can never safely create a Texture/Box<Texture> value or deference a &Texture place this sentence can be relaxed or clarified for extern type (treating them effectively as align-1 behind a pointer, similar to the treatment of dyn Trait in LLVM IR).

@jmillikin

A practical way to solve that problem is to prevent external types from being used as references, because if the only thing they're good for is their bit pattern then they're not actually references in the Rust sense.

I don't see how banning &Texture is the practical solution. Using the min_extern_type restriction i.e.

  1. Texture does not implement ?Sized, i.e. can't be substituted into any generic arguments or associated types
  2. Texture can't be used as a struct member field (except #[repr(transparent)])

it can already prevent {size,align}_of_val{,_raw}::<Texture>() from being even instantiated, and Texture can't be used as a struct tail either, side-stepping the entire alignment/size question.

I'm not sure what "references in the Rust sense" encompasses. IMO at least calling methods given a &mut Texture / &Texture do make sense.

Footnotes

  1. That said, the Texture type in the real-world code already produced an under-aligned structure: declare_opaque! results in an align-1 ZST, but the actual C++ definition of sf::Texture contains an Uint64 m_cacheId field meaning the actual alignment must be at least 8.

@jmillikin
Copy link
Contributor

@kennytm The problem with the proposed min_extern_types is the inability to use such a type as a generic parameter. It's sacrificing an important capability (extern type generics) to retain something unnecessary (extern type references).

Given the following sets of goals for an extern type T:

  • Allow *const T and *mut T, which are required in FFI signatures.
  • Allow NonNull<T> and Option<NonNull<T>>, for niche optimization.
  • Forbid size_of_val::<T>(_) and align_of_val::<T>(_), as extern types don't have sizes or alignments.
  • Forbid struct W(T), Option<T>, [T], and &[T], as a consequence of not having sizes or alignments.

... the proposal to forbid their use as references solves all four, and allowing references but forbidding generics would leave two unsolved.

@programmerjake
Copy link
Member

programmerjake commented Apr 25, 2024

  • Forbid struct W(T), Option<T>, [T], and &[T], as a consequence of not having sizes or alignments.

the types Option<T>, [T], and [T; N] are forbidden anyway for extern types since extern types aren't Sized.

@kennytm
Copy link
Member

kennytm commented Apr 25, 2024

@jmillikin Forbidding &ExternType alone will not prevent calling align_of_val_raw which takes a pointer, not a reference.

Meanwhile forbidding reference but not generic meant the following code is still possible:

// crate A
pub struct Tailed<T: ?Sized> {
    a: usize,
    b: T,
}

// crate B
extern { type Texture; }
type TailedTexture = crate_a::Tailed<Texture>;

The NonNull<T> case can be supported once we have sorted out the Sized hierarchy issue. ?Sized could become an alias of DynSized and the bound of NonNull<T> will be changed from ?Sized to Unsized.

@jmillikin
Copy link
Contributor

@kennytm I don't think size_of_val_raw() and align_of_val_raw() matter that much, because they're not stable and therefore their signatures can change to have new trait bounds or whatever.

I don't understand why the code you provided would be valid for extern type -- they don't have a size or alignment, so they can't be struct fields.

@RalfJung
Copy link
Member

RalfJung commented Apr 25, 2024

You can't forbid references without forbidding extern types in generics. Otherwise it's always possible to use generics to bypass the restriction.

@programmerjake
Copy link
Member

programmerjake commented Apr 25, 2024

I don't understand why the code you provided would be valid for extern type -- they don't have a size or alignment, so they can't be struct fields.

I think it should be valid for extern type, since that allows you to declare a struct that you can make references to that is unsized and is a header for whatever unknown larger struct is actually in memory. actually accessing the extern type field shouldn't be valid, unless the extern type has a known alignment, e.g.:

extern {
    #[repr(align(4))]
    type ExternTypeWithKnownAlignment;
}

@kennytm
Copy link
Member

kennytm commented Apr 25, 2024

@jmillikin

I don't think size_of_val_raw() and align_of_val_raw() matter that much matter that much, because they're not stable and therefore their signatures can change to have new trait bounds or whatever.

What really matters is the intrinsic they are being forwarded to (min_align_of_val) are actually used by the language to compute alignment of a struct tail field.

I don't understand why the code you provided would be valid for extern type -- they don't have a size or alignment, so they can't be struct fields.

  1. If it is forbidden from "crate A", how could it know that T is an extern type and has unspecified size and alignment?
  2. If it is forbidden from "crate B", how could it know that T is being used as a struct tail (given that all members of Tailed are private)?

The min_extern_type restriction is to forbid such code from "crate B" (since from "crate A"'s perspective, Tailed<[u8]> and Tailed<dyn Trait> must continue to work) by declaring ExternType does not implement ?Sized, simple.


@programmerjake

unless the extern type has a known alignment, e.g.:

While it sounds good in theory I doubt in practice if there is any actual scenario where the FFI type specified the exact alignment yet the size and all fields are implementation details. Usually you either have none or all of these 3 details.

Like for the <cstdio> type FILE*, either you just use whatever returned by fopen() and don't care about its alignment/size/fields, or you'll actually use the complete definition like

#[repr(C)]
struct FILE {
    _flags: c_int,
    _IO_read_ptr: *mut c_char,
    ...
}

I see no case that you specifically only want to know align_of::<FILE>() == align_of::<usize>() and nothing else.

The only exception might be dynamic-sized array, like an array tail like xxxx_t tail[0], or C-style string const char* / const wchar_t*, but I think they are a different kind of Custom DST that shouldn't be hacked through extern type.

@nikomatsakis
Copy link
Contributor

@RalfJung

That's not giving the function an extra capability, but gives the caller an extra capability! Those are very different things. In fact they are the exact opposite of each other (in a precise, formal sense even). I think it would be extremely confusing to treat them like the same thing.

I agree they are not the same (and my blog post said that). However, my point was that I do not think that understanding this distinction is as important as you do, at least at first, and I think that by the time it is, people will be able to handle it. In contrast, the ? operator puts that price up front -- it marks this bound as a very different thing -- in a way that I think can intimidate readers and isn't especially necessary for them to understand early on. Now, if ?Sized could scale to all the use cases we need, then maybe that'd be one thing, but it seems to me that it cannot (though maybe there are proposals I've missed, I feel like I have to do a bit of a review, I didn't realize how much conversation had been going on).

Let me explain this a different way. Early on in their Learning Rust journey, I think the main way that people will interact with Unsized or DynSized bounds is when calling library APIs -- i.e., reading them in documentation, not authoring them. They might not realize that Unsized or DynSized is in fact saying that this function accepts a broader set of types (unlike Debug, which narrows it). But so long as they can tell that their types implement those traits, do they care? I don't think so.

Maybe they then go to author an impl that needs T: Unsized to be maximally applicable. My experience in authoring such impls is that I often forget the T: ?Sized initially. I write the code. It type checks. Then I go to use it, and I have to add some bounds. This is ... surprisingly similar to what happens with ordinary bounds. Of course the "compiler complains, add bound, recompile" cycle is happening when I use the impl, not when I author the impl, but the overall interplay feels similar. Especially because there is a kind of reverse thing that sometimes plays out, where I add bounds to write the code, then find out that the types I want to use don't support those bounds (sometimes for good reason), and I have to go back and refactor the code not to require so many bounds.

All that said, I think we would really benefit from doing some experimentation here. It's often hard to judge how something will feel after using it for a while. My hunch is that Unsized bounds will feel pretty natural, but I'd feel better if we had a nightly implementation to play with. And, as I said, I'd like to review the other ideas. I'm open to something better. But mostly I don't want us to delay forever and ever.

@nikomatsakis
Copy link
Contributor

(Has anybody written lints to detect when a type parameter could have been ?Sized?)

@zetanumbers
Copy link
Contributor

zetanumbers commented Apr 27, 2024

I have a concern that drop glue for extern types should not be instantiated, meanwhile it currently resolves with empty drop_in_place. Same goes for needs_drop as it should always return true instead of false, just like with dyn Trait.

P.S. It should probably become some sort of a post-monomorphization (or linker) error, since this feature usually relies on the linker finding symbols anyway. I could try implement that. Perhaps this could also be done to size_of_val? So I would say calling drop_in_place most probably indicates a bug in the code. I wonder what crater run would say about that.

@Skepfyr
Copy link
Contributor

Skepfyr commented Apr 27, 2024

@zetanumbers Could you give a code example of what your talking about? I can't imagine a scenario where drop glue would be emitted for an extern types, because you can't ever have an owned extern type in Rust. Similarly, I'd expect needs_dropto return false, because you can't implement Drop for extern types (which means I'd expect drop_in_place to do nothing).

@zetanumbers
Copy link
Contributor

@zetanumbers Could you give a code example of what your talking about? I can't imagine a scenario where drop glue would be emitted for an extern types, because you can't ever have an owned extern type in Rust. Similarly, I'd expect needs_dropto return false, because you can't implement Drop for extern types (which means I'd expect drop_in_place to do nothing).

As you have said, so that drop_in_place::<ForeignType> is instantiated with empty drop glue.

I'd expect needs_drop to return false, because you can't implement Drop for extern types

You cannot implement Drop for a number of types including dyn Trait for which needs_drop would return true.

I understand you don't see the difference for yourself, I am just looking which behavior would people prefer if they care.

@RalfJung
Copy link
Member

You cannot implement Drop for a number of types including dyn Trait for which needs_drop would return true.

dyn Trait forwards Drop to the actual underlying type. So not being able to implement Drop here is an entirely different matter.

Extern types do not have any implicit drop glue, so there's no good reason for needs_drop to return true here. needs_drop() == false just means that the drop shim does nothing, and indeed drop_in_place does nothing for these types, so all works out.

Maybe we want to allow impl Drop for ExternType in the future, but that's a separate discussion and the current behavior is forward-compatible with that.

@zetanumbers
Copy link
Contributor

zetanumbers commented Apr 27, 2024

I am worried if such behavior may affect support for unforgettable types. Extern types seem like very much raw feature for unsafe code only, which is not inherently bad, but I wonder if usage from safe code is planned.

Also to note if unforgettable types are indeed only meaningful when they have a lifetime, as I believe it would be without accounting other hypothetical features, then extern type should probably not be allowed to have lifetime arguments.

@Skepfyr
Copy link
Contributor

Skepfyr commented Apr 27, 2024

I don't see how extern types are particularly interesting here. In terms of needs_drop and drop_in_place, I think extern type Foo acts exactly the same as struct Foo. Currently, you can't implement Drop for an extern type but as Ralf says it's not clear that restriction will stay forever, and if it does drop it will continue to look like struct Foo: needs_drop will return true and drop_in_place will call the drop function.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-ffi Area: Foreign Function Interface (FFI) B-RFC-implemented Blocker: Approved by a merged RFC and implemented. B-unstable Blocker: Implemented in the nightly compiler and unstable. C-tracking-issue Category: A tracking issue for an RFC or an unstable feature. F-extern_types `#![feature(extern_types)]` S-tracking-needs-summary Status: It's hard to tell what's been done and what hasn't! Someone should do some investigation. T-lang Relevant to the language team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests