Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CIR][CodeGen][LowerToLLVM] End-to-end implementation of offload_* cases for OpenCL with SPIR-V target #724

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

seven-mile
Copy link
Collaborator

@seven-mile seven-mile commented Jul 7, 2024

This PR implements offload_* cases discussed in this thread.

  • Integrate target-specific CIR-to-LLVM address space map into TargetLoweringInfo
  • CIRGen: Implement these cases in getValueFromLangAS
  • Lowering: Extend the state of type converter with the CIR-to-LLVM address space map

When frontend provides a new LangAS like opencl_generic, it would be processed by CIRGenTypes and Builder.getPointerTo() and encoded as offload_generic.

When we lower CIR to LLVM, the address space map is queried from TargetLoweringInfo at the beginning. General targets like X86 or ARM64 use the default map, which maps all non-target cases to 0. For SPIR-V target here, it further maps offload_generic to 4.

Copy link
Collaborator Author

@seven-mile seven-mile left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sitio-couto Several ABI-related refactor and new code paths. I'll appreciate it if you can take a look!

@@ -59,6 +59,15 @@ bool ItaniumCXXABI::classifyReturnType(LowerFunctionInfo &FI) const {
return false;
}

class AppleARM64CXXABI : public ItaniumCXXABI {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AppleARM64CXXABI is used because Nathan introduced macOS host in the test driver.c. When we create LowerModule, the CXXABI is created meanwhile, leading to NYI. It's simple so I just add it here for completeness.


// FIXME: No actual usage of this rewriter, just to make the ctor happy.
mlir::PatternRewriter rewriter{&getContext()};
LowerModule lowerModule = createLowerModule(module, rewriter);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All LowerModule stuff is unpolished. @sitio-couto and I reached an agreement to defer the unification of the states like LowerModule. All the info needed here is the CIR AS map in TargetLoweringInfo, which is trivial enough. It's just for future convenience to define rewriteAddrSpace method in LowerModule.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's trivial enough why we need a specific pass?

Copy link
Member

@bcardosolopes bcardosolopes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is doing different things and moving code while at it.

I'd like to see this in more incremental and well explained pieces (e.g. SPIRV skeleton can be added separately, you also don't provide any description about this address map stuff, a sudden change to struct type which indirectly increases memory size of every single struct allocation, etc).

The approach also feels a bit over engineered, see my questions inline. If you have refactoring to do, please do them separately.

address space, which means it's not yet converted by the address space map
to carry target-specific semantics.
The address space attribute is used in pointer types. It essentially models
`clang::LangAS` rather than the LLVM address space.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps change It essentially models clang::LangAS rather than the LLVM address space to It essentially provides an unified models on top of clang::LangAS, rather than LLVM address spaces

@@ -35,11 +35,14 @@ struct StructTypeStorage : public TypeStorage {
bool packed;
StructType::RecordKind kind;
ASTRecordDeclInterface ast;
// Extra unique key to differentiate replacements on mutable parts
int replaceVersion;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This struct change sounds super adhoc and specific, doesn't fit the overall picture. Please elaborate why we need this and what are you trying to do, so perhaps we can find a different solution?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. The whole AttrTypeSubElementHandler was taken from LLVMStructType, which is necessary for pure C++ types as such to support walk and replace infrastructure in MLIR -- It's necessary for the AS lowering pass: recursivelyReplaceElementsIn to work.

/// Allow walking and replacing the subelements of a LLVMStructTypeStorage key.
template <>
struct AttrTypeSubElementHandler<LLVM::detail::LLVMStructTypeStorage::Key> {
static void walk(const LLVM::detail::LLVMStructTypeStorage::Key &param,
AttrTypeImmediateSubElementWalker &walker) {
if (param.isIdentified())
walker.walkRange(param.getIdentifiedStructBody());
else
walker.walkRange(param.getTypeList());
}
static FailureOr<LLVM::detail::LLVMStructTypeStorage::Key>
replace(const LLVM::detail::LLVMStructTypeStorage::Key &param,
AttrSubElementReplacements &attrRepls,
TypeSubElementReplacements &typeRepls) {
// TODO: It's not clear how we support replacing sub-elements of mutable
// types.
if (param.isIdentified())
return failure();
return LLVM::detail::LLVMStructTypeStorage::Key(
typeRepls.take_front(param.getTypeList().size()), param.isPacked());
}
};

But they left a TODO comment for the mutable case. I thought an extra tracing key would work, so gave it a try.

@@ -41,6 +41,8 @@ std::unique_ptr<Pass> createGotoSolverPass();
/// Create a pass to lower ABI-independent function definitions/calls.
std::unique_ptr<Pass> createCallConvLoweringPass();

std::unique_ptr<Pass> createAddrSpaceLoweringPass();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why we need a separate pass to do address space lowering, I don't see any good reason why it shouldn't be done directly in CIRGen.

Copy link
Collaborator Author

@seven-mile seven-mile Jul 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it can be done in CIRGen immediately. Did you mean the TypeConverter of DirectToLLVM? I also prefer that way and already implemented it previously. If we use TypeConverter instead, we can also get rid of recursivelyReplaceElementsIn and replaceVersion (the handler should still be fixed though, can be another patch). It just bother a bit ConvertCIRToLLVMPass::runOnOperation to maintain the states required by LowerModule, and a lazy initialization logic for LowerModule as we do not always have triple present. If it's okay, let me update the changes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It just bother a bit ConvertCIRToLLVMPass::runOnOperation to maintain the states required by LowerModule

Can you elaborate a bit what would be needed specifically? Would it require a hacky solution or just some extra work?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. This is the patch looks like:
https://gist.github.com/seven-mile/c57ddfb96e180af1609d429e1e66f67b

Apart from messing up ConvertCIRToLLVMPass::runOnOperation, it doesn't look too hacky IMO.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Thoughts:

  • The lazy initialization of the lowering state requiring synchronization doesn't look right to me... The address space map is going to be small and immutable, so maybe you could even pass it by value to prepareTypeConverter(...)?
  • Would it make sense to just use an empty map in case no triple is present? Or asked differently, would we expect offload AS to be used for the default target?
  • In the gist you mention the triple is not available for .cir test-cases -- could you (or @sitio-couto) address that in a separate PR first? I'd expect that we only need to specify that for OpenCL-related test-cases, assuming there's a well-defined default in case no triple is given.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, perhaps I've been too insistent on using an ideal method to call TargetLoweringInfo. We could opt for an interim solution (passing AS map by value) and refine it as it becomes more accessible.

A feasible approach would be:

// All CIR AS maps have static lifetime, it's safe to ref it
auto getCIRASMap = [&]() -> mlir::cir::AddressSpaceAttr::map_t const & {
  // If the triple is not present, e.g. CIR modules parsed from text, we
  // cannot init LowerModule properly.
  assert(!::cir::MissingFeatures::makeTripleAlwaysPresent());
  // Here we will use a default AS map to ignore all AS stuff.
  static const mlir::cir::AddressSpaceAttr::map_t defaultASMap{0};
  if (module->hasAttr("cir.triple")) {
    mlir::PatternRewriter rewriter{&getContext()};
    auto lowerModule = mlir::cir::createLowerModule(module, rewriter);
    return lowerModule.getTargetLoweringInfo().getCIRAddrSpaceMap();
  } else {
    return defaultASMap;
  }
};
prepareTypeConverter(converter, dataLayout, getCIRASMap());


// FIXME: No actual usage of this rewriter, just to make the ctor happy.
mlir::PatternRewriter rewriter{&getContext()};
LowerModule lowerModule = createLowerModule(module, rewriter);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's trivial enough why we need a specific pass?

@@ -0,0 +1,72 @@
//===- AArch64.cpp --------------------------------------------------------===//
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Filename is missing one letter and header is wrong.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch! For the filename, both clang libBasic and libCodeGen use SPIR.cpp, because there are spir and spirv target in clang. Usually spirv is based on spir with some overrides, e.g. class SPIRVTargetCodeGenInfo : public CommonSPIRTargetCodeGenInfo. Here I didn't include spir because it's not used yet.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gotcha, thanks for the clarification

@bcardosolopes
Copy link
Member

Let me try to expand a bit on what I meant by over design, and ask a few more clarifying questions, since I might have misunderstood part of the approach.

This PR abstracts target specific address map into more generic, unified approach. It does that by keeping at the CIR level a more generic representation for address spaces, example: addrspace(offload_local) instead of a target specific number.

CIR is then lowered as part of target specific ABI lowering to a CIR representation that has a target specific address space information.
Q: Is this right?
Q: I don't see any tests checking what become of those when the ABI lowering is done, do they become addrspace(target<N>) at that point?
Q: What happens when the __attribute__... is speficied with an arbitrary N number, let's say 777?, is that still represented by addrspace(target<N>)?

I find that ABI lowering for address space a bit premature. If I'm writing a pass on top of CIR, now I need to know whether that CIR is pre or post target ABI lowering to try understand what's version of address spaces I should be looking at...
Q: In case other __attribute__s are used within the same code, with arbitrary numbers, how is one supposed to work with the mixed things?
Q: What kind of compiler transformations are you imagining that would require this kind of distiction this early? When users of OpenCL want to write a transformation, do they care whether they are unified or already target specific?

If you have to use a AttrTypeSubElementHandler to replace every pointer inside a struct, this isn't a viable approach IMO. It's adding extra complexity for something that has no use or no intend use - or please specify what you have in mind here for for future transformations and analysis. A better place to do that instead would be during LLVM lowering, such that we keep the unified info all the in CIR and convert it to target specific whenever we have to convert pointers anwyays. ABI info should be designed to allow queries from within LowerToLLVM, not sure if it's possible to do that now, but I brought up this point to @sitio-couto previously.

In a gist, I'd like to keep this out of target lowering, I really not convinced by any trade offs.

GSoC mentor feedback hat here: in the future, please elaborate on workflow and design decisions on the PR description, preferably with examples. Also, before going all the way into design (which I think it's pretty cool and good work overall), consider first introducing the simple and direct approach, once that lands, you can incrementally think about how to redesign on top of existing, reliable testcases. Same is true about use cases, in CIR we try not to raise representations without a concrete use case in mind, that could be a guiding approach to consider. As an example: the RFC was very nice, but probably an improvement that could have come up incrementally, once we have fledged how the pieces connect in practice for more than opencl address spaces.

@seven-mile
Copy link
Collaborator Author

CIR is then lowered as part of target specific ABI lowering to a CIR representation that has a target specific address space information.
Q: Is this right?

No. We planned to only touch the CIR AS while or just before LowerToLLVM. Wherever we do the lowering, in the TypeConverter or a separate pass, it just intend to do the lowering directly or prepare for the lowering.

Let me describe the original motive of a separate pass. When I first try introducing target-specific info in the type converter of LLVM lowering, I found that it have to mess up the ConvertCIRToLLVMPass with strange synchronization and an unused rewriter state. I think that's not acceptable. So a direct thought on that would be to use a separate pass to extract and isolate these details. Note that I also added the new pass in populateCIRPreLoweringPasses.

But when a separate pass leads to the misfunctional replacing in struct type, I didn't do a rethink on the approach, just thinking "it's a bug so let's fix it anyway". Now I realized that the ground is not solid enough. I think we can just go for a solution in TypeConverter.

Q: I don't see any tests checking what become of those when the ABI lowering is done, do they become addrspace(target<N>) at that point?

Same no and same reason. Sorry for the confusion.

Q: What happens when the __attribute__... is speficied with an arbitrary N number, let's say 777?, is that still represented by addrspace(target<N>)?

Yes.

I find that ABI lowering for address space a bit premature. If I'm writing a pass on top of CIR, now I need to know whether that CIR is pre or post target ABI lowering to try understand what's version of address spaces I should be looking at...
Q: In case other __attribute__s are used within the same code, with arbitrary numbers, how is one supposed to work with the mixed things?
Q: What kind of compiler transformations are you imagining that would require this kind of distiction this early? When users of OpenCL want to write a transformation, do they care whether they are unified or already target specific?

True. It's totally a different thing from CallConvLowering. I think the unified one always carries more information than the target-specific one, otherwise we wouldn't be able to do the lowering. In other words, it shouldn't provide any extra optimization opportunities if we do it earlier than LowerToLLVM. I think we are in agreement on this topic.

ABI info should be designed to allow queries from within LowerToLLVM, not sure if it's possible to do that now, but I brought up this point to @sitio-couto previously.

Sure. TBH, introducing target-specific information in TypeConverter is not a common use case. I don't expect it to be done.😂

Currently @sitio-couto suggested me that a general method here is to keep an instance of LowerModule for the pass requiring TargetLoweringInfo just like what CallConvLowering does. It's an interim solution and waiting for future refactor. The @jopperm's proposed approach that keeps the reference to the AS map rather than the LowerModule looks more graceful to me (at least no strange lazy initialization logic). We will also need to update it to a more fluent way when one is available. Wdyt?


Many thanks for the feedback and suggestions! I often have a feel that I lack the ability to describe my obscure design choices in a brief and clear manner: I'm always replaying what I did and presenting verbose details, or just providing too less information by a too short description. Understanding this fact would be a good start, and trying providing more examples should helps greatly. I'll try my best to do better next time ; )

@seven-mile
Copy link
Collaborator Author

seven-mile commented Jul 12, 2024

Made some changes:

  • Use the type converter and remove the codes around AttrTypeSubElementHandler.
  • Rename map_t to MapTy to match the naming convention of MLIR.
  • To pass the AS map into the type converter, just like DataLayout, I use a pointer with null semantic corresponding to "absence of cir.triple". When we are lowering non-null AddressSpaceAttr, we assert that the AS map can not be null. It avoids implicit miscomilations.
  • Extended existing test cases: IR/address-space.cir & Lowering/address-space.cir.

Sorry. Ignore the close and re-open, mistake.🥲


Regarding breaking this PR into dependent pieces, I'm planning:

  • [NFC] Refactor createLowerModule to be used by libLowerToLLVM
  • [NFC] enable TargetLoweringInfo for AppleArm64
  • (This PR) The core offload_* supporting codes with SPIR TargetInfo

Will rebase this PR after that, sorry for the review churn.

@seven-mile seven-mile closed this Jul 12, 2024
@seven-mile seven-mile reopened this Jul 12, 2024
@seven-mile seven-mile marked this pull request as draft July 12, 2024 13:12
@bcardosolopes
Copy link
Member

Will rebase this PR after that, sorry for the review churn.

Don't worry about that, thanks for breaking this up!

@bcardosolopes
Copy link
Member

bcardosolopes commented Jul 12, 2024

Further PR splitting you could do here:

  1. Add the CIR dialect bits for unified address spaces + clang/test/CIR/IR/address-space.cir.
  2. Add SPIR.cpp skeleton without the address space maps.

This will basically leave the contentios part of this PR isolated and easier to iterate.

@seven-mile seven-mile changed the title [CIR][Dialect] Add offload_* cases in address space attribute [CIR][CodeGen][LowerToLLVM] End-to-end implementation of offload_* cases Jul 13, 2024
@seven-mile seven-mile changed the title [CIR][CodeGen][LowerToLLVM] End-to-end implementation of offload_* cases [CIR][CodeGen][LowerToLLVM] End-to-end implementation of offload_* cases for OpenCL with SPIR-V target Jul 17, 2024
@seven-mile
Copy link
Collaborator Author

seven-mile commented Jul 18, 2024

Changes and PR description updated. This PR is ready for another round of review. ; )

Comment on lines +3598 to +3599
if (!module->hasAttr("cir.triple"))
return nullptr;
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that IRASMap const * is a pointer to array. It would be nullptr if triple is not present.

If no address space attributes are present in CIR, we ensure that a null IRASMap remains functional. However, if there is an address space attribute to convert, the guard at line 3619 will reject it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought a bit about using the pointer here, but couldn't come up with a better design. I believe using std::optional<std::reference_wrapper<IRASMap>> would work, but that doesn't really simplify anything.

The underlying question is, do we really need this miscompilation protection here or could we just assume a default mapping similar to always having a DataLayout? In case a target triple is present and the target doesn't supply an AS map (or an incomplete AS map), we'd happily use a default flattening to (target) AS 0. To catch that, we'd need to use an actual map (e.g. SmallDenseMap) to let targets opt-in to the CIR AS they support, but that design point would deviate from OG clang.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify, for this PR I'm suggesting to either (1) keep the "no triple -> null map" logic or (2) drop the check and use a default map when no triple is present. I meant (3) "targets opt-in to supported AS" more as food for thought and potential future improvement that should not hold up this PR which finally reactivates the in-tree OpenCL tests 🎉

Copy link
Contributor

@jopperm jopperm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neat after all the prep PRs landed 👍

@@ -707,6 +707,7 @@ def AddressSpaceAttr : CIR_Attr<"AddressSpace", "addrspace"> {
let extraClassDeclaration = [{
static constexpr char kTargetKeyword[] = "}]#targetASCase.symbol#[{";
static constexpr int32_t kFirstTargetASValue = }]#targetASCase.value#[{;
using MapTy = unsigned[kFirstTargetASValue];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
using MapTy = unsigned[kFirstTargetASValue];
using MapTy = std::array<unsigned, kFirstTargetASValue>;

(more idiomatic in C++, and might prevent future headaches if we need to copy the maps etc.)

@@ -3583,24 +3585,44 @@ void populateCIRToLLVMConversionPatterns(mlir::RewritePatternSet &patterns,
}

namespace {

using IRASMap = mlir::cir::AddressSpaceAttr::MapTy;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the short name be IRASMapTy?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think MeowTy is a common pattern for member types in class. For free type names used in variable/parameter declarations, I prefer a clean CamelCase without suffix, because CamelCase is already the feature of types, no?

Btw, IRASMap copies its name from clang::LangASMap XD

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer a clean CamelCase without suffix

We should use the pattern that's used overall in LLVM despite our personal preferences, otherwise someone will need to fix this when we upstream this chunk of code.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I mean the same, by "because CamelCase is already the feature of types, no?"

@@ -8,5 +8,10 @@ TargetLoweringInfo::TargetLoweringInfo(std::unique_ptr<ABIInfo> Info)

TargetLoweringInfo::~TargetLoweringInfo() = default;

AddressSpaceAttr::MapTy const &TargetLoweringInfo::getCIRAddrSpaceMap() const {
static AddressSpaceAttr::MapTy defaultAddrSpaceMap = {0};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Add a comment saying that the default target behavior is to flatten all address spaces to AS 0.
  • The = {0} is not strictly necessary here, correct? I find it a bit misleading, and the zero-initialization is better explained as a comment I think.

Copy link
Member

@bcardosolopes bcardosolopes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another round of review, thanks for breaking it, this is much simpler and easier to review.

Have you considered adding a AddressMapAttr (wrapping an ArrayAttr perhaps`)? Seems like it would make verifying the length easy, easy to deal with the default value because we can encode in tablegen, and easy for targets to populate and return. Given how often those will be built in a compilation instance, value semantics is probably easier to follow here.

@@ -8,5 +8,11 @@ TargetLoweringInfo::TargetLoweringInfo(std::unique_ptr<ABIInfo> Info)

TargetLoweringInfo::~TargetLoweringInfo() = default;

AddressSpaceAttr::MapTy const &TargetLoweringInfo::getCIRAddrSpaceMap() const {
// Flatten all non-target address spaces to addrspace(0)
static AddressSpaceAttr::MapTy defaultAddrSpaceMap = {};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this need to be static? We usually reserve using local static variables only when it's strictly needed, and looks like this function could just be a return {}? Note that this or the target versions are not called much more than a couple times in a pipeline, so probably not worth any early optimization.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, we can replace the reference with just value.

1, // offload_global
2, // offload_constant
4, // offload_generic
};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if a target define an address space map that has fewer, more or none elements? Why this need to be a static global? Seems like a constant global would be enough.

Copy link
Collaborator Author

@seven-mile seven-mile Jul 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if a target define an address space map that has fewer, more or none elements?

The target author should use any dummy value they want, and ensure by their own that no unexpected AS would be emitted for normal CodeGen.

In other words, all maps for different targets should encode all cases that is always sync with AddressSpaceAttr.

This is the approach used by clang::LangASMap. I think it's reasonable because

  • It does not make sense to add a target that overrides getCIRAddrSpaceMap without considering all (language or unified) addrspace cases.
  • It does not make sense to add a (language or unified) addrspace case without considering all maps in all targets.

// cannot init LowerModule properly.
assert(!::cir::MissingFeatures::makeTripleAlwaysPresent());
// Here we will use a default AS map to ignore all AS stuff.
if (!module->hasAttr("cir.triple"))
Copy link
Member

@bcardosolopes bcardosolopes Jul 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What testcases related to address space break right now when we don't have cir.triple? We should either update them or place a direct assert so that folks in the future don't write tests like that.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For testcases related to address space, only Lowering/address-space.cir updated in this PR would break if we don't set cir.triple. But those testcases unrelated to address space, i.e. almost all Lowering/*.cir testcases, will fail because of absence of cir.triple and llvm.data_layout, if we don't set the optional semantic but place a direct assert here. The optional IRASMap makes us able to lower AS on demand.

Btw, I don't know whether or not cir.triple should be mandatory in our text-form IR -- we may alternatively set a default triple when parsing to migrate progressively. Moreover, in llc for reference, there is an option -mtriple to override the target triple = .. section in LLVM IR. We can even choose to do something similar in cir-translate. This topic needs more discussion to be decided, therefore I'd better leave the strategy space unchanged in this PR by using a less intrusive approach.

module {
module attributes {
cir.triple = "spirv64-unknown-unknown",
llvm.data_layout = "e-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-G1"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not for this PR, but opening for discussion: we should make llvm.data_layout computable from cir.triple. @sitio-couto do we already have that or plans to do so?

@@ -3583,24 +3585,44 @@ void populateCIRToLLVMConversionPatterns(mlir::RewritePatternSet &patterns,
}

namespace {

using IRASMap = mlir::cir::AddressSpaceAttr::MapTy;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer a clean CamelCase without suffix

We should use the pattern that's used overall in LLVM despite our personal preferences, otherwise someone will need to fix this when we upstream this chunk of code.

Copy link
Collaborator Author

@seven-mile seven-mile left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the attribute approach, I think it's a bit weird for the value returned by TargetLoweringInfo:

  • The ideal pattern for CIR addrspace map in the future should be as simple as lowerModule.getTargetLoweringInfo().getCIRAddrSpaceMap()[cirAS], align to other target-specific information. We need an extra wrapper getIRAddrSpaceMap for now only because TargetLoweringInfo is premature temporarily.
  • MLIR objects require a context to live in, which would complicate the construction.
  • Attribute that never exists in IR is probably abusing the infrastructure.
  • MLIR attributes are usually harder to work with: the dynamic element type and variable length of ArrayAttr. We do not even need a verification logic if the map has the type std::array<unsigned, kFirstTargetASValue>.

I don't see explicit drawbacks of "no triple -> null map". Note that the approach is relatively self-contained. getIRAddrSpaceMap resides in an anonymous namespace as the implementation details.

So if we don't expect a reference to static storage, we can pass it by value overall. To be more specific, use the following combination:

  • MapTy = std::array<unsigned, kFirstTargetASValue>: leaves the addrspace map as is
  • AddressSpaceAttr::MapTy TargetLoweringInfo::getCIRAddrSpaceMap(): copies the std::array and pass it around
  • std::optional<AddressSpaceAttr::MapTy> getIRAddrSpaceMap(mlir::ModuleOp module): provides the extra semantic for "no triple -> null map"

1, // offload_global
2, // offload_constant
4, // offload_generic
};
Copy link
Collaborator Author

@seven-mile seven-mile Jul 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if a target define an address space map that has fewer, more or none elements?

The target author should use any dummy value they want, and ensure by their own that no unexpected AS would be emitted for normal CodeGen.

In other words, all maps for different targets should encode all cases that is always sync with AddressSpaceAttr.

This is the approach used by clang::LangASMap. I think it's reasonable because

  • It does not make sense to add a target that overrides getCIRAddrSpaceMap without considering all (language or unified) addrspace cases.
  • It does not make sense to add a (language or unified) addrspace case without considering all maps in all targets.

@@ -3583,24 +3585,44 @@ void populateCIRToLLVMConversionPatterns(mlir::RewritePatternSet &patterns,
}

namespace {

using IRASMap = mlir::cir::AddressSpaceAttr::MapTy;
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I mean the same, by "because CamelCase is already the feature of types, no?"

// cannot init LowerModule properly.
assert(!::cir::MissingFeatures::makeTripleAlwaysPresent());
// Here we will use a default AS map to ignore all AS stuff.
if (!module->hasAttr("cir.triple"))
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For testcases related to address space, only Lowering/address-space.cir updated in this PR would break if we don't set cir.triple. But those testcases unrelated to address space, i.e. almost all Lowering/*.cir testcases, will fail because of absence of cir.triple and llvm.data_layout, if we don't set the optional semantic but place a direct assert here. The optional IRASMap makes us able to lower AS on demand.

Btw, I don't know whether or not cir.triple should be mandatory in our text-form IR -- we may alternatively set a default triple when parsing to migrate progressively. Moreover, in llc for reference, there is an option -mtriple to override the target triple = .. section in LLVM IR. We can even choose to do something similar in cir-translate. This topic needs more discussion to be decided, therefore I'd better leave the strategy space unchanged in this PR by using a less intrusive approach.

@@ -8,5 +8,11 @@ TargetLoweringInfo::TargetLoweringInfo(std::unique_ptr<ABIInfo> Info)

TargetLoweringInfo::~TargetLoweringInfo() = default;

AddressSpaceAttr::MapTy const &TargetLoweringInfo::getCIRAddrSpaceMap() const {
// Flatten all non-target address spaces to addrspace(0)
static AddressSpaceAttr::MapTy defaultAddrSpaceMap = {};
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, we can replace the reference with just value.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants