Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement compatibility for external LLVM-IL #1325

Open
AlexDemydenko opened this issue Mar 19, 2024 · 15 comments
Open

Implement compatibility for external LLVM-IL #1325

AlexDemydenko opened this issue Mar 19, 2024 · 15 comments

Comments

@AlexDemydenko
Copy link
Contributor

The compiler has an API to compile LLVM-IL code directly.
The idea of clspv is to use the semi-compiled sources (from cl to IL) for future compilation with other parameters or to compile with a different combination of the semi-compiled sources. In this case all IL-sources are generated by the internal (LLVM) clspv FrontEnd.

The clvk has mode CLVK_ENABLE_SPIRV_IL.
In this case the driver prepares the IL using SPIRV-LLVM-Translator.

The problem with "use-native-builtins". The SPIRV-LLVM-Translator doesn't use this option and generate LLVM-IL using it native instructions instead of to using the built-ins from clspv--.bc.

The proposal is to implement a new Pass that should intercept instructions that not in the list of "use-native-builtins" and replace them with the one of built-ins from clspv--.bc.

This pass could implemented for all external IL or directly to support the SPIRV-LLVM-Translator only.

@AlexDemydenko
Copy link
Contributor Author

This is a related PR from which it all began, just for the record.
#1249

@rjodinchr
Copy link
Collaborator

I think this would have to be done before linking with the builtins library as we link in lazy mode to be efficient (both time-wise and size-wise).

We would need to parse each instruction of the module in CompileModule, before LinkBuiltinLibrary to replace the instruction that should be mapped to an emulated builtin call.

Then NativeMathPass would take care of using the native or emulated implementation depending on what is requested by the driver.

@alan-baker
Copy link
Collaborator

Can you provide an example of a translation from the spirv-llvm-translator you wish to replace? Would the replacement be directly using a SPIR-V instruction (or extended instruction) or an emulation? If it is an emulation, would it be different than libclc?

@AlexDemydenko
Copy link
Contributor Author

AlexDemydenko commented Mar 20, 2024

All files in frem.zip

There 2 compilations:

  1. frem_float.cl to CL_frem_emulated.spv
  2. source_frem_float.spv to result_frem_float.spv
    The results in CL_frem_emulated.spv & result_frem_float.spv should be equal.

FYI. fmod function is equal to OpFRem instuction in spirv and the OpFMod instruction is similar but not equal.

In this case, I want to replace the OpFMod instruction with the fmod function and take the body of this instruction from libclc.

@alan-baker
Copy link
Collaborator

If I understand correctly if you compile directly from the spirv-llvm-translator you end up OpFMod in the SPIR-V because the translator maps directly to the llvm instruction and clspv maps that to the GLSL std450 instruction. Instead you want to replace fmod with the libclc implementation which we have as the mangled _Z4fmodff function.

If that's correct then it sounds like almost like a reverse of the native math pass. Specify a list of functions you wish to emulate using the libclc implementation. That seems reasonable if the source is a single instruction. It's probably a little more complicated if you need to replace a sequence of instructions, but maybe the pattern matcher LLVM uses in InstCombine could be used for such cases.

I could image an early pass in the flow where that does this before we discard the unused libclc functions. Is it the case that the libclc functions are all you need? That you don't require a different emulation than what clspv normally uses?

@rjodinchr
Copy link
Collaborator

Native math pass do not discard that much the unused libclc functios anymore.
Since we link in lazy the module, we only get what we need from there. So this will have to be a pass done before linking with the builtin library.

@alan-baker
Copy link
Collaborator

Native math pass do not discard that much the unused libclc functios anymore. Since we link in lazy the module, we only get what we need from there. So this will have to be a pass done before linking with the builtin library.

That's unfortunate and not ideal as we don't currently run any passes prior to linking.

@AlexDemydenko
Copy link
Contributor Author

The OpFMod instruction is only an example. Maybe it is a bad example. I have deal with a fork of the clspv witch is somewhat outdated.
But the idea is the same for all other functions.
For example, the spirv_new test from the OpenCL-CTS test suite contains tests for: OpFAdd, OpFSub, OpFMul, OpFDiv, OpFMod, OpFRem. And this is only the math section. The test has other test cases.

@AlexDemydenko
Copy link
Contributor Author

AlexDemydenko commented Mar 20, 2024

@rjodinchr
Copy link
Collaborator

Maybe it would make sense to implement the translation directly in the LLVM-SPIRV-TRANSLATOR?
I could understand that we are not the only ones who prefer to see this OpFMod mapped to the libclc implementation of fmod and not directly to the LLVM IR operator. At least for OpenCL SPIR-V.

@alan-baker
Copy link
Collaborator

The OpFMod instruction is only an example. Maybe it is a bad example. I have deal with a fork of the clspv witch is somewhat outdated. But the idea is the same for all other functions. For example, the spirv_new test from the OpenCL-CTS test suite contains tests for: OpFAdd, OpFSub, OpFMul, OpFDiv, OpFMod, OpFRem. And this is only the math section. The test has other test cases.

I would suggest documenting what the various translations need to be then. You can also compare Vulkan's accuracy to OpenCL's accuracy. I wouldn't expect + or - to need special translations. In any event, before finalizing any design it would be good to know the full extent of changes that are necessary.

@AlexDemydenko
Copy link
Contributor Author

AlexDemydenko commented Mar 21, 2024

Maybe it would make sense to implement the translation directly in the LLVM-SPIRV-TRANSLATOR? I could understand that we are not the only ones who prefer to see this OpFMod mapped to the libclc implementation of fmod and not directly to the LLVM IR operator. At least for OpenCL SPIR-V.

The same instructions have different precision requirements based on different memory models (Vulka/OpenCL). Take a look: Vulkan SPIRV Precision & OpenCL SPIRVPrecision.
LLVM-SPIRV-TRANSLATOR is only responsible for OpenCL kernels (based on the description from the project). Why it should handle cases when the driver for some reasons uses vulkan versions of the instructions instead of OpenCL versions in the OpenCL kernel?

Even if it does. Where is the guaranty what they would use a solution identical to the solution in clspv? The problem is that results are not identical. CTS tests don't check that one of the results is not good enough. Both versions can produce results with valid precision, but if the are not identical, then the test will fails.

@AlexDemydenko
Copy link
Contributor Author

The OpFMod instruction is only an example. Maybe it is a bad example. I have deal with a fork of the clspv witch is somewhat outdated. But the idea is the same for all other functions. For example, the spirv_new test from the OpenCL-CTS test suite contains tests for: OpFAdd, OpFSub, OpFMul, OpFDiv, OpFMod, OpFRem. And this is only the math section. The test has other test cases.

I would suggest documenting what the various translations need to be then. You can also compare Vulkan's accuracy to OpenCL's accuracy. I wouldn't expect + or - to need special translations. In any event, before finalizing any design it would be good to know the full extent of changes that are necessary.

The idea is to support only builtin functions from libclc. I think this library with math functions is there to solve the same problem. I don't want to do more delta between spirv and Opencl-C sources. I want only reduce it. In this case, I only need to use what exists for the Opencl-C sources.

@rjodinchr
Copy link
Collaborator

Maybe it would make sense to implement the translation directly in the LLVM-SPIRV-TRANSLATOR? I could understand that we are not the only ones who prefer to see this OpFMod mapped to the libclc implementation of fmod and not directly to the LLVM IR operator. At least for OpenCL SPIR-V.

The same instructions have different precision requirements based on different memory models (Vulka/OpenCL). Take a look: Vulkan SPIRV Precision & OpenCL SPIRVPrecision. LLVM-SPIRV-TRANSLATOR is only responsible for OpenCL kernels (based on the description from the project). Why it should handle cases when the driver for some reasons uses vulkan versions of the instructions instead of OpenCL versions in the OpenCL kernel?

Even if it does. Where is the guaranty what they would use a solution identical to the solution in clspv? The problem is that results are not identical. CTS tests don't check that one of the results is not good enough. Both versions can produce results with valid precision, but if the are not identical, then the test will fails.

My thought here is that when the LLVM-SPIRV-TRANSLATOR maps OpFMod to the LLVM IR fmod, we kind of lose the precision requirement. To compare it with clang, clang would not generate the IR fmod from a fmod call in an OpenCL-C kernel, because clang knows that precision wise it's not enough for OpenCL.

But this is just an idea. It might be easier architecture-wise to do that in the LLVM-SPIRV-TRANSLATOR, but this is just a guess, an idea to consider. Maybe it's much harder to do it in LLVM-SPIRV-TRANSLATOR and we should focus on clspv.
I'm just saying we should consider it.

@AlexDemydenko
Copy link
Contributor Author

What I found about the LLVM IR fmod/frem instruction.
https://llvm.org/docs/LangRef.html#frem-instruction

My opinion is this:
if the math instruction doesn't have any fast-math flags then it is the OpenCL version.
if the math instruction labeled fast flag then a function from the graphics pipeline.

I will do an issue for LLVM-SPIRV-TRANSLATOR in parallels.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants