Add e2e test suite for FA2 #17751

erman-gurses · 2024-06-27T07:20:55Z

Draft PR for Add e2e test suite for FA2

ScottTodd · 2024-06-27T15:26:27Z

tests/e2e/linalg_ext_ops/generate_e2e_fa2_tests.py

+    if shapes_id == ShapesId.SMALL:
+        return [
+            TestShapeAndScale(batch=4, numHeads=4, seqLen=1024, headDim=64, scale=1.0),
+        ]
+    if shapes_id == ShapesId.MEDIUM:
+        return [
+            TestShapeAndScale(batch=8, numHeads=4, seqLen=2048, headDim=128, scale=1.0),
+        ]
+    if shapes_id == ShapesId.LARGE:
+        return [
+            TestShapeAndScale(batch=16, numHeads=4, seqLen=4096, headDim=128, scale=1.0),
+            TestShapeAndScale(batch=16, numHeads=4, seqLen=16384, headDim=64, scale=1.0),
+        ]


I see that this is a draft PR, but I'm curious about how this PR differs from the previous PR with the same name: #16953

I'm always hesitant to add new testing infrastructure like these generator scripts, since they add layers of indirection and more lines of code to maintain. Can you explain why the existing tests are insufficient and why generated tests for fa2 are worth the costs?

The initial plan was to stay with those test cases, but it was switched to the test suite since they were insufficient to test FA2 thoroughly. The test suite covers the four dimensions with big numbers. It also allows you to add more tests with less effort than hard-coded tests.

Is there an issue tracking those plans / discussions?

Also follow-up question: is this unique to "flash attention 2", or is it generic to all kinds of "attention"? I'm not clear on where the distinction lies, in either the model input code or the code generation. Wondering if we'll fork this boilerplate code again when the inevitable "flashier attention 3" comes around :P

I can open that issue tracker tomorrow and connect it with this PR. The reference implementation is the Attention algorithm, so, to the best of my knowledge, it should work with all the faster and optimized versions.

Add e2e tests for FA2

1824f3b

iree-org deleted a comment from google-cla bot Jun 27, 2024

ScottTodd reviewed Jun 27, 2024

View reviewed changes

erman-gurses changed the title ~~Add e2e tests for FA2~~ Add e2e test suite for FA2 Jun 27, 2024

aviator19941 and others added 3 commits June 28, 2024 00:06

Add dynamic allocation for scores and attention matrices

fde9839

Remove the batch dim since attention op only accepts 3D tensors

7222cf5

Add formatting

1041117

erman-gurses mentioned this pull request Jul 12, 2024

E2E tests suite for Flash Attention 2. #17892

Open

IanNod added 2 commits July 16, 2024 21:32

Add different dimension sizes for Q K V tensors

9dc718d

Add new dims into the func call

8f7cfff

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add e2e test suite for FA2 #17751

Add e2e test suite for FA2 #17751

erman-gurses commented Jun 27, 2024 •

edited

Loading

ScottTodd Jun 27, 2024

erman-gurses Jun 27, 2024

ScottTodd Jun 27, 2024

erman-gurses Jun 27, 2024

Add e2e test suite for FA2 #17751

Are you sure you want to change the base?

Add e2e test suite for FA2 #17751

Conversation

erman-gurses commented Jun 27, 2024 • edited Loading

ScottTodd Jun 27, 2024

Choose a reason for hiding this comment

erman-gurses Jun 27, 2024

Choose a reason for hiding this comment

ScottTodd Jun 27, 2024

Choose a reason for hiding this comment

erman-gurses Jun 27, 2024

Choose a reason for hiding this comment

erman-gurses commented Jun 27, 2024 •

edited

Loading