-
Notifications
You must be signed in to change notification settings - Fork 558
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add e2e test suite for FA2 #17751
base: main
Are you sure you want to change the base?
Add e2e test suite for FA2 #17751
Conversation
if shapes_id == ShapesId.SMALL: | ||
return [ | ||
TestShapeAndScale(batch=4, numHeads=4, seqLen=1024, headDim=64, scale=1.0), | ||
] | ||
if shapes_id == ShapesId.MEDIUM: | ||
return [ | ||
TestShapeAndScale(batch=8, numHeads=4, seqLen=2048, headDim=128, scale=1.0), | ||
] | ||
if shapes_id == ShapesId.LARGE: | ||
return [ | ||
TestShapeAndScale(batch=16, numHeads=4, seqLen=4096, headDim=128, scale=1.0), | ||
TestShapeAndScale(batch=16, numHeads=4, seqLen=16384, headDim=64, scale=1.0), | ||
] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see that this is a draft PR, but I'm curious about how this PR differs from the previous PR with the same name: #16953
I'm always hesitant to add new testing infrastructure like these generator scripts, since they add layers of indirection and more lines of code to maintain. Can you explain why the existing tests are insufficient and why generated tests for fa2 are worth the costs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The initial plan was to stay with those test cases, but it was switched to the test suite since they were insufficient to test FA2 thoroughly. The test suite covers the four dimensions with big numbers. It also allows you to add more tests with less effort than hard-coded tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there an issue tracking those plans / discussions?
Also follow-up question: is this unique to "flash attention 2", or is it generic to all kinds of "attention"? I'm not clear on where the distinction lies, in either the model input code or the code generation. Wondering if we'll fork this boilerplate code again when the inevitable "flashier attention 3" comes around :P
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can open that issue tracker tomorrow and connect it with this PR. The reference implementation is the Attention algorithm, so, to the best of my knowledge, it should work with all the faster and optimized versions.
Draft PR for Add e2e test suite for FA2