Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add speculative decoding example #432

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

XiaobingSuper
Copy link
Contributor

This PR is about adding a speculative decoding example.

@XiaobingSuper
Copy link
Contributor Author

@Shixiaowei02, could you help review this PR?

@avianion
Copy link

@XiaobingSuper did you test that this actually works?

@XiaobingSuper
Copy link
Contributor Author

@XiaobingSuper did you test that this actually works?

Yes, I checked and it works.

@biaochen
Copy link

biaochen commented Jul 2, 2024

Hi Xiaobing, great job! I've done similar things as you do, and sucessfully run sps (speculative sampling). It works fine for tp=1, but when tp=4 tritonserver will occasionlly block, and no log is printed. Do you encounter similar issue? The issue is detailed here:
#498

Thanks~

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants