Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ layer ] Lower the computational code of lstmcell_core to BLAS level #2642

Merged
merged 2 commits into from
Jun 20, 2024

Conversation

skykongkong8
Copy link
Member

Through this PR, 2 things:

  1. Cleaner code : reduce if/def code block to tensor level
  2. Acceleration : Use SIMD haxpy in fp16 case. Previously it was only using naive loop

- Occasionally, add_i computation for only interested section is desired.
- Moreover, this function could lower down if/def code blocks from the layer level.

**Self evaluation:**
1. Build test:     [X]Passed [ ]Failed [ ]Skipped
2. Run test:     [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: skykongkong8 <[email protected]>
Using add_i_partial function in LSTM layer will reduce if/def codeblock, and even accelerate the function latency.

**Self evaluation:**
1. Build test:     [X]Passed [ ]Failed [ ]Skipped
2. Run test:     [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: skykongkong8 <[email protected]>
@taos-ci
Copy link
Collaborator

taos-ci commented Jun 18, 2024

📝 TAOS-CI Version: 1.5.20200925. Thank you for submitting PR #2642. Please a submit 1commit/1PR (one commit per one PR) policy to get comments quickly from reviewers. Your PR must pass all verificiation processes of cibot before starting a review process from reviewers. If you are new member to join this project, please read manuals in documentation folder and wiki page. In order to monitor a progress status of your PR in more detail, visit http://ci.nnstreamer.ai/.

Copy link
Collaborator

@taos-ci taos-ci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@skykongkong8, 💯 All CI checkers are successfully verified. Thanks.

Copy link
Member

@SeoHyungjun SeoHyungjun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

nntrainer/tensor/tensor.cpp Show resolved Hide resolved
Copy link
Collaborator

@jijoongmoon jijoongmoon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jijoongmoon jijoongmoon merged commit 3cdc478 into nnstreamer:main Jun 20, 2024
32 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants