Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Audio recording enhacement #1341

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

nishant-sg
Copy link

Before:
The user could only record for a fix 5 seconds

After:
As long as the user is talking everything will be recorded. When the user stops talking it will take a three seconds to stop recording.

Print statements have been removed, for demo purpose please look at the video

demo-record-until-silent-enhancement.mp4

@@ -15,6 +17,15 @@
print("chat: failed to import pyaudio, wave or openai. See https://ardupilot.org/mavproxy/docs/modules/chat.html")
exit()

def rms( data ):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, thanks for this. A few things to fix here:

  1. could you add comments above the function?
  2. I think we should name the function to be more specific so its purpose is clear. Perhaps calc_audio_volume() and maybe it should return the decibels directly instead of making the caller do that.
  3. our normal style is no spaces after brackets. So change "def rms( data )" to "def rms(data)"
  4. I think it's possible that count could be zero if the microphone is not working. In any case we should have protection against divide-by-zero which could happen if count is zero.

while curr_time < time_stop:

# logic for recording sound until someone is speaking.
isSpeaking = True
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, I think our normal style is to use underscores between variables so let's change "isSpeaking" to "is_speaking".

Also maybe change the comment to be "record sound while user is speaking"

rms1 = rms(data)
if rms1!=0.0:
decibel = 20 * math.log10(rms1)
isSpeaking = decibel>-80.0 # -80 is the hardcoded threshold. higher number means louder. Set threshold in the range (-100,0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If possible let's move this -80 to be a definition at the top of the file where it will be easier to find.

@@ -34,7 +45,7 @@ def check_connection(self):
try:
self.client = OpenAI()
except Exception:
print("chat: failed to connect to OpenAI")
print("chat: failed to connect to OpenAI - 4")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this change to the print statement can be removed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants