关注微信公众号了解更多关于Chat GPT和Whisper API
https://openai.com/blog/introducing-chatgpt-and-whisper-apis
ChatGPT API的推出引起了很大关注,但是同时推出的WHISPER API受关注程度就小很多。其实,WHISPER也是一个有广泛应用场景的应用,它能够在语音和文字之间实现转化。通过ChatGPT和Whisper,我们可以实现一个智能聊天机器人,进行实时对话。
这个视频展示了如何使用OpenAI的ChatGPT和WHISPER API, 通过两个模型:“GPT-3.5-TURBO”和”WHISPER-1”, 在WINDOWS电脑上实现一个实时的语音聊天机器人。
文字链接:
https://www.notion.so/updayday/Chat-GPT-WHISPER-API-GPT-3-5-TURBO-2af2630c857a4f0da92abcc763b4fd48?pvs=4
Music from #Uppbeat (free for Creators!):
https://uppbeat.io/t/qube/breezy
License code: KBQB8OTBISWO2U8I
Music from #Uppbeat (free for Creators!):
https://uppbeat.io/t/prigida/picture-frames
License code: IXBHX2IYQMQQV12P
原理:
语音—>WHISPER API—→文字—→POST to Chat GPT API—→返回回答—→系统文字转语音输出
前提:是否需要会编程,不需要;
会在WINDOWS操作系统安装程序;
会复制黏贴。
准备工作 (实验环境):
操作系统:WINDOWS 11
安装ffmpeg
https://ffmpeg.org/
代码编辑器:
https://code.visualstudio.com/
安装python
https://www.python.org/
WINDOWS系统文字转语音WSAY:
https://github.com/p-groarke/wsay/releases
分步演示:
导入库
pip install openai
pip install gradio
GRADIO建立用户界面
https://gradio.app/quickstart/
# 测试1,新建输入输出界面
import gradio as gr
def greet(name):
return "Hello " + name + "!"
demo = gr.Interface(
fn=greet,
inputs=gr.Textbox(lines=2, placeholder="Name Here..."),
outputs="text",
)
demo.launch()
# 测试2, 麦克风输入
import gradio as gr
def transcribe(audio):
print(audio)
return "这里显示音频"
ui = gr.Interface(
fn=transcribe,
inputs=gr.Audio(source="microphone", type="filepath"),
outputs="text"
).launch()
ui.launch()
OpenAI链接: https://platform.openai.com/docs/guides/speech-to-text
# 测试3, WHISPER API
import gradio as gr
import openai
openai.api_key = "XXXXXXXXXXXXXX"
def transcribe(audio):
print(audio)
audio_file= open(audio, "rb")
transcript = openai.Audio.transcribe("whisper-1", audio_file)
return transcript["text"]
ui = gr.Interface(
fn=transcribe,
inputs=gr.Audio(source="microphone", type="filepath"),
outputs="text"
).launch()
ui.launch()
# 最终稿:
import gradio as gr
import openai, subprocess
openai.api_key = "xxxxxxxxxxxxxxxxxx"
messages = [{"role": "system", "content": '你是一名知识渊博,乐于助人的智能聊天机器人.你的任务是陪我聊天,请用简短的对话方式,用中文讲一段话,每次回答不超过50个字!'}]
def transcribe(audio):
global messages
audio_file = open(audio, "rb")
transcript = openai.Audio.transcribe("whisper-1", audio_file)
messages.append({"role": "user", "content": transcript["text"]})
response = openai.ChatCompletion.create(model="gpt-3.5-turbo", messages=messages)
system_message = response["choices"][0]["message"]
messages.append(system_message)
subprocess.call(["wsay", system_message['content']])
chat_transcript = ""
for message in messages:
if message['role'] != 'system':
chat_transcript += message['role'] + ": " + message['content'] + "\n\n"
return chat_transcript
ui = gr.Interface(fn=transcribe, inputs=gr.Audio(source="microphone", type="filepath"), outputs="text").launch()
ui.launch()