快捷導(dǎo)航

python開發(fā)一款翻譯工具

更新時間：2020年10月10日 16:11:36 作者：無毀的湖光-Al

這篇文章主要介紹了如何用python開發(fā)一款翻譯工具，幫助大家更好的理解和學(xué)習(xí)python，感興趣的朋友可以了解下

最近，某水果手機廠在萬眾期待中開了一場沒有發(fā)布萬眾期待的手機產(chǎn)品的發(fā)布會，發(fā)布了除手機外的其他一些產(chǎn)品，也包括最新的水果14系統(tǒng)。幾天后，更新了系統(tǒng)的吃瓜群眾經(jīng)過把玩突然發(fā)現(xiàn)新系統(tǒng)里一個超有意思的功能——翻譯，比如這種：

奇怪的翻譯知識增加了！

相比常見的翻譯工具，同聲翻譯工具更具有實用價值，想想不精通其他語言就能和歪果朋友無障礙交流的場景，真是一件美事，不如自己動手實現(xiàn)個工具備用!一個同聲翻譯工具，邏輯大概可以是先識別，而后翻譯，翻譯能否成功，識別的準確率是個關(guān)鍵因素。為了降低難度，我決定分兩次完成工具開發(fā)。首先來實現(xiàn)試試語音識別的部分。

輕車熟路，本次的demo繼續(xù)調(diào)用有道智云API，實現(xiàn)實時語音識別。

效果展示

先看看界面和結(jié)果哈：

可以選擇多種語音，這里只寫了四種常見的：

偶分別測試的中文、韓文、英文。看著還不錯哦~

調(diào)用API接口的準備工作

首先，是需要在有道智云的個人頁面上創(chuàng)建實例、創(chuàng)建應(yīng)用、綁定應(yīng)用和實例，獲取調(diào)用接口用到的應(yīng)用的id和密鑰。具體個人注冊的過程和應(yīng)用創(chuàng)建過程詳見文章分享一次批量文件翻譯的開發(fā)過程

開發(fā)過程詳細介紹

下面介紹具體的代碼開發(fā)過程。

首先是根據(jù)實時語音識別文檔來分析接口的輸入輸出。接口設(shè)計的目的是對連續(xù)音頻流的實時識別，轉(zhuǎn)換成文本信息并返對應(yīng)文字流，因此通信采用websocket，調(diào)用過程分為認證、實時通信兩階段。

在認證階段，需發(fā)送以下參數(shù)：

參數(shù)	類型	必填	說明	示例
appKey	String	是	已申請的應(yīng)用ID	ID
salt	String	是	UUID	UUID
curtime	String	是	時間戳（秒）	TimeStamp
sign	String	是	加密數(shù)字簽名。	sha256
signType	String	是	數(shù)字簽名類型	v4
langType	String	是	語言選擇，參考支持語言列表	zh-CHS
format	String	是	音頻格式，支持wav	wav
channel	String	是	聲道，支持1（單聲道）	1
version	String	是	api版本	v1
rate	String	是	采樣率	16000

簽名sign生成方法如下：
signType=v4；
sign=sha256(應(yīng)用ID+salt+curtime+應(yīng)用密鑰)。

認證之后，就進入了實時通信階段，發(fā)送音頻流，獲取識別結(jié)果，最后發(fā)送結(jié)束標志結(jié)束通信，這里需要注意的是，發(fā)送的音頻最好是16bit位深的單聲道、16k采樣率的清晰的wav音頻文件，這里我開發(fā)時最開始因為音頻錄制設(shè)備有問題，導(dǎo)致音頻效果極差，接口一直返回錯誤碼304（手動捂臉）。

Demo開發(fā)：

這個demo使用python3開發(fā)，包括maindow.py，audioandprocess.py，recobynetease.py三個文件。界面部分，使用python自帶的tkinter庫，來進行語言選擇、錄音開始、錄音停止并識別的操作。audioandprocess.py實現(xiàn)了錄音、音頻處理的邏輯，最后通過recobynetease.py中的方法來調(diào)用實時語音識別API。

1.界面部分：

主要元素：

root=tk.Tk()
root.title("netease youdao translation test")
frm = tk.Frame(root)
frm.grid(padx='80', pady='80')
# label1=tk.Label(frm,text="選擇待翻譯文件：")
# label1.grid(row=0,column=0)
label=tk.Label(frm,text='選擇語言類型：')
label.grid(row=0,column=0)
combox=ttk.Combobox(frm,textvariable=tk.StringVar(),width=38)
combox["value"]=lang_type_dict
combox.current(0)
combox.bind("<<ComboboxSelected>>",get_lang_type)
combox.grid(row=0,column=1)

btn_start_rec = tk.Button(frm, text='開始錄音', command=start_rec)
btn_start_rec.grid(row=2, column=0)

lb_Status = tk.Label(frm, text='Ready', anchor='w', fg='green')
lb_Status.grid(row=2,column=1)

btn_sure=tk.Button(frm,text="結(jié)束并識別",command=get_result)
btn_sure.grid(row=3,column=0)

root.mainloop()

2.音頻錄制部分，引入pyaudio庫（需通過pip安裝）來調(diào)用音頻設(shè)備，錄制接口要求的wav文件，并通過wave庫存儲文件：

def __init__(self, audio_path, language_type,is_recording):
 self.audio_path = audio_path,
 self.audio_file_name=''
 self.language_type = language_type,
 self.language=language_dict[language_type]
 print(language_dict[language_type])
 self.is_recording=is_recording
 self.audio_chunk_size=1600
 self.audio_channels=1
 self.audio_format=pyaudio.paInt16
 self.audio_rate=16000

def record_and_save(self):
 self.is_recording = True
 # self.audio_file_name=self.audio_path+'/recordtmp.wav'
 self.audio_file_name='/recordtmp.wav'

 threading.Thread(target=self.record,args=(self.audio_file_name,)).start()

def record(self,file_name):
 print(file_name)
 p=pyaudio.PyAudio()
 stream=p.open(
 format=self.audio_format,
 channels=self.audio_channels,
 rate=self.audio_rate,
 input=True,
 frames_per_buffer=self.audio_chunk_size
 )
 wf = wave.open(file_name, 'wb')
 wf.setnchannels(self.audio_channels)
 wf.setsampwidth(p.get_sample_size(self.audio_format))
 wf.setframerate(self.audio_rate)

 # 讀取數(shù)據(jù)寫入文件
 while self.is_recording:
 data = stream.read(self.audio_chunk_size)
 wf.writeframes(data)
 wf.close()
 stream.stop_stream()
 stream.close()
 p.terminate()

3.翻譯接口調(diào)用部分：

def recognise(filepath,language_type):
 global file_path
 file_path=filepath
 nonce = str(uuid.uuid1())
 curtime = str(int(time.time()))
 signStr = app_key + nonce + curtime + app_secret
 print(signStr)
 sign = encrypt(signStr)

 uri = "wss://openapi.youdao.com/stream_asropenapi?appKey=" + app_key + "&salt=" + nonce + "&curtime=" + curtime + \
  "&sign=" + sign + "&version=v1&channel=1&format=wav&signType=v4&rate=16000&langType=" + language_type
 print(uri)
 start(uri, 1600)


def encrypt(signStr):
 hash = hashlib.sha256()
 hash.update(signStr.encode('utf-8'))
 return hash.hexdigest()



def on_message(ws, message):
 result=json.loads(message)
 try:
 resultmessage1 = result['result'][0]
 resultmessage2 = resultmessage1["st"]['sentence']
 print(resultmessage2)
 except Exception as e:
 print('')

def on_error(ws, error):
 print(error)


def on_close(ws):
 print("### closed ###")


def on_open(ws):
 count = 0
 file_object = open(file_path, 'rb')
 while True:
 chunk_data = file_object.read(1600)
 ws.send(chunk_data, websocket.ABNF.OPCODE_BINARY)
 time.sleep(0.05)
 count = count + 1
 if not chunk_data:
  break
 print(count)
 ws.send('{\"end\": \"true\"}', websocket.ABNF.OPCODE_BINARY)



def start(uri,step):

 websocket.enableTrace(True)

 ws = websocket.WebSocketApp(uri,
    on_message=on_message,
    on_error=on_error,
    on_close=on_close)

 ws.on_open = on_open
 ws.run_forever()

總結(jié)

有道智云提供的接口一如既往的好用，這次開發(fā)主要的精力全都浪費在了由于我自己錄制的音頻質(zhì)量差而識別失敗的問題上，音頻質(zhì)量ok后，識別結(jié)果準確無誤，下一步就是拿去翻譯了，有了有道智云API，實現(xiàn)實時翻譯也可以如此簡單！

以上就是python開發(fā)一款翻譯工具的詳細內(nèi)容，更多關(guān)于python開發(fā)翻譯工具的資料請關(guān)注腳本之家其它相關(guān)文章！

您可能感興趣的文章: