Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

采用16000采样率,录制的音频会失真 #7

Closed
jinggoing opened this issue Jul 4, 2019 · 12 comments
Closed

采用16000采样率,录制的音频会失真 #7

jinggoing opened this issue Jul 4, 2019 · 12 comments
Labels
question Further information is requested

Comments

@jinggoing
Copy link

由于目前各家语音识别要求的音频采样率都需要是16K,但是采用16K的采样率去录制的话,音频就会失真,这个问题能解决吗

@2fps
Copy link
Owner

2fps commented Jul 4, 2019

16k的采样率对于人耳来说肯定是失真的吧,因为采样位数不够啊。
想问下需求点是啥?因为对于语音识别都是需要16kpcm格式的音频文件,如果是要保存录音或者回放,可以尝试保存或播放是48k(或者处理成其他mp3啥格式的),当发送给语音识别时压缩到16k。

@2fps 2fps added the question Further information is requested label Jul 4, 2019
@jinggoing
Copy link
Author

js 或 nodejs 有什么插件可以压缩的吗

@2fps
Copy link
Owner

2fps commented Jul 5, 2019

简单压缩pcm的话,你手写也可以的(可以参考我的compress()方法),48k->16k的话,3个取1。
压缩成其他格式的话,前端可以看下lamejs,node端的话看下fluent-ffmpeg

@xiangyuecn
Copy link

目测compress函数没有处理双声道,感觉至少需要成对的进行抽样(左右声道分开来采样),不知道会不会产生影响,demo测试页面改个参数费老大劲了😂

16k和48k的采样率,不细听其实音质差别不会太大。用了lamejs也太可能对音质有所改善。

如果是单声道,可以参考我改良优化的压缩采样方法

@2fps
Copy link
Owner

2fps commented Jul 29, 2019

两个星爷头像啊,哈哈哈。是没处理,双声道是后加的,这块给忘了,,,后期补上

@xiangyuecn
Copy link

@2fps 不说还没留意到这两个头像😁 紫霞搂赌圣

@2fps 2fps closed this as completed Aug 2, 2019
@ThoughtZer
Copy link

ThoughtZer commented Nov 6, 2019

@2fps 用了这个库进行16K的采样,失真...换了 @xiangyuecn 这个小伙伴的库,进行录音,在通过采样方法采样成16K的就可以~ 应该是采样方法不一样

@2fps
Copy link
Owner

2fps commented Nov 6, 2019

@ThoughtZer 麻烦给下控制台下的这个输入采样率

new (window.AudioContext || window.webkitAudioContext)().sampleRate 

@ThoughtZer
Copy link

@2fps 44100 chrome

@2fps
Copy link
Owner

2fps commented Nov 6, 2019

@ThoughtZer 我取整导致数据是22050的了,但头中的输出采样率仍是16000的,我会更新下

@ThoughtZer
Copy link

@2fps 尝试了~~ 完美

@yangmingqi
Copy link

yangmingqi commented Jan 10, 2023

简单压缩pcm的话,你手写也可以的(可以参考我的compress()方法),48k->16k的话,3个取1。 压缩成其他格式的话,前端可以看下lamejs,node端的话看下fluent-ffmpeg

48k 音频包含0-24k的有效频率,降采样到16k 是只将0-8k的信息保留下来,舍弃8-24k的音频。
48k->16k的时,直接3个取1的方法是不对的。这样做会产生混叠现象,将8-24k的信息混杂在新音频里面,产生噪声。
正确的做法应该先对48k音频进行低通滤波,过滤到8-24k的信息,然后再进行3取1的音频采样

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

5 participants