音频基础知识
PCM格式
pcm是经过话筒录音后直接得到的未经压缩的数据流
数据大小=采样频率采样位数声道*秒数/8
采样频率一般是44k,位数一般是8位或者16位,声道一般是单声道或者双声道
pcm属于编码格式,就是一串由多个样本值组成的数据流,本身没有任何头信息或者帧的概念。如果不是音频的录制者,光凭一段PCM数据,是没有办法知道它的采样率等信息的。
AAC格式
初步了解,AAC文件可以没有文件头,全部由帧序列组成,每个帧由帧头和数据部分组成。帧头包含采样率、声道数、帧长度等,有点类似MP3格式。
AAC编码
初始化编码转换器
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42
| -(BOOL)createAudioConvert{ //根据输入样本初始化一个编码转换器 if (m_converter != nil){ return TRUE; } AudioStreamBasicDescription inputFormat = {0}; inputFormat.mSampleRate = _configuration.audioSampleRate; inputFormat.mFormatID = kAudioFormatLinearPCM; inputFormat.mFormatFlags = kAudioFormatFlagIsSignedInteger | kAudioFormatFlagsNativeEndian | kAudioFormatFlagIsPacked; inputFormat.mChannelsPerFrame = (UInt32)_configuration.numberOfChannels; inputFormat.mFramesPerPacket = 1; inputFormat.mBitsPerChannel = 16; inputFormat.mBytesPerFrame = inputFormat.mBitsPerChannel / 8 * inputFormat.mChannelsPerFrame; inputFormat.mBytesPerPacket = inputFormat.mBytesPerFrame * inputFormat.mFramesPerPacket; AudioStreamBasicDescription outputFormat; // 这里开始是输出音频格式 memset(&outputFormat, 0, sizeof(outputFormat)); outputFormat.mSampleRate = inputFormat.mSampleRate; // 采样率保持一致 outputFormat.mFormatID = kAudioFormatMPEG4AAC; // AAC编码 kAudioFormatMPEG4AAC kAudioFormatMPEG4AAC_HE_V2 outputFormat.mChannelsPerFrame = (UInt32)_configuration.numberOfChannels;; outputFormat.mFramesPerPacket = 1024; // AAC一帧是1024个字节 const OSType subtype = kAudioFormatMPEG4AAC; AudioClassDescription requestedCodecs[2] = { { kAudioEncoderComponentType, subtype, kAppleSoftwareAudioCodecManufacturer }, { kAudioEncoderComponentType, subtype, kAppleHardwareAudioCodecManufacturer } }; OSStatus result = AudioConverterNewSpecific(&inputFormat, &outputFormat, 2, requestedCodecs, &m_converter); if(result != noErr) return NO; return YES; }
|
编码转换
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
| char *aacBuf;
if(!aacBuf){ aacBuf = malloc(inBufferList.mBuffers[0].mDataByteSize); } // 初始化一个输出缓冲列表 AudioBufferList outBufferList; outBufferList.mNumberBuffers = 1; outBufferList.mBuffers[0].mNumberChannels = inBufferList.mBuffers[0].mNumberChannels; outBufferList.mBuffers[0].mDataByteSize = inBufferList.mBuffers[0].mDataByteSize; // 设置缓冲区大小 outBufferList.mBuffers[0].mData = aacBuf; // 设置AAC缓冲区 UInt32 outputDataPacketSize = 1; if (AudioConverterFillComplexBuffer(m_converter, inputDataProc, &inBufferList, &outputDataPacketSize, &outBufferList, NULL) != noErr){ return; } AudioFrame *audioFrame = [AudioFrame new]; audioFrame.timestamp = timeStamp; audioFrame.data = [NSData dataWithBytes:aacBuf length:outBufferList.mBuffers[0].mDataByteSize];
char exeData[2]; exeData[0] = _configuration.asc[0]; exeData[1] = _configuration.asc[1]; audioFrame.audioInfo =[NSData dataWithBytes:exeData length:2];
|