多媒体技术

多媒体技术 中南大学信息科学与工程学院黄东军

第四章无损数据压缩技术

1 熵编码 1.1 算术编码（Arithmetic Coding）举例设 s = ( a1, a2, a3, a4 ) p = (0.1 0.4 0.2 0.3) message = a3a1a4a1a3a4a2 则算术编码算法的操作如下： ① 将区间[0.0 , 1.0) 按照码元的概率分成4个部分码元概率初始子区间 a1 0.1 [0.0 , 0.1) 0.4 [0.1 , 0.5) a2 a3 0.2 [0.5 , 0.7) a4 0.3 [0.7 , 1.0)

1 熵编码 0.51442 0.51442 0.52 0.514402 1 0.7 0.52 0.5146 1.1 算术编码（Arithmetic Coding）举例 ② 输入消息的第1个字符 a3 ，将其对应的初始子区间作为它的编码区间，即编码区间为[0.5, 0.7)。 0.7 Δ0.0006 Δ0.00012 Δ0.000036 Δ0.006 Δ0.02 Δ0.2 0.5 0.1 0.5 0.5 0.514 0.5143 0.514384 0.5143876 0.0 0.514 a4 a1 a3 a4 a2 a1 a3

1 熵编码 0.51442 0.51442 0.52 0.514402 1 0.7 0.52 0.5146 1.1 算术编码（Arithmetic Coding）举例 ③ 输入消息的第2个字符a1，在上一步得到的编码区间中取第1个十分之一作为a3a1串的编码区间，即[0.5,0.52)。 0.7 Δ0.0006 Δ0.00012 Δ0.000036 Δ0.006 Δ0.02 Δ0.2 0.5 0.1 0.5 0.5 0.514 0.5143 0.514384 0.5143876 0.0 0.514 a4 a1 a3 a4 a2 a1 a3

1 熵编码 0.51442 0.51442 0.52 0.514402 1 0.7 0.52 0.5146 1.1 算术编码（Arithmetic Coding）举例 ④ 输入第3个字符a4，在上一步得到的编码区间中取第7个十分之一开始的3个十分之一作为a3a1a4串的编码区间，即[0.514,0.52)。 0.7 Δ0.0006 Δ0.00012 Δ0.000036 Δ0.006 Δ0.02 Δ0.2 0.5 0.1 0.5 0.5 0.514 0.5143 0.514384 0.5143876 0.0 0.514 a4 a1 a3 a4 a2 a1 a3

1 熵编码 0.51442 0.51442 0.52 0.514402 1 0.7 0.52 0.5146 1.1 算术编码（Arithmetic Coding）举例 ⑤对后续输入字符a1a3a4a2，重复上述处理过程。 0.7 Δ0.0006 Δ0.00012 Δ0.000036 Δ0.006 Δ0.02 Δ0.2 0.5 0.1 0.5 0.5 0.514 0.5143 0.514384 0.5143876 0.0 0.514 a4 a1 a3 a4 a2 a1 a3

1 熵编码 0.51442 0.51442 0.52 0.514402 1 0.7 0.52 0.5146 1.1 算术编码（Arithmetic Coding）举例 ⑥ 当所有输入字符处理完毕，最后所得的编码区间的下界值或者上界值即为消息串的编码输出，该输出值为一个实数：0.5143876。 0.7 Δ0.0006 Δ0.00012 Δ0.000036 Δ0.006 Δ0.02 Δ0.2 0.5 0.1 0.5 0.5 0.514 0.5143 0.514384 0.5143876 0.0 0.514 a4 a1 a3 a4 a2 a1 a3

1 熵编码 0.7 1.1 算术编码（Arithmetic Coding）举例解码过程 ① 输入待解码的编码实数，检查其落入的初始子区间，该区间对应的字符即为第一个译码字符。 0.5143876 0.5 a3

1 熵编码 0.7 0.52 1.1 算术编码（Arithmetic Coding）举例 ②按照所有信源字符概率大小，将上一步译码字符对应的区间分成几个子区间，在这些子区间中，输入数值落入的区间对应的字符即为第二个译码字符：a1。 0.64 0.5143876 0.6 0.52 0.5 0.5 0.5 a1 a3

1 熵编码 0.7 0.52 0.52 1.1 算术编码（Arithmetic Coding）举例 ③ 重复上述过程，直到译码完毕。 0.514 …… 0.5143876 0.51 0.502 0.5 0.5 0.514 a3 a1 a4

1 熵编码 1.2 行程编码基本思想给定如下消息 abcdddddddddffffgggg ( 20 chars) 则可用如下的索引+索引对象（重复对象）方式代表之 abc9d4f4g ( consume 9 chars) RLE: index + indexed object = codec unit 应用 BMP 文件：RLE_4 & RLE_8编码 ; JPEG 等。

2 词典编码 2.1 词典编码分类第一类字词编码基本思想：用指向早期曾经出现过的字符串的指针来表示当前被编码字符串。 A B C B C B C B C A B C PBC PBCBC

2 词典编码 2.1 词典编码分类第二类字词编码从输入的数据流中创建一个短语词典，后续数据流中若出现词典中的短语，则可用该短语在词典中的索引表示该短语，而不需要输出短语本身。 A B C B C A B C B • Dictionary • BC • 2 ABC A B C 1 2 B

2 词典编码 Data coded Data waiting code Max-Win-Size 2.2 第一类词典编码 LZ77 算法( by Abraham Lempel and Jakob Ziv) …… Len Len Off Next char Coding position

2 词典编码 2.2 第一类词典编码 LZ77 算法描述 ① Set the coding position at the start place ② Find the max length string in slide window for the data waiting code ③ If found output (Off, Len, Next char); slide the window ahead Len+1; return ② if unfinished coding ; else output (0, 0, Next char) ; slide the window ahead 1 ; return ② if unfinished coding ; end if

2 词典编码 2.2 第一类词典编码 LZ77 算法举例给定一个报文 : abcdbbccaaabaeaaabaee MAX-WIN-SIZE is 10 则算法处理过程如下： a b c d b b c c a a a b a e a a a b a e e Coding position Output : (0,0,a)(0,0,b)(0,0,c)(0,0,d)(3,1,b)(4,1,c)(8,1,a)(10,2,a) (0,0,e)(6,6,e)

2 词典编码 2.2 第一类词典编码 LZ77算法的特点 ① Output data includes some unnecessary data ② Coded data previously may decrease the coding effect later ③ The slide window size has influence on the effect of the whole coding processing; but it is difficult to make the decision what length of the size will be better

2 词典编码 2.2 第一类词典编码 LZSS 算法 As a development of LZ77, it requires that the Len has to more than a constant MIN-LENGTH ① Set the coding position at the start place ② Find the max length string in slide window for the data waiting code ③ If Len >= MIN-LENGTH output (Off, Len); slide the window ahead Len; return ② if unfinished coding ; else output Next char ; slide the window ahead 1 ; return ② if unfinished coding ; end if

2 词典编码 2.2 第一类词典编码 LZSS 算法举例给定一报文： AABBCBBAABC MIN_LENGTH = 2, WIN_SIZE = 10 则算法处理过程如下： A A B B C B B A A B C Output : A A B B C (3,2) (7,3) C

2 词典编码 2.3 第二类词典编码 LZ78 算法描述 ① Dictionary = NULL ; P = NULL ; ② C = Next char ; ③ If p + C in Dictionary P = P + C ; else output the index of P in Dictionary and C; Dictionary = P + C ; P = NULL ; end if ④ If there still data waiting to be processed then return ② else if P  NULL then output p ; end of the algorithm

2 词典编码 2.3 第二类词典编码 LZ78 算法举例给定一报文： ABBCBCABA 则算法的处理过程如下表所示：步骤编码位置词典输出 (0,A) 1 A 1 (0,B) 2 B 2 (2,C) 3 BC 3 (3,A) 4 BCA 5 (2,A) 5 BA 8

Thank you !

多媒体技术

多媒体技术

Presentation Transcript