【我所認知的BIOS】—>Decompression
By LightSeed
2009-5-22
存在於BIOS的bin檔中的內容大多都是以模組的形式存在的。總所周知存在於裏面的模組都是有被壓縮的。這張我們就來探討一下關於模組的壓縮與解壓過程。這個章節裏同樣是針對於理解Awxxx公司的code。
1、LHA
1.1 LHA的背景
LHA是一種檔壓縮電腦軟體,也是此壓縮格式的名稱,其對應的副檔名有.lha及.lzh。發明人為日本的業餘程式設計師吉崎榮泰。擔任內科醫師的吉崎榮泰利用業餘時間,在奧村晴彥所發明的演算法的基礎上開發了名為LHarc的檔壓縮軟體及壓縮格式,1988年首次於網路上公開。1990年左右全面改寫程式,並改名為LHA。現在可能在別的國家用的不多了,但是在日本它仍是常見的壓縮格式之一。(來源於網路)
1.2 LZH的格式
要說明,LZH的header一共有三中級別。Level-0, Level-1和 Level-2。他們的數據格式如圖1。
圖1 Level-0, Level-1和 Level-2的格式區別
Level-0, Level-1和 Level-2的格式中各個byte的對應數據如下麵的三個列表。
level-0
Offset Length Contents
0 1 byte Size of archived file header (h)
1 1 byte Header checksum
2 5 bytes Method ID
7 4 bytes Compressed size (n)
11 4 bytes Uncompressed size
15 4 bytes Original file date/time (Generic time stamp)
19 1 byte File attribute
20 1 byte Level (0x00)
21 1 byte Filename / path length in bytes (f)
22 (f)bytes Filename / path
22+(f) 2 bytes CRC-16 of original file
24+(f) (n)bytes Compressed data
level-1
Offset Length Contents
0 1 byte Size of archived file header (h)
1 1 byte Header checksum
2 5 bytes Method ID
7 4 bytes Compressed size (n)
11 4 bytes Uncompressed size
15 4 bytes Original file date/time (Generic time stamp)
19 1 byte 0x20
20 1 byte Level (0x01)
21 1 byte Filename / path length in bytes (f)
22 (f)bytes Filename / path
22+(f) 2 bytes CRC-16 of original file
24+(f) 1 byte OS ID
25+(f) 2 bytes Next header size(x) (0 means no extension header)
[ // Extension headers
1 byte Extension type
(x)-3 bytes Extension fields
2 bytes Next header size(x) (0 means no next extension header)
]*
(n)bytes Compressed data
level-2
Offset Length Contents
0 2 byte Total size of archived file header (h)
2 5 bytes Method ID
7 4 bytes Compressed size (n)
11 4 bytes Uncompressed size
15 4 bytes Original file time stamp(UNIX type, seconds since 1970)
19 1 byte Reserved
20 1 byte Level (0x02)
21 2 bytes CRC-16 of original file
23 1 byte OS ID
24 2 bytes Next header size(x) (0 means no extension header)
[
1 byte Extension type
(x)-3 bytes Extension fields
2 bytes Next header size(x) (0 means no next extension header)
]*
(n)bytes Compressed data
針對於其中Method ID有對應的解釋,見表1
表1 LHA的壓縮方法ID代表的意義
"-lh0-" | No compression |
"-lh1-" | 4k sliding dictionary(max 60 bytes) + dynamic Huffman + fixed encoding of position |
"-lh2-" | 8k sliding dictionary(max 256 bytes) + dynamic Huffman (Obsoleted) |
"-lh3-" | 8k sliding dictionary(max 256 bytes) + static Huffman (Obsoleted) |
"-lh4-" | 4k sliding dictionary(max 256 bytes) + static Huffman + improved encoding of position and trees |
"-lh5-" | 8k sliding dictionary(max 256 bytes) + static Huffman + improved encoding of position and trees |
"-lh6-" | 32k sliding dictionary(max 256 bytes) + static Huffman + improved encoding of position and trees |
"-lh7-" | 64k sliding dictionary(max 256 bytes) + static Huffman + improved encoding of position and trees |
"-lzs-" | 2k sliding dictionary(max 17 bytes) |
"-lz4-" | No compression |
"-lz5-" | 4k sliding dictionary(max 17 bytes) |
"-lhd-" | Directory (no compressed data) |
2、解壓縮詳解
2.1解壓縮的準備知識
2.1.1 Cbrom的壓縮
AWxxx的code整個燒入ROM的BIOS.bin檔其實大多都是由cbrom以模組的形式壓入的。比如說我手上板子的BIOS就一共有8個模組。見圖2
圖2 BIOS模組
由圖中可知,不管是System BIOS還是PCI rom都是以模組的形式壓入到bin檔裏的。並且在cbrom壓入的過程中,會有相應的一些格式。這個結構如表2。
表2 被cbrom壓入的模組header
Offset from 1st byte | Offset in Real Header | Contents |
00h | N/A | The header length of the component. It depends on the file/component name. |
01h | N/A | The header 8-bit checksum, not including the first 2 bytes (header length and header checksum byte). |
02h - 06h | 00h - 04h | LZH Method ID (ASCII string signature). In my BIOS it's "-lh5-" which means: 8k sliding dictionary(max 256 bytes) + static Huffman + improved encoding of position and trees. |
07h - 0Ah | 05h - 08h | compressed file/component size in little endian dword value, i.e. MSB at 0Ah and so forth |
0Bh - 0Eh | 09h - 0Ch | Uncompressed file/component size in little endian dword value, i.e. MSB at 0Eh and so forth |
0Fh - 10h | 0Dh - 0Eh | Decompression offset address in little endian word value, i.e. MSB at 10h and so forth. The component will be decompressed into this offset address (real mode addressing is in effect here). |
11h - 12h | 0Fh - 10h | Decompression segment address in little endian word value, i.e. MSB at 12h and so forth. The component will be decompressed into this segment address (real mode addressing is in effect here). |
13h | 11h | File attribute. My BIOS components contain 20h here, which is normally found in LZH level-1 compressed file. |
14h | 12h | Level. My BIOS components contain 01h here, which means it's a LZH level-1 compressed file. |
15h | 13h | component filename name length in byte. |
16h - [15h+filename_len] | 14h - [13h+filename_len] | component filename (ASCII string) |
[16h+filename_len] - [17h+filename_len] | [14h+filename_len] - [15h+filename_len] | file/component CRC-16 in little endian word value, i.e. MSB at [HeaderSize - 2h] and so forth. |
[18h+filename_len] | [16h+filename_len] | Operating System ID. In my BIOS it's always 20h (ASCII space character) which don't resemble any LZH OS ID known to me. |
[19h+filename_len] - [1Ah+filename_len] | [17h+filename_len] - [18h+filename_len] | Next header size. In my BIOS it's always 0000h which means no extension header. |
值得一提的是,表中畫藍線的地方。這是被壓入bin檔中,模組將要被解壓縮到哪里去的地址。由於Cbrom是Awxxx出的工具,在解壓縮的過程中也是與之默契配合的。(筆者:基本上由cbrom壓入的模組,在解壓位址中都是在4000段中,至於具體位置會有點變化。這個會和index有關係。)見截圖3,它是在板子上BIOS.bin的system BIOS被cbrom壓縮後的二進位截圖。
解压缩的segment
平衡二叉树答案唯一吗
上一篇
2024-08-03 07:02
你还不知道这些程序员必备的小众网站吗? 建议点赞收藏
下一篇
2024-08-02 22:36
|