您好,登錄后才能下訂單哦!
no zuo no die系列,來自于pg的wiki。
這一節的內容是:不要使用char(n) 。
理由是:
Any string you insert into a char(n) field will be padded with spaces to the declared width. That’s probably not what you actually want.
The manual says:
Values of type character are physically padded with spaces to the specified width n, and are stored and displayed that way. However, trailing spaces are treated as semantically insignificant and disregarded when comparing two values of type character. In collations where whitespace is significant, this behavior can produce unexpected results; for example SELECT ‘a ‘::CHAR(2) collate “C” < E’a\n’::CHAR(2) returns true, even though C locale would consider a space to be greater than a newline. Trailing spaces are removed when converting a character value to one of the other string types. Note that trailing spaces are semantically significant in character varying and text values, and when using pattern matching, that is LIKE and regular expressions.
That should scare you off it.
The space-padding does waste space, but doesn’t make operations on it any faster; in fact the reverse, thanks to the need to strip spaces in many contexts.
It’s important to note that from a storage point of view char(n) is not a fixed-width type. The actual number of bytes varies since characters may take more than one byte, and the stored values are therefore treated as variable-length anyway (even though the space padding is included in the storage).
原因是期望指定n長,但由于字符編碼(如中文字符,GB2312是2個字節,而UTF8是3個字節)的原因,實際跟預想的不符,而且會出現影響排序等其他副作用。
testdb=# drop table if exists t_char;
DROP TABLE
testdb=# create table t_char(id int,c1 char(10),c2 varchar(10));
CREATE TABLE
testdb=#
testdb=# insert into t_char values(1,'測試123','123123');
INSERT 0 1
testdb=# insert into t_char values(2,'abc123','123123');
INSERT 0 1
testdb=#
testdb=# insert into t_char values(3,'a','a ');
INSERT 0 1
testdb=# insert into t_char values(4,E'a\n',E'a\n');
INSERT 0 1
testdb=#
使用length函數獲取長度
testdb=# select id,length(c1),length(c2) from t_char order by id;
id | length | length
----+--------+--------
1 | 5 | 6
2 | 6 | 6
3 | 1 | 2
4 | 2 | 2
(4 rows)
如上所述,使用length函數獲取的實際是字符個數而不是實際的字節數,如“測試123”實際的字節數是9+5=14字節。
testdb=# select id,length(c1),octet_length(c1),length(c2),octet_length(c2) from t_char order by id;
id | length | octet_length | length | octet_length
----+--------+--------------+--------+--------------
1 | 5 | 14 | 6 | 6
2 | 6 | 10 | 6 | 6
3 | 1 | 10 | 2 | 2
4 | 2 | 10 | 2 | 2
(4 rows)
在字符串比較上面,雖然空格的ascii碼值(0x20)比’\n’(0x0a)要大,但查詢的實際效果看起來卻是char(10)定義的’a’比’a\n’要小:
testdb=# select E'a\n'::bytea;
bytea
--------
\x610a
(1 row)
testdb=# select E'a '::bytea;
bytea
--------
\x6120
(1 row)
testdb=# select * from t_char where c1 < E'a\n';
id | c1 | c2
----+------------+----
3 | a | a
(1 row)
參考資料
Don’t Do This
免責聲明:本站發布的內容(圖片、視頻和文字)以原創、轉載和分享為主,文章觀點不代表本網站立場,如果涉及侵權請聯系站長郵箱:is@yisu.com進行舉報,并提供相關證據,一經查實,將立刻刪除涉嫌侵權內容。