您好,登錄后才能下訂單哦!
這篇文章將為大家詳細講解有關如何使用pandas讀取csv文件的指定列,小編覺得挺實用的,因此分享給大家做個參考,希望大家閱讀完這篇文章后可以有所收獲。
之所以想實現讀取前面的幾列是因為我手頭的一個csv文件恰好有后面幾列沒有可用數據,但是卻一直存在著。原來的數據如下:
GreydeMac-mini:chapter06 greyzhang$ cat data.csv
1,name_01,coment_01,,,, 2,name_02,coment_02,,,, 3,name_03,coment_03,,,, 4,name_04,coment_04,,,, 5,name_05,coment_05,,,, 6,name_06,coment_06,,,, 7,name_07,coment_07,,,, 8,name_08,coment_08,,,, 9,name_09,coment_09,,,, 10,name_10,coment_10,,,, 11,name_11,coment_11,,,, 12,name_12,coment_12,,,, 13,name_13,coment_13,,,, 14,name_14,coment_14,,,, 15,name_15,coment_15,,,, 16,name_16,coment_16,,,, 17,name_17,coment_17,,,, 18,name_18,coment_18,,,, 19,name_19,coment_19,,,, 20,name_20,coment_20,,,, 21,name_21,coment_21,,,,
如果使用pandas讀取出全部的數據,打印的時候會出現以下結果:
In [41]: data = pd.read_csv('data.csv')
In [42]: data Out[42]: 1 name_01 coment_01 Unnamed: 3 Unnamed: 4 Unnamed: 5 Unnamed: 6 0 2 name_02 coment_02 NaN NaN NaN NaN 1 3 name_03 coment_03 NaN NaN NaN NaN 2 4 name_04 coment_04 NaN NaN NaN NaN 3 5 name_05 coment_05 NaN NaN NaN NaN 4 6 name_06 coment_06 NaN NaN NaN NaN 5 7 name_07 coment_07 NaN NaN NaN NaN 6 8 name_08 coment_08 NaN NaN NaN NaN 7 9 name_09 coment_09 NaN NaN NaN NaN 8 10 name_10 coment_10 NaN NaN NaN NaN 9 11 name_11 coment_11 NaN NaN NaN NaN 10 12 name_12 coment_12 NaN NaN NaN NaN 11 13 name_13 coment_13 NaN NaN NaN NaN 12 14 name_14 coment_14 NaN NaN NaN NaN 13 15 name_15 coment_15 NaN NaN NaN NaN 14 16 name_16 coment_16 NaN NaN NaN NaN 15 17 name_17 coment_17 NaN NaN NaN NaN 16 18 name_18 coment_18 NaN NaN NaN NaN 17 19 name_19 coment_19 NaN NaN NaN NaN 18 20 name_20 coment_20 NaN NaN NaN NaN 19 21 name_21 coment_21 NaN NaN NaN NaN
所說在學習的過程中這并不會給我帶來什么障礙,但是在命令行終端界面呆久了總喜歡稍微清爽一點的風格。使用read_csv的參數usecols能夠在一定程度上減少這種混亂感。
In [45]: data = pd.read_csv('data.csv',usecols=[0,1,2,3])
In [46]: data Out[46]: 1 name_01 coment_01 Unnamed: 3 0 2 name_02 coment_02 NaN 1 3 name_03 coment_03 NaN 2 4 name_04 coment_04 NaN 3 5 name_05 coment_05 NaN 4 6 name_06 coment_06 NaN 5 7 name_07 coment_07 NaN 6 8 name_08 coment_08 NaN 7 9 name_09 coment_09 NaN 8 10 name_10 coment_10 NaN 9 11 name_11 coment_11 NaN 10 12 name_12 coment_12 NaN 11 13 name_13 coment_13 NaN 12 14 name_14 coment_14 NaN 13 15 name_15 coment_15 NaN 14 16 name_16 coment_16 NaN 15 17 name_17 coment_17 NaN 16 18 name_18 coment_18 NaN 17 19 name_19 coment_19 NaN 18 20 name_20 coment_20 NaN 19 21 name_21 coment_21 NaN
為了能夠看到數據的“邊界”,讀取的時候顯示了第一列無效的數據。正常的使用中,或許我們是想連上面結果中最后一列的信息也去掉的,那只需要在參數重去掉最后一列的列號。
In [47]: data = pd.read_csv('data.csv',usecols=[0,1,2])
In [48]: data Out[48]: 1 name_01 coment_01 0 2 name_02 coment_02 1 3 name_03 coment_03 2 4 name_04 coment_04 3 5 name_05 coment_05 4 6 name_06 coment_06 5 7 name_07 coment_07 6 8 name_08 coment_08 7 9 name_09 coment_09 8 10 name_10 coment_10 9 11 name_11 coment_11 10 12 name_12 coment_12 11 13 name_13 coment_13 12 14 name_14 coment_14 13 15 name_15 coment_15 14 16 name_16 coment_16 15 17 name_17 coment_17 16 18 name_18 coment_18 17 19 name_19 coment_19 18 20 name_20 coment_20 19 21 name_21 coment_21
關于“如何使用pandas讀取csv文件的指定列”這篇文章就分享到這里了,希望以上內容可以對大家有一定的幫助,使各位可以學到更多知識,如果覺得文章不錯,請把它分享出去讓更多的人看到。
免責聲明:本站發布的內容(圖片、視頻和文字)以原創、轉載和分享為主,文章觀點不代表本網站立場,如果涉及侵權請聯系站長郵箱:is@yisu.com進行舉報,并提供相關證據,一經查實,將立刻刪除涉嫌侵權內容。