您好,登錄后才能下訂單哦!
這篇文章將為大家詳細講解有關怎么在matplotlib中利用scatter方法畫散點圖,文章內容質量較高,因此小編分享給大家做個參考,希望大家閱讀完這篇文章后對相關知識有一定的了解。
1、最簡單的繪制方式
繪制散點圖是數據分析過程中的常見需求。python中最有名的畫圖工具是matplotlib,matplotlib中的scatter方法可以方便實現畫散點圖的需求。下面我們來繪制一個最簡單的散點圖。
數據格式如下:
0 746403
1 1263043
2 982360
3 1202602
...
其中第一列為X坐標,第二列為Y坐標。下面我們來畫圖。
#!/usr/bin/env python #coding:utf-8 import matplotlib.pyplot as plt def pltpicture(): file = "xxx" xlist = [] ylist = [] with open(file, "r") as f: for line in f.readlines(): lines = line.strip().split() if len(lines) != 2 or int(lines[1]) < 100000: continue x, y = int(lines[0]), int(lines[1]) xlist.append(x) ylist.append(y) plt.xlabel('X') plt.ylabel('Y') plt.scatter(xlist, ylist) plt.show()
2、更漂亮一些的畫圖方式
上面的圖片比較粗糙,是最簡單的方式,沒有任何相關的配置項。下面我們再用另外一份數據集畫出更漂亮一點的圖。
數據集來自網絡的公開數據集,數據格式如下:
40920 8.326976 0.953952 3
14488 7.153469 1.673904 2
26052 1.441871 0.805124 1
75136 13.147394 0.428964 1
...
第一列每年獲得的飛行常客里程數;
第二列玩視頻游戲所耗時間百分比;
第三列每周消費的冰淇淋公升數;
第四列為label:
1表示不喜歡的人
2表示魅力一般的人
3表示極具魅力的人
現在將每年獲取的飛行里程數作為X坐標,玩視頻游戲所消耗的事件百分比作為Y坐標,畫出圖。
from matplotlib import pyplot as plt file = "/home/mi/wanglei/data/datingTestSet2.txt" label1X, label1Y, label2X, label2Y, label3X, label3Y = [], [], [], [], [], [] with open(file, "r") as f: for line in f: lines = line.strip().split() if len(lines) != 4: continue distance, rate, label = lines[0], lines[1], lines[3] if label == "1": label1X.append(distance) label1Y.append(rate) elif label == "2": label2X.append(distance) label2Y.append(rate) elif label == "3": label3X.append(distance) label3Y.append(rate) plt.figure(figsize=(8, 5), dpi=80) axes = plt.subplot(111) label1 = axes.scatter(label1X, label1Y, s=20, c="red") label2 = axes.scatter(label2X, label2Y, s=40, c="green") label3 = axes.scatter(label3X, label3Y, s=50, c="blue") plt.xlabel("every year fly distance") plt.ylabel("play video game rate") axes.legend((label1, label2, label3), ("don't like", "attraction common", "attraction perfect"), loc=2) plt.show()
最后效果圖:
3、scatter函數詳解
我們來看看scatter函數的簽名:
def scatter(self, x, y, s=None, c=None, marker=None, cmap=None, norm=None, vmin=None, vmax=None, alpha=None, linewidths=None, verts=None, edgecolors=None, **kwargs): """ Make a scatter plot of `x` vs `y` Marker size is scaled by `s` and marker color is mapped to `c` Parameters ---------- x, y : array_like, shape (n, ) Input data s : scalar or array_like, shape (n, ), optional size in points^2. Default is `rcParams['lines.markersize'] ** 2`. c : color, sequence, or sequence of color, optional, default: 'b' `c` can be a single color format string, or a sequence of color specifications of length `N`, or a sequence of `N` numbers to be mapped to colors using the `cmap` and `norm` specified via kwargs (see below). Note that `c` should not be a single numeric RGB or RGBA sequence because that is indistinguishable from an array of values to be colormapped. `c` can be a 2-D array in which the rows are RGB or RGBA, however, including the case of a single row to specify the same color for all points. marker : `~matplotlib.markers.MarkerStyle`, optional, default: 'o' See `~matplotlib.markers` for more information on the different styles of markers scatter supports. `marker` can be either an instance of the class or the text shorthand for a particular marker. cmap : `~matplotlib.colors.Colormap`, optional, default: None A `~matplotlib.colors.Colormap` instance or registered name. `cmap` is only used if `c` is an array of floats. If None, defaults to rc `image.cmap`. norm : `~matplotlib.colors.Normalize`, optional, default: None A `~matplotlib.colors.Normalize` instance is used to scale luminance data to 0, 1. `norm` is only used if `c` is an array of floats. If `None`, use the default :func:`normalize`. vmin, vmax : scalar, optional, default: None `vmin` and `vmax` are used in conjunction with `norm` to normalize luminance data. If either are `None`, the min and max of the color array is used. Note if you pass a `norm` instance, your settings for `vmin` and `vmax` will be ignored. alpha : scalar, optional, default: None The alpha blending value, between 0 (transparent) and 1 (opaque) linewidths : scalar or array_like, optional, default: None If None, defaults to (lines.linewidth,). verts : sequence of (x, y), optional If `marker` is None, these vertices will be used to construct the marker. The center of the marker is located at (0,0) in normalized units. The overall marker is rescaled by ``s``. edgecolors : color or sequence of color, optional, default: None If None, defaults to 'face' If 'face', the edge color will always be the same as the face color. If it is 'none', the patch boundary will not be drawn. For non-filled markers, the `edgecolors` kwarg is ignored and forced to 'face' internally. Returns ------- paths : `~matplotlib.collections.PathCollection` Other parameters ---------------- kwargs : `~matplotlib.collections.Collection` properties See Also -------- plot : to plot scatter plots when markers are identical in size and color Notes ----- * The `plot` function will be faster for scatterplots where markers don't vary in size or color. * Any or all of `x`, `y`, `s`, and `c` may be masked arrays, in which case all masks will be combined and only unmasked points will be plotted. Fundamentally, scatter works with 1-D arrays; `x`, `y`, `s`, and `c` may be input as 2-D arrays, but within scatter they will be flattened. The exception is `c`, which will be flattened only if its size matches the size of `x` and `y`. Examples -------- .. plot:: mpl_examples/shapes_and_collections/scatter_demo.py """
其中具體的參數含義如下:
x,y是相同長度的數組。
s可以是標量,或者與x,y長度相同的數組,表明散點的大小。默認為20。
c即color,表示點的顏色。
marker 是散點的形狀。
關于怎么在matplotlib中利用scatter方法畫散點圖就分享到這里了,希望以上內容可以對大家有一定的幫助,可以學到更多知識。如果覺得文章不錯,可以把它分享出去讓更多的人看到。
免責聲明:本站發布的內容(圖片、視頻和文字)以原創、轉載和分享為主,文章觀點不代表本網站立場,如果涉及侵權請聯系站長郵箱:is@yisu.com進行舉報,并提供相關證據,一經查實,將立刻刪除涉嫌侵權內容。