odps函數

發布時間：2020-10-23 02:14:57 來源：網絡閱讀：1604 作者：猿程序G 欄目：云計算

常用函數

系統自帶函數

coalesce()：返回列表中第一個非NULL的值，如果列表中所有的值都是NULL則返回NULL;

eg:

concat()：字符串連接函數;

eg:

least()：返回輸入參數中最小的一個

greatest()：返回輸入參數中最大的一個(var1，var2可以為bigint，double，datetime或者string。若所有值都為NULL則返回NULL。

返回值:輸入參數中的最大值，當不存在隱式轉換時返回同輸入參數類型。NULL為最小值。當輸入參數類型不同時，double，bigint，string之間的比較轉為double；string，datetime的比較轉為datetime。不允許其它的隱式轉換)

decode()：實現分支選擇的功能

eg:select decode(customer_id,

? ? ? ? ? ? 1, 'Taobao',

? ? ? ? ? ? 2, 'Alipay',

? ? ? ? ? ? 3, 'Aliyun',

? ? ? ? ? ? NULL, 'N/A',

? ? ? ? ? ? 'Others') as result

? ? from sale_detail;

上面的decode函數實現了下面if-then-else語句中的功能：

? ? if customer_id = 1 then

? ? ? ? result := 'Taobao';

? ? elsif customer_id = 2 then

? ? ? ? result := 'Alipay';

? ? elsif customer_id = 3 then

? ? ? ? result := 'Aliyun';

? ? ...

? ? else

? ? ? ? result := 'Others';

? ? end if;

if函數：if(邏輯條件,coumn1,coumn2)表示滿足條件則輸出1，否則輸出2的值

eg:if(cap_direction not in('0','1'),null, cast(cap_direction as bigint));

substr():返回字符串str從start_position開始長度為length的子串

eg: substr(""abc"", 2) = ""bc"";substr(""abc"", 2, 1) = ""b"";

to_char():將Boolean類型、bigint類型、decimal類型或者double類型轉為對應的string類型表示

eg:to_char(123) = '123';to_char(true) = 'TRUE';to_char(1.23) = '1.23';to_char(null) = NULL;

to_char():Datetime類型，要轉換的日期值，若輸入為string類型會隱式轉換為datetime類型后參與運算，其它類型拋異常。

eg:to_char(getdate(),'yyyymmdd')

concat（coumn1,',',coumn2）：字符串連接函數

匹配兩位精度：

concat(substr(to_char(lng),1,6),',',substr(to_char(lat),1,5)) like '120.08,30.28';?

regexp_extract(coumn,'',number):字符串拆分函數

如：臨東路與火神塘路交叉口

regexp_extract(inter_name,'(.*?)(路)',1) =臨東

regexp_extract(inter_name,'與(.*?)(交叉口)',1)=火神塘路

regexp_replace:字符串替換函數

regexp_replace(round_name,'-','',1)表示吧-替換成null

split_part字符串拆分函數

split_part('環北-密渡橋','-',2)=密渡橋

instr：計算一個子串str2在字符串str1中的位置

? instr('Tech on the net', 'e') = 2；instr('Tech on the net', 'e', 1, 1) = 2

cast

coors_convert(lng,lat,1)：谷歌轉高德coors_convert(120.2334214,30.21829241,1)

WHERE judge_location(split_part(coors_convert(a.lng,a.lat,1),',',1),split_part(coors_convert(a.lng,a.lat,1),',',2))=1

窗口函數

統計量：count,sum,avg,max/min,median,stddev,stddev_samp

排名：row_unmber,rank,dense_rank,percent_rank

其他類：lag,lead,cluster_sample

--------------------

基本用法;把數據按照一定條件分成多組稱為開窗，每個組稱為一個窗口

partition by部分用來指定開窗的列

分區列的值相同的行被視為在同一個窗口內

order by用來指定數據在一個窗口內如何排序

使用限制：只能出現在select子句中

窗口函數中不要嵌套使用窗口函數和聚合函數

不可以和同級別的聚合函數一起使用

一個odps sql語句中，可以使用至多5個窗口函數

Partition開窗時，同一窗口內最多包含1億行數據

用rows開窗時，x,y必須大于等于0的整數常量，限定范圍0-10000，值為0時表示當前行

必須使用order by才可以用rows方式指定窗口范圍

并非所有的窗口函數都可以用rows指定開窗方式，支持這種用法的窗口函數有avg,count,max,min,stddev和sum

----------------------

舉個栗子

select *,rank() over(partition by monitor_id order by distance) as mindistance_monitor_id from()

自定義函數

基于阿里云odps制作相應的自定義函數

說明：本例子中由于odps版本過低：所以創建的時候沒有采用阿里云example一步一步來maven打包，而是采用自己打包，是由于采用例子的一步一步來出來的jar回有問題(出來的jar沒有類資源，只有配置文件資源)。

名詞解釋：

UDF：用戶自定義標量值函數(user defined scalar function),其輸入與輸出是一對一的關系，讀入一行數據(可以有多個參數)，寫出一條輸出值

UDTF：自定義表值函數(user defined table valued function),是用來解決一次函數調用輸出多行數據場景的，也是唯一能返回多個字段的自定義函數

UDAF：自定義聚合函數(user defined aggregation function)，其輸入和輸出是多對一的關系，將多條輸入記錄聚合成一條輸出值(可以和group by語句聯用)

向AI問一下細節

91超碰碰碰碰久久久久久综合_超碰av人澡人澡人澡人澡人掠_国产黄大片在线观看画质优化_txt小说免费全本

odps函數

猜你喜歡

91超碰碰碰碰久久久久久综合_超碰av人澡人澡人澡人澡人掠_国产黄大片在线观看画质优化_txt小说免费全本

odps函數

猜你喜歡

最新資訊

相關推薦

相關標簽