您好,登錄后才能下訂單哦!
這篇文章主要講解了“PostgreSQL中Definitions作用是什么”,文中的講解內容簡單清晰,易于學習與理解,下面請大家跟著小編的思路慢慢深入,一起來研究和學習“PostgreSQL中Definitions作用是什么”吧!
Flex輸入文件由四部分組成:
%{ Declarations %} Definitions %% Rules %% User subroutines
在Declarations和Rules之間的部分是Definitions,這一部分可以定義進行正則表達式的”宏定義”,這些定義可在 規則(Rules)段被使用,如:
newline [\n\r]
這樣在Rules中可以直接使用newline指代[\n\r]。
//各種選項設置 %option reentrant %option bison-bridge %option bison-locations %option 8bit %option never-interactive %option nodefault %option noinput %option nounput %option noyywrap %option noyyalloc %option noyyrealloc %option noyyfree %option warn %option prefix="core_yy" /* * OK, here is a short description of lex/flex rules behavior. * The longest pattern which matches an input string is always chosen. * For equal-length patterns, the first occurring in the rules list is chosen. * INITIAL is the starting state, to which all non-conditional rules apply. * Exclusive states change parsing rules while the state is active. When in * an exclusive state, only those rules defined for that state apply. * 下面是一些lex/flex規則動作的簡單描述. * 通常會選中可以最大匹配輸入的字符串模式. * 對于長度一致的模式,規則鏈表中的第一個規則會選中. * INITIAL是開始狀態,適用于所有非條件規則. * * We use exclusive states for quoted strings, extended comments, * and to eliminate parsing troubles for numeric strings. * Exclusive states: * <xb> bit string literal * <xc> extended C-style comments * <xd> delimited identifiers (double-quoted identifiers) * <xh> hexadecimal numeric string * <xq> standard quoted strings * <xe> extended quoted strings (support backslash escape sequences) * <xdolq> $foo$ quoted strings * <xui> quoted identifier with Unicode escapes * <xuiend> end of a quoted identifier with Unicode escapes, UESCAPE can follow * <xus> quoted string with Unicode escapes * <xusend> end of a quoted string with Unicode escapes, UESCAPE can follow * <xeu> Unicode surrogate pair in extended quoted string * 對于引用字符串/擴展注釋使用獨占的狀態,并消除數值字符串在解析中存在的麻煩. * 獨占的狀態包括: * <xb> 位字符串 * <xc> 擴展的C-style注釋 * <xd> 分隔標識符(雙引號標識符) * <xh> 十六進制數字字符串 * <xq> 標準的帶引號的字符串 * <xe> 擴展的帶引號的字符串(支持反斜杠轉義序列) * <xdolq> $foo$帶引號的字符串 * <xui> 帶Unicode轉義的帶引號的標識符 * <xuiend> 帶Unicode轉義的帶引號的標識符的結尾,后跟UESCAPE * <xus> 帶Unicode轉義的帶引號的字符串 * <xueend> 帶Unicode轉義的帶引號的字符串的結尾,后跟UESCAPE * <xeu> 擴展帶引號的字符串中的Unicode代理對 * * Remember to add an <<EOF>> case whenever you add a new exclusive state! * The default one is probably not the right thing. * 增加一個獨占狀態時,務請記住添加<<EOF>>. * 默認情況下可能不是正確. */ //INITIAL是開始狀態,其他狀態必須由%s或%x指定 %x xb %x xc %x xd %x xh %x xe %x xq %x xdolq %x xui %x xuiend %x xus %x xusend %x xeu /* * In order to make the world safe for Windows and Mac clients as well as * Unix ones, we accept either \n or \r as a newline. A DOS-style \r\n * sequence will be seen as two successive newlines, but that doesn't cause * any problems. Comments that start with -- and extend to the next * newline are treated as equivalent to a single whitespace character. * 對了適配Windows和Mac客戶端,\n或者\r也視為新行. * DOS-style的\r\n序列被視為兩個連續的新行,但這不會引起任何問題. * 由--開始的注釋,如果擴展到新行,視為單個空白字符 * * NOTE a fine point: if there is no newline following --, we will absorb * everything to the end of the input as a comment. This is correct. Older * versions of Postgres failed to recognize -- as a comment if the input * did not end with a newline. * 注意:如果--后沒有新行,將把輸入末尾的所有內容作為注釋. * PG的舊版本對這種情況無法識別,--如果沒有以換行符結束,則作為注釋 * * XXX perhaps \f (formfeed) should be treated as a newline as well? * XXX 那么,\f也應該作為新行來處理 * * XXX if you change the set of whitespace characters, fix scanner_isspace() * to agree, and see also the plpgsql lexer. * XXX 如果改變了空白字符集合,注意同步修改scanner_isspace()以適應修改后的情況,同時關注plpgsql的詞法 */ //\t -->Tab鍵,\n -->換行,\t -->回車,\f -->換頁 space [ \t\n\r\f] //tab鍵/換行/回車/換頁 horiz_space [ \t\f] //空格/tab鍵/換頁 newline [\n\r] //換行/回車 non_newline [^\n\r] //除了換行/回車外的其他字符 //單行注釋 comment ("--"{non_newline}*) //空白字符(1個或以上空格或者注釋均視為whitespace) whitespace ({space}+|{comment}) /* * SQL requires at least one newline in the whitespace separating * string literals that are to be concatenated. Silly, but who are we * to argue? Note that {whitespace_with_newline} should not have * after * it, whereas {whitespace} should generally have a * after it... * SQL語句要求在分隔字符串字面值的空格中至少有一行換行符, * 這些字符串字面值將被連接起來. * 很傻:( 但這又有什么好爭論的呢? * 注意{whitespace_with_newline}不應該在定義的后面存在*號, * 這里{whitespace}通常至少在其后面跟一個* */ //特殊空白,1個+以上空格或注釋后跟新行 special_whitespace ({space}+|{comment}{newline}) //水平空白(一堆的空格或者注釋) horiz_whitespace ({horiz_space}|{comment}) //0個或多個horiz_whitespace+新行+0個或多個特殊空白 whitespace_with_newline ({horiz_whitespace}*{newline}{special_whitespace}*) /* * To ensure that {quotecontinue} can be scanned without having to back up * if the full pattern isn't matched, we include trailing whitespace in * {quotestop}. This matches all cases where {quotecontinue} fails to match, * except for {quote} followed by whitespace and just one "-" (not two, * which would start a {comment}). To cover that we have {quotefail}. * The actions for {quotestop} and {quotefail} must throw back characters * beyond the quote proper. * 如果全模式沒有匹配,為了確保{quotecontinue}不需要備份就可以掃描, * 我們在{quotestop}中包含了尾部空格. * 這可以匹配{quotecontinue}無法匹配的所有情況,除了{quote}后跟空格而且只有一個'-'字符的情況 * (注意,不是兩個'-'字符,這被視為{comment}的開始) * 為了覆蓋含有{quotefail}的情況,{quotestop}和{quotefail}的動作必須返回超出引號的字符 */ quote ' quotestop {quote}{whitespace}* quotecontinue {quote}{whitespace_with_newline}{quote} quotefail {quote}{whitespace}*"-" //<xb> /* Bit string * It is tempting to scan the string for only those characters * which are allowed. However, this leads to silently swallowed * characters if illegal characters are included in the string. * For example, if xbinside is [01] then B'ABCD' is interpreted * as a zero-length string, and the ABCD' is lost! * Better to pass the string forward and let the input routines * validate the contents. * 位字符串 * 傾向于只掃描字符串中允許的字符. * 但是這會導致如果非法字符包含在字符串中時默默的接受這些非法字符. * 比如,如果xbinside是[01],則B'ABCD'被視為0長度的字符串,并且丟失了ABCD' */ xbstart [bB]{quote} //開始:b或B字符開頭,后跟單引號'字符 xbinside [^']* //字符串內容:除單引號外的其他字符,0個或多個 /* Hexadecimal number */ //<xh> 十六進制數字 xhstart [xX]{quote} //開始:以x或X打頭,后跟單引號 xhinside [^']* //內容:除單引號外的其他字符,0個或多個 /* National character */ //<xn> 國家字符(Unicode) xnstart [nN]{quote} //開始:以n或N打頭 /* Quoted string that allows backslash escapes */ //<xe> 允許反斜杠轉義字符的帶引號的字符串 xestart [eE]{quote} //開始:e或E打頭,后跟單引號 xeinside [^\\']+ //內容:除反斜杠和單引號外的其他字符 xeescape [\\][^0-7] //轉義字符:以反斜杠打頭后跟除0-7之外的其他字符 xeoctesc [\\][0-7]{1,3} //八進制轉義字符:以反斜杠打頭后跟0-7,出現1次-3次 xehexesc [\\]x[0-9A-Fa-f]{1,2} //十六進制轉義字符:以反斜杠打頭后跟0-F/f,出現1次-2次 //Unicode字符:以反斜杠打頭,后跟u和0-F/f(連續出現4次)或者是后跟U,0-F/f連續出現8次 xeunicode [\\](u[0-9A-Fa-f]{4}|U[0-9A-Fa-f]{8}) //不符合xeunicode的其他情況 xeunicodefail [\\](u[0-9A-Fa-f]{0,3}|U[0-9A-Fa-f]{0,7}) /* Extended quote * 擴展引號 * xqdouble implements embedded quote, '''' * xqdouble實現了內嵌引號,'''' */ xqstart {quote} xqdouble {quote}{quote} xqinside [^']+ /* $foo$ style quotes ("dollar quoting") * The quoted string starts with $foo$ where "foo" is an optional string * in the form of an identifier, except that it may not contain "$", * and extends to the first occurrence of an identical string. * There is *no* processing of the quoted text. * $foo$類型的引號("美元引號") * 帶引號的字符串以$foo$開始,這里foo是一個可選的字符串, * 但它不包含字符$,并且擴展到相同字符串的第一次出現. * 擴展到標識符第一次出現的地方. * 對于引用文本,不需要進行處理 * * {dolqfailed} is an error rule to avoid scanner backup when {dolqdelim} * fails to match its trailing "$". * {dolqfailed}是一種錯誤規則,用以避免掃描器在{dolqdelim}不能匹配末尾的$時進行備份 */ //<xdolq> dolq_start [A-Za-z\200-\377_] //開始:大小寫英文字母/80-FF字符(8進制是200-377)/下劃線 dolq_cont [A-Za-z\200-\377_0-9] //dolq_start + 數字 dolqdelim \$({dolq_start}{dolq_cont}*)?\$ //分隔符$xx$,xx可選 dolqfailed \${dolq_start}{dolq_cont}* //失敗:以$開始,但沒有$結束 dolqinside [^$]+ //內容:除$外的其他字符 /* Double quote * Allows embedded spaces and other special characters into identifiers. * 雙引號 * 允許嵌入空格和其他特殊字符 */ //<xd> dquote \" //雙引號 xdstart {dquote} //開始:以雙引號打頭 xdstop {dquote} //結束:以雙引號結束 xddouble {dquote}{dquote}//兩個雙引號 xdinside [^"]+ //內容:除雙引號外的其他字符,1個或多個 /* Unicode escapes */ //<xue> //轉義字符: uescape [uU][eE][sS][cC][aA][pP][eE]{whitespace}*{quote}[^']{quote} /* error rule to avoid backup */ //錯誤規則:避免備份 uescapefail [uU][eE][sS][cC][aA][pP][eE]{whitespace}*"-"|[uU][eE][sS][cC][aA][pP][eE]{whitespace}*{quote}[^']|[uU][eE][sS][cC][aA][pP][eE]{whitespace}*{quote}|[uU][eE][sS][cC][aA][pP][eE]{whitespace}*|[uU][eE][sS][cC][aA][pP]|[uU][eE][sS][cC][aA]|[uU][eE][sS][cC]|[uU][eE][sS]|[uU][eE]|[uU] /* Quoted identifier with Unicode escapes */ //使用Unicode轉義字符的引用標識符(雙引號) //<xui> xuistart [uU]&{dquote} //開頭:以u/U打頭,后跟雙引號 /* Quoted string with Unicode escapes */ //使用Unicode轉義字符的字符串(單引號) //<xus> xusstart [uU]&{quote} /* Optional UESCAPE after a quoted string or identifier with Unicode escapes. */ //引用字符串或者標識符后可選的UESCAPE //<> xustop1 {uescapefail}? xustop2 {uescape} /* error rule to avoid backup */ //錯誤規則:避免備份 xufailed [uU]& /* C-style comments * C風格注釋 * * The "extended comment" syntax closely resembles allowable operator syntax. * The tricky part here is to get lex to recognize a string starting with * slash-star as a comment, when interpreting it as an operator would produce * a longer match --- remember lex will prefer a longer match! Also, if we * have something like plus-slash-star, lex will think this is a 3-character * operator whereas we want to see it as a + operator and a comment start. * The solution is two-fold: * 1. append {op_chars}* to xcstart so that it matches as much text as * {operator} would. Then the tie-breaker (first matching rule of same * length) ensures xcstart wins. We put back the extra stuff with yyless() * in case it contains a star-slash that should terminate the comment. * 2. In the operator rule, check for slash-star within the operator, and * if found throw it back with yyless(). This handles the plus-slash-star * problem. * Dash-dash comments have similar interactions with the operator rule. * "擴展注釋"語法與允許的操作符語法非常相似. * 這里比較棘手的部分是讓詞法分析器可以識別以斜杠加星號開頭的字符串作為注釋, * 因為在認為星號為操作符時可能會產生更長的匹配 -- 記住:詞法分析器傾向于更長的匹配. * 同時,如果存在形如+/*這樣的字符串,詞法分析器會認為這是3元操作符, * 但其實我們希望把它視作一個加號操作符和注釋的開始. * 解決方案如下: * 1.追加{op_chars}*到xcstart中,以便它可以匹配盡可能多的文本(與{operator}一樣). * 然后,tie-breaker(相同長度首次匹配原則)確保xcstart會首先匹配. * 我們用yyless()放進去了一些額外的東西,以防它包含一個星號和斜杠(即:*/),這會終止注釋 * 2.在操作符規則中,檢查操作符中的反斜杠+星號,如發現則返回給yyless().這可以處理+/*這個問題 * "--"注釋與操作符規則有類型的交互方式. */ xcstart \/\*{op_chars}* //開始:/*+操作符(0個或多個) xcstop \*+\/ //結束:1個或多個*號+字符/ xcinside [^*/]+ //內容:除了*和/外的其他字符,1個或多個 digit [0-9] //數字:0-9 ident_start [A-Za-z\200-\377_] //標識符開始:英文字母/80-FF字符/下劃線 ident_cont [A-Za-z\200-\377_0-9\$] //標識符:ident_start外加數字 identifier {ident_start}{ident_cont}* //標識符 /* Assorted special-case operators and operator-like tokens */ //組合的特殊情況操作符和類似操作符的tokens typecast "::" //強制類型轉換操作符 dot_dot \.\. //點點操作符 colon_equals ":=" //賦值操作符 /* * These operator-like tokens (unlike the above ones) also match the {operator} * rule, which means that they might be overridden by a longer match if they * are followed by a comment start or a + or - character. Accordingly, if you * add to this list, you must also add corresponding code to the {operator} * block to return the correct token in such cases. (This is not needed in * psqlscan.l since the token value is ignored there.) * 這些類操作符tokens(不同于上面所列)同時會匹配{operator}規則, * 這意味著如果后跟注釋起始符或者+-字符的話,它們可能會被長匹配覆蓋. * 因此,如果加入到鏈表中,必須同時相應的代碼到{operator}塊中以便返回正確的token. * (在psqlscan.l中不需要這樣做,因為token值會被忽略) */ equals_greater "=>" //等于大于 less_equals "<=" //小于等于 greater_equals ">=" //大于等于 less_greater "<>" //小于/大于 not_equals "!=" //不等于 /* * "self" is the set of chars that should be returned as single-character * tokens. "op_chars" is the set of chars that can make up "Op" tokens, * which can be one or more characters long (but if a single-char token * appears in the "self" set, it is not to be returned as an Op). Note * that the sets overlap, but each has some chars that are not in the other. * * If you change either set, adjust the character lists appearing in the * rule for "operator"! * "self"是那些作為單字符tokens返回的字符集合. * "op_chars"是組成"Op" tokens(一個或多個字符)的字符集合 * (如果單個字符token出現在"self"中,則不會作為Op返回). * 注意這些集合是重復的,但是每個集合都有一些不在另外一個集合中的字符. * 如果改變了其中一個集合,調整出現在"operator"所設定的規則中字符列表. */ self [,()\[\].;\:\+\-\*\/\%\^\<\>\=] op_chars [\~\!\@\#\^\&\|\`\?\+\-\*\/\%\<\>\=] operator {op_chars}+ /* we no longer allow unary minus in numbers. * instead we pass it separately to parser. there it gets * coerced via doNegate() -- Leon aug 20 1999 * 我們不再允許一進制負數,這些值會單獨傳遞給解析器. * 在那里,會通過doNegate()方法處理. * * {decimalfail} is used because we would like "1..10" to lex as 1, dot_dot, 10. * {decimalfail}在處理形如1..10的情況 * * {realfail1} and {realfail2} are added to prevent the need for scanner * backup when the {real} rule fails to match completely. * 添加{realfail1} 和 {realfail2}的目的是防止在{real}規則匹配失敗時的掃描器備份 */ integer {digit}+ //整數 decimal (({digit}*\.{digit}+)|({digit}+\.{digit}*)) //小數 decimalfail {digit}+\.\. //匹配失敗的小數 real ({integer}|{decimal})[Ee][-+]?{digit}+ //實數 realfail1 ({integer}|{decimal})[Ee] //匹配失敗1 realfail2 ({integer}|{decimal})[Ee][-+] //匹配失敗2 param \${integer} //參數 other . //其他 /* * Dollar quoted strings are totally opaque, and no escaping is done on them. * Other quoted strings must allow some special characters such as single-quote * and newline. * Embedded single-quotes are implemented both in the SQL standard * style of two adjacent single quotes "''" and in the Postgres/Java style * of escaped-quote "\'". * Other embedded escaped characters are matched explicitly and the leading * backslash is dropped from the string. * Note that xcstart must appear before operator, as explained above! * Also whitespace (comment) must appear before operator. * 使用$符號括起來的字符串是完全密封的,在其上無任何的轉義可做. * 其他引用字符串必須運行一些特殊字符比如單引號或者新行. * 嵌入式的單引號在標準SQL風格中通過兩個相鄰的單引號"''"實現, * 在Postgres/Java風格中使用轉義字符"\'"實現. * 其他嵌入式的轉義字符顯式匹配,打頭的反斜杠會從字符串中去掉. * 如前所解釋過的,務必注意xcstart必須在操作符前出現. * 同時空白字符(注釋)必須在操作符前出現. */
感謝各位的閱讀,以上就是“PostgreSQL中Definitions作用是什么”的內容了,經過本文的學習后,相信大家對PostgreSQL中Definitions作用是什么這一問題有了更深刻的體會,具體使用情況還需要大家實踐驗證。這里是億速云,小編將為大家推送更多相關知識點的文章,歡迎關注!
免責聲明:本站發布的內容(圖片、視頻和文字)以原創、轉載和分享為主,文章觀點不代表本網站立場,如果涉及侵權請聯系站長郵箱:is@yisu.com進行舉報,并提供相關證據,一經查實,將立刻刪除涉嫌侵權內容。