Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Doc] add milliseconds_diff func and update file external table according to v3.1.9 and 3.2.4 rn (backport #42437) (backport #42439) (backport #42447) (backport #42462) #42466

Merged
merged 2 commits into from
Mar 12, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions docs/docusaurus/sidebars.json
Original file line number Diff line number Diff line change
Expand Up @@ -447,7 +447,7 @@
},
{
"type": "category",
"label": "Date Functions",
"label": "Date and Time Functions",
"link": {"type": "doc", "id": "cover_pages/functions_date"},
"items": [ { "type": "autogenerated", "dirName": "sql-reference/sql-functions/date-time-functions" } ]
},
Expand Down Expand Up @@ -997,7 +997,7 @@
},
{
"type": "category",
"label": "日期函数",
"label": "时间日期函数",
"link": {"type": "doc", "id": "cover_pages/functions_date"},
"items": [ { "type": "autogenerated", "dirName": "sql-reference/sql-functions/date-time-functions" } ]
},
Expand Down
2 changes: 1 addition & 1 deletion docs/en/data_source/file_external_table.md
Original file line number Diff line number Diff line change
Expand Up @@ -79,7 +79,7 @@ A set of parameters for accessing the target data file.
| ------------------------ | -------- | ------------------------------------------------------------ |
| path | Yes | The path of the data file. <ul><li>If the data file is stored in HDFS, the path format is `hdfs://<IP address of HDFS>:<port>/<path>`. The default port number is 8020. If you use the default port, you do not need to specify it.</li><li>If the data file is stored in AWS S3, the path format is `s3://<bucket name>/<folder>/`.</li></ul> Note the following rules when you enter the path: <ul><li>If you want to access all files in a path, end this parameter with a slash (`/`), such as `hdfs://x.x.x.x/user/hive/warehouse/array2d_parq/data/`. When you run a query, StarRocks traverses all data files under the path. It does not traverse data files by using recursion.</li><li>If you want to access a single file, enter a path that directly points to this file, such as `hdfs://x.x.x.x/user/hive/warehouse/array2d_parq/data`. When you run a query, StarRocks only scans this data file.</li></ul> |
| format | Yes | The format of the data file. Only Parquet and ORC are supported. |
| enable_recursive_listing | No | Specifies whether to recursively transverse all files under the current path. Default value: false. |
| enable_recursive_listing | No | Specifies whether to recursively transverse all files under the current path. Default value: `false`. |

#### StorageCredentialParams (Optional)

Expand Down
6 changes: 5 additions & 1 deletion docs/en/reference/System_variable.md
Original file line number Diff line number Diff line change
Expand Up @@ -140,7 +140,11 @@ The variables are described **in alphabetical order**. Variables with the `globa

Whether to enable low cardinality optimization. After this feature is enabled, the performance of querying STRING columns improves by about three times. Default value: true.

### character_set_database (global)
### cbo_eq_base_type (2.5.14 and later)

Specifies the data type used for data comparison between DECIMAL data and STRING data. The default value is `VARCHAR`, and DECIMAL is also a valid value.

### character_set_database (global)

The character set supported by StarRocks. Only UTF8 (`utf8`) is supported.

Expand Down
5 changes: 3 additions & 2 deletions docs/en/sql-reference/sql-functions/function-list.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ You can find a function based on the following categories:
- [Bitmap functions](#bitmap-functions)
- [Conditional functions](#conditional-functions)
- [Cryptographic functions](#cryptographic-functions)
- [Date functions](#date-functions)
- [Date and time functions](#date-and-time-functions)
- [Geographic functions](#geographic-functions)
- [Hash functions](#hash-functions)
- [JSON functions](#json-functions)
Expand Down Expand Up @@ -154,7 +154,7 @@ You can find a function based on the following categories:
- [sm3](../sql-functions/crytographic-functions/sm3.md)
- [to_base64](../sql-functions/crytographic-functions/from_base64.md)

## Date functions
## Date and time functions

- [add_months](../sql-functions/date-time-functions/add_months.md)
- [adddate, days_ad](../sql-functions/date-time-functions/adddate.md)
Expand Down Expand Up @@ -353,6 +353,7 @@ You can find a function based on the following categories:
- [like](./like-predicate-functions/like.md)
- [regexp](./like-predicate-functions/regexp.md)
- [regexp_extract](./like-predicate-functions/regexp_extract.md)
- [regexp_extract_all](./like-predicate-functions/regexp_extract_all.md)
- [regexp_replace](./like-predicate-functions/regexp_replace.md)

## Percentile functions
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
---
displayed_sidebar: "English"
---

# regexp_extract_all

## Description

Extracts all substrings from the target string (`str`) that matches a regular expression pattern (`pattern`) and corresponds to the regex group index specified by `pos`. This function returns an array.

In regex, groups are enclosed within the parentheses () and numbered by counting their opening parentheses from left to right, starting from 1. For example, `([[:lower:]]+)C([[:lower:]]+)` is to match lowercase letters to the left or right side of the uppercase letter `C`. This pattern contains two groups: `([[:lower:]]+)` to the left of `C` is the first group and `([[:lower:]]+)` to the right of `C` is the second group.

The pattern must completely match some parts of `str`. If no matches are found, an empty string is returned.

This function is supported from v2.5.19.

## Syntax

```Haskell
ARRAY<VARCHAR> regexp_extract_all(VARCHAR str, VARCHAR pattern, BIGINT pos)
```

## Parameters

- `str`: the string to be matched.

- `pattern`: the regular expression pattern used to match substrings.

- `pos`: `pattern` may contain multiple groups. `pos` indicates which regex group to extract.

## Return value

Returns an ARRAY that consists of VARCHAR elements.

## Examples

```Plain Text
-- Return all the letters that match group 1 in the pattern.
MySQL > SELECT regexp_extract_all('AbCdE', '([[:lower:]]+)C([[:lower:]]+)', 1);
+-------------------------------------------------------------------+
| regexp_extract_all('AbCdE', '([[:lower:]]+)C([[:lower:]]+)', 1) |
+-------------------------------------------------------------------+
| ['b'] |
+-------------------------------------------------------------------+

-- Return all the letters that match group 2 in the pattern.
MySQL > SELECT regexp_extract_all('AbCdExCeF', '([[:lower:]]+)C([[:lower:]]+)', 2);
+---------------------------------------------------------------------+
| regexp_extract_all('AbCdExCeF', '([[:lower:]]+)C([[:lower:]]+)', 2) |
+---------------------------------------------------------------------+
| ['d','e'] |
+---------------------------------------------------------------------+
```

## Keywords

REGEXP_EXTRACT_ALL, REGEXP, EXTRACT
4 changes: 4 additions & 0 deletions docs/zh/reference/System_variable.md
Original file line number Diff line number Diff line change
Expand Up @@ -135,6 +135,10 @@ SELECT /*+ SET_VAR

是否开启低基数全局字典优化。开启后,查询 STRING 列时查询速度会有 3 倍左右提升。默认值:true。

### cbo_eq_base_type (2.5.14 及以后)

用来指定 DECIMAL 类型和 STRING 类型的数据比较时的强制类型,默认按照 `VARCHAR` 类型进行比较,可选 `DECIMAL`(按数值进行比较)。

### character_set_database(global)

StarRocks 数据库支持的字符集,当前仅支持 UTF8 编码 (`utf8`)。
Expand Down
5 changes: 3 additions & 2 deletions docs/zh/sql-reference/sql-functions/function-list.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ StarRocks 提供了丰富的函数,方便您在日常数据查询和分析时
您可以按照以下分类来查找目标函数。

- [函数列表](#函数列表)
- [日期函数](#日期函数)
- [时间日期函数](#时间日期函数)
- [字符串函数](#字符串函数)
- [聚合函数](#聚合函数)
- [数学函数](#数学函数)
Expand All @@ -27,7 +27,7 @@ StarRocks 提供了丰富的函数,方便您在日常数据查询和分析时
- [地理位置函数](#地理位置函数)
- [Hash 函数](#hash-函数)

## 日期函数
## 时间日期函数

| 函数 | 功能 |
| :-: | :-: |
Expand Down Expand Up @@ -352,6 +352,7 @@ StarRocks 提供了丰富的函数,方便您在日常数据查询和分析时
| [like](./like-predicate-functions/like.md) | 判断字符串是否**模糊匹配**给定的模式 `pattern`。 |
| [regexp](./like-predicate-functions/regexp.md) | 判断字符串是否匹配给定的正则表达式 `pattern`。 |
| [regexp_extract](./like-predicate-functions/regexp_extract.md) | 对字符串进行正则匹配,抽取符合 pattern 的第 pos 个匹配部分,需要 pattern 完全匹配 str 中的某部分,才能返回 pattern 部分中需匹配部分,如果没有匹配就返回空字符串。 |
| [regexp_extract_all](./like-predicate-functions/regexp_extract_all.md) | 从 `str` 中提取与正则表达式 `pattern` 相匹配的子字符串并返回一个字符串数组。|
| [regexp_replace](./like-predicate-functions/regexp_replace.md) | 对字符串进行正则匹配,将命中 pattern 的部分使用 repl 来进行替换。 |

## 条件函数
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
---
displayed_sidebar: "Chinese"
---

# regexp_extract_all

## 功能

从 `str` 中提取与正则表达式 `pattern` 相匹配的子字符串并返回一个字符串数组。字符串必须匹配 `pos` 指定的正则分组。

在正则表达式中,分组是用括号 `()` 包裹起来的。一个 `pattern` 可能包含多个组,按照括号从左到右进行编号,起始数为 1。比如 `([[:lower:]]+)C([[:lower:]]+)` 包含两个分组,`C` 左边的 `([[:lower:]]+)` 为第一个分组,右边的 `([[:lower:]]+)` 为第二个分组。该正则表达式匹配大写字母 `C` 左边和右边的小写字母。

`pattern` 必须完全匹配 `str` 的一部分。如果没有匹配的字符串,返回空字符串。

该函数从 2.5.19 版本开始支持。

## 语法

```Haskell
ARRAY<VARCHAR> regexp_extract_all(VARCHAR str, VARCHAR pattern, BIGINT pos)
```

## 参数说明

- `str`:要提取字符的字符串。

- `pattern`: 待匹配的正则模式。

- `pos`: `pattern` 中可能包含多个分组,`pos` 用于指定第几个组,从 1 开始。

## 返回值说明

返回一个字符串数组。

## 示例

```Plain Text
-- 返回 pattern 中第一个分组匹配到的所有字符。
MySQL > SELECT regexp_extract_all('AbCdE', '([[:lower:]]+)C([[:lower:]]+)', 1);
+-------------------------------------------------------------------+
| regexp_extract_all('AbCdE', '([[:lower:]]+)C([[:lower:]]+)', 1) |
+-------------------------------------------------------------------+
| ['b'] |
+-------------------------------------------------------------------+

-- 返回 pattern 中第二个分组匹配的所有字符。
MySQL > SELECT regexp_extract_all('AbCdExCeF', '([[:lower:]]+)C([[:lower:]]+)', 2);
+---------------------------------------------------------------------+
| regexp_extract_all('AbCdExCeF', '([[:lower:]]+)C([[:lower:]]+)', 2) |
+---------------------------------------------------------------------+
| ['d','e'] |
+---------------------------------------------------------------------+
```

## Keywords

REGEXP_EXTRACT_ALL,REGEXP,EXTRACT
Loading