Skip to content

Commit

Permalink
[Fix][Doc] Fix LocalFile doc (#7887)
Browse files Browse the repository at this point in the history
Continue to optimize the document about filtering files and add some examples
[(#7887)](#7887)
  • Loading branch information
YOMO-Lee committed Oct 26, 2024
1 parent e64b8a6 commit 42e5919
Showing 1 changed file with 60 additions and 0 deletions.
60 changes: 60 additions & 0 deletions docs/en/connector-v2/source/LocalFile.md
Original file line number Diff line number Diff line change
Expand Up @@ -256,10 +256,70 @@ Filter pattern, which used for filtering files.

The filtering format is similar to wildcard matching file names in Linux.

| Wildcard | Meaning | Example |
|--------------|--------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------|
| * | Match 0 or more characters | f* &emsp;&ensp;&emsp; Any file starting with f<br/>b*.txt &emsp; Any file starting with b, any character in the middle, and ending with. txt |
| [] | Match a single character in parentheses | [abc]* &emsp; A file that starts with any one of the characters a, b, or c |
| ? | Match any single character | f?.txt &emsp; Any file starting with 'f' followed by a character and ending with '. txt' |
| [!] | Match any single character not in parentheses | [!abc]* &emsp; Any file that does not start with abc |
| [a-z] | Match any single character from a to z | [a-z]* &emsp; Any file starting with a to z |
| {a,b,c}/a..z | When separated by commas, it represents individual characters<br/>When separated by two dots, represents continuous characters | {a,b,c}* &emsp; Files starting with any character from abc<br/>{a..Z}* &emsp;&ensp; Files starting with any character from a to z |

However, it should be noted that unlike Linux wildcard characters, when encountering file suffixes, the middle dot cannot be omitted.

For example, `abc20241022.csv`, the normal Linux wildcard `abc*` is sufficient, but here we need to use `abc*.*` , Pay attention to a point in the middle.

File Structure Example:
```
report.txt
notes.txt
input.csv
abch20241022.csv
abcw20241022.csv
abcx20241022.csv
abcq20241022.csv
abcg20241022.csv
abcv20241022.csv
abcb20241022.csv
old_data.csv
logo.png
script.sh
helpers.sh
```
Matching Rules Example:

**Example 1**: *Match all .txt files*,Regular Expression:
```
*.txt
```
The result of this example matching is:
```
report.txt
notes.txt
```
**Example 2**: *Match all Any file starting with abc*,Regular Expression:
```
abc*.csv
```
The result of this example matching is:
```
abch20241022.csv
abcw20241022.csv
abcx20241022.csv
abcq20241022.csv
abcg20241022.csv
abcv20241022.csv
abcb20241022.csv
```
**Example 3**: *Match all Any file starting with abc,And the fourth character is either x or g*, the Regular Expression:
```
abc[x,g]*.csv
```
The result of this example matching is:
```
abcx20241022.csv
abcg20241022.csv
```
### compress_codec [string]

The compress codec of files and the details that supported as the following shown:
Expand Down

0 comments on commit 42e5919

Please sign in to comment.