Skip to content

Commit

Permalink
CDPD-32844: Speeding up HDFS scan by skipping folders not containing …
Browse files Browse the repository at this point in the history
…jar or tar.gz files (#25)
  • Loading branch information
isuller authored and GitHub Enterprise committed Dec 21, 2021
1 parent f277913 commit 5094942
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion hdp_support_scripts/patch_hdfs_tgz.sh
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,7 @@ tmpdir=${TMPDIR:-/tmp}
mkdir -p $tmpdir
echo "Using tmp directory '$tmpdir'"

for hdfs_file_path in $($user_option hdfs dfs -ls -R $hdfs_path | awk 'BEGIN {LAST=""} {if (match($8,LAST"/")>0) { print LAST; } LAST=$8}')
for hdfs_file_path in $($user_option hdfs dfs -ls -R $hdfs_path | awk 'BEGIN {LAST=""} /^d/ {LAST=$8} /^-.*(jar|tar.gz)/ {if (LAST) { print LAST; } LAST=""}')
do
echo $hdfs_file_path

Expand Down

0 comments on commit 5094942

Please sign in to comment.