Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HIVE-20917: Support applyQuotesToAll option in OpenCSVSerde. #3718

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

gigem
Copy link

@gigem gigem commented Oct 31, 2022

OpenCSV version 3.10 added an option called "applyQuotesToAll." When set to false, output columns were not quoted unless they needed it. This commit adds support for that option in OpenCSVSerde and also updates the support for OpenCSV version 5.7.1, which is the current version at the time of this commit. This commit is based on a patch from Sungwoo Park [email protected], which was based on a patch from David Engel [email protected].

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

OpenCSV version 3.10 added an option called "applyQuotesToAll."  When
set to false, output columns were not quoted unless they needed it.
This commit adds support for that option in OpenCSVSerde and also
updates the support for OpenCSV version 5.7.1, which is the current
version at the time of this commit.  This commit is based on a patch
from Sungwoo Park <[email protected]>, which was based on a patch from
David Engel <[email protected]>.
@github-actions
Copy link

github-actions bot commented Oct 31, 2022

@check-spelling-bot Report

🔴 Please review

See the files view or the action log for details.

Unrecognized words (1)

APPLYQUOTESTOALL

Previously acknowledged words that are now absent aarry bytecode timestamplocal yyyy
To accept these unrecognized words as correct (and remove the previously acknowledged and now absent words), run the following commands

... in a clone of the [email protected]:gigem/hive.git repository
on the HIVE-20917 branch:

update_files() {
perl -e '
my @expect_files=qw('".github/actions/spelling/expect.txt"');
@ARGV=@expect_files;
my @stale=qw('"$patch_remove"');
my $re=join "|", @stale;
my $suffix=".".time();
my $previous="";
sub maybe_unlink { unlink($_[0]) if $_[0]; }
while (<>) {
if ($ARGV ne $old_argv) { maybe_unlink($previous); $previous="$ARGV$suffix"; rename($ARGV, $previous); open(ARGV_OUT, ">$ARGV"); select(ARGV_OUT); $old_argv = $ARGV; }
next if /^(?:$re)(?:(?:\r|\n)*$| .*)/; print;
}; maybe_unlink($previous);'
perl -e '
my $new_expect_file=".github/actions/spelling/expect.txt";
use File::Path qw(make_path);
use File::Basename qw(dirname);
make_path (dirname($new_expect_file));
open FILE, q{<}, $new_expect_file; chomp(my @words = <FILE>); close FILE;
my @add=qw('"$patch_add"');
my %items; @items{@words} = @words x (1); @items{@add} = @add x (1);
@words = sort {lc($a)."-".$a cmp lc($b)."-".$b} keys %items;
open FILE, q{>}, $new_expect_file; for my $word (@words) { print FILE "$word\n" if $word =~ /\w/; };
close FILE;
system("git", "add", $new_expect_file);
'
}

comment_json=$(mktemp)
curl -L -s -S \
-H "Content-Type: application/json" \
"https://api.github.com/repos/apache/hive/issues/comments/1297703933" > "$comment_json"
comment_body=$(mktemp)
jq -r ".body // empty" "$comment_json" > $comment_body
rm $comment_json

patch_remove=$(perl -ne 'next unless s{^</summary>(.*)</details>$}{$1}; print' < "$comment_body")

patch_add=$(perl -e '$/=undef; $_=<>; if (m{Unrecognized words[^<]*</summary>\n*```\n*([^<]*)```\n*</details>$}m) { print "$1" } elsif (m{Unrecognized words[^<]*\n\n((?:\w.*\n)+)\n}m) { print "$1" };' < "$comment_body")

update_files
rm $comment_body
git add -u
If the flagged items do not appear to be text

If items relate to a ...

  • well-formed pattern.

    If you can write a pattern that would match it,
    try adding it to the patterns.txt file.

    Patterns are Perl 5 Regular Expressions - you can test yours before committing to verify it will match your lines.

    Note that patterns can't match multiline strings.

  • binary file.

    Please add a file path to the excludes.txt file matching the containing file.

    File paths are Perl 5 Regular Expressions - you can test yours before committing to verify it will match your files.

    ^ refers to the file's path from the root of the repository, so ^README\.md$ would exclude README.md (on whichever branch you're using).

@sonarcloud
Copy link

sonarcloud bot commented Nov 1, 2022

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 5 Code Smells

No Coverage information No Coverage information
No Duplication information No Duplication information

@github-actions
Copy link

github-actions bot commented Jan 1, 2023

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.
Feel free to reach out on the [email protected] list if the patch is in need of reviews.

@github-actions github-actions bot added the stale label Jan 1, 2023
@github-actions github-actions bot closed this Jan 9, 2023
@BsoBird
Copy link
Contributor

BsoBird commented Oct 15, 2024

@deniskuzZ @ayushtkn Hello. can we reopen this?

@@ -90,6 +95,8 @@ public void initialize(Configuration configuration, Properties tableProperties,
separatorChar = getProperty(properties, SEPARATORCHAR, CSVWriter.DEFAULT_SEPARATOR);
quoteChar = getProperty(properties, QUOTECHAR, CSVWriter.DEFAULT_QUOTE_CHARACTER);
escapeChar = getProperty(properties, ESCAPECHAR, CSVWriter.DEFAULT_ESCAPE_CHARACTER);
String temp = properties.getProperty(APPLYQUOTESTOALL);
applyQuotesToAll = (temp == null || Boolean.parseBoolean(temp));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you could just inline

applyQuotesToAll = Boolean.parseBoolean(properties.getProperty(APPLYQUOTESTOALL))

} else {
return new CSVReader(reader, separator, quote, escape);
parser = new CSVParserBuilder()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we avoid duplication by creating a builder var or using ternary operator? the only diff here is the missing escape char setup

} else {
return new CSVWriter(writer, separator, quote, escape, "");
}
return new CSVWriter(writer, separator, quote, escape, "");
Copy link
Member

@deniskuzZ deniskuzZ Oct 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we change the escape char handling here, but keep it in a reader?

@github-actions github-actions bot removed the stale label Oct 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants