Skip to content

Commit

Permalink
Merge pull request #39 from gnawhnehpets/transfer_s3_to_gcs
Browse files Browse the repository at this point in the history
Transfer s3 to gcs - add option to rerun transfer job
  • Loading branch information
rmathur87 authored May 29, 2024
2 parents 0a53e52 + 0d89d55 commit e31ee34
Show file tree
Hide file tree
Showing 3 changed files with 19 additions and 9 deletions.
2 changes: 1 addition & 1 deletion generate_manifest_s3/v1.0/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -45,4 +45,4 @@ COPY ./opt/*.py /opt/
RUN chmod +x /usr/local/bin/configure_aws_cli.sh /opt/entrypoint.sh /opt/generate_manifest_for_aws.py

# Expose port
EXPOSE 80
EXPOSE 80
6 changes: 3 additions & 3 deletions transfer_s3_to_gcs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -137,12 +137,12 @@ docker run \
From within the interactive session, trigger `entrypoint.sh` to 1.) activate the Google Service Account, and 2.) create a Google transfer job.
- Verify credentials were applied by running `gcloud auth list`

A transfer job can be created one time for each Google Storage bucket. If a job needs to be repeated, the transfer job must be deleted from the operations list.
Note: A transfer job can be created one time for each Google Storage bucket. If a job needs to be repeated, use the `--run` flag on `/opt/entrypoint.sh`.

In order to do that:
1. set project name for gcloud to the project name ( `gcloud config set project "$GC_PROJECT"` )
2. find the name of the transfer job ( `gcloud transfer operations list` )
3. delete the transfer job ( `gcloud transfer job delete <NAME_OF_JOB>`)
Once complete, you can initiate a new transfer job by rerunning `bash /opt/entrypoint.sh`
3. run the transfer job with `bash /opt/entrypoint.sh --run`

Transfer jobs can be monitored through `gcloud transfer operations list`.
Once the job is complete, Google Storage buckets can be examined with `gsutil ls -r $S3_BUCKET`
Expand Down
20 changes: 15 additions & 5 deletions transfer_s3_to_gcs/v1.0/opt/entrypoint.sh
Original file line number Diff line number Diff line change
Expand Up @@ -4,28 +4,38 @@
# Check if file or json environment variables exist
if [ -n "$GC_ADC_FILE" ]; then
echo "[[ Activating Google Cloud service account ]]"
gcloud auth activate-service-account --key-file="$GC_ADC_FILE"
# If container binding is not supported, create temporary adc.json file
gcloud auth activate-service-account --key-file="$GC_ADC_FILE" --no-user-output-enabled
# Add the code to handle the case when the variables are set
elif [ -n "$GC_ADC_JSON" ]; then
echo "--- GC_ADC_JSON detected ---"
echo "[[ Create Google Service Account key file ]]"
echo $GC_ADC_JSON > /opt/adc.json
echo "[[ Activating Google Cloud service account ]]"
gcloud auth activate-service-account --key-file="/opt/adc.json"
gcloud auth activate-service-account --key-file="/opt/adc.json" --no-user-output-enabled

# Add the code to handle the case when the variables are not set
else
echo "GC_ADC_FILE or GC_ADC_JSON not detected."
exit 1
fi
echo "[[ Create temporary AWScreds.txt file ]]"
echo "{ \"accessKeyId\": \"$AWS_ACCESS_KEY_ID\", \"secretAccessKey\": \"$AWS_SECRET_ACCESS_KEY\" }" > /opt/AWScreds.txt

echo "[[ Initiate transfer from s3:// to gs://$S3_BUCKET ]]"
gcloud transfer jobs create s3://"$S3_BUCKET" gs://"$S3_BUCKET" \
# Check for the "--run" flag in the script arguments
if [[ " $* " =~ " --run " ]]; then
echo "--run flag is present."
echo "[[ Run transfer job { $S3_BUCKET } ]]"
gcloud transfer jobs run "$S3_BUCKET" --project "$GC_PROJECT"
else
# echo "--run flag is not present."
echo "[[ Initiate transfer from s3:// to gs://$S3_BUCKET ]]"
gcloud transfer jobs create s3://"$S3_BUCKET" gs://"$S3_BUCKET" \
--name "$S3_BUCKET" \
--description "$S3_BUCKET" \
--source-creds-file /opt/AWScreds.txt \
--project "$GC_PROJECT" \
--no-enable-posix-transfer-logs
fi

echo "[[ Cleanup AWScreds.txt file ]]"
rm /opt/AWScreds.txt
Expand Down

0 comments on commit e31ee34

Please sign in to comment.