Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge master into rc #124

Merged
merged 205 commits into from
Oct 6, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
205 commits
Select commit Hold shift + click to select a range
f35f0f7
Adding a configurationsource for use in ASP
tylorhl-msft Apr 30, 2019
a3e69ed
Adding dependencies
tylorhl-msft Apr 30, 2019
85b7790
Fixing name
tylorhl-msft Apr 30, 2019
236c0b1
Modifying for new settings and optional SF
tylorhl Apr 30, 2019
fc104f2
Added Configuration parameter to StartUpUtil, not being used yet
tylorhl Apr 30, 2019
8ddead9
Moved service fabric configuration to servicehost; added settings con…
tylorhl May 1, 2019
924816b
Reducing surface area of SF config calls
tylorhl May 1, 2019
2832380
Adding SF to config builder
tylorhl May 1, 2019
fb9ce9f
Correcting errorneous default value
tylorhl May 1, 2019
3e91d60
Runs and apis are hittable
tylorhl May 1, 2019
f8225ea
Removing dev strings
tylorhl May 1, 2019
e4a2383
Adding gitignore entry
tylorhl May 1, 2019
7d999f4
Modifying dev default auth connection string
tylorhl May 1, 2019
f627988
Modifying launch api to simple get call
tylorhl May 1, 2019
323972f
Switching connection string
tylorhl May 1, 2019
3e30cbb
Adding onebox
tylorhl May 1, 2019
3d3238d
Uppdating dev settings
tylorhl May 1, 2019
db6946e
Modified the sf configuration source to better conform to customizabl…
tylorhl May 2, 2019
90dc210
Using constant instead of literal
tylorhl May 2, 2019
0518723
Enabling more of the local scenario
tylorhl May 2, 2019
862c779
Modified mef extension to generate ILoggers for classes automatically
tylorhl May 2, 2019
a97c9f7
Updating settings for onebox scenario
tylorhl May 2, 2019
4cad95d
Adding authentication
tylorhl May 2, 2019
5210b62
Begin adding auth attribute; Not working yet
tylorhl May 6, 2019
ee06bc4
Adding auth scenario; needs cleaning
tylorhl-msft May 6, 2019
547f49b
Disable RequireAuthenticatedUser if in OneBox mode
tylorhl-msft May 7, 2019
572de3e
Revert "Disable RequireAuthenticatedUser if in OneBox mode"
tylorhl-msft May 7, 2019
3047142
Removing authentication requirement from OneBox scenario
May 7, 2019
0e79e8c
Temporary: modified for rerouteService scenario
May 7, 2019
8182575
Minor code improvements
May 8, 2019
951b229
Removing unnecessary project dependency
May 9, 2019
d7a006b
Added a gateway policy for compatibility in SF auth scenario; Moved s…
May 9, 2019
9b4c8ec
Cleanup
May 10, 2019
10186cc
Reverting auto changes
May 10, 2019
16c683f
Renaming appinsights setting to name in SF
May 15, 2019
9044b14
Modified gateway policy for fix in SF
May 15, 2019
a93b6c8
Updating SF xmls with EnableOneBox
May 15, 2019
ce22cf2
Added varying documentation by request
May 15, 2019
87ca83e
Refactored settings into contract; added todos for consolidation
May 17, 2019
8fa3935
Some extensibility improvements
May 17, 2019
8c7c1c8
Reconfigured to be IStartupFilter instead of a simple startup
May 20, 2019
7d0f1e0
Tylake/configuration (#51)
tylorhl May 20, 2019
6b07669
Improved startup experience
May 20, 2019
7e0d4e1
Added documentation; removed original startup for Flow
May 20, 2019
b811406
Converting startup and settings
May 20, 2019
ca0de26
Updating appsettings
May 20, 2019
c1a1aee
Modifying startups
May 20, 2019
e8c4018
Adding auth attributes
May 20, 2019
7761e29
Merging from main repo
May 20, 2019
df3a10f
Fixing usings
May 20, 2019
b6539b2
Deleting old settings classes
May 21, 2019
f25bcec
Simplifying call
May 21, 2019
1c42345
Adding EnableOneBox setting for SF
May 21, 2019
17e517d
Added comments by request
May 21, 2019
7931454
Removing inaccurate comment
May 21, 2019
443d172
Missed override
May 21, 2019
5fb1b7d
Merge pull request #66 from tylorhl/tylake/commonstartup
tylorhl May 21, 2019
838364f
Fixing the backend services post the refactoring for the various star…
s-tuli May 26, 2019
3caff02
Removed the environment variable value from the Services/DataX.Flow/F…
s-tuli May 26, 2019
4adf14d
reverting the typo from /Flow.InteractiveQueryService/appsettings.Dev…
s-tuli May 26, 2019
ad494aa
1. Adding IConfiguration in each of the controller such that the appS…
s-tuli May 28, 2019
c716417
Remove the auth connecting string
s-tuli May 28, 2019
3300943
Adding support for an optional secret for setting the json object whi…
s-tuli May 31, 2019
ddfa31a
Enable batch processing from blob input
vijayupadya May 31, 2019
ef8e7de
Batch processing from blob input
vijayupadya May 31, 2019
5e93e36
Merge branch 'master' of https://github.com/Microsoft/data-accelerato…
vijayupadya May 31, 2019
52e5d46
Merge branch 'master' of https://github.com/microsoft/data-accelerato…
s-tuli Jun 3, 2019
e7dfa5b
Adding the json parsing logic in securedSettings.js for the optional …
s-tuli Jun 3, 2019
ffaa462
Updating readme.md such that customers know how to listen to services…
s-tuli Jun 3, 2019
baf2858
Adding a few checks for when the Kubernetes secret may not be present…
s-tuli Jun 3, 2019
4e1c575
Updating the comments and readme along with some of the checks
s-tuli Jun 3, 2019
3f832ea
Resolving conflicts and merging with master
s-tuli Jun 3, 2019
1423094
Merge pull request #72 from microsoft/s-tuli/kubernetes
s-tuli Jun 3, 2019
1e2a2e9
Merge branch 'master' of https://github.com/microsoft/data-accelerato…
s-tuli Jun 3, 2019
ad35f41
Making metrics.sln compile by adding a missing project. Also fixing a…
s-tuli Jun 4, 2019
a96dfd4
Adding the missing project DataX.ServiceHost to DataX.Gateway.sln
s-tuli Jun 4, 2019
d053365
Nuget Restore on DataX.Gateway.sln fixed
s-tuli Jun 4, 2019
dd0c7c6
Adding the licensing header for some the newly added files since it w…
s-tuli Jun 4, 2019
2d5eea1
Adding EnableOneBox parameter under ConfigOverrides for DataX.Flow. T…
s-tuli Jun 4, 2019
2989c61
Updating FinalRun.sh and appsettings.json for Flow.ManagementService …
s-tuli Jun 4, 2019
375420a
Removing the local values for LocalRoot and SparkHome 'cause this wil…
s-tuli Jun 4, 2019
57e797a
Merge pull request #74 from microsoft/kubernetes/Integrate
s-tuli Jun 5, 2019
c13af9c
Fixing the signing issue related to DataX.ServiceHost project and rem…
s-tuli Jun 7, 2019
c1d7f33
Adding the Microbuild NuGet package to DataX.ServiceHost project
s-tuli Jun 8, 2019
0b94abc
Merge pull request #75 from microsoft/s-tuli/FixBuild
s-tuli Jun 9, 2019
d143bef
Add sql output
vijayupadya Jun 12, 2019
3d3c546
- Refactor output manager to enable sync'ing non-json data
vijayupadya Jun 12, 2019
aba008a
Merge branch 'master' of https://github.com/Microsoft/data-accelerato…
vijayupadya Jun 12, 2019
f77f07c
Kjcho/revert (#77)
kjcho-msft Jun 12, 2019
f3250a7
Rev spark version to 2.4 and add suport for secret scope (#79)
rohit489 Jun 14, 2019
1db0840
ARM: remove snapshot from the templates to unblock ARM deployment fro…
kjcho-msft Jun 17, 2019
e108051
Enable databricks support on Web (#80)
rohit489 Jun 18, 2019
1ed2516
Updating the website code to extract out the Query package named data…
s-tuli Jun 18, 2019
46192fb
fixing merge conflicts
s-tuli Jun 18, 2019
161d466
Updating the versions for all the packages
s-tuli Jun 18, 2019
b43b067
cleaning the code and removing the comments
s-tuli Jun 18, 2019
857ec6c
Updating the package.json for datax-common since we don't require jso…
s-tuli Jun 18, 2019
b11b7f4
Refactored the Dockerfile and finalrun.sh such that CICD can be enabl…
s-tuli Jun 18, 2019
828c5ab
Making a few tweaks: updating all files to be LF instead of CRLF. Add…
s-tuli Jun 19, 2019
60ea61e
Merge pull request #85 from microsoft/s-tuli/kubimages
s-tuli Jun 19, 2019
fdb381a
Adding a Helper function ConvertFlowToQueryMetadata. This creates the…
s-tuli Jun 19, 2019
82bb799
Adding comment header for the new fucntion: ConvertFlowToQueryMetadata
s-tuli Jun 19, 2019
3a7d0d7
Merge branch 'master' of https://github.com/microsoft/data-accelerato…
s-tuli Jun 19, 2019
a880c27
Removing the dupe style under datax-pipeline.
s-tuli Jun 19, 2019
d0d9bd8
Refactor OutputManager to configure outputing non-json data easily (#78)
vijayupadya Jun 19, 2019
d0631a2
SQL Output: UI and config gen service updates (#86)
vijayupadya Jun 20, 2019
42848d9
Fixing a few bugs I found while testing: The query was not getting up…
s-tuli Jun 20, 2019
ce07680
Fixing a few bugs I found while testing: The query was not getting up…
s-tuli Jun 20, 2019
ab49bcf
Commit merge conflcits
s-tuli Jun 20, 2019
81a68ad
Removing the redundant code from datax-pipeline. Removing the term fl…
s-tuli Jun 20, 2019
d75864f
Rev'ing the package versions for each of the packages
s-tuli Jun 20, 2019
2bca074
Fix pom for datax-host (#88)
rohit489 Jun 20, 2019
1506a6f
Removing some redundant code and calling into the QueryActions initQu…
s-tuli Jun 20, 2019
8d7a16d
Merge branch 'master' of https://github.com/microsoft/data-accelerato…
s-tuli Jun 20, 2019
03d3b72
Updating the version of packages in package.json for datax-home
s-tuli Jun 20, 2019
7c2a559
Removing the style.css import from website to datax-pipleline
s-tuli Jun 20, 2019
e45b867
Fixing the memory heap issue because monacoeditor was being imported …
s-tuli Jun 21, 2019
7adcc28
Rev'ing the version of all website packages
s-tuli Jun 22, 2019
77ab420
Adding react-monaco-editor and the Monaco Editor plugin to peerDepend…
s-tuli Jun 24, 2019
7f3f247
Enable batch processing of blob input (#90)
vijayupadya Jun 25, 2019
2b73d79
Updating the package versions. Updating the code to use MonacoEditorC…
s-tuli Jun 25, 2019
6c68902
Merge branch 'master' into s-tuli/querypackage
s-tuli Jun 25, 2019
0aaac6e
Merge pull request #84 from microsoft/s-tuli/querypackage
s-tuli Jun 25, 2019
44b01de
pass in databricks token for live query (#92)
rohit489 Jun 26, 2019
972dfa0
Databricks support in services (#87)
rohit489 Jun 27, 2019
5266e58
Codesign DatabricksClient (#93)
rohit489 Jun 28, 2019
5314dac
Fixing and adding a check for Databricks vs HDInsight when saving and…
s-tuli Jul 1, 2019
c96f1d5
Merge pull request #94 from microsoft/s-tuli/querysample
s-tuli Jul 1, 2019
9b24672
Migrate to latest jackson.core databind artifact.
carlbrochu Jul 8, 2019
630fa78
Updating all projects to use .Net Core 2.2 to resolve the component g…
s-tuli Jul 8, 2019
ae3f784
Updating a few more references and NuGets to 2.2
s-tuli Jul 8, 2019
0543c02
Updating LivyClient.Test project as well
s-tuli Jul 8, 2019
cbe3044
Updating Gatewaycloud.xml
s-tuli Jul 8, 2019
2660148
reverting the signing unintentional change
s-tuli Jul 9, 2019
e4d6d23
Merge pull request #97 from microsoft/s-tuli/updatenetcore
s-tuli Jul 9, 2019
9be463c
Update resource files to have unit tests pass again.
carlbrochu Jul 9, 2019
31de617
Merge branch 'master' into fixunittests
carlbrochu Jul 9, 2019
e17ad4d
Merge branch 'master' into updatejackson
carlbrochu Jul 9, 2019
f0562ba
Merge pull request #98 from microsoft/fixunittests
carlbrochu Jul 9, 2019
c697a07
Adding support for reading the white listed ClientID AAD app from Key…
s-tuli Jul 10, 2019
cf91432
Adding the Whitelisting logic to RolesCheck for the ScenarioTester
s-tuli Jul 11, 2019
8aa0d4a
Renaming the helper and tweaking th logic
s-tuli Jul 11, 2019
0961310
more tweaks to the code
s-tuli Jul 11, 2019
3bcbd2e
Removing the redundant project dependencies in DataX.Utilities.Web
s-tuli Jul 11, 2019
1cf9765
Adding the paramter in appsettings.json as well. This will be useful …
s-tuli Jul 11, 2019
f1c8f84
Updating the white listed clientId value and the code for handling a …
s-tuli Jul 12, 2019
0c5db00
Merge pull request #99 from microsoft/s-tuli/scenariotester
s-tuli Jul 12, 2019
2d1b455
Updating the SimulatedData service to .net core 2.2
s-tuli Jul 12, 2019
6cd874f
Merge branch 'master' of https://github.com/microsoft/data-accelerato…
s-tuli Jul 12, 2019
56d923e
Merge pull request #100 from microsoft/s-tuli/scenariotester
s-tuli Jul 12, 2019
338bad7
provide a scenario tester to run through actions on a host in sequenc…
shibbas Jul 15, 2019
a458dd2
fixed namespaces and nuspec as per PR feedback
shibbas Jul 15, 2019
ffd79a0
Merge pull request #101 from microsoft/shibbas/scenariotester
shibbas Jul 15, 2019
364c911
Merge branch 'master' of https://github.com/microsoft/data-accelerato…
s-tuli Jul 16, 2019
ee25d39
Adding dependency for NewtonSoft.Json in the nuspec
s-tuli Jul 17, 2019
9037716
Adding the signing requirements and updating .nuspec for the Scenario…
s-tuli Jul 17, 2019
b53e650
bug fixes: spark nuspec, iothub sku, simulator service num events (#102)
kjcho-msft Jul 17, 2019
02f07d7
Merge pull request #103 from microsoft/s-tuli/scenariotester
s-tuli Jul 17, 2019
c6d9ef0
Merge pull request #96 from microsoft/updatejackson
carlbrochu Jul 17, 2019
523270e
Fix unit tests (#104)
rohit489 Jul 25, 2019
013aa64
Flow service: Add Blob support (#89)
kjcho-msft Jul 29, 2019
11aef90
For databricks access azure storage account using account key from ke…
rohit489 Aug 1, 2019
443e473
Refactor DataX.Flow to support batch scenarios better (#106)
kjcho-msft Aug 2, 2019
bbc671e
Web: add batching support (#91)
kjcho-msft Aug 2, 2019
86fd4f0
Update the microbuild signing cert to sha2 (#107)
kjcho-msft Aug 2, 2019
3767ec0
ARM: assign the writer role to the service AAD app (#109)
kjcho-msft Aug 6, 2019
49d0475
Update to use .netcore 2.2 and aspnetcore to 2.2.6 (#111)
rohit489 Aug 8, 2019
e6b2e95
Migrate to latest databind component. (#113)
carlbrochu Aug 9, 2019
d8a7c1a
Fix the way to read the blob for GetSchema feature (#112)
kjcho-msft Aug 9, 2019
206334f
update datax spark jar version (#114)
rohit489 Aug 10, 2019
4923294
ARM support for databricks (#108)
rohit489 Aug 12, 2019
57ce427
Update GetSampleEvents for Kafka to run asynchronously (#115)
kjcho-msft Aug 12, 2019
8940fd9
Add null checks for guiConfig() as some custom configs don't have a g…
kjcho-msft Aug 14, 2019
9146c1f
For databricks livequery mount storage account container (#118)
rohit489 Aug 15, 2019
43bb67c
Add code coverage options for gathering CC results in validation (#119)
carlbrochu Aug 15, 2019
9ae0268
Merge master into rc
kjcho-msft Aug 15, 2019
2bc8a06
Change parameters for MountStorage method to only use the required pr…
rohit489 Aug 15, 2019
5949fcb
Update proj to force pdb and exclude other tests binaries (#121)
carlbrochu Aug 15, 2019
456d289
Merge remote-tracking branch 'origin/master' into kjcho/releaseRC
kjcho-msft Aug 15, 2019
e31ba65
avoid creation of datax-host-with-dependency jar (#125)
rohit489 Aug 16, 2019
f78473d
Merge remote-tracking branch 'origin/master' into kjcho/releaseRC
kjcho-msft Aug 16, 2019
d4e789d
reset query each time a new flow is opened (#129)
rohit489 Aug 22, 2019
d7e4e7b
Enable deploy button even after saving the flow (#130)
carlbrochu Aug 22, 2019
c87989c
Update the settings name for sql output and fail fast if null. (#131)
vijayupadya Aug 22, 2019
07aa2e6
fix live query involving udf in databricks and add dependency jars fo…
rohit489 Aug 27, 2019
d7ec69d
Adding JobRunner Service and the first DataX mainline job that calls …
s-tuli Aug 27, 2019
cf20dd9
Adding signing for DataX.Utilities.Composition project (#137)
s-tuli Aug 28, 2019
12f7ee9
return job status after job has been stopped (#136)
rohit489 Aug 29, 2019
a5ca184
fix bugs-metrics dashboard and switching mode, and also enable scro… …
kjcho-msft Sep 4, 2019
75e0faa
Databricks fix output to blobs (#138)
rohit489 Sep 5, 2019
6713833
Fix batch job for databricks (#141)
rohit489 Sep 11, 2019
2841616
merge with master
kjcho-msft Sep 12, 2019
f99abc8
ARM: for the sample deployment script, pass the servicefabric cluster…
kjcho-msft Sep 12, 2019
611640a
Enable rerunning jobs that were previously in error state (#143)
rohit489 Sep 18, 2019
35c1c4f
- Do a better job of handling the case where there is no batch job to…
kjcho-msft Sep 18, 2019
60d1164
Change bulkInsert UI flag data type to bool (#147)
vijayupadya Sep 19, 2019
a1bcd69
Set internal transaction to false by default for bulk insert, The API…
vijayupadya Sep 19, 2019
b4c9c1b
Adding steps for a new JobRunner job calling into ScenarioTester. (#…
s-tuli Sep 20, 2019
8cd7810
Remove databricks token from APIs (#149)
rohit489 Sep 26, 2019
f6d398c
update jackson bit (#150)
kjcho-msft Sep 26, 2019
d26e17e
Fetch value from promise of isDatabricksSparkType (#151)
rohit489 Sep 27, 2019
ac4d0b8
merge with master
kjcho-msft Sep 27, 2019
535a47f
For databricks by default disable autoscale (#152)
rohit489 Oct 3, 2019
6d475df
merge with master #152
kjcho-msft Oct 4, 2019
07a685a
updated the version datax packages
kjcho-msft Oct 5, 2019
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
6 changes: 6 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -6,3 +6,9 @@
/v15
npm-debug.log
/DataProcessing/DataX.Utilities/DataX.Utility.CodeSign/obj
/Tests/ScenarioTester/ScenarioTester/bin
/Tests/ScenarioTester/.vs
*.suo
/Tests/ScenarioTester/ScenarioTester/obj
/Tests/ScenarioTester/ScenarioTesterTests/bin
/Tests/ScenarioTester/ScenarioTesterTests/obj
1 change: 1 addition & 0 deletions DataProcessing/DataX.Utilities/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
*/obj
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

<PropertyGroup>
<OutputType>Library</OutputType>
<TargetFramework>netcoreapp2.1</TargetFramework>
<TargetFramework>netcoreapp2.2</TargetFramework>
<ApplicationIcon />
<StartupObject />
<SignAssembly>true</SignAssembly>
Expand Down
17 changes: 11 additions & 6 deletions DataProcessing/Spark.nuspec
Original file line number Diff line number Diff line change
Expand Up @@ -20,16 +20,21 @@
<file src="**\azure-eventhubs-1.2.1.jar" target="lib" />
<file src="**\azure-eventhubs-spark_2.11-2.3.6.jar" target="lib" />
<file src="**\azure-keyvault-webkey-1.1.jar" target="lib" />
<file src="**\datax-core_2.3_2.11-1.1.0.jar" target="lib" />
<file src="**\datax-host_2.3_2.11-1.1.0.jar" target="lib" />
<file src="**\datax-utility_2.3_2.11-1.1.0.jar" target="lib" />
<file src="**\datax-keyvault_2.3_2.11-1.1.0-with-dependencies.jar" target="lib" />
<file src="**\datax-udf-samples_2.3_2.11-1.1.0.jar" target="lib" />
<file src="**\datax-core*.jar" target="lib" />
<file src="**\datax-host*.jar" target="lib" />
<file src="**\datax-utility*.jar" target="lib" />
<file src="**\datax-keyvault*.jar" target="lib" />
kjcho-msft marked this conversation as resolved.
Show resolved Hide resolved
<file src="**\datax-udf-samples*.jar" target="lib" />
<file src="**\java-uuid-generator-3.1.5.jar" target="lib" />
<file src="**\proton-j-0.31.0.jar" target="lib" />
<file src="**\scala-java8-compat_2.11-0.9.0.jar" target="lib" />
<file src="**\kafka-clients-2.0.0.jar" target="lib" />
<file src="**\spark-streaming-kafka-0-10_2.11-2.4.0.jar" target="lib" />
<file src="**\azure-sqldb-spark-1.0.2.jar" target="lib" />
<file src="**\hadoop-azure-2.7.3.jar" target="lib" />
<file src="**\jetty-util-6.1.25.jar" target="lib" />
<file src="**\json-20180813.jar" target="lib" />
<file src="**\azure-storage-3.1.0.jar" target="lib" />
<file src="NOTICE.txt" target="" />
</files>
</package>
</package>
6 changes: 3 additions & 3 deletions DataProcessing/datax-core/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -50,15 +50,15 @@ SOFTWARE
</scm>

<groupId>com.microsoft.datax</groupId>
<artifactId>datax-core_2.3_2.11</artifactId>
<version>1.1.0</version>
<artifactId>datax-core_2.4_2.11</artifactId>
<version>1.2.0</version>
<name>Data Accelerator Core</name>
<description>This package contains the core module of Data Accelerator functionality.</description>
<url>https://github.com/Microsoft/data-accelerator</url>
<packaging>jar</packaging>

<properties>
<spark.version>2.3.0</spark.version>
<spark.version>2.4.0</spark.version>
<scala.version.major>2.11</scala.version.major>
<scala.version.minor>8</scala.version.minor>
<scala.version>${scala.version.major}.${scala.version.minor}</scala.version>
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
// *********************************************************************
// Copyright (c) Microsoft Corporation. All rights reserved.
// Licensed under the MIT License
// *********************************************************************
package datax.constants

object BlobProperties {
// Define constants for blobs
val BlobHostPath = ".blob.core.windows.net"
}
Original file line number Diff line number Diff line change
Expand Up @@ -15,4 +15,7 @@ object JobArgument {
def ConfName_AppInsightKeyRef = s"${ConfNamePrefix}APPINSIGHTKEYREF"
def ConfName_BlobWriterTimeout: String = s"${ConfNamePrefix}BlobWriterTimeout"
def ConfName_DefaultVaultName: String = s"${ConfNamePrefix}DEFAULTVAULTNAME"
def ConfName_DefaultStorageAccount: String = s"${ConfNamePrefix}DEFAULTSTORAGEACCOUNT"
def ConfName_DefaultContainer: String = s"${ConfNamePrefix}DEFAULTCONTAINER"
def ConfName_AzureStorageJarPath: String = s"${ConfNamePrefix}AZURESTORAGEJARPATH"
}
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,8 @@ import datax.config.SettingDictionary
import org.apache.spark.sql.{DataFrame, Row, SparkSession}

package object sink {
type SinkDelegate = (Row, Seq[Row], Timestamp, Int, String)=>Map[String, Int]
type JsonSinkDelegate = (Row, Seq[Row], Timestamp, Int, String)=>Map[String, Int]
type SinkDelegate = (DataFrame, Timestamp, String)=>Map[String, Int]
type Metrics = Map[String, Double]

trait SinkOperatorFactory{
Expand All @@ -21,6 +22,7 @@ package object sink {

case class SinkOperator(name: String,
isEnabled: Boolean,
sinkAsJson: Boolean,
flagColumnExprGenerator: () => String,
generator: (Int)=>SinkDelegate,
onInitialization: (SparkSession)=>Unit = null,
Expand Down
61 changes: 33 additions & 28 deletions DataProcessing/datax-host/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -50,11 +50,11 @@ SOFTWARE
</scm>

<groupId>com.microsoft.datax</groupId>
<artifactId>datax-host_2.3_2.11</artifactId>
<version>1.1.0</version>
<artifactId>datax-host_2.4_2.11</artifactId>
<version>1.2.0</version>

<properties>
<spark.version>2.3.0</spark.version>
<spark.version>2.4.0</spark.version>
<scala.version.major>2.11</scala.version.major>
<scala.version.minor>8</scala.version.minor>
<scala.version>${scala.version.major}.${scala.version.minor}</scala.version>
Expand Down Expand Up @@ -117,13 +117,13 @@ SOFTWARE
</dependency>
<dependency>
<groupId>com.microsoft.datax</groupId>
<artifactId>datax-core_2.3_2.11</artifactId>
<version>1.1.0</version>
<artifactId>datax-core_2.4_2.11</artifactId>
<version>1.2.0</version>
</dependency>
<dependency>
<groupId>com.microsoft.datax</groupId>
<artifactId>datax-utility_2.3_2.11</artifactId>
<version>1.1.0</version>
<artifactId>datax-utility_2.4_2.11</artifactId>
<version>1.2.0</version>
</dependency>
<dependency>
<groupId>com.microsoft.azure</groupId>
Expand All @@ -138,21 +138,44 @@ SOFTWARE
<dependency>
<groupId>com.microsoft.azure</groupId>
<artifactId>azure-storage</artifactId>
<version>5.3.0</version>
<version>3.1.0</version>
</dependency>
<dependency>
<groupId>com.microsoft.azure</groupId>
<artifactId>azure-documentdb</artifactId>
<version>1.16.1</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-azure</artifactId>
<version>2.7.3</version>
</dependency>
<dependency>
<groupId>org.mortbay.jetty</groupId>
<artifactId>jetty-util</artifactId>
<version>6.1.25</version>
</dependency>
<dependency>
<groupId>org.json</groupId>
<artifactId>json</artifactId>
<version>20180813</version>
</dependency>
<dependency>
<groupId>org.apache.kafka</groupId>
<artifactId>kafka-clients</artifactId>
<version>2.0.0</version>
</dependency>
<dependency>
<groupId>com.microsoft.azure</groupId>
<artifactId>azure-sqldb-spark</artifactId>
<version>1.0.2</version>
</dependency>
<dependency>
<groupId>com.databricks</groupId>
<artifactId>dbutils-api_2.11</artifactId>
<version>0.0.3</version>
</dependency>
</dependencies>


<profiles>
<profile>
<id>build</id>
Expand Down Expand Up @@ -228,24 +251,6 @@ SOFTWARE
</execution>
</executions>
</plugin>
<plugin>
<artifactId>maven-assembly-plugin</artifactId>
<version>2.4.1</version>
<configuration>
<descriptors>
<descriptor>with-dependencies.xml</descriptor>
</descriptors>
</configuration>
<executions>
<execution>
<id>make-assembly</id>
<phase>package</phase>
<goals>
<goal>single</goal>
</goals>
</execution>
</executions>
</plugin>
<plugin>
<groupId>com.github.github</groupId>
<artifactId>site-maven-plugin</artifactId>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,6 @@ object BatchApp {
def main(inputArguments: Array[String]): Unit = {
BlobBatchingHost.runBatchApp(
inputArguments,
config => CommonProcessorFactory.createProcessor(config).asBlobPointerProcessor())
config => CommonProcessorFactory.createProcessor(config).asBatchBlobProcessor())
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
// *********************************************************************
// Copyright (c) Microsoft Corporation. All rights reserved.
// Licensed under the MIT License
// *********************************************************************
package datax.client.sql

case class SqlConf(name: String,
connectionString: String,
url: String,
encrypt: String,
trustServerCertificate: String,
hostNameInCertificate: String,
databaseName:String,
table: String,
writeMode: String,
userName:String,
password:String,
filter: String,
connectionTimeout: String,
queryTimeout: String,
useBulkCopy : Boolean,
useBulkCopyTableLock: String,
useBulkCopyInternalTransaction: String,
bulkCopyTimeout:String,
bulkCopyBatchSize:String
)
Original file line number Diff line number Diff line change
@@ -0,0 +1,85 @@
// *********************************************************************
// Copyright (c) Microsoft Corporation. All rights reserved.
// Licensed under the MIT License
// *********************************************************************
package datax.executor

import datax.config.ConfigManager
import datax.constants.{BlobProperties, JobArgument}
import datax.fs.HadoopClient
import datax.host.SparkJarLoader
import datax.securedsetting.KeyVaultClient
import org.apache.log4j.LogManager
import org.apache.spark.broadcast
import org.apache.spark.sql.SparkSession

object ExecutorHelper {
private val logger = LogManager.getLogger(this.getClass)

/***
* Create broadcast variable for blob storage account key
* @param path blob storage path
* @param spark SparkSession
*/
def createBlobStorageKeyBroadcastVariable(path: String, spark : SparkSession): broadcast.Broadcast[String] ={
addJarToExecutor(spark)
val sc = spark.sparkContext
val sa = getStorageAccountName(path)
var key = ""
KeyVaultClient.withKeyVault {vaultName => key = HadoopClient.resolveStorageAccount(vaultName, sa).get}
val blobStorageKey = sc.broadcast(key)
blobStorageKey
}

/***
* Get the storage account name from blob path
* @param path blob storage path
*/
private def getStorageAccountName(path:String):String ={
val regex = s"@([a-zA-Z0-9-_]+)${BlobProperties.BlobHostPath}".r
regex.findFirstMatchIn(path) match {
case Some(partition) => partition.group(1)
case None => null
}
}

/***
* Add azure-storage jar to executor nodes
* @param spark Spark Session
*/
private def addJarToExecutor(spark : SparkSession){
try{
logger.warn("Adding azure-storage jar to executor nodes")
withStorageAccount {(storageAccount,containerName,azureStorageJarPath) => SparkJarLoader.addJar(spark, s"wasbs://$containerName@$storageAccount${BlobProperties.BlobHostPath}$azureStorageJarPath")}
}
catch {
case e: Exception => {
logger.error(s"azure-storage jar could not be added to executer nodes", e)
throw e
}
}
}

/***
* a scope to execute operation with the default storageAccount/container/azureStorageJarPath, skip the operation if that doesn't exist.
* @param callback execution within the scope
*/
private def withStorageAccount(callback: (String, String, String)=> Unit) = {
ConfigManager.getActiveDictionary().get(JobArgument.ConfName_DefaultStorageAccount) match {
case Some(storageAccount) =>
logger.warn(s"Default Storage Account is $storageAccount")
ConfigManager.getActiveDictionary().get(JobArgument.ConfName_DefaultContainer) match {
case Some(containerName) =>
logger.warn(s"Default container is $containerName")
ConfigManager.getActiveDictionary().get(JobArgument.ConfName_AzureStorageJarPath) match {
case Some(azureStorageJarPath) =>
logger.warn(s"Azure storage jar path is $azureStorageJarPath")
callback(storageAccount, containerName, azureStorageJarPath)
case None => logger.warn(s"No azure storage jar path is defined")
}
case None => logger.warn(s"No default container is defined")
}
case None => logger.warn(s"No default storage account is defined")
}
}
}
Loading