DataX.Flow: fix blob output issue #50

kjcho-msft · 2019-05-10T00:19:35Z

No description provided.

* Initial Checkin * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update CONTRIBUTING.md * Update CONTRIBUTING.md * Update pom files in master to build against GitHub Maven repo * Use 1.0.0 as dependencies in Maven. * Update CONTRIBUTING.md * Update CONTRIBUTING.md * update to use the new sfpkg names (#4) * Update Maven repo * Update configgenConfigs.json * Sync json files to latest nuget content * Data Accelerator whitepaper * Add files via upload (#8) Adding Introductory Whitepaper to complement the Architecture Whitepaper * Formating and other updates to docs. (#9) * Update README.md * Update issue templates * Add GIT links to each web package * Add descriptions to each packages' readme * Update README.md * Replace samples local.db with template files. (#31) * remove eval when fethcing the environment variable (#33) * remove eval when fethcing the environment variable * add check to see if localServices are defined in webComposition * expose a new param for installing modules and use the minimumVersion for checking (#34) * bump version of jackson dependency * update website deploy script to be able to take a resource group as the third parameter * Adding code to support signing the binaries for the backend service:DataX.FlowManagement for the local scenario. * Include status badge on Readme (#37) * Include status badge on Readme Include status badge on Readme #23 * Change table structure * Add status badge to contributing.md * update status badge table * Update README.md * Update README.md * Dineshc/databricksliveq (#40) * Added support for Live Query for Databricks stack. It still needs work for: 1. Initialization steps 2. HDFS story (for reference data, etc. and for kernels garbage collection) * Removed obj files * Created utility class for HttpClient for Databricks * Add multiple credentials to the AAD app (#39) * Update README.md * Protect the service by not loading non-Managed assemblies (#42) * Protect the service by not loading non-Managed assemblies from the storage account In case of Native dlls, the Assembly.Load can throw BadImageFormatException and this change wraps that call in a try-catch so that the assembly is safely skipped by not loading into the system, required for creating CompositionContainer * Added a logging in the exception handler * Changed the logger area to Startup while logging exception at the time of loading assemblies from Storage account * Edited comments * Edited comments * add support to simulate data into kafka (#43) * add support to simulate data into kafka * fetch broker from connection string * rename Broker to BootstrapServers * add license header (#44) * fix ensurejobstate infinite retries (#46) * Kjcho/add kafka input (#41) * UI: add eventhub for kafka input support * UI: add eventhub for kafka input support * changed to Kafka (Event Hub) * add native kafka support * add snapshot to the package version * simulate data into kafka HDInsight (#47) * Adding contents for creating docker container for Floe.ManagementService * fix blob output issue (#50) * remove the check for Azure function key on UI (#52) * DataX.Flow: fix the bugs in the azure function config generation (#53) * Add support for reading from multiple json schema files and fixed issue where the 1st field no longer needs to be of struct type (#55) * rev maven package version (#56) * rev maven package version * update few more files * remove filter to include only jar * Adding the required files for creating and deploying the Pod Identity Service. This code also adds in the required DockerFile and finalrun.sh for creating the docker container for Kubernetes scneario * Updating the line endings to be LF and not CRLF for the DockerFile and the yaml * Datax.Flow: Add kafka support (#45) * NPOT: add Kafka support * NPOT: add Kafka support * update to support the native Kafka * change AutoOffsetReset to Latest for kafka sampling * Move the hard-coded cacert source to the keyvault * update not to create consumer groups for EventHub Kafka * Enable http post functions and keyvault retrieval fix. (#54) * Enable http post functions and keyvault retrival fix. * Take function baseUrl instead of host name in config. Also updated pom files. * Add null checks for function params * Web: Remove subscription and resource group input fields for Kafka, and update labels for Azure Function (#64) * remove subscription field for kafka input and update azure function labels * remove subscription field for kafka input and update azure function labels * remove subscription field for kafka input and update azure function labels * update the pepeline version in the website package.json * Converting the DataX.Metrics app from being stateful to stateless * Removing redundant code * Add kafka input support for data processing (#57) * Support for EventHub for Kafka as input * Eventhub for kakfa input changes * Support for native kafka * Support for native kafka * kafka producer * Updated pom file and few minor updates to comments. * Add license header to new files. * pom file updates to revert spark version. * Adding more headers * Add comments to new files. * Revert Sprk 2.4 signature change for UDF. * ignore codesign * Define constants and convert return null to throwing exception. * Making tweaks to DataX.Metrics.Ingestor_InstanceCount value and logging the exception message * Remove special case when request comes from office software since it is no longer being used. (#67) * DataProcessing: Add missing dependencies for Kafka and update nuspec to include them (#69) * Add missing dependencies and update nuspec to include them * Add missing dependencies and update nuspec to include them * add rel="noopener noreferrer" to links (#68) * add rel="noopener noreferrer" to links to prevent Reverse Tabnabbing * rev versions of all packages * Update proton-j and jackson lib versions. (#71) * ARM deployment: add kafka support (#70) * ARM deployment: add kafka support * update based on feedback * update proton version for kafkaDataXDirect template which is missing from a previous commit * Remove snapshots * Remove snapshots from Spark.nuspec Remove snapshots from Spark.nuspec * remove snapshot

* Initial Checkin * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update CONTRIBUTING.md * Update CONTRIBUTING.md * Update pom files in master to build against GitHub Maven repo * Use 1.0.0 as dependencies in Maven. * Update CONTRIBUTING.md * Update CONTRIBUTING.md * update to use the new sfpkg names (#4) * Update Maven repo * Update configgenConfigs.json * Sync json files to latest nuget content * Data Accelerator whitepaper * Add files via upload (#8) Adding Introductory Whitepaper to complement the Architecture Whitepaper * Formating and other updates to docs. (#9) * Update README.md * Update issue templates * Add GIT links to each web package * Add descriptions to each packages' readme * Update README.md * Replace samples local.db with template files. (#31) * remove eval when fethcing the environment variable (#33) * remove eval when fethcing the environment variable * add check to see if localServices are defined in webComposition * expose a new param for installing modules and use the minimumVersion for checking (#34) * bump version of jackson dependency * update website deploy script to be able to take a resource group as the third parameter * Adding code to support signing the binaries for the backend service:DataX.FlowManagement for the local scenario. * Include status badge on Readme (#37) * Include status badge on Readme Include status badge on Readme #23 * Change table structure * Add status badge to contributing.md * update status badge table * Update README.md * Update README.md * Dineshc/databricksliveq (#40) * Added support for Live Query for Databricks stack. It still needs work for: 1. Initialization steps 2. HDFS story (for reference data, etc. and for kernels garbage collection) * Removed obj files * Created utility class for HttpClient for Databricks * Add multiple credentials to the AAD app (#39) * Update README.md * Protect the service by not loading non-Managed assemblies (#42) * Protect the service by not loading non-Managed assemblies from the storage account In case of Native dlls, the Assembly.Load can throw BadImageFormatException and this change wraps that call in a try-catch so that the assembly is safely skipped by not loading into the system, required for creating CompositionContainer * Added a logging in the exception handler * Changed the logger area to Startup while logging exception at the time of loading assemblies from Storage account * Edited comments * Edited comments * add support to simulate data into kafka (#43) * add support to simulate data into kafka * fetch broker from connection string * rename Broker to BootstrapServers * add license header (#44) * fix ensurejobstate infinite retries (#46) * Kjcho/add kafka input (#41) * UI: add eventhub for kafka input support * UI: add eventhub for kafka input support * changed to Kafka (Event Hub) * add native kafka support * add snapshot to the package version * simulate data into kafka HDInsight (#47) * Adding contents for creating docker container for Floe.ManagementService * fix blob output issue (#50) * remove the check for Azure function key on UI (#52) * DataX.Flow: fix the bugs in the azure function config generation (#53) * Add support for reading from multiple json schema files and fixed issue where the 1st field no longer needs to be of struct type (#55) * rev maven package version (#56) * rev maven package version * update few more files * remove filter to include only jar * Adding the required files for creating and deploying the Pod Identity Service. This code also adds in the required DockerFile and finalrun.sh for creating the docker container for Kubernetes scneario * Updating the line endings to be LF and not CRLF for the DockerFile and the yaml * Datax.Flow: Add kafka support (#45) * NPOT: add Kafka support * NPOT: add Kafka support * update to support the native Kafka * change AutoOffsetReset to Latest for kafka sampling * Move the hard-coded cacert source to the keyvault * update not to create consumer groups for EventHub Kafka * Enable http post functions and keyvault retrieval fix. (#54) * Enable http post functions and keyvault retrival fix. * Take function baseUrl instead of host name in config. Also updated pom files. * Add null checks for function params * Web: Remove subscription and resource group input fields for Kafka, and update labels for Azure Function (#64) * remove subscription field for kafka input and update azure function labels * remove subscription field for kafka input and update azure function labels * remove subscription field for kafka input and update azure function labels * update the pepeline version in the website package.json * Converting the DataX.Metrics app from being stateful to stateless * Removing redundant code * Add kafka input support for data processing (#57) * Support for EventHub for Kafka as input * Eventhub for kakfa input changes * Support for native kafka * Support for native kafka * kafka producer * Updated pom file and few minor updates to comments. * Add license header to new files. * pom file updates to revert spark version. * Adding more headers * Add comments to new files. * Revert Sprk 2.4 signature change for UDF. * ignore codesign * Define constants and convert return null to throwing exception. * Making tweaks to DataX.Metrics.Ingestor_InstanceCount value and logging the exception message * Remove special case when request comes from office software since it is no longer being used. (#67) * DataProcessing: Add missing dependencies for Kafka and update nuspec to include them (#69) * Add missing dependencies and update nuspec to include them * Add missing dependencies and update nuspec to include them * add rel="noopener noreferrer" to links (#68) * add rel="noopener noreferrer" to links to prevent Reverse Tabnabbing * rev versions of all packages * Update proton-j and jackson lib versions. (#71) * ARM deployment: add kafka support (#70) * ARM deployment: add kafka support * update based on feedback * update proton version for kafkaDataXDirect template which is missing from a previous commit * Remove snapshots * Remove snapshots from Spark.nuspec Remove snapshots from Spark.nuspec * remove snapshot * Merge master into rc (#124) * Adding a configurationsource for use in ASP * Adding dependencies * Fixing name * Modifying for new settings and optional SF * Added Configuration parameter to StartUpUtil, not being used yet * Moved service fabric configuration to servicehost; added settings constants; modified startuputil for DataXSettings * Reducing surface area of SF config calls * Adding SF to config builder * Correcting errorneous default value * Runs and apis are hittable * Removing dev strings * Adding gitignore entry * Modifying dev default auth connection string * Modifying launch api to simple get call * Switching connection string * Adding onebox * Uppdating dev settings * Modified the sf configuration source to better conform to customizable settings; began work on local scenario * Using constant instead of literal * Enabling more of the local scenario * Modified mef extension to generate ILoggers for classes automatically * Updating settings for onebox scenario * Adding authentication * Begin adding auth attribute; Not working yet * Adding auth scenario; needs cleaning * Disable RequireAuthenticatedUser if in OneBox mode * Revert "Disable RequireAuthenticatedUser if in OneBox mode" This reverts commit 547f49b. * Removing authentication requirement from OneBox scenario * Temporary: modified for rerouteService scenario * Minor code improvements * Removing unnecessary project dependency * Added a gateway policy for compatibility in SF auth scenario; Moved some auth classes to be better accessed; RoleCheck changed to consider SF environment * Cleanup * Reverting auto changes * Renaming appinsights setting to name in SF * Modified gateway policy for fix in SF * Updating SF xmls with EnableOneBox * Added varying documentation by request * Refactored settings into contract; added todos for consolidation * Some extensibility improvements * Reconfigured to be IStartupFilter instead of a simple startup * Tylake/configuration (#51) * Adding a configurationsource for use in ASP * Adding dependencies * Fixing name * Modifying for new settings and optional SF * Added Configuration parameter to StartUpUtil, not being used yet * Moved service fabric configuration to servicehost; added settings constants; modified startuputil for DataXSettings * Reducing surface area of SF config calls * Adding SF to config builder * Correcting errorneous default value * Runs and apis are hittable * Removing dev strings * Adding gitignore entry * Modifying dev default auth connection string * Modifying launch api to simple get call * Switching connection string * Adding onebox * Uppdating dev settings * Modified the sf configuration source to better conform to customizable settings; began work on local scenario * Using constant instead of literal * Enabling more of the local scenario * Modified mef extension to generate ILoggers for classes automatically * Updating settings for onebox scenario * Adding authentication * Begin adding auth attribute; Not working yet * Adding auth scenario; needs cleaning * Disable RequireAuthenticatedUser if in OneBox mode * Revert "Disable RequireAuthenticatedUser if in OneBox mode" This reverts commit 547f49b. * Removing authentication requirement from OneBox scenario * Temporary: modified for rerouteService scenario * Minor code improvements * Removing unnecessary project dependency * Added a gateway policy for compatibility in SF auth scenario; Moved some auth classes to be better accessed; RoleCheck changed to consider SF environment * Cleanup * Reverting auto changes * Renaming appinsights setting to name in SF * Modified gateway policy for fix in SF * Updating SF xmls with EnableOneBox * Added varying documentation by request * Adding onebox to settings * Renaming based on feedback * Improved startup experience * Added documentation; removed original startup for Flow * Converting startup and settings * Updating appsettings * Modifying startups * Adding auth attributes * Fixing usings * Deleting old settings classes * Simplifying call * Adding EnableOneBox setting for SF * Added comments by request * Removing inaccurate comment * Missed override * Fixing the backend services post the refactoring for the various startups in Datax.Flow. Basically, updating the AppSettings.json, appsettings.Development.json and adding the app.UseAuthentication(); in the DataXServiceStartup.cs and tweaking the way we are getting the settings in StartUpUtil.cs * Removed the environment variable value from the Services/DataX.Flow/Flow.SchemaInferenceService/Properties/launchSettings.json * reverting the typo from /Flow.InteractiveQueryService/appsettings.Development.json * 1. Adding IConfiguration in each of the controller such that the appSettings properties/parameters are available for the EngineEnvrionment call. Passing this IConfiguration to other classes as needed. 2. Fixing DataX.Flow.sln since it was giving an error previously as one of the projects was missing from this solution * Remove the auth connecting string * Adding support for an optional secret for setting the json object which is a key value pair for each servicename and the ip address where the service can be listened at. * Enable batch processing from blob input * Batch processing from blob input * Adding the json parsing logic in securedSettings.js for the optional secret for kubernetes services * Updating readme.md such that customers know how to listen to services deployed on the AKS cluster. * Adding a few checks for when the Kubernetes secret may not be present altogether * Updating the comments and readme along with some of the checks * Making metrics.sln compile by adding a missing project. Also fixing a merge conflict. * Adding the missing project DataX.ServiceHost to DataX.Gateway.sln * Nuget Restore on DataX.Gateway.sln fixed * Adding the licensing header for some the newly added files since it was missing. Making a few tweaks based on PR feedback * Adding EnableOneBox parameter under ConfigOverrides for DataX.Flow. This is needed for ServiceFabric. * Updating FinalRun.sh and appsettings.json for Flow.ManagementService to enable the OneBox Scenario * Removing the local values for LocalRoot and SparkHome 'cause this will vary depending upon the environment. * Fixing the signing issue related to DataX.ServiceHost project and removing the folders DataX.ServiceHost and SolutionItems from DataX.Flow solution * Adding the Microbuild NuGet package to DataX.ServiceHost project * Add sql output * - Refactor output manager to enable sync'ing non-json data - Add SQL server output * Kjcho/revert (#77) * Revert "- Refactor output manager to enable sync'ing non-json data" This reverts commit 3d3c546. * Revert "Add sql output" This reverts commit d143bef. * Rev spark version to 2.4 and add suport for secret scope (#79) * Rev scala version to 2.4 and add suport for secret scope * Synchronize calls to dbutils * change artifact id to 2.4 * rev dependencies version * ARM: remove snapshot from the templates to unblock ARM deployment from master branch (#82) * Enable databricks support on Web (#80) * Updating the website code to extract out the Query package named dataX-query. This datax-query package content used to be part of the package datax-pipeline. The reason that this package is being extracted out is such that this package can be used by othercustomers who do not want a dependency on the datax-pipeline package. * Updating the versions for all the packages * cleaning the code and removing the comments * Updating the package.json for datax-common since we don't require jsoneditor and monaco editor packages * Refactored the Dockerfile and finalrun.sh such that CICD can be enabled easily by passing in the service name as parameter. Also adding the yaml files for each service which will need the parameters to be passed in prior to deploying the service to the Kubernetes cluster (just as we do for when deploying to the service fabric cluster.) * Making a few tweaks: updating all files to be LF instead of CRLF. Adding quotes for servicedllname and adding a new line for each of the Dockerfile and finalrun.sh * Adding a Helper function ConvertFlowToQueryMetadata. This creates the object that contains all the parameters as needed by datax-query package. Also cleaning up the code a bit and addressing PR feedback. * Adding comment header for the new fucntion: ConvertFlowToQueryMetadata * Removing the dupe style under datax-pipeline. * Refactor OutputManager to configure outputing non-json data easily (#78) Add SqlServer output * SQL Output: UI and config gen service updates (#86) * SQL Output: UI and config gen service updates * Minor updates to UI based on review feedback. * Flatterner template update and minor UI tweaks * Update package version. * Fix a typo * Fixing a few bugs I found while testing: The query was not getting updated when calling codegen in the UI and The deploy button was not getting enabled when the query was dirty * Fixing a few bugs I found while testing: The query was not getting updated when calling codegen in the UI and The deploy button was not getting enabled when the query was dirty * Removing the redundant code from datax-pipeline. Removing the term flow for datax-query. * Rev'ing the package versions for each of the packages * Fix pom for datax-host (#88) * Removing some redundant code and calling into the QueryActions initQuery function * Updating the version of packages in package.json for datax-home * Removing the style.css import from website to datax-pipleline * Fixing the memory heap issue because monacoeditor was being imported twice for the datax-pipeline package. The solution is to create a common control MonacoEditorControl that can be consumed by both datax-query and by datax-pipeline package. This commit also removes the need to include monaco editor and jsoneditor in other packages. * Rev'ing the version of all website packages * Adding react-monaco-editor and the Monaco Editor plugin to peerDependencies in package.json for datax-query and datax-pipeline * Enable batch processing of blob input (#90) * Updating the package versions. Updating the code to use MonacoEditorControl as defined in datax-query package. Removing the dependency on react-monaco-editor and the plugin in datax-pipeline * pass in databricks token for live query (#92) * Databricks support in services (#87) * Databricks support in services * Fix autoscale and use flow specific token to send requests to databricks * Fix live query * refactor uriPrefix to be handled by keyVaultClient * Codesign DatabricksClient (#93) * Fixing and adding a check for Databricks vs HDInsight when saving and resolving the spark job params * Migrate to latest jackson.core databind artifact. * Updating all projects to use .Net Core 2.2 to resolve the component governance issue * Updating a few more references and NuGets to 2.2 * Updating LivyClient.Test project as well * Updating Gatewaycloud.xml * reverting the signing unintentional change * Update resource files to have unit tests pass again. * Adding support for reading the white listed ClientID AAD app from KeyVault * Adding the Whitelisting logic to RolesCheck for the ScenarioTester * Renaming the helper and tweaking th logic * more tweaks to the code * Removing the redundant project dependencies in DataX.Utilities.Web * Adding the paramter in appsettings.json as well. This will be useful for when we add support for kubernetes. Addressing some PR feedback: Adding header and renaming the helper method that adds the whitelisted client user id for testing purposes. * Updating the white listed clientId value and the code for handling a list of whitelisted clientIds which would essentially be of the format {objectIdentifier}.{tenantId} such that it is unique * Updating the SimulatedData service to .net core 2.2 * provide a scenario tester to run through actions on a host in sequence and parallelly. Enables creating simulated test loads. * fixed namespaces and nuspec as per PR feedback * Adding dependency for NewtonSoft.Json in the nuspec * Adding the signing requirements and updating .nuspec for the ScenarioTester such that it is packaged with its own NuGet dependencies. * bug fixes: spark nuspec, iothub sku, simulator service num events (#102) * Fix unit tests (#104) * Fix unit tests * make sparkType property optional, add end to end test for databricks, fix config.local test * Flow service: Add Blob support (#89) * The flow service: Add Blob support * The flow service: Add Blob support * move the kv secret resolution from the FlattenJobConfig to GenerateJobConfigBatching * add tests for batch and update based on feedback * enable sign for the new projects * merge w/ master * updated based on feedback * update based on feedback * fixing tests * fixing tests * fixing tests * clean up code * For databricks access azure storage account using account key from keyvault (#105) * For databricks access azure storage account using account key from keyvault * added comment * remove fileUrl as global variable * Refactor DataX.Flow to support batch scenarios better (#106) * refactor DataX.Flow to support batch better * refactor DataX.Flow to support batch better * Updated based on feedback * update based on feedback * Web: add batching support (#91) * initial commit for Blob input support * initial commit for Blob input support * Web: add batch support * update the logic for save buttton and the lable for schedule tab * use a datetime picker control for starttime and endtime * update based on feedback * merge with master and clean up code * update the version for all packages * Update the microbuild signing cert to sha2 (#107) * Update the microbuild signing cert to sha2 * Update the microbuild signing cert to sha2 * ARM: assign the writer role to the service AAD app (#109) * ARM: assign the writer role to the service AAD app * Refactor Set-AzureAADApiPermission function not to pass the roleId as a param * Update to use .netcore 2.2 and aspnetcore to 2.2.6 (#111) * Update .netcore 2.2 and aspnetcore to 2.2.6 * update netcoreapp version in nuspec * Migrate to latest databind component. (#113) * Fix the way to read the blob for GetSchema feature (#112) * Fix the way to read the blob for GetSchema feature * Fix the way to read the blob for GetSchema feature * update based on feedback * add header comments and update based on feedback * add one more test for this pattern: {yyyy/MM/dd} * update based on feedback * update based on feedback * update datax spark jar version (#114) * ARM support for databricks (#108) * ARM support for databricks * PR feedback * create databricks in existing vnet, update hdinsight kafka zookeeper vmsize to standard_a4_v2, add port 443 rule for hdinsight * Update GetSampleEvents for Kafka to run asynchronously (#115) * Add null checks for guiConfig() as some custom configs don't have a gui section (#116) * add null checks for guiConfig() before we use it as not all configs have a gui section * add null checks for guiConfig() before we use it as not all configs have a gui section * clean up code * updated the sample data for EndToEndGenerationCustom test * For databricks livequery mount storage account container (#118) * For databricks livequery mount storage account container * PR feedback * add license header and move methods to helper * Add code coverage options for gathering CC results in validation (#119) * Add stop and get jobs unit tests * update to test * add coverage settings * fix path in cc * add livy test project * merge solution * Feedback and revert .sln file * Change parameters for MountStorage method to only use the required properties instead of complete flowConfigObject (#123) * Update proj to force pdb and exclude other tests binaries (#121) * avoid creation of datax-host-with-dependency jar (#125) * reset query each time a new flow is opened (#129) * Enable deploy button even after saving the flow (#130) * Remove restriction to deploy if Flow has been saved. Remove restriction to deploy if file has already been saved. * SNAPSHOT- * Rev package and dependencies * Update the settings name for sql output and fail fast if null. (#131) * Update the settings name and fail fast if null. * Update to use the right get funtion * fix live query involving udf in databricks and add dependency jars for various outputs (#133) * fix live query involving udf in databricks and add dependency jars for various outputs * PR feedback * Adding JobRunner Service and the first DataX mainline job that calls … (#127) * Adding JobRunner Service and the first DataX mainline job that calls into the ScenarioTester. All sensitive info is in the KeyVault * Refactoring the storage utility files to be part of DataX.Utility.Blob project and updating the gitignore file to include the appsettings.Development.json * Removing duplicate code for InstanceExportDescriptorProvider. Creating a new utility project: DataX.Utility.Composition. * Updating the method GetExportDescriptors. Also updating the namespace for the storage utility classes. * Adding signing for DataX.Utilities.Composition project (#137) * return job status after job has been stopped (#136) * return job status after job has been stopped * add unit tests and add max retries to fetch job state when job is in process of termination * fix bugs-metrics dashboard and switching mode, and also enable scro… (#132) * fix bugs-metrics dashboard and switching mode, and also enabling scroll for job page * updated the package versions * update based on feedback * added some unit tests for the helper functions in ConfigDeleter API * Enable scrollbar for Input panel * Databricks fix output to blobs (#138) * Databricks fix output to blobs * PR feedback * create function to create broadcast variable and change return type of resolveStorageAccount method to option[string] * create a method to set Storage Account Key On Hadoop Conf * Fix batch job for databricks (#141) * Fix batch job for databricks * move createBlobStorageKeyBroadcastVariable to its own class, refactor resolve storage account key retrieval, add new default storage account environment variable * rename file * PR feedback * ARM: for the sample deployment script, pass the servicefabric cluster name to the utility module (#142) * Enable rerunning jobs that were previously in error state (#143) * Enable rerunning jobs that were previously in error state * Add unit test * - Do a better job of handling the case where there is no batch job to deploy (#144) - Clean up job names for the samples as for a job which hasn't been deployed, it should be null * Change bulkInsert UI flag data type to bool (#147) * Change bulkInsert UI flag data type to bool * Change UseBulkInsert datatype is sql output model * Make the bool nullable since its optional. * Set internal transaction to false by default for bulk insert, The API we are using doesn't accept this to be set to true. (#146) * Adding steps for a new JobRunner job calling into ScenarioTester. (#145) * Adding steps for a new JobRunner job calling into ScenarioTester. These steps are essentially calling into and testing the apis within InteractiveQueryService, LiveDataService and SchemaGeneratorService. Also adding support for running the JobRunner on both DataBricks and HDInsight clusters. * Updating the code per PR feedback. Essentially updating one of the parameters' names. Removing some redundant code * Refactoring the code a little to create a helper class and a helper mehtod for constructing Initialization Kernel json object * Remove databricks token from APIs (#149) * Remove databricks token from APIs. Create new save button for databricks token. Fix delete kernel API * PR feedback * Enable databricks/HDInsight env validation check for save button * remove isDatabricksSparkType state from flowDefinitionPanel * create secretScopePrefix constant * Extract status code and error message from response string * add try catch * update jackson bit (#150) * Fetch value from promise of isDatabricksSparkType (#151) * Fetch value from promise of isDatabricksSparkType * update package version * PR feedback * For databricks by default disable autoscale (#152) * updated the version datax packages

kjcho-msft requested a review from vijayupadya May 10, 2019 00:19

fix blob output issue

47d2147

kjcho-msft force-pushed the kjcho/bloboutput branch from 26a2f2d to 47d2147 Compare May 10, 2019 00:21

vijayupadya approved these changes May 10, 2019

View reviewed changes

kjcho-msft merged commit d9f8c9a into master May 10, 2019

kjcho-msft deleted the kjcho/bloboutput branch August 15, 2019 18:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DataX.Flow: fix blob output issue #50

DataX.Flow: fix blob output issue #50

kjcho-msft commented May 10, 2019

DataX.Flow: fix blob output issue #50

DataX.Flow: fix blob output issue #50

Conversation

kjcho-msft commented May 10, 2019