Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-20544][SPARKR] R wrapper for input_file_name #17818

Closed
wants to merge 5 commits into from

Conversation

zero323
Copy link
Member

@zero323 zero323 commented May 1, 2017

What changes were proposed in this pull request?

Adds wrapper for o.a.s.sql.functions.input_file_name

How was this patch tested?

Existing unit tests, additional unit tests, check-cran.sh.

@SparkQA
Copy link

SparkQA commented May 1, 2017

Test build #76344 has finished for PR 17818 at commit 21d658d.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented May 1, 2017

Test build #76357 has finished for PR 17818 at commit 7c53668.

  • This patch fails SparkR unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@zero323 zero323 force-pushed the SPARK-20544 branch 3 times, most recently from 8457f01 to 2dd17dc Compare May 1, 2017 19:45
@SparkQA
Copy link

SparkQA commented May 1, 2017

Test build #76358 has finished for PR 17818 at commit f3ec7b7.

  • This patch fails SparkR unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented May 1, 2017

Test build #76359 has finished for PR 17818 at commit 2dd17dc.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@felixcheung
Copy link
Member

felixcheung commented May 2, 2017

hmm, not clear why AppVeyor failed. you could trigger it again by closing and re-opening this PR
without affecting Jenkins


#' input_file_name
#'
#' Creates a string column for the file name of the current Spark task.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually find this description in Scala API quite a bit confusing - what is "Spark task" and how it has "file name"?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about the new one?

path <- tempfile(pattern = "input_file_name_test", fileext = ".txt")
write.table(iris[1:50, ], path, row.names = FALSE, col.names = FALSE)

df <- read.text(path)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does it work with df <- read.json(jsonPath)?

if yes, consider adding to the test for column functions

(again, this is regarding: #17817 to consolidate/skip tests)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should work with any file input as far as I remember. I'd skip collecting but there have been some issues with PySpark in the past.

@zero323 zero323 closed this May 2, 2017
@zero323 zero323 reopened this May 2, 2017
@zero323
Copy link
Member Author

zero323 commented May 2, 2017

hmm, not clear why AppVeyor failed. you could trigger it again by closing and re-opening this PR
without affecting Jenkins

Look I'll have to rebase it anyway but thank you so much for the hint. I've been meaning to ask if there is some equivalent of Jenkins helpers.

@SparkQA
Copy link

SparkQA commented May 2, 2017

Test build #76375 has finished for PR 17818 at commit 38f43d0.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member

@felixcheung felixcheung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

#'
#' @rdname input_file_name
#' @name input_file_name
#' @aliases input_file_name,missing-method
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually, could you add @family normal_funcs here? I missed this earlier and in the other PR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@SparkQA
Copy link

SparkQA commented May 3, 2017

Test build #76397 has finished for PR 17818 at commit 72f3fb7.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@zero323 zero323 changed the title [SPARK-20544] R wrapper for input_file_name [SPARK-20544][SPARKR] R wrapper for input_file_name May 3, 2017

#' input_file_name
#'
#' Creates a string column with the input file name for a given row
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this actually makes a lot more sense...

Copy link
Member

@felixcheung felixcheung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@felixcheung
Copy link
Member

merged to master

@asfgit asfgit closed this in f21897f May 4, 2017
@zero323 zero323 deleted the SPARK-20544 branch February 2, 2020 17:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants