You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm a long time C# programmer but just getting my feet wet with .Net for Apache Spark. Following many "getting started" instructions and videos, I installed:
Problem:
When I call DataFrame.Show() after doing a DataFrame.WithColumn() using a UDF, I always get an error: [2023-02-07T15:45:31.3903664Z] [DESKTOP-H37P8Q0] [Error] [TaskRunner] [0] ProcessStream() failed with exception: System.ArgumentNullException: Value cannot be null. Parameter name: type
Note that the same bug will appear executing many different methods on the DataFrame object but only after a call to the WithColumn method using a UDF. In this case, the code looks like this:
// user defined function
Func<Column, Column, Column> GetSubst = Udf<string, string, int>(
(strOrder, strPlayers) =>
{
return GetSubstance(strOrder, strPlayers);
});
// call the user defined function and add a new column to the dataframe
ordersFrame = ordersFrame.WithColumn("substance", GetSubst(ordersFrame["names"], ordersFrame["players"]).Cast("Integer"));
// *** This is where the error will be thrown, but if I comment it out, the same error will be thrown later
// print out the data
ordersFrame.Show(20, 20, false);
however, I've tried it with other UDFs followed by other DataFrame method calls and I always get the same error. In the Main() function, you will see a later foreach loop. If I comment out the ordersFrame.Show() call, and comment in the contents of the loop, I will get the same error when I access row.Values[0].ToString().
I wonder if I have missed something in my installation?
Desktop (please complete the following information):
OS: Windows 10
Browser n/a
Version see above
The text was updated successfully, but these errors were encountered:
Well, it has been 5 days and I'm getting crickets.
I noticed that other questions have no responses after long periods of times and those that have any responses have had to wait weeks if not months.
Should I interpret this to mean that .NET for Apache Spark is sundowned and no longer supported?
Is this a dead product and we should not incorporate it in new development?
I'm a long time C# programmer but just getting my feet wet with .Net for Apache Spark. Following many "getting started" instructions and videos, I installed:
7-Zip
Java 8
I downloaded Apache Spark from https://spark.apache.org/downloads.html
.NET for Apache Spark v2.1.1
WinUtils.exe I'm running this on Window 10
Problem:
When I call DataFrame.Show() after doing a DataFrame.WithColumn() using a UDF, I always get an error: [2023-02-07T15:45:31.3903664Z] [DESKTOP-H37P8Q0] [Error] [TaskRunner] [0] ProcessStream() failed with exception: System.ArgumentNullException: Value cannot be null. Parameter name: type
TestCases.csv looks like this:
TestCases.csv
OrderList.csv looks like this:
OrderList.csv
Here is the Program class of the TestSparkApp console project:
Program.cs.txt
and supporting classes:
Player.cs.txt
Collector.cs.txt
Here is the output of the above app:
TestSpartAppOutput.txt
Note that the same bug will appear executing many different methods on the DataFrame object but only after a call to the WithColumn method using a UDF. In this case, the code looks like this:
however, I've tried it with other UDFs followed by other DataFrame method calls and I always get the same error. In the Main() function, you will see a later foreach loop. If I comment out the ordersFrame.Show() call, and comment in the contents of the loop, I will get the same error when I access row.Values[0].ToString().
I wonder if I have missed something in my installation?
Desktop (please complete the following information):
The text was updated successfully, but these errors were encountered: