-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support pyspark #19
Comments
same need +1 |
@melin @harryprince /spark/bin/pyspark --driver-class-path nebula-spark-connector-3.0.0.jar --jars nebula-spark-connector-3.0.0.jar df = spark.read.format(
"com.vesoft.nebula.connector.NebulaDataSource").option(
"type", "vertex").option(
"spaceName", "basketballplayer").option(
"label", "player").option(
"returnCols", "name,age").option(
"metaAddress", "metad0:9559").option(
"partitionNumber", 1).load()
>>> df.show(n=2)
+---------+--------------+---+
|_vertexId| name|age|
+---------+--------------+---+
|player105| Danny Green| 31|
|player109|Tiago Splitter| 34|
+---------+--------------+---+
only showing top 2 rows |
|
does the schema information could be detected automatically like we use Hive with meta info? |
I think you could use the nebula-python client to fetch meta/schema easier(it should be working to do so via spark-c, too with this py4j under the hood, I didn't try that yet), while, please be noted |
Now both write and read examples were provided #55 |
Pyspark is best supported. Algorithmic people are familiar with Python and easy to use
The text was updated successfully, but these errors were encountered: