diff --git a/docs-2.0-en/import-export/nebula-exchange/parameter-reference/ex-ug-para-import-command.md b/docs-2.0-en/import-export/nebula-exchange/parameter-reference/ex-ug-para-import-command.md index 6431bf9d4b7..dc6a63e893b 100644 --- a/docs-2.0-en/import-export/nebula-exchange/parameter-reference/ex-ug-para-import-command.md +++ b/docs-2.0-en/import-export/nebula-exchange/parameter-reference/ex-ug-para-import-command.md @@ -2,28 +2,33 @@ After editing the configuration file, run the following commands to import specified source data into the NebulaGraph database. -- First import +## Import data - ```bash - /bin/spark-submit --master "local" --class com.vesoft.nebula.exchange.Exchange -c - ``` +```bash +/bin/spark-submit --master "spark://HOST:PORT" --class com.vesoft.nebula.exchange.Exchange -c +``` -- Import the reload file +The following table lists command parameters. - If some data fails to be imported during the first import, the failed data will be stored in the reload file. Use the parameter `-r` to import the reload file. +| Parameter | Required | Default value | Description | +| :--- | :--- | :--- | :--- | +| `--class`  | Yes | - | Specify the main class of the driver.| +| `--master`  | Yes | - | Specify the URL of the master process in a Spark cluster. For more information, see [master-urls](https://spark.apache.org/docs/latest/submitting-applications.html#master-urls). Optional values are:
`local`: Local Mode. Run Spark applications on a single thread. Suitable for importing small data sets in a test environment.
`yarn`: Run Spark applications on a YARN cluster. Suitable for importing large data sets in a production environment.
`spark://HOST:PORT`: Connect to the specified Spark standalone cluster.
`mesos://HOST:PORT`: Connect to the specified Mesos cluster.
`k8s://HOST:PORT`: Connect to the specified Kubernetes cluster.
| +| `-c`/`--config`  | Yes | - | Specify the path of the configuration file. | +| `-h`/`--hive`  | No | `false` | Specify whether importing Hive data is supported. | +| `-D`/`--dry`  | No | `false` | Specify whether to check the format of the configuration file. This parameter is used to check the format of the configuration file only, it does not check the validity of `tags` and `edges` configurations and does not import data. Don't add this parameter if you need to import data. | +| `-r`/`--reload` | No | - | Specify the path of the reload file that needs to be reloaded. | - ```bash - /bin/spark-submit --master "local" --class com.vesoft.nebula.exchange.Exchange -c -r "" - ``` +For more Spark parameter configurations, see [Spark Configuration](https://spark.apache.org/docs/latest/configuration.html#runtime-environment). !!! note - The version number of a JAR file is subject to the name of the JAR file that is actually compiled. - - If users use the [yarn-cluster mode](https://spark-reference-doc-cn.readthedocs.io/zh_CN/latest/deploy-guide/running-on-yarn.html) to submit a job, see the following command, **especially the two '--conf' commands in the example**. + - If users use the [yarn mode](https://spark-reference-doc-cn.readthedocs.io/zh_CN/latest/deploy-guide/running-on-yarn.html) to submit a job, see the following command, **especially the two '--conf' commands in the example**. ```bash - $SPARK_HOME/bin/spark-submit --master yarn-cluster \ + $SPARK_HOME/bin/spark-submit --master yarn \ --class com.vesoft.nebula.exchange.Exchange \ --files application.conf \ --conf spark.driver.extraClassPath=./ \ @@ -32,15 +37,12 @@ After editing the configuration file, run the following commands to import speci -c application.conf ``` -The following table lists command parameters. +## Import the reload file -| Parameter | Required | Default value | Description | -| :--- | :--- | :--- | :--- | -| `--class`  | Yes | - | Specify the main class of the driver.| -| `--master`  | Yes | - | Specify the URL of the master process in a Spark cluster. For more information, see [master-urls](https://spark.apache.org/docs/latest/submitting-applications.html#master-urls "click to open Apache Spark documents"). | -| `-c`  / `--config`  | Yes | - | Specify the path of the configuration file. | -| `-h`  / `--hive`  | No | `false` | Indicate support for importing Hive data. | -| `-D`  / `--dry`  | No | `false` | Check whether the format of the configuration file meets the requirements, but it does not check whether the configuration items of `tags` and `edges` are correct. This parameter cannot be added when users import data. | -| `-r` / `--reload` | No | - | Specify the path of the reload file that needs to be reloaded. | +If some data fails to be imported during the import, the failed data will be stored in the reload file. Use the parameter `-r` to import the data in reload file. -For more Spark parameter configurations, see [Spark Configuration](https://spark.apache.org/docs/latest/configuration.html#runtime-environment). +```bash +/bin/spark-submit --master "spark://HOST:PORT" --class com.vesoft.nebula.exchange.Exchange -c -r "" +``` + +If the import still fails, go to [Official Forum](https://github.com/vesoft-inc/nebula/discussions) for consultation. \ No newline at end of file diff --git a/docs-2.0-zh/import-export/nebula-exchange/parameter-reference/ex-ug-para-import-command.md b/docs-2.0-zh/import-export/nebula-exchange/parameter-reference/ex-ug-para-import-command.md index 44b52a5fdac..260e8320683 100644 --- a/docs-2.0-zh/import-export/nebula-exchange/parameter-reference/ex-ug-para-import-command.md +++ b/docs-2.0-zh/import-export/nebula-exchange/parameter-reference/ex-ug-para-import-command.md @@ -2,28 +2,33 @@ 完成配置文件修改后,可以运行以下命令将指定来源的数据导入{{nebula.name}}数据库。 -- 首次导入 +## 导入数据 - ```bash - /bin/spark-submit --master "local" --class com.vesoft.nebula.exchange.Exchange -c - ``` +```bash +/bin/spark-submit --master "spark://HOST:PORT" --class com.vesoft.nebula.exchange.Exchange -c +``` -- 导入 reload 文件 - - 如果首次导入时有一些数据导入失败,会将导入失败的数据存入 reload 文件,可以用参数`-r`尝试导入 reload 文件。 - - ```bash - /bin/spark-submit --master "local" --class com.vesoft.nebula.exchange.Exchange -c -r "" - ``` +参数说明如下。 + +| 参数 | 是否必需 | 默认值 | 说明 | +| :--- | :--- | :--- | :--- | +| `--class`  | 是 | 无 | 指定驱动的主类。 | +| `--master`  | 是 | 无 | 指定 Spark 集群的 master URL。详情请参见 [master-urls](https://spark.apache.org/docs/latest/submitting-applications.html#master-urls)。可选值为:
`local`:本地模式,使用单个线程运行 Spark 应用程序。适合在测试环境进行小数据量导入。
`yarn`:在 YARN 集群上运行 Spark 应用程序。适合在线上环境进行大数据量导入。
`spark://HOST:PORT`:连接到指定的 Spark standalone 集群。
`mesos://HOST:PORT`:连接到指定的 Mesos 集群。
`k8s://HOST:PORT`:连接到指定的 Kubernetes 集群。
| +| `-c`/`--config`  | 是 | 无 | 指定配置文件的路径。 | +| `-h`/`--hive`  | 否 | `false` | 添加这个参数表示支持从 Hive 中导入数据。 | +| `-D`/`--dry`  | 否 | `false` | 指定是否检查配置文件的格式。该参数仅用于检查配置文件的格式,不检查`tags`和`edges`配置项的有效性,也不会导入数据。需要导入数据时不要添加这个参数。 | +|`-r`/`--reload` | 否 | 无 | 指定需要重新加载的 reload 文件路径。 | + +更多 Spark 的参数配置说明请参见 [Spark Configuration](https://spark.apache.org/docs/latest/configuration.html#runtime-environment)。 !!! note - JAR 文件版本号以实际编译得到的 JAR 文件名称为准。 - - 如果使用 [yarn-cluster 模式](https://spark-reference-doc-cn.readthedocs.io/zh_CN/latest/deploy-guide/running-on-yarn.html)提交任务,请参考如下示例,**尤其是示例中的两个**`--conf`。 + - 如果使用 [yarn 模式](https://spark-reference-doc-cn.readthedocs.io/zh_CN/latest/deploy-guide/running-on-yarn.html)提交任务,请参考如下示例,**尤其是示例中的两个**`--conf`。 ```bash - $SPARK_HOME/bin/spark-submit --master yarn-cluster \ + $SPARK_HOME/bin/spark-submit --master yarn \ --class com.vesoft.nebula.exchange.Exchange \ --files application.conf \ --conf spark.driver.extraClassPath=./ \ @@ -32,15 +37,12 @@ -c application.conf ``` -下表列出了命令的相关参数。 +## 导入 reload 文件 + +如果导入数据时有一些数据导入失败,会将导入失败的数据存入 reload 文件,可以用参数`-r`尝试导入 reload 文件中的数据。 -| 参数 | 是否必需 | 默认值 | 说明 | -| :--- | :--- | :--- | :--- | -| `--class`  | 是 | 无 | 指定驱动的主类。 | -| `--master`  | 是 | 无 | 指定 Spark 集群中 master 进程的 URL。详情请参见 [master-urls](https://spark.apache.org/docs/latest/submitting-applications.html#master-urls "点击前往 Apache Spark 文档")。 | -| `-c`  / `--config`  | 是 | 无 | 指定配置文件的路径。 | -| `-h`  / `--hive`  | 否 | `false` | 添加这个参数表示支持从 Hive 中导入数据。 | -| `-D`  / `--dry`  | 否 | `false` | 添加这个参数表示检查配置文件的格式是否符合要求,但不会校验`tags`和`edges`的配置项是否正确。正式导入数据时不能添加这个参数。 | -|-r / --reload | 否 | 无 | 指定需要重新加载的 reload 文件路径。 | +```bash +/bin/spark-submit --master "spark://HOST:PORT" --class com.vesoft.nebula.exchange.Exchange -c -r "" +``` -更多 Spark 的参数配置说明请参见 [Spark Configuration](https://spark.apache.org/docs/latest/configuration.html#runtime-environment)。 +如果仍然导入失败,请到[论坛](https://discuss.nebula-graph.com.cn/)寻求帮助。 \ No newline at end of file