Skip to content

Commit

Permalink
refine readme (#1785)
Browse files Browse the repository at this point in the history
  • Loading branch information
ZhengshuaiPENG authored May 18, 2022
1 parent c39d83a commit 9c50863
Showing 1 changed file with 71 additions and 156 deletions.
227 changes: 71 additions & 156 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,29 +4,8 @@
<img src="/images/Byzer_Logo.png" alt="drawing" width="200"/>
</p>

## TOC
* [Byzer-Lang](#byzer-lang)
* [Byzer Code Example](#byzer-code-example)
* [Byzer Architecture](#byzer-aritchitechture)
* [Official WebSite](#official-website)
* [VSCode Extension(MacOS、Linux)](#vscode-extensionmacoslinux)
* [Docker Sandbox](#docker-sandbox)
* [Pulling Sandbox Docker Image](#pulling-sandbox-docker-image)
* [Start Container](#start-container)
* [Download Byzer](#download-byzer)
* [Building a Distribution](#building-a-distribution)
* [Prerequisites](#prerequisites)
* [Downloading Source Code](#downloading-source-code)
* [Building Spark 2.4.3 Bundle](#building-spark-243-bundle)
* [Building Spark 3.1.1 Bundle](#building-spark-311-bundle)
* [Building without Chinese Analyzer](#building-without-chinese-analyzer)
* [Building with Aliyun OSS Support](#building-with-aliyun-oss-support)
* [Deploying](#deploying)
* [How to contribute to Byzer-Lang](#how-to-contribute-to-byzer-lang)
* [Contributors](#contributors)
* [WeChat Group](#wechat-group)

## Byzer-Lang

## Byzer-lang

**Byzer** (former MLSQL) is a low-code, open-sourced and distributed programming language for data pipeline, analytics and AI in cloud native way.

Expand All @@ -38,175 +17,110 @@ We believe that everything is a table, a simple and powerful SQL-like language c

![Byzer-lang Arch](images/Byzer-arch.png)

### Byzer Code Example


```sql
load hive.`raw.stripe_discounts` as discounts;
load hive.`raw.stripe_invoice_items` as invoice_items;

select
invoice_items.*,
case
when discounts.discount_type = 'percent'
then amount * (1.0 - discounts.discount_value::float / 100)
else amount - discounts.discount_value
end as discounted_amount
You can build a data product based on Byzer engine & Byzer-lang without interacting with computing framework directly like Spark in your data APP. Thus will simplify your data app significantly.

from invoice_items
For example, Byzer org contributes a data app [Byzer Notebook](https://github.com/byzer-org/byzer-notebook), which provides notebook interaction & workflow GUI interaction.

left outer join discounts
on invoice_items.customer_id = discounts.customer_id
and invoice_items.invoice_date > discounts.discount_start
and (invoice_items.invoice_date < discounts.discount_end
or discounts.discount_end is null)
as joined;
### BIP (Byzer Improvement Proposal)

Byzer project uses the [BIP](https://github.com/byzer-org/byzer-lang/wiki) for the community collaboration, you can checkout the feature design or architecture design in BIP.

### Online Trial

select
You can access the official website [https://www.byzer.org/](https://www.byzer.org/) and try Byzer-lang & Byzer Notebook online.

id,
invoice_id,
customer_id,
coalesce(discounted_amount, amount) as discounted_amount,
currency,
description,
created_at,
deleted_at

from joined
as final;
### Download

You can download Byzer engine via:
- [https://download.byzer.org/](https://download.byzer.org/)
- [Byzer Docker Hub](https://hub.docker.com/u/byzer)
- [Github Release](https://github.com/byzer-org/byzer-lang/releases)

For more details, please refer to the [docs](https://docs.byzer.org/#/byzer-lang/zh-cn/installation/README)

set allColumns = "all,wow";
### Install

!if ''' split(:allColumns,",")[0] == "all" ''';
select * from final as final2;
!else;
select id,invoice from final as final2;
!fi;
For **dev/test** purpose, you can download [Byzer All In One Package](https://docs.byzer.org/#/byzer-lang/zh-cn/installation/server/byzer-all-in-one-deployment), extract and then execute the command below

select * from final2 as output;
```
$ cd {BYZER_HOME}
$ ./bin/byzer.sh start
```


## Official WebSite
And for **production** purpose, we recommend to use [Byzer Server Pacakge](https://docs.byzer.org/#/byzer-lang/zh-cn/installation/server/binary-installation) and deploy it on Hadoop.

[https://www.byzer.org](https://www.byzer.org)

## Notebook Support
You can also install [Byzer VSCode Extension](https://docs.byzer.org/#/byzer-lang/zh-cn/installation/vscode/byzer-vscode-extension-installation) to use Byzer-lang.

[byzer-notebook](https://github.com/byzer-org/byzer-notebook)
For the Docker Image or , please refer to the [docs](https://docs.byzer.org/#/byzer-lang/zh-cn/installation/README)


## VSCode Extension(MacOS、Linux、Windows)

[VSCode IDE Extension](https://github.com/byzer-org/byzer-desktop)
### Byzer Code Example

[More document about byzer-lang vscode extension(Chinese version)](https://docs.byzer.org/#/byzer-lang/zh-cn/installation/desktop-installation)
Below list an example that how to process Github API as a table to get the information of Byzer Org

## Docker Sandbox (With Notebook)
```sql
-- Get Github Organization Info

```
export MYSQL_PASSWORD=${1:-root}
export SPARK_VERSION=${SPARK_VERSION:-3.1.1}
export MLSQL_VERSION=${MLSQL_VERSION:-2.2.0-SNAPSHOT}
docker run -d \
-p 3306:3306 \
-p 9002:9002 \
-p 9003:9003 \
-e MYSQL_ROOT_HOST=% \
-e MYSQL_ROOT_PASSWORD="${MYSQL_PASSWORD}" \
--name mlsql-sandbox-${SPARK_VERSION}-${MLSQL_VERSION} \
mlsql-sandbox:${SPARK_VERSION}-${MLSQL_VERSION}
```
-- set API URL and params
SET org_name="byzer-org";
SET GITHUB_ORGANIZATION_URL="https://api.github.com/orgs/${org_name}";

Then you can visit `http://127.0.0.1:9002` .
-- Load Github Organization API as table
LOAD Rest.`$GITHUB_ORGANIZATION_URL`
where `config.connect-timeout`="10s"
and `config.method`="GET"
and `header.accept`="application/vnd.github.v3+json"
as github_org;


## Download Byzer
-- decode API response from binary to a json string
select string(content) as content from github_org as response_content;

* The latest stable version is release-2.2.0
* You can download from [Byzer Download Website](https://download.byzer.org)
* Spark 2.4.3/3.1.1 have been tested
-- expand the json string
run response_content as JsonExpandExt.`` where inputCol="content" and structColumn="true" as github_org;

***Naming Convention***
-- retrieve user infomation and process as a table
select content.* from github_org as org_info;

mlsql-engine_${spark_major_version}-${mlsql_version}.tgz
```shell
## Pre-built for Spark 2.4.3
byzer-lang-2.4.3-2.1.0.tar.gz
-- save the table to delta lake
save overwrite org_info as delta.`github_info_db.byzer_org`;
```

## Pre-built for Spark 3.1.1
byzer-lang-3.1.1-2.1.0.tar.gz
```

## Building a Distribution
### Prerequisites
- JDK 8+
- Maven
- Linux or MacOS
For more details about the Byzer-lang grammer, please refer to the user manaual [Byzer-lang Grammer](https://docs.byzer.org/#/byzer-lang/zh-cn/grammar/outline)

### Downloading Source Code
```shell
## Clone the code base
git clone https://github.com/byzer-org/byzer-lang.git
cd byzer-lang
```
### Development

### Building Spark 2.4.3 Bundle
```shell
export MLSQL_SPARK_VERSION=2.4
./dev/make-distribution.sh
```
1. Fork the repository and clone

### Building Spark 3.1.1 Bundle
```shell
export MLSQL_SPARK_VERSION=3.0
./dev/make-distribution.sh
```
### Building without Chinese Analyzer
```shell
## Chinese analyzer is enabled by default.
export ENABLE_CHINESE_ANALYZER=false
./dev/make-distribution.sh <spark_version>
git clone https://github.com/{YOUR_GITHUB}/byzer-lang.git
```

## Deploying
1. [Download](#Download) or [build a distribution](#Build)
2. Install Spark and set environment variable SPARK_HOME, make sure Spark version matches that of MLSQL
3. Deploy tgz
- Set environment variable MLSQL_HOME
- Copy distribution tar ball over and untar it

4.Start Byzer in local mode
```shell
cd $MLSQL_HOME
## Run process in background
nohup ./bin/start-local.sh 2>&1 > ./local_mlsql.log &
```
5. Open a browser and type in http://localhost:9003, have fun.

Directory structure
```shell
|-- mlsql
|-- bin
|-- conf
|-- data
|-- examples
|-- libs
|-- main
|-- README.md
|-- LICENSE
|-- RELEASE
```
2. Use Intellj Idea to open the project, choose the scala version `2.12.10`

3. In Intellj Idea Maven Setting, check the profile below
- gpg
- local
- scala-2.12
- spark-3.0.0
- streamingpro-spark-3.0.0-adaptor

## How to contribute to Byzer-Lang
4. Click Maven Refresh and wait for Idea load finished

If you are planning to contribute to this repository, please create an issue at [our Issue page](https://github.com/byzer-org/byzer-lang/issues)
5. Find the class `tech.mlsql.example.app.LocalSparkServiceApp`, click Debug button then Byzer Engine will be started, then you can access the Byzer Web Console in [http://localhost:9003/](http://localhost:9003/#/)

### Build

You can refer to the project [byzer-org/byzer-build](https://github.com/byzer-org/byzer-build) to check how to build the Byzer engine binary pacakges and images

### How to contribute to Byzer-Lang

If you are planning to contribute to this project, please create an issue at [our Issue page](https://github.com/byzer-org/byzer-lang/issues)
even if the topic is not related to source code itself (e.g., documentation, new idea and proposal).

This is an active open source project for everyone,
Expand All @@ -215,17 +129,18 @@ and we are always open to people who want to use this system or contribute to it
For more details about how to contribute to the Byzer Org, please refer to [How to Contribute](https://docs.byzer.org/#/byzer-lang/zh-cn/appendix/contribute)


## Contributors
### Contributors

<a href="https://github.com/byzer-org/byzer-lang/graphs/contributors">
<img src="https://contrib.rocks/image?repo=byzer-org/byzer-lang" />
</a>

Made with [contrib.rocks](https://contrib.rocks).

## WeChat Group
### Community

- **Slack**: [byzer-org.slack.com](https://byzer-org.slack.com)
- **Wechat Official Account:** Byzer Community

扫码添加K小助微信号,添加成功后,发送 mlsql 这5个英文字母进群。

![](https://github.com/allwefantasy/mlsql/blob/master/images/dc0f4493-570f-4660-ab41-0e487b17a517.png)

0 comments on commit 9c50863

Please sign in to comment.