Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[web] all links validation #1170 #1206

Merged
merged 1 commit into from
Aug 11, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
40 changes: 40 additions & 0 deletions agdb_web/e2e/allLinks.spec_.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
import { test, expect, Page } from "@playwright/test";

const validatedLinks: string[] = [];

const validateLinks = async (page: Page) => {
const links = await page
.locator("a")
.evaluateAll((els) => els.map((el) => el.getAttribute("href")));
for (const href of links) {
if (
href &&
!validatedLinks.includes(href) &&
!href.startsWith("mailto") &&
!href.startsWith("tel") &&
!href.startsWith("javascript") &&
!href.includes("/blob/main/")
) {
await page.goto(href);

const pageTitle = await page.title();
expect(pageTitle.length).toBeGreaterThan(0);
expect(pageTitle).not.toContain("404");

validatedLinks.push(href);

if (href.startsWith("https") || href.startsWith("http")) {
continue;
}
await validateLinks(page);
}
}
};

test("should validate all links", async ({ page }) => {
test.setTimeout(300000);

await page.goto("http://localhost:5001/");

await validateLinks(page);
});
2 changes: 1 addition & 1 deletion agdb_web/pages/api-docs/openapi.en-US.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ description: "API, Agnesoft Graph Database"

# api

The [agdb server](docs/guides/server) can be accessed using OpenAPI (REST) via any HTTP client. In addition to the API specification `agdb` offers wide range of clients for many languages that uses the same API but provides convenience and ease-of-use:
The [agdb server](/docs/guides/how-to-run-server) can be accessed using OpenAPI (REST) via any HTTP client. In addition to the API specification `agdb` offers wide range of clients for many languages that uses the same API but provides convenience and ease-of-use:

<p>
<a href="/api-docs/rust">
Expand Down
2 changes: 1 addition & 1 deletion agdb_web/pages/api-docs/rust.en-US.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -7,4 +7,4 @@ description: "Rust, Agnesoft Graph Database"

The rust agdb API client is **async only** and can be used with any HTTP client that would implement the `agdb_api::HttpClient` trait. The default implementation uses [reqwest](https://crates.io/crates/reqwest).

See [Quickstart - client](/docs/guides/quickstart_client) for usage.
See [Quickstart - client](/docs/guides/quickstart-client) for usage.
4 changes: 1 addition & 3 deletions agdb_web/pages/blog/blog.en-US.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,4 @@ Articles written about the `agdb` and related topics adding insight into technol

<br />

1. [Why not SQL](blog/why_not_sql.md)
<br />
<br />
1. [Why not SQL](/blog/why-not-sql)
12 changes: 6 additions & 6 deletions agdb_web/pages/blog/why-not-sql.en-US.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ The following items provide explanations for some of the design choices of `agdb
- [Why single file?](#why-single-file)
- [What about sharding, replication and performance at scale?](#what-about-sharding-replication-and-performance-at-scale)

# Why graph?
## Why graph?

The database area is dominated by relational database systems (tables) and text queries since the 1970s. However the issues with the relational database systems are numerous and they even gave rise the the regular SW profession - database engineer. This is because contrary to their name they are very awkward at representing actual relations between data which is always demanded by the real world applications. They typically use foreign keys and/or proxy tables to represent them. Additionally the tables naturally enforce fixed immutable data schema upon the data they store. To change the schema one needs to create a new database with the changed schema and copy the data over (this is called database migration). Such operation is very costly and most database systems fair poorly when there are foreign keys involved (requiring them to be disabled for the migration to happen). As it turns out nowadays no database schema is truly immutable. New and changed requirements happen so often that the database schemas usually need updating (migrating) nearly every time there is an update to the systems using it.

Expand All @@ -33,7 +33,7 @@ That is in a nutshell why the graph database is the best choice for most problem

Everything has the cost and graph databases are no exception. Some operations and some data representations may be costlier in them as opposed to table based databases. For example if you had immutable schema that never updates then table based database might a better fit as the representation in form of tables is more storage efficient. or if you always read the whole table or whole rows then once again the table based databases might be more performant. Typically though these are uncommon edge cases unlikely to be found in the real world applications. The data is almost always sparse and diverse in nature, the schema is never truly stable etc. On the other hand most use cases benefit greatly from graph based representation and thus such a database is well worth it despite some (often more theoretical) costs.

# Why not use an existing graph database?
## Why not use an existing graph database?

The following is the list of requirements for an ideal graph database:

Expand All @@ -46,7 +46,7 @@ The following is the list of requirements for an ideal graph database:

Surprisingly there is no database that would fit the bill. Even the most popular graph databases such as `Neo4J` or `OrientDB` fall short on several of these requirements. They do have their own text based language (e.g. Cypher for Neo4J). They lack the drivers for C++/Rust. They are not particularly efficient (being mostly written in Java). Even the recent addition built in Rust - `SurrealDb` - is using text based SQL queries. Quite incomprehensibly its driver support for Rust itself is not very mature so far and was added only later despite the system being written in Rust. Something which is oddly common in the database world, e.g. `RethinkDb`, itself a document database, written mostly in C++, has no C++ support but does officially support for example Ruby. Atop of these issues they often do not leverage the graph structure very well (except for Neo4J which does great job at this) still leaning heavily towards tables.

# Why object queries?
## Why object queries?

The most ubiquitous database query language is SQL which is text based language created in the 1970s. Its biggest advantage is that being text based it can be used from any language to communicate with the database. However just like relational (table) bases databases from the same era it has some major flaws:

Expand All @@ -61,20 +61,20 @@ The solutions include heavily sanitizing the user inputs in an attempt to preven

Using native objects representing the queries eliminate all of the SQL issues sacrificing the portability between languages. However that can be relatively easily be made up via already very mature (de)serialization of native objects available in most languages. Using builder pattern to construct these objects further improve their correctness and readability. Native objects carry no additional cognitive load on the programmer and can be easily used just like any other code.

# Why single file?
## Why single file?

All operating systems have fairly low limit on number of open file descriptors for a program and for all programs in total making this system resource one of the rarest. Furthermore operating over multiple files does not seem to bring in any substantial benefit for the database while it complicates its implementation significantly. The graph database typically needs to have access to the full graph at all times unlike say key-value stores or document databases. Splitting the data into multiple files would therefore be actually detrimental. Lastly overall storage taken by the multiple files would not actually change as the amount of data would be the same.

Conversely using just a single file (with a second temporary write ahead log file) makes everything simpler and easier. You can for example easily transfer the data to a different machine - it is just one file. The database can also operate on the file directly if memory mapping was turned off to save RAM at the cost of performance. The program would not need to juggle multiple files and consuming valuable system resources.

The one file is the database and the data.

# What about sharding, replication and performance at scale?
## What about sharding, replication and performance at scale?

Most databases tackle the issue of (poor) performance at scale by scaling up using replication/sharding strategies. While these techniques are definitely useful and they are planned for `agdb` they should be avoided as much as possible. The increase in complexity when using replication and/or sharding is dramatic and it has adverse performance impact meaning it is only worth it if there is no other choice.

The `agdb` is designed so that it performs well regardless of the data set size. Direct access operations are O(1) and there is no limit on concurrency. Write operations are O(1) amortized however they are exclusive - there can be only one write operation running on the database at any given time preventing any other read or write operations at the same time. You will still get O(n) complexity when searching the (sub)graph as reading a 1000 connected nodes will take 1000 O(1) operations = O(n) same as reading 1000 rows in a table. However if the data does not indiscriminately connect everything to everything one can have as large data set as the hardware can fit without performance issues. The key is querying only subset of the graph (subgraph) since your query will have performance based on that subgraph and not all the data stored in the database.

The point here is that scaling has significant cost regardless of technology or clever tricks. Only when the database starts exceeding limits of a single machine they shall be considered because adding data replication/backup will mean huge performance hit. To mitigate it to some extent caching can be used but it can never be as performant as local database. The features "at scale" are definitely coming you should avoid using them as much as possible even if available.

[For real world performance see dedicated documentation.](performance.md)
[For real world performance see dedicated documentation.](/docs/references/performance)
79 changes: 11 additions & 68 deletions agdb_web/pages/docs/docs.en-US.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -13,103 +13,46 @@ Documentation for the Agnesoft Graph Database can be found here.

1. [Concepts](/docs/guides/concepts)

<br />
<br />

2. [Quickstart](/docs/guides/quickstart)

<br />
<br />

3. [Quickstart - Client](/docs/guides/quickstart_client)

<br />
<br />

4. [How to run a server?](/docs/guides/server)

<br />
<br />
3. [Quickstart - Client](/docs/guides/quickstart-client)

5. [How to use the studio?](/docs/guides/studio)
4. [How to run a server?](/docs/guides/how-to-run-server)

<br />
<br />
5. [How to use the studio?](/docs/guides/how-to-use-studio)

6. [Troubleshooting](docs/guides/troubleshooting)
6. [Troubleshooting](/docs/guides/troubleshooting)

<br />
<br />

7. [Migration from SQL](docs/guides/migration_from_sql)
<br />
<br />
7. [Migration from SQL](/docs/guides/migration-from-sql)

## examples

<br />

1. [app_db](https://github.com/agnesoft/agdb/tree/main/examples/app_db)

<br />
<br />

2. [indexes](https://github.com/agnesoft/agdb/tree/main/examples/indexes)

<br />
<br />

3. [joins](https://github.com/agnesoft/agdb/tree/main/examples/joins)

<br />
<br />

4. [schema migration](https://github.com/agnesoft/agdb/tree/main/examples/schema_migration)

<br />
<br />

5. [server client - rurst](https://github.com/agnesoft/agdb/tree/main/examples/server_client_rust)

<br />
<br />

6. [server client - typescript](https://github.com/agnesoft/agdb/tree/main/examples/server_client_typescript)

<br />
<br />

7. [strong types](https://github.com/agnesoft/agdb/tree/main/examples/strong_types)
<br />
<br />

## references

1. [Queries](docs/references/queries)

<br />
<br />

2. [Server](docs/references/server)

<br />
<br />

3. [Studio](docs/references/studio)

<br />
<br />
1. [Queries](/docs/references/queries)

4. [Cloud](docs/references/cloud)
2. [Server](/docs/references/server)

<br />
<br />
3. [Studio](/docs/references/studio)

5. [Efficient agdb](docs/references/efficient_agdb)
4. [Cloud](/docs/references/cloud)

<br />
<br />
5. [Efficient agdb](/docs/references/efficient-agdb)

6. [Performance](docs/references/performance)
<br />
<br />
6. [Performance](/docs/references/performance)
22 changes: 1 addition & 21 deletions agdb_web/pages/docs/examples.en-US.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -5,40 +5,20 @@ description: "Examples, Agnesoft Graph Database"

# Examples

The following links lead you to the example code in the `agdb` repository. For the guided examples see [guides](docs/guides).
The following links lead you to the example code in the `agdb` repository. For the guided examples see [guides](/docs/guides).

<br />

1. [app_db](https://github.com/agnesoft/agdb/tree/main/examples/app_db)

<br />
<br />

2. [indexes](https://github.com/agnesoft/agdb/tree/main/examples/indexes)

<br />
<br />

3. [joins](https://github.com/agnesoft/agdb/tree/main/examples/joins)

<br />
<br />

4. [schema migration](https://github.com/agnesoft/agdb/tree/main/examples/schema_migration)

<br />
<br />

5. [server client - rurst](https://github.com/agnesoft/agdb/tree/main/examples/server_client_rust)

<br />
<br />

6. [server client - typescript](https://github.com/agnesoft/agdb/tree/main/examples/server_client_typescript)

<br />
<br />

7. [strong types](https://github.com/agnesoft/agdb/tree/main/examples/strong_types)
<br />
<br />
29 changes: 6 additions & 23 deletions agdb_web/pages/docs/guides.en-US.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -7,31 +7,14 @@ description: "Guides, Agnesoft Graph Database"

The following guides are guided examples of usage of the agdb:

1. [Concepts](guides/concepts)
1. [Concepts](/docs/guides/concepts)

<br />
<br />
2. [Quickstart](/docs/guides/quickstart)

2. [Quickstart](guides/quickstart)
3. [Quickstart - Client](/docs/guides/quickstart-client)

<br />
<br />
4. [How to run a server?](/docs/guides/how-to-run-server)

3. [Quickstart - Client](guides/quickstart_client)
5. [How to use the studio?](/docs/guides/how-to-use-studio)

<br />
<br />

4. [How to run a server?](guides/how_to_run_server)

<br />
<br />

5. [How to use the studio?](guides/how_to_use_studio)

<br />
<br />

6. [Troubleshooting](guides/troubleshooting)
<br />
<br />
6. [Troubleshooting](/docs/guides/troubleshooting)
8 changes: 4 additions & 4 deletions agdb_web/pages/docs/guides/_meta.en-US.json
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
{
"concepts": "Concepts",
"quickstart": "Quickstart",
"quickstart_client": "Quickstart (Client)",
"how_to_run_server": "How to Run Server",
"how_to_use_studio": "How to Use Studio",
"quickstart-client": "Quickstart (Client)",
"how-to-run-server": "How to Run Server",
"how-to-use-studio": "How to Use Studio",
"troubleshooting": "Troubleshooting",
"migration_from_sql": "Migration from SQL"
"migration-from-sql": "Migration from SQL"
}
18 changes: 5 additions & 13 deletions agdb_web/pages/docs/guides/concepts.en-US.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,17 +5,9 @@ description: "Concepts, Agnesoft Graph Database"

# concepts

- [graph](#graph)
- [query](#query)
- [transaction](#transaction)
- [storage](#storage)
- [data types](#data-types)

<br/>

## graph

_Related:_ [Why graph?](but_why.md#why-graph)
_Related:_ [Why graph?](/blog/why-not-sql#why-graph)

Graph is a set of nodes (also vertices, points) that are connected to each other through edges (also arcs, links). In `agdb` the data is plotted on directed graphs and there are no restrictions on their structure. They can be cyclic (forming a cycle), acyclic (being open ended), sparse (having only some connections between nodes), disjointed (thus forming multiple graphs), having self-referential edges (nodes being connected to themselves), having multiple edges to the same node (even itself) and/or in the same same direction.

Expand All @@ -34,15 +26,15 @@ Nodes and edges are `graph elements` and each can have key-value pairs associate

## query

_Related:_ [Why object queries?](but_why#why-object-queries), [Queries](queries.md)
_Related:_ [Why object queries?](/blog/why-not-sql#why-object-queries), [Queries](/docs/references/queries)

Query is a request to retrieve or manipulate data in a database (both the graph structure and `values` associated with the nodes and edges). In `agdb` queries are not texts (like in SQL) but rather objects that contain details about what is being requested. These objects are typically constructed via a query builder but it is also possible to create them like any other object. The builder steps resemble, and often indeed are, direct translations of a well known SQL equivalents (e.g. `QueryBuilder::select() == SELECT`, `QueryBuilder::insert() == INSERT INTO`).

Queries are executed by the database directly. The `agdb` distinguishes between `immutable` (retrieving data) and `mutable` (manipulating data) queries. Each query execution produces either a result or an error. In `agdb` there is a single `result` object containing a numerical result (i.e. number of affected elements or values) and a list of database elements. Each element in a result is comprised of a database id and a list of `values` (associated key-value pairs).

In case of a failure the database execution yields an error detailing what went wrong instead of a result.

See dedicated [queries](queries.md) documentation for details.
See dedicated [queries](/docs/references/queries) documentation for details.

**terminology:**

Expand All @@ -55,7 +47,7 @@ See dedicated [queries](queries.md) documentation for details.

## transaction

_Related_: [Queries](queries.md)
_Related_: [Queries](/docs/references/queries)

Transactions are a way to provide atomicity, isolation and data consistency in a database (three of [ACID](https://en.wikipedia.org/wiki/ACID) properties). In `agdb` every query is a transaction but it is also possible to execute multiple queries as a single transaction. Just like `queries` transactions are immutable or mutable. One important rule is borrowed directly from Rust and enforced on the type level:

Expand All @@ -73,7 +65,7 @@ In multithreaded environment you can easily synchronize the access to the databa

## storage

_Related_: [Why single file?](but_why.md#why-single-file)
_Related_: [Why single file?](/blog/why-not-sql#why-single-file)

Every persistent database eventually stores its data somewhere on disk in one or more files. the `agdb` stores its data in a single file (that is being shadowed by another temporary write ahead log file). Its internal structure is very similar to that of a memory which makes it very easy to map between the two. The file format is fully platform agnostic and the file can be safely transferred to another machine and loaded there. Similarly the `agdb` is by default memory mapped database but it could just as easily operate purely on the file itself at the cost of read performance (might be implemented as a feature in the future).

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -82,7 +82,7 @@ curl -X POST -H "Authorization: Bearer ${token}" localhost:3000/api/v1/admin/use
token=$(curl -X POST -H 'Content-Type: application/json' localhost:3000/api/v1/user/login -d '{"username":"my_db_user","password":"password123"}')
```

<br/>9. To interact with the database you can either continue using `curl`, interactive OpenAPI GUI from any browser `localhost:3000/api/v1` (provided by `rapidoc`) or choose one of the [available API clients](/api.md). The raw OpenAPI specification can be downloaded from the server at `localhost:3000/api/v1/openapi.json`.
<br/>9. To interact with the database you can either continue using `curl`, interactive OpenAPI GUI from any browser `localhost:3000/api/v1` (provided by `rapidoc`) or choose one of the [available API clients](/api-docs/openapi). The raw OpenAPI specification can be downloaded from the server at `localhost:3000/api/v1/openapi.json`.
<br/><br/>

<br/>10. The server can be shutdown with `CTRL+C` or programmatically posting to the shutdown endpoint as logged in server admin:
Expand Down
Loading