Skip to content

Commit

Permalink
docs: Add "What is Sentence?"
Browse files Browse the repository at this point in the history
fix #27
  • Loading branch information
azu committed Feb 11, 2023
1 parent e223009 commit 8acfd8b
Show file tree
Hide file tree
Showing 3 changed files with 321 additions and 0 deletions.
35 changes: 35 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,41 @@

Split {Japanese, English} text into sentences.

## What is sentence?

This library split next text into 3 sentences.

```
We are talking about pens.
He said "This is a pen. I like it".
I could relate to that statement.
```

Result is:

![Sentence Image](./docs/img/sentence-result.png)

You can check actual AST in online playground.

- <https://sentence-splitter.netlify.app/#We%20are%20talking%20about%20pens.%0AHe%20said%20%22This%20is%20a%20pen.%20I%20like%20it%22.%0AI%20could%20relate%20to%20that%20statement.>

Second sentence has `"This is a pen. I like it"`, but this library can not split it into new sentence.
The second line will be one sentence.

The reason is `"..."` and `「...」` text is ambiguous as a sentence or a proper noun.
Also, HTML does not have suitable semantics for conversation.

- [html - Most semantic way to markup a conversation (or interview)? - Stack Overflow](https://stackoverflow.com/questions/8798685/most-semantic-way-to-markup-a-conversation-or-interview)

As a result, sentence-splitter can not support nesting sentence.
Probably, rule implementation should handle the `"..."` and `「...」` text after parsing sentences by sentence-splitter.

- Issue: [Nesting Sentences Support · Issue #27 · textlint-rule/sentence-splitter](https://github.com/textlint-rule/sentence-splitter/issues/27)
- Related PR
- https://github.com/textlint-ja/textlint-rule-no-doubled-joshi/pull/47
- https://github.com/textlint-ja/textlint-rule-no-doubled-conjunctive-particle-ga/pull/27
- https://github.com/textlint-ja/textlint-rule-max-ten/pull/24

## Installation

npm install sentence-splitter
Expand Down
286 changes: 286 additions & 0 deletions docs/img/sentence-result.excalidraw
Original file line number Diff line number Diff line change
@@ -0,0 +1,286 @@
{
"type": "excalidraw",
"version": 2,
"source": "file://",
"elements": [
{
"type": "rectangle",
"version": 127,
"versionNonce": 155757330,
"isDeleted": false,
"id": "CPWsxIp5UqqLknHc4shTM",
"fillStyle": "hachure",
"strokeWidth": 1,
"strokeStyle": "solid",
"roughness": 1,
"opacity": 100,
"angle": 0,
"x": 377.07421875,
"y": 206.6640625,
"strokeColor": "#000000",
"backgroundColor": "transparent",
"width": 830.79296875,
"height": 366.7890625,
"seed": 955301074,
"groupIds": [],
"strokeSharpness": "sharp",
"boundElementIds": []
},
{
"id": "KEWyk2eG0Ng__NtfJQoTv",
"type": "text",
"x": 380.765625,
"y": 168.265625,
"width": 140,
"height": 36,
"angle": 0,
"strokeColor": "#000000",
"backgroundColor": "transparent",
"fillStyle": "hachure",
"strokeWidth": 1,
"strokeStyle": "solid",
"roughness": 1,
"opacity": 100,
"groupIds": [],
"strokeSharpness": "sharp",
"seed": 1895227154,
"version": 117,
"versionNonce": 1352565710,
"isDeleted": false,
"boundElementIds": null,
"text": "Paragraph",
"fontSize": 28,
"fontFamily": 1,
"textAlign": "left",
"verticalAlign": "top",
"baseline": 25
},
{
"id": "FIKY5VYRApZ6JUgjQ9jfg",
"type": "rectangle",
"x": 401.61328125,
"y": 366.47265625,
"width": 767.2265624999999,
"height": 88.44531250000001,
"angle": 0,
"strokeColor": "#364fc7",
"backgroundColor": "transparent",
"fillStyle": "hachure",
"strokeWidth": 1,
"strokeStyle": "solid",
"roughness": 1,
"opacity": 100,
"groupIds": [],
"strokeSharpness": "sharp",
"seed": 2114809806,
"version": 1034,
"versionNonce": 1640867730,
"isDeleted": false,
"boundElementIds": null
},
{
"id": "NkxByDzhJeuWMlRkzsq05",
"type": "text",
"x": 425.53515625,
"y": 396.77734375,
"width": 466,
"height": 36,
"angle": 0,
"strokeColor": "#000000",
"backgroundColor": "transparent",
"fillStyle": "hachure",
"strokeWidth": 1,
"strokeStyle": "solid",
"roughness": 1,
"opacity": 100,
"groupIds": [],
"strokeSharpness": "sharp",
"seed": 1248001874,
"version": 1039,
"versionNonce": 722702,
"isDeleted": false,
"boundElementIds": null,
"text": "He said \"This is a pen. I like it\". ",
"fontSize": 28,
"fontFamily": 1,
"textAlign": "left",
"verticalAlign": "top",
"baseline": 25
},
{
"id": "uCtHIvKHx3_xD5gg4JvzJ",
"type": "text",
"x": 428.8359375,
"y": 486.79296875,
"width": 484,
"height": 36,
"angle": 0,
"strokeColor": "#000000",
"backgroundColor": "transparent",
"fillStyle": "hachure",
"strokeWidth": 1,
"strokeStyle": "solid",
"roughness": 1,
"opacity": 100,
"groupIds": [],
"strokeSharpness": "sharp",
"seed": 28050254,
"version": 181,
"versionNonce": 1353967506,
"isDeleted": false,
"boundElementIds": null,
"text": "I could relate to that statement.",
"fontSize": 28,
"fontFamily": 1,
"textAlign": "left",
"verticalAlign": "top",
"baseline": 25
},
{
"type": "rectangle",
"version": 1161,
"versionNonce": 1068043214,
"isDeleted": false,
"id": "Jm8BQlR_Ktcs7-ZmHYDIS",
"fillStyle": "hachure",
"strokeWidth": 1,
"strokeStyle": "solid",
"roughness": 1,
"opacity": 100,
"angle": 0,
"x": 402.1875,
"y": 465.87890625,
"strokeColor": "#364fc7",
"backgroundColor": "transparent",
"width": 767.2265624999999,
"height": 88.44531250000001,
"seed": 1229023694,
"groupIds": [],
"strokeSharpness": "sharp",
"boundElementIds": []
},
{
"id": "50cFtlC92aNBurqqVzEnO",
"type": "text",
"x": 423.134765625,
"y": 285.083984375,
"width": 372,
"height": 36,
"angle": 0,
"strokeColor": "#000000",
"backgroundColor": "transparent",
"fillStyle": "hachure",
"strokeWidth": 1,
"strokeStyle": "solid",
"roughness": 1,
"opacity": 100,
"groupIds": [],
"strokeSharpness": "sharp",
"seed": 239201038,
"version": 91,
"versionNonce": 1877841234,
"isDeleted": false,
"boundElementIds": null,
"text": "We are talking about pens.",
"fontSize": 28,
"fontFamily": 1,
"textAlign": "center",
"verticalAlign": "middle",
"baseline": 25
},
{
"type": "rectangle",
"version": 1275,
"versionNonce": 1072058702,
"isDeleted": false,
"id": "33pywk-M1KtaAGr6La9Eo",
"fillStyle": "hachure",
"strokeWidth": 1,
"strokeStyle": "solid",
"roughness": 1,
"opacity": 100,
"angle": 0,
"x": 399.94921875,
"y": 261.18359375,
"strokeColor": "#364fc7",
"backgroundColor": "transparent",
"width": 767.2265624999999,
"height": 88.44531250000001,
"seed": 1633354898,
"groupIds": [],
"strokeSharpness": "sharp",
"boundElementIds": []
},
{
"id": "htXPaO1pgoZXGhCieO1u8",
"type": "text",
"x": 399.80859375,
"y": 226.19921875,
"width": 134,
"height": 36,
"angle": 0,
"strokeColor": "#364fc7",
"backgroundColor": "transparent",
"fillStyle": "hachure",
"strokeWidth": 1,
"strokeStyle": "solid",
"roughness": 1,
"opacity": 100,
"groupIds": [],
"strokeSharpness": "sharp",
"seed": 259192142,
"version": 51,
"versionNonce": 1533734862,
"isDeleted": false,
"boundElementIds": null,
"text": "Sentences",
"fontSize": 28,
"fontFamily": 1,
"textAlign": "left",
"verticalAlign": "top",
"baseline": 25
},
{
"id": "R5-1rsrenF25cdcexTzSQ",
"type": "line",
"x": 539.3203125,
"y": 438.66015625,
"width": 329.5234375,
"height": 2.8984375,
"angle": 0,
"strokeColor": "#364fc7",
"backgroundColor": "transparent",
"fillStyle": "hachure",
"strokeWidth": 1,
"strokeStyle": "dashed",
"roughness": 1,
"opacity": 100,
"groupIds": [],
"strokeSharpness": "round",
"seed": 1884800526,
"version": 153,
"versionNonce": 1910055886,
"isDeleted": false,
"boundElementIds": null,
"points": [
[
0,
0
],
[
329.5234375,
-2.8984375
]
],
"lastCommittedPoint": null,
"startBinding": null,
"endBinding": null,
"startArrowhead": null,
"endArrowhead": null
}
],
"appState": {
"gridSize": null,
"viewBackgroundColor": "#FFFFFF"
}
}
Binary file added docs/img/sentence-result.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit 8acfd8b

Please sign in to comment.