Skip to content

Commit

Permalink
Move /document_loaders to granular entrypoints
Browse files Browse the repository at this point in the history
  • Loading branch information
nfcampos committed Apr 9, 2023
1 parent b156d8c commit 632d07b
Show file tree
Hide file tree
Showing 88 changed files with 379 additions and 128 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ id,text
Example code:

```typescript
import { CSVLoader } from "langchain/document_loaders";
import { CSVLoader } from "langchain/document_loaders/fs/csv";

const loader = new CSVLoader("src/document_loaders/example_data/example.csv");

Expand Down Expand Up @@ -61,7 +61,7 @@ id,text
Example code:

```typescript
import { CSVLoader } from "langchain/document_loaders";
import { CSVLoader } from "langchain/document_loaders/fs/csv";

const loader = new CSVLoader(
"src/document_loaders/example_data/example.csv",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -20,13 +20,13 @@ src/document_loaders/example_data/example/
Example code:

```typescript
import { DirectoryLoader } from "langchain/document_loaders/fs/directory";
import {
DirectoryLoader,
JSONLoader,
JSONLinesLoader,
TextLoader,
CSVLoader,
} from "langchain/document_loaders";
} from "langchain/document_loaders/fs/json";
import { TextLoader } from "langchain/document_loaders/fs/text";
import { CSVLoader } from "langchain/document_loaders/fs/csv";

const loader = new DirectoryLoader(
"src/document_loaders/example_data/example",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ npm install mammoth
# Usage

```typescript
import { DocxLoader } from "langchain/document_loaders";
import { DocxLoader } from "langchain/document_loaders/fs/docx";

const loader = new DocxLoader(
"src/document_loaders/tests/example_data/attention.docx"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ npm install epub2 html-to-text
# Usage, one document per chapter

```typescript
import { EPubLoader } from "langchain/document_loaders";
import { EPubLoader } from "langchain/document_loaders/fs/epub";

const loader = new EPubLoader("src/document_loaders/example_data/example.epub");

Expand All @@ -25,7 +25,7 @@ const docs = await loader.load();
# Usage, one document per file

```typescript
import { EPubLoader } from "langchain/document_loaders";
import { EPubLoader } from "langchain/document_loaders/fs/epub";

const loader = new EPubLoader(
"src/document_loaders/example_data/example.epub",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ Example JSON file:
Example code:

```typescript
import { JSONLoader } from "langchain/document_loaders";
import { JSONLoader } from "langchain/document_loaders/fs/json";

const loader = new JSONLoader("src/document_loaders/example_data/example.json");

Expand Down Expand Up @@ -73,7 +73,7 @@ In this example, we want to only extract information from "from" and "surname" e
Example code:

```typescript
import { JSONLoader } from "langchain/document_loaders";
import { JSONLoader } from "langchain/document_loaders/fs/json";

const loader = new JSONLoader(
"src/document_loaders/example_data/example.json",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ Example JSONLines file:
Example code:

```typescript
import { JSONLinesLoader } from "langchain/document_loaders";
import { JSONLinesLoader } from "langchain/document_loaders/fs/json";

const loader = new JSONLinesLoader(
"src/document_loaders/example_data/example.jsonl",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ npm install pdfjs-dist
# Usage, one document per page

```typescript
import { PDFLoader } from "langchain/document_loaders";
import { PDFLoader } from "langchain/document_loaders/fs/pdf";

const loader = new PDFLoader("src/document_loaders/example_data/example.pdf");

Expand All @@ -25,7 +25,7 @@ const docs = await loader.load();
# Usage, one document per file

```typescript
import { PDFLoader } from "langchain/document_loaders";
import { PDFLoader } from "langchain/document_loaders/fs/pdf";

const loader = new PDFLoader("src/document_loaders/example_data/example.pdf", {
splitPages: false,
Expand All @@ -39,7 +39,7 @@ const docs = await loader.load();
In legacy environments, you can use the `pdfjs` option to provide a function that returns a promise that resolves to the `PDFJS` object. This is useful if you want to use a custom build of `pdfjs-dist` or if you want to use a different version of `pdfjs-dist`. Eg. here we use the legacy build of `pdfjs-dist`, which includes several polyfills that are not included in the default build.

```typescript
import { PDFLoader } from "langchain/document_loaders";
import { PDFLoader } from "langchain/document_loaders/fs/pdf";

const loader = new PDFLoader("src/document_loaders/example_data/example.pdf", {
pdfjs: () =>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ npm install srt-parser-2
## Usage

```typescript
import { SRTLoader } from "langchain/document_loaders";
import { SRTLoader } from "langchain/document_loaders/fs/srt";

const loader = new SRTLoader(
"src/document_loaders/example_data/Star_Wars_The_Clone_Wars_S06E07_Crisis_at_the_Heart.srt"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ hide_table_of_contents: true
This example goes over how to load data from text files.

```typescript
import { TextLoader } from "langchain/document_loaders";
import { TextLoader } from "langchain/document_loaders/fs/text";

const loader = new TextLoader("src/document_loaders/example_data/example.txt");

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ npm install cheerio
## Usage

```typescript
import { CollegeConfidentialLoader } from "langchain/document_loaders";
import { CollegeConfidentialLoader } from "langchain/document_loaders/web/college_confidential";

const loader = new CollegeConfidentialLoader(
"https://www.collegeconfidential.com/colleges/brown-university/"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ npm install cheerio
## Load from single GitBook page

```typescript
import { GitbookLoader } from "langchain/document_loaders";
import { GitbookLoader } from "langchain/document_loaders/web/gitbook";

const loader = new GitbookLoader(
"https://docs.gitbook.com/product-tour/navigation"
Expand All @@ -29,7 +29,7 @@ const docs = await loader.load();
For this to work, the GitbookLoader needs to be initialized with the root path (https://docs.gitbook.com in this example) and have `shouldLoadAllPaths` set to `true`.

```typescript
import { GitbookLoader } from "langchain/document_loaders";
import { GitbookLoader } from "langchain/document_loaders/web/gitbook";

const loader = new GitbookLoader("https://docs.gitbook.com", {
shouldLoadAllPaths: true,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ This example goes over how to load data from a GitHub repository.
You can set the `GITHUB_ACCESS_TOKEN` environment variable to a GitHub access token to increase the rate limit and access private repositories.

```typescript
import { GithubRepoLoader } from "langchain/document_loaders";
import { GithubRepoLoader } from "langchain/document_loaders/web/github";

const loader = new GithubRepoLoader(
"https://github.com/hwchase17/langchainjs",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ npm install cheerio
## Usage

```typescript
import { HNLoader } from "langchain/document_loaders";
import { HNLoader } from "langchain/document_loaders/web/hn";

const loader = new HNLoader("https://news.ycombinator.com/item?id=34817881");

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ npm install cheerio
## Usage

```typescript
import { IMSDBLoader } from "langchain/document_loaders";
import { IMSDBLoader } from "langchain/document_loaders/web/imsdb";

const loader = new IMSDBLoader("https://imsdb.com/scripts/BlacKkKlansman.html");

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ npm install cheerio
## Usage

```typescript
import { CheerioWebBaseLoader } from "langchain/document_loaders";
import { CheerioWebBaseLoader } from "langchain/document_loaders/web/cheerio";

const loader = new CheerioWebBaseLoader(
"https://news.ycombinator.com/item?id=34817881"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ npm install puppeteer
## Usage

```typescript
import { PuppeteerWebBaseLoader } from "langchain/document_loaders";
import { PuppeteerWebBaseLoader } from "langchain/document_loaders/web/puppeteer";

/**
* Loader uses `page.evaluate(() => document.body.innerHTML)`
Expand Down Expand Up @@ -54,7 +54,7 @@ By passing these options to the `PuppeteerWebBaseLoader` constructor, you can cu
Here is a basic example to do it:

```typescript
import { PuppeteerWebBaseLoader } from "langchain/document_loaders";
import { PuppeteerWebBaseLoader } from "langchain/document_loaders/web/puppeteer";

const loader = new PuppeteerWebBaseLoader("https://www.tabnews.com.br/", {
launchOptions: {
Expand Down
2 changes: 1 addition & 1 deletion examples/src/document_loaders/cheerio_web.ts
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
import { CheerioWebBaseLoader } from "langchain/document_loaders";
import { CheerioWebBaseLoader } from "langchain/document_loaders/web/cheerio";

export const run = async () => {
const loader = new CheerioWebBaseLoader(
Expand Down
2 changes: 1 addition & 1 deletion examples/src/document_loaders/college_confidential.ts
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
import { CollegeConfidentialLoader } from "langchain/document_loaders";
import { CollegeConfidentialLoader } from "langchain/document_loaders/web/college_confidential";

export const run = async () => {
const loader = new CollegeConfidentialLoader(
Expand Down
2 changes: 1 addition & 1 deletion examples/src/document_loaders/gitbook.ts
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
import { GitbookLoader } from "langchain/document_loaders";
import { GitbookLoader } from "langchain/document_loaders/web/gitbook";

export const run = async () => {
const loader = new GitbookLoader("https://docs.gitbook.com");
Expand Down
2 changes: 1 addition & 1 deletion examples/src/document_loaders/github.ts
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
import { GithubRepoLoader } from "langchain/document_loaders";
import { GithubRepoLoader } from "langchain/document_loaders/web/github";

export const run = async () => {
const loader = new GithubRepoLoader(
Expand Down
2 changes: 1 addition & 1 deletion examples/src/document_loaders/hn.ts
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
import { HNLoader } from "langchain/document_loaders";
import { HNLoader } from "langchain/document_loaders/web/hn";

export const run = async () => {
const loader = new HNLoader("https://news.ycombinator.com/item?id=34817881");
Expand Down
2 changes: 1 addition & 1 deletion examples/src/document_loaders/imsdb.ts
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
import { IMSDBLoader } from "langchain/document_loaders";
import { IMSDBLoader } from "langchain/document_loaders/web/imsdb";

export const run = async () => {
const loader = new IMSDBLoader(
Expand Down
2 changes: 1 addition & 1 deletion examples/src/document_loaders/notion_markdown.ts
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
import { NotionLoader } from "langchain/document_loaders";
import { NotionLoader } from "langchain/document_loaders/fs/notion";

export const run = async () => {
/** Provide the directory path of your notion folder */
Expand Down
2 changes: 1 addition & 1 deletion examples/src/document_loaders/puppeteer_web.ts
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
import { PuppeteerWebBaseLoader } from "langchain/document_loaders";
import { PuppeteerWebBaseLoader } from "langchain/document_loaders/web/puppeteer";

export const run = async () => {
const loader = new PuppeteerWebBaseLoader("https://www.tabnews.com.br/");
Expand Down
2 changes: 1 addition & 1 deletion examples/src/document_loaders/srt.ts
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
import { SRTLoader } from "langchain/document_loaders";
import { SRTLoader } from "langchain/document_loaders/fs/srt";

export const run = async () => {
const loader = new SRTLoader(
Expand Down
2 changes: 1 addition & 1 deletion examples/src/document_loaders/text.ts
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
import { TextLoader } from "langchain/document_loaders";
import { TextLoader } from "langchain/document_loaders/fs/text";

export const run = async () => {
const loader = new TextLoader(
Expand Down
2 changes: 1 addition & 1 deletion examples/src/document_loaders/unstructured.ts
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
import { UnstructuredLoader } from "langchain/document_loaders";
import { UnstructuredLoader } from "langchain/document_loaders/fs/unstructured";

export const run = async () => {
const loader = new UnstructuredLoader(
Expand Down
2 changes: 1 addition & 1 deletion examples/src/indexes/vector_stores/hnswlib_fromdocs.ts
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
import { HNSWLib } from "langchain/vectorstores/hnswlib";
import { OpenAIEmbeddings } from "langchain/embeddings/openai";
import { TextLoader } from "langchain/document_loaders";
import { TextLoader } from "langchain/document_loaders/fs/text";

export const run = async () => {
// Create docs with a loader
Expand Down
57 changes: 57 additions & 0 deletions langchain/.gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -94,6 +94,63 @@ docstore.d.ts
document_loaders.cjs
document_loaders.js
document_loaders.d.ts
document_loaders/base.cjs
document_loaders/base.js
document_loaders/base.d.ts
document_loaders/web/cheerio.cjs
document_loaders/web/cheerio.js
document_loaders/web/cheerio.d.ts
document_loaders/web/puppeteer.cjs
document_loaders/web/puppeteer.js
document_loaders/web/puppeteer.d.ts
document_loaders/web/college_confidential.cjs
document_loaders/web/college_confidential.js
document_loaders/web/college_confidential.d.ts
document_loaders/web/gitbook.cjs
document_loaders/web/gitbook.js
document_loaders/web/gitbook.d.ts
document_loaders/web/hn.cjs
document_loaders/web/hn.js
document_loaders/web/hn.d.ts
document_loaders/web/imsdb.cjs
document_loaders/web/imsdb.js
document_loaders/web/imsdb.d.ts
document_loaders/web/github.cjs
document_loaders/web/github.js
document_loaders/web/github.d.ts
document_loaders/fs/directory.cjs
document_loaders/fs/directory.js
document_loaders/fs/directory.d.ts
document_loaders/fs/buffer.cjs
document_loaders/fs/buffer.js
document_loaders/fs/buffer.d.ts
document_loaders/fs/text.cjs
document_loaders/fs/text.js
document_loaders/fs/text.d.ts
document_loaders/fs/json.cjs
document_loaders/fs/json.js
document_loaders/fs/json.d.ts
document_loaders/fs/srt.cjs
document_loaders/fs/srt.js
document_loaders/fs/srt.d.ts
document_loaders/fs/pdf.cjs
document_loaders/fs/pdf.js
document_loaders/fs/pdf.d.ts
document_loaders/fs/docx.cjs
document_loaders/fs/docx.js
document_loaders/fs/docx.d.ts
document_loaders/fs/epub.cjs
document_loaders/fs/epub.js
document_loaders/fs/epub.d.ts
document_loaders/fs/csv.cjs
document_loaders/fs/csv.js
document_loaders/fs/csv.d.ts
document_loaders/fs/notion.cjs
document_loaders/fs/notion.js
document_loaders/fs/notion.d.ts
document_loaders/fs/unstructured.cjs
document_loaders/fs/unstructured.js
document_loaders/fs/unstructured.d.ts
chat_models.cjs
chat_models.js
chat_models.d.ts
Expand Down
1 change: 1 addition & 0 deletions langchain/document_loaders/fs/notion_markdown.cjs
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
module.exports = require('../../dist/document_loaders/fs/notion_markdown.cjs');
1 change: 1 addition & 0 deletions langchain/document_loaders/fs/notion_markdown.d.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
export * from '../../dist/document_loaders/fs/notion_markdown.js'
1 change: 1 addition & 0 deletions langchain/document_loaders/fs/notion_markdown.js
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
export * from '../../dist/document_loaders/fs/notion_markdown.js'
Loading

0 comments on commit 632d07b

Please sign in to comment.