Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clipper: Add capability of limiting downloads #9788

Merged
merged 17 commits into from
Mar 9, 2024
Merged
Prev Previous commit
Next Next commit
removing dummy and checking before using download controller
  • Loading branch information
pedr committed Feb 27, 2024
commit 4f9b6464b4add8769f9071c8407b6890b525019f
9 changes: 0 additions & 9 deletions packages/lib/downloadController.ts
Original file line number Diff line number Diff line change
Expand Up @@ -14,15 +14,6 @@ export interface DownloadController {
limitMessage(): string;
}

export class DummyDownloadController implements DownloadController {
public totalBytes = 0;
public imagesCount = 0;
public imageCountExpected = 0;
public printStats(): void {}
public handleChunk() { return () => {}; }
public limitMessage() { return ''; }
}

export class LimitedDownloadController implements DownloadController {
private totalBytes_ = 0;
// counts before the downloaded has finished, so at the end if the totalBytes > maxTotalBytesAllowed
Expand Down
25 changes: 14 additions & 11 deletions packages/lib/services/rest/routes/notes.ts
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ const { MarkupToHtml } = require('@joplin/renderer');
const { ErrorNotFound } = require('../utils/errors');
import { fileUriToPath } from '@joplin/utils/url';
import { NoteEntity } from '../../database/types';
import { DownloadController, DummyDownloadController } from '../../../downloadController';
import { DownloadController } from '../../../downloadController';
import { ErrorCode } from '../../../errors';
import { PromisePool } from '@supercharge/promise-pool';

Expand Down Expand Up @@ -69,6 +69,7 @@ type RequestNote = {
type FetchOptions = {
timeout?: number;
maxRedirects?: number;
downloadController?: DownloadController;
};

async function requestNoteToNote(requestNote: RequestNote): Promise<NoteEntity> {
Expand Down Expand Up @@ -226,7 +227,7 @@ const isValidUrl = (url: string, isDataUrl: boolean, urlProtocol?: string, allow
return isAllowedProtocol;
};

export async function downloadMediaFile(url: string, downloadController: DownloadController, fetchOptions?: FetchOptions, allowedProtocols?: string[]) {
export async function downloadMediaFile(url: string, fetchOptions?: FetchOptions, allowedProtocols?: string[]) {
logger.info('Downloading media file', url);

// The URL we get to download have been extracted from the Markdown document
Expand All @@ -250,7 +251,7 @@ export async function downloadMediaFile(url: string, downloadController: Downloa
const localPath = fileUriToPath(url);
await shim.fsDriver().copy(localPath, mediaPath);
} else {
const response = await shim.fetchBlob(url, { path: mediaPath, maxRetry: 1, ...fetchOptions, downloadController });
const response = await shim.fetchBlob(url, { path: mediaPath, maxRetry: 1, ...fetchOptions });

if (!fileExt) {
// If we could not find the file extension from the URL, try to get it
Expand All @@ -265,12 +266,14 @@ export async function downloadMediaFile(url: string, downloadController: Downloa
}
}

async function downloadMediaFiles(urls: string[], downloadController: DownloadController, fetchOptions: FetchOptions, allowedProtocols?: string[]) {
async function downloadMediaFiles(urls: string[], fetchOptions: FetchOptions, allowedProtocols?: string[]) {
const output: any = {};

const downloadController = fetchOptions?.downloadController ?? null;

const downloadOne = async (url: string) => {
downloadController.imagesCount += 1;
const mediaPath = await downloadMediaFile(url, downloadController, fetchOptions, allowedProtocols);
if (downloadController) downloadController.imagesCount += 1;
const mediaPath = await downloadMediaFile(url, fetchOptions, allowedProtocols);
if (mediaPath) output[url] = { path: mediaPath, originalUrl: url };
};

Expand All @@ -286,8 +289,10 @@ async function downloadMediaFiles(urls: string[], downloadController: DownloadCo
})
.process(downloadOne);

downloadController.imageCountExpected = urls.length;
downloadController.printStats(urls.length);
if (downloadController) {
downloadController.imageCountExpected = urls.length;
downloadController.printStats(urls.length);
}

return output;
}
Expand Down Expand Up @@ -402,7 +407,6 @@ export const extractNoteFromHTML = async (
requestNote: RequestNote,
requestId: number,
imageSizes: any,
downloadController: DownloadController,
fetchOptions?: FetchOptions,
allowedProtocols?: string[],
) => {
Expand All @@ -412,7 +416,7 @@ export const extractNoteFromHTML = async (

logger.info(`Request (${requestId}): Downloading media files: ${mediaUrls.length}`);

const mediaFiles = await downloadMediaFiles(mediaUrls, downloadController, fetchOptions, allowedProtocols);
const mediaFiles = await downloadMediaFiles(mediaUrls, fetchOptions, allowedProtocols);

logger.info(`Request (${requestId}): Creating resources from paths: ${Object.getOwnPropertyNames(mediaFiles).length}`);

Expand Down Expand Up @@ -469,7 +473,6 @@ export default async function(request: Request, id: string = null, link: string
requestNote,
requestId,
imageSizes,
new DummyDownloadController(),
undefined,
allowedProtocolsForDownloadMediaFiles,
);
Expand Down
8 changes: 5 additions & 3 deletions packages/lib/shim-init-node.ts
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ import * as fs from 'fs-extra';
import * as pdfJsNamespace from 'pdfjs-dist';
import { writeFile } from 'fs/promises';
import { ResourceEntity } from './services/database/types';
import { DownloadController, DummyDownloadController } from './downloadController';
import { DownloadController } from './downloadController';

const { FileApiDriverLocal } = require('./file-api-driver-local');
const mimeUtils = require('./mime-utils.js').mime;
Expand Down Expand Up @@ -484,7 +484,7 @@ function shimInit(options: ShimInitOptions = null) {
shim.fetchBlob = async function(url: any, options: FetchBlobOptions) {
if (!options || !options.path) throw new Error('fetchBlob: target file path is missing');
if (!options.method) options.method = 'GET';
if (!options.downloadController) options.downloadController = new DummyDownloadController();
if (!options.downloadController) options.downloadController = null;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's not necessary

// if (!('maxRetry' in options)) options.maxRetry = 5;

// 21 maxRedirects is the default amount from follow-redirects library
Expand Down Expand Up @@ -562,7 +562,9 @@ function shimInit(options: ShimInitOptions = null) {

const request = http.request(requestOptions, (response: any) => {

response.on('data', downloadController.handleChunk(request));
if (downloadController !== null) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for strict null checking here. What you want to check is if it's defined or not and for that just if (downloadController) is enough. There are places maybe where we need to make the difference between undefined and null or other falsy values, but not when we just need to know if something exists or not.

response.on('data', downloadController.handleChunk(request));
}

response.pipe(file);

Expand Down
Loading