-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
read big file line by line #4007
Comments
In Deno, the I/O model is different. The But to give an example of your code above: import { BufReader } from "https://deno.land/std@v0.33.0/io/mod.ts";
import { TextProtoReader } from "https://deno.land/std@v0.33.0/textproto/mod.ts";
import { parse } from "https://deno.land/std@v0.33.0/flags/mod.ts";
import { basename } from "https://deno.land/std@v0.33.0/path/mod.ts";
export async function read(r: Deno.Reader) {
const reader = new TextProtoReader(BufReader.create(r));
console.log("Reading data...");
let lineCount = 0;
while (true) {
let line = await reader.readLine();
if (line === Deno.EOF) break;
// do something with `line`
lineCount += 1;
}
console.log(`${lineCount} lines read.`);
}
if (import.meta.main) {
const args = parse(Deno.args, {
boolean: ["h"],
alias: {
h: ["help"]
}
});
if (args.h) {
printUsage();
Deno.exit(0);
}
const [filename] = args._;
if (!filename) {
printUsage();
Deno.exit(1);
}
const file = filename === "-" ? Deno.stdin : await Deno.open(filename);
await read(file);
file.close();
function printUsage() {
console.error(
`Usage: deno --allow-read ${basename(import.meta.url)} <filename>`
);
}
} The thing to note is that I'm using |
I tried it with 3 approaches to read 307 MB test file :
import { BufReader } from "https://deno.land/std/io/bufio.ts";
export async function stream_file(filename: string) {
const file = await Deno.open(filename);
const bufReader = new BufReader(file);
console.log("Reading data...");
let line: string | any;
let lineCount: number = 0;
while ((line = await bufReader.readString("\n")) != Deno.EOF) {
lineCount++;
// do something with `line`.
}
file.close();
console.log(`${lineCount} lines read.`);
}
import { BufReader } from "https://deno.land/std@v0.33.0/io/mod.ts";
import { TextProtoReader } from "https://deno.land/std@v0.33.0/textproto/mod.ts";
export async function textProtoReader(filename:string) {
const r: Deno.Reader = await Deno.open(filename)
const reader = new TextProtoReader(BufReader.create(r));
console.log("Reading data...");
let lineCount = 0;
while (true) {
let line = await reader.readLine();
if (line === Deno.EOF) break;
// do something with `line`
lineCount += 1;
}
console.log(`${lineCount} lines read.`);
}
import { BufReader } from "https://deno.land/std/io/bufio.ts";
export async function readLine(filename: string) {
const file = await Deno.open(filename);
const bufReader = new BufReader(file);
console.log("Reading data...");
let line: string | any;
let lineCount: number = 0;
while ((line = await bufReader.readLine()) != Deno.EOF) {
lineCount++;
// do something with `line`.
}
file.close();
console.log(`${lineCount} lines read.`);
} rust itself does it in
use std::fs::File;
use std::io::{self, BufRead};
use std::path::Path;
use std::io::{stdin, stdout, Read, Write};
use std::time::{Instant};
fn main() {
let start = Instant::now();
let mut counter = 0;
// File hosts must exist in current path before this produces output
if let Ok(lines) = read_lines("./enwik9") {
// Consumes the iterator, returns an (Optional) String
for _line in lines {
counter = counter + 1;
}
println!("{}",counter)
}
let duration = start.elapsed();
println!("Time elapsed in expensive_function() is: {:?}", duration);
pause()
}
fn read_lines<P>(filename: P) -> io::Result<io::Lines<io::BufReader<File>>>
where P: AsRef<Path>, {
let file = File::open(filename)?;
Ok(io::BufReader::new(file).lines())
}
fn pause() {
let mut stdout = stdout();
stdout.write(b"Press Enter to continue...").unwrap();
stdout.flush().unwrap();
stdin().read(&mut [0]).unwrap();
} node does that:
const fs = require("fs");
const readline = require("readline");
async function processLineByLine() {
console.log(Date());
const fileStream = fs.createReadStream("./enwik9");
const rl = readline.createInterface({
input: fileStream,
crlfDelay: Infinity
});
let counter = 0;
for await (const line of rl) {
counter++;
}
console.log(Date());
return counter;
} can anybody explain why it's different? |
Side note, feels like something we need to add to the benchmarks. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
saostad's approach 3 doesn't seem to be working for me (possibly due to the unversioned import). For others arrive here via Google looking for a quick copy/paste line-by-line file reader, deepakshrma has a neat example that seems to work well: import { readLine } from "https://raw.githubusercontent.com/deepakshrma/deno-by-example/feacd84bd5cd5b1a630dfcc72afb3cd64de21b91/examples/file_reader.ts";
let lineIterator = await readLine("yourFile.txt");
for await (let line of lineIterator) {
console.log(line);
} (I haven't benchmarked it.) |
I am trying to read a big file (13,147,026 lines of text) line by line with deno but it's giving me error:
Here is my code:
versions:
I am trying to find an equivalent of node streams.
please advise!
The text was updated successfully, but these errors were encountered: