-
Notifications
You must be signed in to change notification settings - Fork 95
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Streaming example #276
Comments
A Rust program that does the task I described is here: use std::fs::File;
use std::io::{BufRead, BufReader};
fn main() {
let file = File::open("test8GB").unwrap();
let reader = BufReader::new(file);
let n_lines = reader.lines().fold(0,
|n, maybe_line| {
if let Ok(line) = maybe_line {
if !line.bytes().all(|b| b.is_ascii_alphabetic()) {
panic!("Encountered non-alphabetic character on line {}", n);
}
n + 1
}
else {
panic!("Could not read line");
}
});
println!("{}", n_lines);
} |
To generate test data, I wrote the following program: fn main() {
let mut str = String::from("");
loop {
str.push('a');
println!("{}", str);
}
} You can run it with
to generate an 8GB test file consisting of:
|
By the way, I know that |
The way I have done this is like this in redis-rs https://github.com/mitsuhiko/redis-rs/blob/75bfe24f7f34faad2460343699bd65bdcfaabaf0/src/parser.rs#L213-L282 . I always meant to port that code into a more generalized form in combine but I kept forgetting. Should have something later today or tomorrow. |
It is on master now, releasing in 4.0 in a few days. |
Hello @Marwes! Thank you for your work on generalizing the redis-rs code. I just researched a bit to see how to adapt the examples in Long story short: For a complete Rust newcomer (but experienced functional programmer) like me, it is not obvious to see how to put things together, and what the best practices for a simple program folding over a list of items parsed from a file are. A small best practice example would tremendously help me and probably quite a few other people out there. I believe that my use-case appears sufficiently often that it could justify a small example in I hope that my feedback is useful to you to improve the experience of users of your library. :) |
Sorry, I should have linked explicitly to what I ported over. With the added |
Thanks, that makes more sense! |
Figured out a way to relax the |
I am trying to create a program that reads files (potentially not fitting into memory) and folds over items found in the file. An item can span over several lines.
My problem is that I found no example for "combine" that I could adapt to my use-case.
(
examples/async.rs
seems to go into the right direction, but it is too complicated for me.)So consider the simplified assignment:
Assume we want to implement
wc -l
, with the little twist that the program should fail when encountering a non-alphabetic and non-newline character.Effectively, this program should recognise the grammar
([a-zA-z]*\n)*
and then output the number of newlines. The program should be able to handle files that do not fit into RAM.A naive attempt at handling this assignment is:
However, the problem here is that:
I would love a solution where I could just specify a
line
parser, then runfold(0, |acc, word| acc + 1)
on repeated parses ofline
on some file to obtain the number of lines.The file should be read lazily, e.g. by using
BufRead
.The text was updated successfully, but these errors were encountered: