You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm not crazy about the current impl_reader! design for generating a "reader" for each file type and it would be nice if that could be replaced by a generic solution like Reader<T> (although we'd still need to impl a trait like RecordReader on this to let it be boxed up for e.g. get_reader, I think?) A generic design like this could potentially be used to replace the experimental chunk interface for multi-threaded reading (new below is inspired by the init_state in that).
I wrote out the following before I started running into some lifetime issues; I think the type bounds for S and T probably need to be tweaked? Refactoring to next(&'r mut self) -> Result<Option<Vec<Value<'r>>>, EtError> breaks the caller because they're no longer able to iterate over multiple records?
use core::fmt::Debug;
use core::marker::PhantomData;
use crate::parsers::FromSlice;
use crate::record::StateMetadata;
/// A reader that abstracts over parsing files
#[derive(Debug)]
pub struct Reader<'r, S, T> where
S: Debug + 'r,
T: Debug + FromSlice<'r, State=&'r mut S>,
{
rb: ReadBuffer<'r>,
state: S,
returns: PhantomData<T>,
}
impl<'r, S, T> Reader<'r, S, T> where
S: Debug + 'r,
T: Debug + FromSlice<'r, State=&'r mut S> + Into<Vec<Value<'r>>>,
{
/// Create a new Reader
pub fn new<B, P>(data: B, params: Option<P>) -> Result<Self, EtError>
where
B: TryInto<ReadBuffer<'r>>,
EtError: From<<B as TryInto<ReadBuffer<'r>>>::Error>,
S: for<'a> FromSlice<'a, State = P>,
P: Default,
{
let mut rb = data.try_into()?;
if let Some(state) = rb.next::<S>(params.unwrap_or_default())? {
Ok(Reader {
rb,
state,
returns: PhantomData,
})
} else {
Err(format!(
"Could not initialize state {}",
::core::any::type_name::<S>()
)
.into())
}
}
/// Get the next record from the `Reader`
#[allow(clippy::should_implement_trait)]
pub fn next(&mut self) -> Result<Option<T>, EtError> {
// FIXME: fails on the next line because of lifetime issues with borrowing both `self.rb` and `self.state`
self.rb.next::<T>(&mut self.state)
}
}
impl<'r, S, T> RecordReader for Reader<'r, S, T> where
S: Debug + StateMetadata + 'r,
T: Debug + FromSlice<'r, State=&'r mut S> + Into<Vec<Value<'r>>>,
{
fn next_record(&mut self) -> Result<Option<Vec<Value>>, EtError> {
Ok(self.next()?.map(|v| v.into()))
}
fn headers(&self) -> Vec<String> {
self.state.header().iter().map(|s| s.to_string()).collect()
}
fn metadata(&self) -> BTreeMap<String, Value> {
self.state.metadata()
}
}
The text was updated successfully, but these errors were encountered:
I got a little further, but having a lot of trouble with the impl RecordReader for Reader<S, T> where T isn't actually returned from next_record. T's lifetime mucks up the whole conversion process because it should be 'a from next_value, but you can't actually set that outside the scope of next_value.
Construct a reader with: FileType::Bam => Box::new(Reader::<_, parsers::sam::BamRecord>::new(data, None)?),
Code below is changes from above (also tried modifying the signature of RecordReader, but still can't get it to work).
/// A reader that abstracts over parsing files
#[derive(Debug)]
pub struct Reader<'r, S, T> where
S: Debug,
T: Debug,
{
rb: ReadBuffer<'r>,
state: S,
returns: PhantomData<T>,
}
impl<'r, 's, S, T> Reader<'r, S, T> where
S: Debug,
T: Debug,
{
/// Get the next record from the `Reader`
#[allow(clippy::should_implement_trait)]
pub fn next<'a>(&'a mut self) -> Result<Option<T>, EtError> where
T: FromSlice<'a, State=&'a mut S>,
{
self.rb.next::<T>(&mut self.state)
}
/// Get the next record from the `Reader`
#[allow(clippy::should_implement_trait)]
pub fn next_value<'a>(&'a mut self) -> Result<Option<Vec<Value>>, EtError> where
T: FromSlice<'a, State=&'a mut S> + Into<Vec<Value<'a>>>
{
Ok(self.next()?.map(|v| v.into()))
}
}
impl<'r> RecordReader for Reader<'r, parsers::fasta::FastaState, parsers::fasta::FastaRecord<'r>> where Self: 'r {
fn next_record<'a>(&'a mut self) -> Result<Option<Vec<Value<'a>>>, EtError> {
self.next_value() // <- error here: expected FastaRecord<'r>, found FastaRecord<'_>
}
...
}
I'm not crazy about the current
impl_reader!
design for generating a "reader" for each file type and it would be nice if that could be replaced by a generic solution likeReader<T>
(although we'd still need toimpl
a trait likeRecordReader
on this to let it be boxed up for e.g.get_reader
, I think?) A generic design like this could potentially be used to replace the experimentalchunk
interface for multi-threaded reading (new
below is inspired by theinit_state
in that).I wrote out the following before I started running into some lifetime issues; I think the type bounds for S and T probably need to be tweaked? Refactoring to
next(&'r mut self) -> Result<Option<Vec<Value<'r>>>, EtError>
breaks the caller because they're no longer able to iterate over multiple records?The text was updated successfully, but these errors were encountered: