A Rust Iterator example

The idea of having an iterator type allowing you to navigate containers in a standard fashion has become common in many languages including Rust.

Sadly I think the documentation makes a mess of it. I seem to come across two examples on which gives out Fibonacci  sequence numbers and is not really what I'd call an iterator, you can't reiterate it again and so it's plain confusing to post it as an example. The other I come across is a structure with a Vec in it and that just returns the iterator for the Vec which does not exactly help you in coding a real iterator.

So let's start off with what an iterator essentially is an go from there.

When we have a structure which we want to iterate an iterator has two pieces of information it needs, it needs a reference to the structure itself and it needs to know where we are in indexing that structure. We can't store that position index in the structure itself as if we had a nested loop or even multiple threads it could be overwritten.

So to start with:

pub struct BVecIter<'a>{
v: &'a BVec,
pos: usize
}

This is an iterator for a structure of mine called BVec, we won't even define it, in this article. We will see in the code it supports a get_element(index) function and you can work this out for you own code. Note that "pos" our position is all we need here, for other constructs like a linked list you would need another form for pos  maybe  "prev" and "next" references.

If you are coming from another language you may be bemused by <'a> and 'a. These are lifetime indicators. One of the main premises of Rust is that it handles freeing memory differently, no garbage collection and no manual free() statements. The bad news that accompanies this is that if you want that to work rust needs to be given a hand in knowing what's going on in the code.

Let's repeat the same code more verbosely:

pub struct BVecIter<'linkedvalidity>{
v: &'linkedvalidity BVec,
pos: usize
}

I have invented the identifier "linkedvalidity" anything lowercase after the ' in a lifetime indicator is valid. The convention for some reason is 'a, maybe someone just likes being cryptic?

Here we can understand the logic, our iterator is only valid if the underlying BVec reference is still in scope. There life times are tied, though in this case the iterator going out of scope doesn't invalidate our actual BVec just a reference to it.

Moving along:

impl Iterator for BVecIter<'_>{
type Item=u64;
fn next(&mut self) ->Option<Self::Item> {
if self.pos==self.v.used{
return None;
}else{
let r = self.v.get_element(self.pos);
self.pos += 1;
Some(r)
}
}
}

Here we have some mystery meat, we alias Item to the return value we are iterating, that's because the iterator system underlying the code wants that. We also introduce "Self" which refers to the implementation so Option<Self::Item> can be used meaning in this case Option<u64>. Again why Option? that's down to how iterators work and we are just implementing the functions in the desired way.  The last mystery is another lifetime indicator <'_>.  This is known as the anonymous lifetime and I have yet to read a decent explanation of its role, the good thing is that the compiler tells us it's needed, but if it can do that explicitly why doesn't it just do it?

But beyond the language issues, the code reads easily, we are if possible moving to the next position and returning both the value and success/fail in the format needed.

Lastly we need the implementation that allows "for in blah" to work:

impl<'a> IntoIterator for &'a BVec{
type Item=u64;
type IntoIter = BVecIter<'a>;
fn into_iter(self) -> BVecIter<'a>{
BVecIter{v: &self, pos: 0}
}
}

 This again tells the underlying code the iterator type and the value it returns and constructs an iterator type from the reference to a BVec and a starting point.

Again we see the lifetime indicator _a dotted around which tells us that the code won't be valid if anyone of the values goes out of scope.

Having learned how to do an iterator I googled again and confirmed that the vast majority of examples are as I thought at the start, either not real world, or plain lazy.

But on a more positive note, once you have coded your "next" method properly all of Rusts iterator goodies like steps, filters and folds are available to you.

One word missing from this article is Trait. The code above for IntoIterator and Iterator are in fact implementing Rust Traits. In C++ we have classes, in golang we have interfaces, but Rust feels the need to have Traits. So what are they and do they bring anything good to the table?

A Trait worse than death


 



Comments

Popular posts from this blog

A reliable tuya authentication in node-red

Arduino boards - what I wish I'd known when starting

node-red a full home automation example