What are the performance impacts of 'functional' Rust? Announcing the arrival of Valued Associate #679: Cesar Manara Planned maintenance scheduled April 17/18, 2019 at 00:00UTC (8:00pm US/Eastern) The Ask Question Wizard is Live! Data science time! April 2019 and salary with experienceHow can I add new methods to Iterator?How can I append a formatted string to an existing String?What is tail recursion?What is 'Currying'?What is a monad?What is the difference between a 'closure' and a 'lambda'?Does functional programming replace GoF design patterns?What is (functional) reactive programming?What is the difference between declarative and imperative programming?Functional programming vs Object Oriented programming“What part of Hindley-Milner do you not understand?”map function for objects (instead of arrays)

Who can trigger ship-wide alerts in Star Trek?

What do you call a plan that's an alternative plan in case your initial plan fails?

Working around an AWS network ACL rule limit

What to do with post with dry rot?

What items from the Roman-age tech-level could be used to deter all creatures from entering a small area?

What was the last x86 CPU that did not have the x87 floating-point unit built in?

Using "nakedly" instead of "with nothing on"

How to dynamically generate the hash value of a file while it gets downloaded from any website?

Aligning matrix of nodes with grid

Simulating Exploding Dice

What are the performance impacts of 'functional' Rust?

Failing to enforce immigration laws?

How did passengers keep warm on sail ships?

Two different pronunciation of "понял"

Can smartphones with the same camera sensor have different image quality?

Estimated State payment too big --> money back; + 2018 Tax Reform

Direct Experience of Meditation

Slither Like a Snake

How to say that you spent the night with someone, you were only sleeping and nothing else?

Mortgage adviser recommends a longer term than necessary combined with overpayments

How to add zeros to reach same number of decimal places in tables?

What LEGO pieces have "real-world" functionality?

A constraint that implies convexity

Does a C shift expression have unsigned type? Why would Splint warn about a right-shift?



What are the performance impacts of 'functional' Rust?



Announcing the arrival of Valued Associate #679: Cesar Manara
Planned maintenance scheduled April 17/18, 2019 at 00:00UTC (8:00pm US/Eastern)
The Ask Question Wizard is Live!
Data science time! April 2019 and salary with experienceHow can I add new methods to Iterator?How can I append a formatted string to an existing String?What is tail recursion?What is 'Currying'?What is a monad?What is the difference between a 'closure' and a 'lambda'?Does functional programming replace GoF design patterns?What is (functional) reactive programming?What is the difference between declarative and imperative programming?Functional programming vs Object Oriented programming“What part of Hindley-Milner do you not understand?”map function for objects (instead of arrays)



.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty height:90px;width:728px;box-sizing:border-box;








6















I am following the Rust track on Exercism.io. I have a fair amount of C/C++ experience. I like the 'functional' elements of Rust but I'm concerned about the relative performance.



I solved the 'run length encoding' problem:



pub fn encode(source: &str) -> String 
let mut retval = String::new();
let firstchar = source.chars().next();
let mut currentchar = match firstchar
Some(x) => x,
None => return retval,
;
let mut currentcharcount: u32 = 0;
for c in source.chars()
if c == currentchar
currentcharcount += 1;
else
if currentcharcount > 1
retval.push_str(&currentcharcount.to_string());

retval.push(currentchar);
currentchar = c;
currentcharcount = 1;


if currentcharcount > 1
retval.push_str(&currentcharcount.to_string());

retval.push(currentchar);
retval



I noticed that one of the top-rated answers looked more like this:



extern crate itertools;

use itertools::Itertools;

pub fn encode(data: &str) -> String
data.chars()
.group_by(


I love the top rated solution; it is simple, functional, and elegant. This is what they promised me Rust would be all about. Mine on the other hand is gross and full of mutable variables. You can tell I'm used to C++.



My problem is that the functional style has a SIGNIFICANT performance impact. I tested both versions with the same 4MB of random data encoded 1000 times. My imperative solution took under 10 seconds; the functional solution was ~2mins30seconds.



  • Why is the functional style so much slower than the imperative style?

  • Is there some problem with the functional implementation which is causing such a huge slowdown?

  • If I want to write high performance code, should I ever use this functional style?









share|improve this question









New contributor




David Copernicus Bowie is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.




















  • The difference looks extremely surprising to me; that's a factor of x15! Have you checked that both implementations yield the same result?

    – Matthieu M.
    9 hours ago











  • @MatthieuM. yep, or at least both functions pass all unit tests defined by exercism.

    – David Copernicus Bowie
    8 hours ago






  • 1





    I am thinking that there should be a way to replace the map step with a flat_map step, with a special-purpose iterator implementation taking the character and count and outputting the required stream of bytes. Forward encoding the integer is a bit tricky, but not too bad with count_leading_zeroes giving a hint of the magnitude (clz(i) * 77 / 256 gives the log 10).

    – Matthieu M.
    8 hours ago

















6















I am following the Rust track on Exercism.io. I have a fair amount of C/C++ experience. I like the 'functional' elements of Rust but I'm concerned about the relative performance.



I solved the 'run length encoding' problem:



pub fn encode(source: &str) -> String 
let mut retval = String::new();
let firstchar = source.chars().next();
let mut currentchar = match firstchar
Some(x) => x,
None => return retval,
;
let mut currentcharcount: u32 = 0;
for c in source.chars()
if c == currentchar
currentcharcount += 1;
else
if currentcharcount > 1
retval.push_str(&currentcharcount.to_string());

retval.push(currentchar);
currentchar = c;
currentcharcount = 1;


if currentcharcount > 1
retval.push_str(&currentcharcount.to_string());

retval.push(currentchar);
retval



I noticed that one of the top-rated answers looked more like this:



extern crate itertools;

use itertools::Itertools;

pub fn encode(data: &str) -> String
data.chars()
.group_by(


I love the top rated solution; it is simple, functional, and elegant. This is what they promised me Rust would be all about. Mine on the other hand is gross and full of mutable variables. You can tell I'm used to C++.



My problem is that the functional style has a SIGNIFICANT performance impact. I tested both versions with the same 4MB of random data encoded 1000 times. My imperative solution took under 10 seconds; the functional solution was ~2mins30seconds.



  • Why is the functional style so much slower than the imperative style?

  • Is there some problem with the functional implementation which is causing such a huge slowdown?

  • If I want to write high performance code, should I ever use this functional style?









share|improve this question









New contributor




David Copernicus Bowie is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.




















  • The difference looks extremely surprising to me; that's a factor of x15! Have you checked that both implementations yield the same result?

    – Matthieu M.
    9 hours ago











  • @MatthieuM. yep, or at least both functions pass all unit tests defined by exercism.

    – David Copernicus Bowie
    8 hours ago






  • 1





    I am thinking that there should be a way to replace the map step with a flat_map step, with a special-purpose iterator implementation taking the character and count and outputting the required stream of bytes. Forward encoding the integer is a bit tricky, but not too bad with count_leading_zeroes giving a hint of the magnitude (clz(i) * 77 / 256 gives the log 10).

    – Matthieu M.
    8 hours ago













6












6








6


1






I am following the Rust track on Exercism.io. I have a fair amount of C/C++ experience. I like the 'functional' elements of Rust but I'm concerned about the relative performance.



I solved the 'run length encoding' problem:



pub fn encode(source: &str) -> String 
let mut retval = String::new();
let firstchar = source.chars().next();
let mut currentchar = match firstchar
Some(x) => x,
None => return retval,
;
let mut currentcharcount: u32 = 0;
for c in source.chars()
if c == currentchar
currentcharcount += 1;
else
if currentcharcount > 1
retval.push_str(&currentcharcount.to_string());

retval.push(currentchar);
currentchar = c;
currentcharcount = 1;


if currentcharcount > 1
retval.push_str(&currentcharcount.to_string());

retval.push(currentchar);
retval



I noticed that one of the top-rated answers looked more like this:



extern crate itertools;

use itertools::Itertools;

pub fn encode(data: &str) -> String
data.chars()
.group_by(


I love the top rated solution; it is simple, functional, and elegant. This is what they promised me Rust would be all about. Mine on the other hand is gross and full of mutable variables. You can tell I'm used to C++.



My problem is that the functional style has a SIGNIFICANT performance impact. I tested both versions with the same 4MB of random data encoded 1000 times. My imperative solution took under 10 seconds; the functional solution was ~2mins30seconds.



  • Why is the functional style so much slower than the imperative style?

  • Is there some problem with the functional implementation which is causing such a huge slowdown?

  • If I want to write high performance code, should I ever use this functional style?









share|improve this question









New contributor




David Copernicus Bowie is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.












I am following the Rust track on Exercism.io. I have a fair amount of C/C++ experience. I like the 'functional' elements of Rust but I'm concerned about the relative performance.



I solved the 'run length encoding' problem:



pub fn encode(source: &str) -> String 
let mut retval = String::new();
let firstchar = source.chars().next();
let mut currentchar = match firstchar
Some(x) => x,
None => return retval,
;
let mut currentcharcount: u32 = 0;
for c in source.chars()
if c == currentchar
currentcharcount += 1;
else
if currentcharcount > 1
retval.push_str(&currentcharcount.to_string());

retval.push(currentchar);
currentchar = c;
currentcharcount = 1;


if currentcharcount > 1
retval.push_str(&currentcharcount.to_string());

retval.push(currentchar);
retval



I noticed that one of the top-rated answers looked more like this:



extern crate itertools;

use itertools::Itertools;

pub fn encode(data: &str) -> String
data.chars()
.group_by(


I love the top rated solution; it is simple, functional, and elegant. This is what they promised me Rust would be all about. Mine on the other hand is gross and full of mutable variables. You can tell I'm used to C++.



My problem is that the functional style has a SIGNIFICANT performance impact. I tested both versions with the same 4MB of random data encoded 1000 times. My imperative solution took under 10 seconds; the functional solution was ~2mins30seconds.



  • Why is the functional style so much slower than the imperative style?

  • Is there some problem with the functional implementation which is causing such a huge slowdown?

  • If I want to write high performance code, should I ever use this functional style?






functional-programming rust imperative-programming






share|improve this question









New contributor




David Copernicus Bowie is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.











share|improve this question









New contributor




David Copernicus Bowie is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









share|improve this question




share|improve this question








edited 8 hours ago









Shepmaster

162k16333477




162k16333477






New contributor




David Copernicus Bowie is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









asked 9 hours ago









David Copernicus BowieDavid Copernicus Bowie

333




333




New contributor




David Copernicus Bowie is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.





New contributor





David Copernicus Bowie is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.






David Copernicus Bowie is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.












  • The difference looks extremely surprising to me; that's a factor of x15! Have you checked that both implementations yield the same result?

    – Matthieu M.
    9 hours ago











  • @MatthieuM. yep, or at least both functions pass all unit tests defined by exercism.

    – David Copernicus Bowie
    8 hours ago






  • 1





    I am thinking that there should be a way to replace the map step with a flat_map step, with a special-purpose iterator implementation taking the character and count and outputting the required stream of bytes. Forward encoding the integer is a bit tricky, but not too bad with count_leading_zeroes giving a hint of the magnitude (clz(i) * 77 / 256 gives the log 10).

    – Matthieu M.
    8 hours ago

















  • The difference looks extremely surprising to me; that's a factor of x15! Have you checked that both implementations yield the same result?

    – Matthieu M.
    9 hours ago











  • @MatthieuM. yep, or at least both functions pass all unit tests defined by exercism.

    – David Copernicus Bowie
    8 hours ago






  • 1





    I am thinking that there should be a way to replace the map step with a flat_map step, with a special-purpose iterator implementation taking the character and count and outputting the required stream of bytes. Forward encoding the integer is a bit tricky, but not too bad with count_leading_zeroes giving a hint of the magnitude (clz(i) * 77 / 256 gives the log 10).

    – Matthieu M.
    8 hours ago
















The difference looks extremely surprising to me; that's a factor of x15! Have you checked that both implementations yield the same result?

– Matthieu M.
9 hours ago





The difference looks extremely surprising to me; that's a factor of x15! Have you checked that both implementations yield the same result?

– Matthieu M.
9 hours ago













@MatthieuM. yep, or at least both functions pass all unit tests defined by exercism.

– David Copernicus Bowie
8 hours ago





@MatthieuM. yep, or at least both functions pass all unit tests defined by exercism.

– David Copernicus Bowie
8 hours ago




1




1





I am thinking that there should be a way to replace the map step with a flat_map step, with a special-purpose iterator implementation taking the character and count and outputting the required stream of bytes. Forward encoding the integer is a bit tricky, but not too bad with count_leading_zeroes giving a hint of the magnitude (clz(i) * 77 / 256 gives the log 10).

– Matthieu M.
8 hours ago





I am thinking that there should be a way to replace the map step with a flat_map step, with a special-purpose iterator implementation taking the character and count and outputting the required stream of bytes. Forward encoding the integer is a bit tricky, but not too bad with count_leading_zeroes giving a hint of the magnitude (clz(i) * 77 / 256 gives the log 10).

– Matthieu M.
8 hours ago












2 Answers
2






active

oldest

votes


















7














Let's review the functional implementation!



Memory Allocations



One of the big issues of the functional style proposed here is the map method which allocates a lot. Every single character is first mapped to a String, before being collected.



It also uses the format machinery, which is known to be relatively slow.



Sometimes, people try way too hard to get a "pure" functional solution, instead:



let mut result = String::new();
for (c, group) in &source.chars().group_by(|&c| c)
let count = group.count();
if count > 1
result.push_str(&count.to_string());


result.push(c);



is about as verbose, yet only allocates when count > 1 just like your solution does and does not use the format machinery either.



I would expect a significant performance win compared to the full functional solution, while at the same time still leveraging group_by for extra readability compared to the full imperative solution. Sometimes, you ought to mix and match!






share|improve this answer

























  • That certainly gives us a speed boost, but it is still around 3x slower than the imperative version (30s rather than 10s in my tests). In fact, even if I only push a constant letter in that for loop it is still about 14s, so around 50% slower than the imperative version. That leads me to believe that group_by is probably not zero cost for this use case. Answer accepted anyway!

    – David Copernicus Bowie
    7 hours ago












  • @DavidCopernicusBowie For complex questions like this, you may want to wait before accepting an answer. See Shepmaster's one below =)

    – Paul Stenne
    6 hours ago






  • 1





    How can I append a formatted string to an existing String?

    – Shepmaster
    6 hours ago


















9














TL;DR



A functional implementation can be faster than your original procedural implementation, in certain cases.




Why is the functional style so much slower than the imperative style? Is there some problem with the functional implementation which is causing such a huge slowdown?




As Matthieu M. already pointed out, the important thing to note is that the algorithm matters. How that algorithm is expressed (procedural, imperative, object-oriented, functional, declarative) generally doesn't matter.



I see two main issues with the functional code:



  • Allocating numerous strings over and over is inefficient. In the original functional implementation, this is done via to_string and format!.


  • There's the overhead of using group_by, which exists to give a nested iterator, which you don't need just to get the counts.


Using more of itertools (batching, take_while_ref, format_with) brings the two implementations much closer:



pub fn encode_slim(data: &str) -> String match count 
1 => f(&c),
n => f(&format_args!("", n, c)),
)
.to_string()



A benchmark of 1KiB of random alphanumeric data:



encode (procedural) time: [4.9922 us 5.0386 us 5.0940 us]
Found 15 outliers among 100 measurements (15.00%)
2 (2.00%) high mild
13 (13.00%) high severe

encode (fast) time: [6.6025 us 6.6636 us 6.7371 us]
Found 10 outliers among 100 measurements (10.00%)
4 (4.00%) high mild
6 (6.00%) high severe


And 4MiB of data, compiled with RUSTFLAGS='-C target-cpu=native':



encode (procedural) time: [21.082 ms 21.620 ms 22.211 ms]

encode (fast) time: [26.457 ms 27.104 ms 27.882 ms]
Found 7 outliers among 100 measurements (7.00%)
4 (4.00%) high mild
3 (3.00%) high severe


If you are interested in creating your own iterator, you can mix-and-match the procedural code with more functional code:



struct RunLength<I> 
iter: I,
saved: Option<char>,


impl<I> RunLength<I>
where
I: Iterator<Item = char>,

fn new(mut iter: I) -> Self
let saved = iter.next(); // See footnote 1
Self iter, saved



impl<I> Iterator for RunLength<I>
where
I: Iterator<Item = char>,

type Item = (char, usize);

fn next(&mut self) -> Option<Self::Item>
let c = self.saved.take().or_else(


pub fn encode_tiny(data: &str) -> String
match count
1 => s.push(c),
n => write!(&mut s, "", n, c).unwrap(),

s
)



1 — thanks to Stargateur for pointing out that eagerly getting the first value helps branch prediction.



4MiB of data, compiled with RUSTFLAGS='-C target-cpu=native':



encode (procedural) time: [19.888 ms 20.301 ms 20.794 ms]
Found 4 outliers among 100 measurements (4.00%)
3 (3.00%) high mild
1 (1.00%) high severe

encode (tiny) time: [19.150 ms 19.262 ms 19.399 ms]
Found 11 outliers among 100 measurements (11.00%)
5 (5.00%) high mild
6 (6.00%) high severe


I believe this more clearly shows the main fundamental difference between the two implementations: an iterator-based solution is resumable. Every time we call next, we need to see if there was a previous character that we've read (self.saved). This adds a branch to the code that isn't there in the procedural code.



On the flip side, the iterator-based solution is more flexible — we can now compose all sorts of transformations on the data, or write directly to a file instead of a String, etc. The custom iterator can be extended to operate on a generic type instead of char as well, making it very flexible.



See also:



  • How can I add new methods to Iterator?


If I want to write high performance code, should I ever use this functional style?




I would, until benchmarking shows that it's the bottleneck. Then evaluate why it's the bottleneck.



Supporting code



Always got to show your work, right?



benchmark.rs



use criterion::criterion_group, criterion_main, Criterion; // 0.2.11
use rle::*;

fn criterion_benchmark(c: &mut Criterion)
let data = rand_data(4 * 1024 * 1024);

c.bench_function("encode (procedural)",
let data = data.clone();
move );

c.bench_function("encode (functional)", encode_iter(&data))
);

c.bench_function("encode (fast)", b);

c.bench_function("encode (tiny)", b);


criterion_group!(benches, criterion_benchmark);
criterion_main!(benches);


lib.rs



use itertools::Itertools; // 0.8.0
use rand; // 0.6.5

pub fn rand_data(len: usize) -> String
use rand::distributions::Alphanumeric, Distribution;
let mut rng = rand::thread_rng();
Alphanumeric.sample_iter(&mut rng).take(len).collect()


pub fn encode_proc(source: &str) -> String
let mut retval = String::new();
let firstchar = source.chars().next();
let mut currentchar = match firstchar
Some(x) => x,
None => return retval,
;
let mut currentcharcount: u32 = 0;
for c in source.chars()
if c == currentchar
currentcharcount += 1;
else
if currentcharcount > 1
retval.push_str(&currentcharcount.to_string());

retval.push(currentchar);
currentchar = c;
currentcharcount = 1;


if currentcharcount > 1
retval.push_str(&currentcharcount.to_string());

retval.push(currentchar);
retval


pub fn encode_iter(data: &str) -> String
data.chars()
.group_by(

pub fn encode_slim(data: &str) -> String match count
1 => f(&c),
n => f(&format_args!("", n, c)),
)
.to_string()


struct RunLength<I>
iter: I,
saved: Option<char>,


impl<I> RunLength<I>
where
I: Iterator<Item = char>,

fn new(mut iter: I) -> Self
let saved = iter.next();
Self iter, saved



impl<I> Iterator for RunLength<I>
where
I: Iterator<Item = char>,

type Item = (char, usize);

fn next(&mut self) -> Option<Self::Item>
let c = self.saved.take().or_else(


pub fn encode_tiny(data: &str) -> String
match count
1 => s.push(c),
n => write!(&mut s, "", n, c).unwrap(),

s
)


#[cfg(test)]
mod test
use super::*;

#[test]
fn all_the_same()
let data = rand_data(1024);

let a = encode_proc(&data);
let b = encode_iter(&data);
let c = encode_slim(&data);
let d = encode_tiny(&data);

assert_eq!(a, b);
assert_eq!(a, c);
assert_eq!(a, d);







share|improve this answer

























  • Great iterator!

    – Matthieu M.
    4 hours ago











Your Answer






StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);






David Copernicus Bowie is a new contributor. Be nice, and check out our Code of Conduct.









draft saved

draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55675093%2fwhat-are-the-performance-impacts-of-functional-rust%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown

























2 Answers
2






active

oldest

votes








2 Answers
2






active

oldest

votes









active

oldest

votes






active

oldest

votes









7














Let's review the functional implementation!



Memory Allocations



One of the big issues of the functional style proposed here is the map method which allocates a lot. Every single character is first mapped to a String, before being collected.



It also uses the format machinery, which is known to be relatively slow.



Sometimes, people try way too hard to get a "pure" functional solution, instead:



let mut result = String::new();
for (c, group) in &source.chars().group_by(|&c| c)
let count = group.count();
if count > 1
result.push_str(&count.to_string());


result.push(c);



is about as verbose, yet only allocates when count > 1 just like your solution does and does not use the format machinery either.



I would expect a significant performance win compared to the full functional solution, while at the same time still leveraging group_by for extra readability compared to the full imperative solution. Sometimes, you ought to mix and match!






share|improve this answer

























  • That certainly gives us a speed boost, but it is still around 3x slower than the imperative version (30s rather than 10s in my tests). In fact, even if I only push a constant letter in that for loop it is still about 14s, so around 50% slower than the imperative version. That leads me to believe that group_by is probably not zero cost for this use case. Answer accepted anyway!

    – David Copernicus Bowie
    7 hours ago












  • @DavidCopernicusBowie For complex questions like this, you may want to wait before accepting an answer. See Shepmaster's one below =)

    – Paul Stenne
    6 hours ago






  • 1





    How can I append a formatted string to an existing String?

    – Shepmaster
    6 hours ago















7














Let's review the functional implementation!



Memory Allocations



One of the big issues of the functional style proposed here is the map method which allocates a lot. Every single character is first mapped to a String, before being collected.



It also uses the format machinery, which is known to be relatively slow.



Sometimes, people try way too hard to get a "pure" functional solution, instead:



let mut result = String::new();
for (c, group) in &source.chars().group_by(|&c| c)
let count = group.count();
if count > 1
result.push_str(&count.to_string());


result.push(c);



is about as verbose, yet only allocates when count > 1 just like your solution does and does not use the format machinery either.



I would expect a significant performance win compared to the full functional solution, while at the same time still leveraging group_by for extra readability compared to the full imperative solution. Sometimes, you ought to mix and match!






share|improve this answer

























  • That certainly gives us a speed boost, but it is still around 3x slower than the imperative version (30s rather than 10s in my tests). In fact, even if I only push a constant letter in that for loop it is still about 14s, so around 50% slower than the imperative version. That leads me to believe that group_by is probably not zero cost for this use case. Answer accepted anyway!

    – David Copernicus Bowie
    7 hours ago












  • @DavidCopernicusBowie For complex questions like this, you may want to wait before accepting an answer. See Shepmaster's one below =)

    – Paul Stenne
    6 hours ago






  • 1





    How can I append a formatted string to an existing String?

    – Shepmaster
    6 hours ago













7












7








7







Let's review the functional implementation!



Memory Allocations



One of the big issues of the functional style proposed here is the map method which allocates a lot. Every single character is first mapped to a String, before being collected.



It also uses the format machinery, which is known to be relatively slow.



Sometimes, people try way too hard to get a "pure" functional solution, instead:



let mut result = String::new();
for (c, group) in &source.chars().group_by(|&c| c)
let count = group.count();
if count > 1
result.push_str(&count.to_string());


result.push(c);



is about as verbose, yet only allocates when count > 1 just like your solution does and does not use the format machinery either.



I would expect a significant performance win compared to the full functional solution, while at the same time still leveraging group_by for extra readability compared to the full imperative solution. Sometimes, you ought to mix and match!






share|improve this answer















Let's review the functional implementation!



Memory Allocations



One of the big issues of the functional style proposed here is the map method which allocates a lot. Every single character is first mapped to a String, before being collected.



It also uses the format machinery, which is known to be relatively slow.



Sometimes, people try way too hard to get a "pure" functional solution, instead:



let mut result = String::new();
for (c, group) in &source.chars().group_by(|&c| c)
let count = group.count();
if count > 1
result.push_str(&count.to_string());


result.push(c);



is about as verbose, yet only allocates when count > 1 just like your solution does and does not use the format machinery either.



I would expect a significant performance win compared to the full functional solution, while at the same time still leveraging group_by for extra readability compared to the full imperative solution. Sometimes, you ought to mix and match!







share|improve this answer














share|improve this answer



share|improve this answer








edited 7 hours ago









Stargateur

9,68542151




9,68542151










answered 9 hours ago









Matthieu M.Matthieu M.

206k29284522




206k29284522












  • That certainly gives us a speed boost, but it is still around 3x slower than the imperative version (30s rather than 10s in my tests). In fact, even if I only push a constant letter in that for loop it is still about 14s, so around 50% slower than the imperative version. That leads me to believe that group_by is probably not zero cost for this use case. Answer accepted anyway!

    – David Copernicus Bowie
    7 hours ago












  • @DavidCopernicusBowie For complex questions like this, you may want to wait before accepting an answer. See Shepmaster's one below =)

    – Paul Stenne
    6 hours ago






  • 1





    How can I append a formatted string to an existing String?

    – Shepmaster
    6 hours ago

















  • That certainly gives us a speed boost, but it is still around 3x slower than the imperative version (30s rather than 10s in my tests). In fact, even if I only push a constant letter in that for loop it is still about 14s, so around 50% slower than the imperative version. That leads me to believe that group_by is probably not zero cost for this use case. Answer accepted anyway!

    – David Copernicus Bowie
    7 hours ago












  • @DavidCopernicusBowie For complex questions like this, you may want to wait before accepting an answer. See Shepmaster's one below =)

    – Paul Stenne
    6 hours ago






  • 1





    How can I append a formatted string to an existing String?

    – Shepmaster
    6 hours ago
















That certainly gives us a speed boost, but it is still around 3x slower than the imperative version (30s rather than 10s in my tests). In fact, even if I only push a constant letter in that for loop it is still about 14s, so around 50% slower than the imperative version. That leads me to believe that group_by is probably not zero cost for this use case. Answer accepted anyway!

– David Copernicus Bowie
7 hours ago






That certainly gives us a speed boost, but it is still around 3x slower than the imperative version (30s rather than 10s in my tests). In fact, even if I only push a constant letter in that for loop it is still about 14s, so around 50% slower than the imperative version. That leads me to believe that group_by is probably not zero cost for this use case. Answer accepted anyway!

– David Copernicus Bowie
7 hours ago














@DavidCopernicusBowie For complex questions like this, you may want to wait before accepting an answer. See Shepmaster's one below =)

– Paul Stenne
6 hours ago





@DavidCopernicusBowie For complex questions like this, you may want to wait before accepting an answer. See Shepmaster's one below =)

– Paul Stenne
6 hours ago




1




1





How can I append a formatted string to an existing String?

– Shepmaster
6 hours ago





How can I append a formatted string to an existing String?

– Shepmaster
6 hours ago













9














TL;DR



A functional implementation can be faster than your original procedural implementation, in certain cases.




Why is the functional style so much slower than the imperative style? Is there some problem with the functional implementation which is causing such a huge slowdown?




As Matthieu M. already pointed out, the important thing to note is that the algorithm matters. How that algorithm is expressed (procedural, imperative, object-oriented, functional, declarative) generally doesn't matter.



I see two main issues with the functional code:



  • Allocating numerous strings over and over is inefficient. In the original functional implementation, this is done via to_string and format!.


  • There's the overhead of using group_by, which exists to give a nested iterator, which you don't need just to get the counts.


Using more of itertools (batching, take_while_ref, format_with) brings the two implementations much closer:



pub fn encode_slim(data: &str) -> String match count 
1 => f(&c),
n => f(&format_args!("", n, c)),
)
.to_string()



A benchmark of 1KiB of random alphanumeric data:



encode (procedural) time: [4.9922 us 5.0386 us 5.0940 us]
Found 15 outliers among 100 measurements (15.00%)
2 (2.00%) high mild
13 (13.00%) high severe

encode (fast) time: [6.6025 us 6.6636 us 6.7371 us]
Found 10 outliers among 100 measurements (10.00%)
4 (4.00%) high mild
6 (6.00%) high severe


And 4MiB of data, compiled with RUSTFLAGS='-C target-cpu=native':



encode (procedural) time: [21.082 ms 21.620 ms 22.211 ms]

encode (fast) time: [26.457 ms 27.104 ms 27.882 ms]
Found 7 outliers among 100 measurements (7.00%)
4 (4.00%) high mild
3 (3.00%) high severe


If you are interested in creating your own iterator, you can mix-and-match the procedural code with more functional code:



struct RunLength<I> 
iter: I,
saved: Option<char>,


impl<I> RunLength<I>
where
I: Iterator<Item = char>,

fn new(mut iter: I) -> Self
let saved = iter.next(); // See footnote 1
Self iter, saved



impl<I> Iterator for RunLength<I>
where
I: Iterator<Item = char>,

type Item = (char, usize);

fn next(&mut self) -> Option<Self::Item>
let c = self.saved.take().or_else(


pub fn encode_tiny(data: &str) -> String
match count
1 => s.push(c),
n => write!(&mut s, "", n, c).unwrap(),

s
)



1 — thanks to Stargateur for pointing out that eagerly getting the first value helps branch prediction.



4MiB of data, compiled with RUSTFLAGS='-C target-cpu=native':



encode (procedural) time: [19.888 ms 20.301 ms 20.794 ms]
Found 4 outliers among 100 measurements (4.00%)
3 (3.00%) high mild
1 (1.00%) high severe

encode (tiny) time: [19.150 ms 19.262 ms 19.399 ms]
Found 11 outliers among 100 measurements (11.00%)
5 (5.00%) high mild
6 (6.00%) high severe


I believe this more clearly shows the main fundamental difference between the two implementations: an iterator-based solution is resumable. Every time we call next, we need to see if there was a previous character that we've read (self.saved). This adds a branch to the code that isn't there in the procedural code.



On the flip side, the iterator-based solution is more flexible — we can now compose all sorts of transformations on the data, or write directly to a file instead of a String, etc. The custom iterator can be extended to operate on a generic type instead of char as well, making it very flexible.



See also:



  • How can I add new methods to Iterator?


If I want to write high performance code, should I ever use this functional style?




I would, until benchmarking shows that it's the bottleneck. Then evaluate why it's the bottleneck.



Supporting code



Always got to show your work, right?



benchmark.rs



use criterion::criterion_group, criterion_main, Criterion; // 0.2.11
use rle::*;

fn criterion_benchmark(c: &mut Criterion)
let data = rand_data(4 * 1024 * 1024);

c.bench_function("encode (procedural)",
let data = data.clone();
move );

c.bench_function("encode (functional)", encode_iter(&data))
);

c.bench_function("encode (fast)", b);

c.bench_function("encode (tiny)", b);


criterion_group!(benches, criterion_benchmark);
criterion_main!(benches);


lib.rs



use itertools::Itertools; // 0.8.0
use rand; // 0.6.5

pub fn rand_data(len: usize) -> String
use rand::distributions::Alphanumeric, Distribution;
let mut rng = rand::thread_rng();
Alphanumeric.sample_iter(&mut rng).take(len).collect()


pub fn encode_proc(source: &str) -> String
let mut retval = String::new();
let firstchar = source.chars().next();
let mut currentchar = match firstchar
Some(x) => x,
None => return retval,
;
let mut currentcharcount: u32 = 0;
for c in source.chars()
if c == currentchar
currentcharcount += 1;
else
if currentcharcount > 1
retval.push_str(&currentcharcount.to_string());

retval.push(currentchar);
currentchar = c;
currentcharcount = 1;


if currentcharcount > 1
retval.push_str(&currentcharcount.to_string());

retval.push(currentchar);
retval


pub fn encode_iter(data: &str) -> String
data.chars()
.group_by(

pub fn encode_slim(data: &str) -> String match count
1 => f(&c),
n => f(&format_args!("", n, c)),
)
.to_string()


struct RunLength<I>
iter: I,
saved: Option<char>,


impl<I> RunLength<I>
where
I: Iterator<Item = char>,

fn new(mut iter: I) -> Self
let saved = iter.next();
Self iter, saved



impl<I> Iterator for RunLength<I>
where
I: Iterator<Item = char>,

type Item = (char, usize);

fn next(&mut self) -> Option<Self::Item>
let c = self.saved.take().or_else(


pub fn encode_tiny(data: &str) -> String
match count
1 => s.push(c),
n => write!(&mut s, "", n, c).unwrap(),

s
)


#[cfg(test)]
mod test
use super::*;

#[test]
fn all_the_same()
let data = rand_data(1024);

let a = encode_proc(&data);
let b = encode_iter(&data);
let c = encode_slim(&data);
let d = encode_tiny(&data);

assert_eq!(a, b);
assert_eq!(a, c);
assert_eq!(a, d);







share|improve this answer

























  • Great iterator!

    – Matthieu M.
    4 hours ago















9














TL;DR



A functional implementation can be faster than your original procedural implementation, in certain cases.




Why is the functional style so much slower than the imperative style? Is there some problem with the functional implementation which is causing such a huge slowdown?




As Matthieu M. already pointed out, the important thing to note is that the algorithm matters. How that algorithm is expressed (procedural, imperative, object-oriented, functional, declarative) generally doesn't matter.



I see two main issues with the functional code:



  • Allocating numerous strings over and over is inefficient. In the original functional implementation, this is done via to_string and format!.


  • There's the overhead of using group_by, which exists to give a nested iterator, which you don't need just to get the counts.


Using more of itertools (batching, take_while_ref, format_with) brings the two implementations much closer:



pub fn encode_slim(data: &str) -> String match count 
1 => f(&c),
n => f(&format_args!("", n, c)),
)
.to_string()



A benchmark of 1KiB of random alphanumeric data:



encode (procedural) time: [4.9922 us 5.0386 us 5.0940 us]
Found 15 outliers among 100 measurements (15.00%)
2 (2.00%) high mild
13 (13.00%) high severe

encode (fast) time: [6.6025 us 6.6636 us 6.7371 us]
Found 10 outliers among 100 measurements (10.00%)
4 (4.00%) high mild
6 (6.00%) high severe


And 4MiB of data, compiled with RUSTFLAGS='-C target-cpu=native':



encode (procedural) time: [21.082 ms 21.620 ms 22.211 ms]

encode (fast) time: [26.457 ms 27.104 ms 27.882 ms]
Found 7 outliers among 100 measurements (7.00%)
4 (4.00%) high mild
3 (3.00%) high severe


If you are interested in creating your own iterator, you can mix-and-match the procedural code with more functional code:



struct RunLength<I> 
iter: I,
saved: Option<char>,


impl<I> RunLength<I>
where
I: Iterator<Item = char>,

fn new(mut iter: I) -> Self
let saved = iter.next(); // See footnote 1
Self iter, saved



impl<I> Iterator for RunLength<I>
where
I: Iterator<Item = char>,

type Item = (char, usize);

fn next(&mut self) -> Option<Self::Item>
let c = self.saved.take().or_else(


pub fn encode_tiny(data: &str) -> String
match count
1 => s.push(c),
n => write!(&mut s, "", n, c).unwrap(),

s
)



1 — thanks to Stargateur for pointing out that eagerly getting the first value helps branch prediction.



4MiB of data, compiled with RUSTFLAGS='-C target-cpu=native':



encode (procedural) time: [19.888 ms 20.301 ms 20.794 ms]
Found 4 outliers among 100 measurements (4.00%)
3 (3.00%) high mild
1 (1.00%) high severe

encode (tiny) time: [19.150 ms 19.262 ms 19.399 ms]
Found 11 outliers among 100 measurements (11.00%)
5 (5.00%) high mild
6 (6.00%) high severe


I believe this more clearly shows the main fundamental difference between the two implementations: an iterator-based solution is resumable. Every time we call next, we need to see if there was a previous character that we've read (self.saved). This adds a branch to the code that isn't there in the procedural code.



On the flip side, the iterator-based solution is more flexible — we can now compose all sorts of transformations on the data, or write directly to a file instead of a String, etc. The custom iterator can be extended to operate on a generic type instead of char as well, making it very flexible.



See also:



  • How can I add new methods to Iterator?


If I want to write high performance code, should I ever use this functional style?




I would, until benchmarking shows that it's the bottleneck. Then evaluate why it's the bottleneck.



Supporting code



Always got to show your work, right?



benchmark.rs



use criterion::criterion_group, criterion_main, Criterion; // 0.2.11
use rle::*;

fn criterion_benchmark(c: &mut Criterion)
let data = rand_data(4 * 1024 * 1024);

c.bench_function("encode (procedural)",
let data = data.clone();
move );

c.bench_function("encode (functional)", encode_iter(&data))
);

c.bench_function("encode (fast)", b);

c.bench_function("encode (tiny)", b);


criterion_group!(benches, criterion_benchmark);
criterion_main!(benches);


lib.rs



use itertools::Itertools; // 0.8.0
use rand; // 0.6.5

pub fn rand_data(len: usize) -> String
use rand::distributions::Alphanumeric, Distribution;
let mut rng = rand::thread_rng();
Alphanumeric.sample_iter(&mut rng).take(len).collect()


pub fn encode_proc(source: &str) -> String
let mut retval = String::new();
let firstchar = source.chars().next();
let mut currentchar = match firstchar
Some(x) => x,
None => return retval,
;
let mut currentcharcount: u32 = 0;
for c in source.chars()
if c == currentchar
currentcharcount += 1;
else
if currentcharcount > 1
retval.push_str(&currentcharcount.to_string());

retval.push(currentchar);
currentchar = c;
currentcharcount = 1;


if currentcharcount > 1
retval.push_str(&currentcharcount.to_string());

retval.push(currentchar);
retval


pub fn encode_iter(data: &str) -> String
data.chars()
.group_by(

pub fn encode_slim(data: &str) -> String match count
1 => f(&c),
n => f(&format_args!("", n, c)),
)
.to_string()


struct RunLength<I>
iter: I,
saved: Option<char>,


impl<I> RunLength<I>
where
I: Iterator<Item = char>,

fn new(mut iter: I) -> Self
let saved = iter.next();
Self iter, saved



impl<I> Iterator for RunLength<I>
where
I: Iterator<Item = char>,

type Item = (char, usize);

fn next(&mut self) -> Option<Self::Item>
let c = self.saved.take().or_else(


pub fn encode_tiny(data: &str) -> String
match count
1 => s.push(c),
n => write!(&mut s, "", n, c).unwrap(),

s
)


#[cfg(test)]
mod test
use super::*;

#[test]
fn all_the_same()
let data = rand_data(1024);

let a = encode_proc(&data);
let b = encode_iter(&data);
let c = encode_slim(&data);
let d = encode_tiny(&data);

assert_eq!(a, b);
assert_eq!(a, c);
assert_eq!(a, d);







share|improve this answer

























  • Great iterator!

    – Matthieu M.
    4 hours ago













9












9








9







TL;DR



A functional implementation can be faster than your original procedural implementation, in certain cases.




Why is the functional style so much slower than the imperative style? Is there some problem with the functional implementation which is causing such a huge slowdown?




As Matthieu M. already pointed out, the important thing to note is that the algorithm matters. How that algorithm is expressed (procedural, imperative, object-oriented, functional, declarative) generally doesn't matter.



I see two main issues with the functional code:



  • Allocating numerous strings over and over is inefficient. In the original functional implementation, this is done via to_string and format!.


  • There's the overhead of using group_by, which exists to give a nested iterator, which you don't need just to get the counts.


Using more of itertools (batching, take_while_ref, format_with) brings the two implementations much closer:



pub fn encode_slim(data: &str) -> String match count 
1 => f(&c),
n => f(&format_args!("", n, c)),
)
.to_string()



A benchmark of 1KiB of random alphanumeric data:



encode (procedural) time: [4.9922 us 5.0386 us 5.0940 us]
Found 15 outliers among 100 measurements (15.00%)
2 (2.00%) high mild
13 (13.00%) high severe

encode (fast) time: [6.6025 us 6.6636 us 6.7371 us]
Found 10 outliers among 100 measurements (10.00%)
4 (4.00%) high mild
6 (6.00%) high severe


And 4MiB of data, compiled with RUSTFLAGS='-C target-cpu=native':



encode (procedural) time: [21.082 ms 21.620 ms 22.211 ms]

encode (fast) time: [26.457 ms 27.104 ms 27.882 ms]
Found 7 outliers among 100 measurements (7.00%)
4 (4.00%) high mild
3 (3.00%) high severe


If you are interested in creating your own iterator, you can mix-and-match the procedural code with more functional code:



struct RunLength<I> 
iter: I,
saved: Option<char>,


impl<I> RunLength<I>
where
I: Iterator<Item = char>,

fn new(mut iter: I) -> Self
let saved = iter.next(); // See footnote 1
Self iter, saved



impl<I> Iterator for RunLength<I>
where
I: Iterator<Item = char>,

type Item = (char, usize);

fn next(&mut self) -> Option<Self::Item>
let c = self.saved.take().or_else(


pub fn encode_tiny(data: &str) -> String
match count
1 => s.push(c),
n => write!(&mut s, "", n, c).unwrap(),

s
)



1 — thanks to Stargateur for pointing out that eagerly getting the first value helps branch prediction.



4MiB of data, compiled with RUSTFLAGS='-C target-cpu=native':



encode (procedural) time: [19.888 ms 20.301 ms 20.794 ms]
Found 4 outliers among 100 measurements (4.00%)
3 (3.00%) high mild
1 (1.00%) high severe

encode (tiny) time: [19.150 ms 19.262 ms 19.399 ms]
Found 11 outliers among 100 measurements (11.00%)
5 (5.00%) high mild
6 (6.00%) high severe


I believe this more clearly shows the main fundamental difference between the two implementations: an iterator-based solution is resumable. Every time we call next, we need to see if there was a previous character that we've read (self.saved). This adds a branch to the code that isn't there in the procedural code.



On the flip side, the iterator-based solution is more flexible — we can now compose all sorts of transformations on the data, or write directly to a file instead of a String, etc. The custom iterator can be extended to operate on a generic type instead of char as well, making it very flexible.



See also:



  • How can I add new methods to Iterator?


If I want to write high performance code, should I ever use this functional style?




I would, until benchmarking shows that it's the bottleneck. Then evaluate why it's the bottleneck.



Supporting code



Always got to show your work, right?



benchmark.rs



use criterion::criterion_group, criterion_main, Criterion; // 0.2.11
use rle::*;

fn criterion_benchmark(c: &mut Criterion)
let data = rand_data(4 * 1024 * 1024);

c.bench_function("encode (procedural)",
let data = data.clone();
move );

c.bench_function("encode (functional)", encode_iter(&data))
);

c.bench_function("encode (fast)", b);

c.bench_function("encode (tiny)", b);


criterion_group!(benches, criterion_benchmark);
criterion_main!(benches);


lib.rs



use itertools::Itertools; // 0.8.0
use rand; // 0.6.5

pub fn rand_data(len: usize) -> String
use rand::distributions::Alphanumeric, Distribution;
let mut rng = rand::thread_rng();
Alphanumeric.sample_iter(&mut rng).take(len).collect()


pub fn encode_proc(source: &str) -> String
let mut retval = String::new();
let firstchar = source.chars().next();
let mut currentchar = match firstchar
Some(x) => x,
None => return retval,
;
let mut currentcharcount: u32 = 0;
for c in source.chars()
if c == currentchar
currentcharcount += 1;
else
if currentcharcount > 1
retval.push_str(&currentcharcount.to_string());

retval.push(currentchar);
currentchar = c;
currentcharcount = 1;


if currentcharcount > 1
retval.push_str(&currentcharcount.to_string());

retval.push(currentchar);
retval


pub fn encode_iter(data: &str) -> String
data.chars()
.group_by(

pub fn encode_slim(data: &str) -> String match count
1 => f(&c),
n => f(&format_args!("", n, c)),
)
.to_string()


struct RunLength<I>
iter: I,
saved: Option<char>,


impl<I> RunLength<I>
where
I: Iterator<Item = char>,

fn new(mut iter: I) -> Self
let saved = iter.next();
Self iter, saved



impl<I> Iterator for RunLength<I>
where
I: Iterator<Item = char>,

type Item = (char, usize);

fn next(&mut self) -> Option<Self::Item>
let c = self.saved.take().or_else(


pub fn encode_tiny(data: &str) -> String
match count
1 => s.push(c),
n => write!(&mut s, "", n, c).unwrap(),

s
)


#[cfg(test)]
mod test
use super::*;

#[test]
fn all_the_same()
let data = rand_data(1024);

let a = encode_proc(&data);
let b = encode_iter(&data);
let c = encode_slim(&data);
let d = encode_tiny(&data);

assert_eq!(a, b);
assert_eq!(a, c);
assert_eq!(a, d);







share|improve this answer















TL;DR



A functional implementation can be faster than your original procedural implementation, in certain cases.




Why is the functional style so much slower than the imperative style? Is there some problem with the functional implementation which is causing such a huge slowdown?




As Matthieu M. already pointed out, the important thing to note is that the algorithm matters. How that algorithm is expressed (procedural, imperative, object-oriented, functional, declarative) generally doesn't matter.



I see two main issues with the functional code:



  • Allocating numerous strings over and over is inefficient. In the original functional implementation, this is done via to_string and format!.


  • There's the overhead of using group_by, which exists to give a nested iterator, which you don't need just to get the counts.


Using more of itertools (batching, take_while_ref, format_with) brings the two implementations much closer:



pub fn encode_slim(data: &str) -> String match count 
1 => f(&c),
n => f(&format_args!("", n, c)),
)
.to_string()



A benchmark of 1KiB of random alphanumeric data:



encode (procedural) time: [4.9922 us 5.0386 us 5.0940 us]
Found 15 outliers among 100 measurements (15.00%)
2 (2.00%) high mild
13 (13.00%) high severe

encode (fast) time: [6.6025 us 6.6636 us 6.7371 us]
Found 10 outliers among 100 measurements (10.00%)
4 (4.00%) high mild
6 (6.00%) high severe


And 4MiB of data, compiled with RUSTFLAGS='-C target-cpu=native':



encode (procedural) time: [21.082 ms 21.620 ms 22.211 ms]

encode (fast) time: [26.457 ms 27.104 ms 27.882 ms]
Found 7 outliers among 100 measurements (7.00%)
4 (4.00%) high mild
3 (3.00%) high severe


If you are interested in creating your own iterator, you can mix-and-match the procedural code with more functional code:



struct RunLength<I> 
iter: I,
saved: Option<char>,


impl<I> RunLength<I>
where
I: Iterator<Item = char>,

fn new(mut iter: I) -> Self
let saved = iter.next(); // See footnote 1
Self iter, saved



impl<I> Iterator for RunLength<I>
where
I: Iterator<Item = char>,

type Item = (char, usize);

fn next(&mut self) -> Option<Self::Item>
let c = self.saved.take().or_else(


pub fn encode_tiny(data: &str) -> String
match count
1 => s.push(c),
n => write!(&mut s, "", n, c).unwrap(),

s
)



1 — thanks to Stargateur for pointing out that eagerly getting the first value helps branch prediction.



4MiB of data, compiled with RUSTFLAGS='-C target-cpu=native':



encode (procedural) time: [19.888 ms 20.301 ms 20.794 ms]
Found 4 outliers among 100 measurements (4.00%)
3 (3.00%) high mild
1 (1.00%) high severe

encode (tiny) time: [19.150 ms 19.262 ms 19.399 ms]
Found 11 outliers among 100 measurements (11.00%)
5 (5.00%) high mild
6 (6.00%) high severe


I believe this more clearly shows the main fundamental difference between the two implementations: an iterator-based solution is resumable. Every time we call next, we need to see if there was a previous character that we've read (self.saved). This adds a branch to the code that isn't there in the procedural code.



On the flip side, the iterator-based solution is more flexible — we can now compose all sorts of transformations on the data, or write directly to a file instead of a String, etc. The custom iterator can be extended to operate on a generic type instead of char as well, making it very flexible.



See also:



  • How can I add new methods to Iterator?


If I want to write high performance code, should I ever use this functional style?




I would, until benchmarking shows that it's the bottleneck. Then evaluate why it's the bottleneck.



Supporting code



Always got to show your work, right?



benchmark.rs



use criterion::criterion_group, criterion_main, Criterion; // 0.2.11
use rle::*;

fn criterion_benchmark(c: &mut Criterion)
let data = rand_data(4 * 1024 * 1024);

c.bench_function("encode (procedural)",
let data = data.clone();
move );

c.bench_function("encode (functional)", encode_iter(&data))
);

c.bench_function("encode (fast)", b);

c.bench_function("encode (tiny)", b);


criterion_group!(benches, criterion_benchmark);
criterion_main!(benches);


lib.rs



use itertools::Itertools; // 0.8.0
use rand; // 0.6.5

pub fn rand_data(len: usize) -> String
use rand::distributions::Alphanumeric, Distribution;
let mut rng = rand::thread_rng();
Alphanumeric.sample_iter(&mut rng).take(len).collect()


pub fn encode_proc(source: &str) -> String
let mut retval = String::new();
let firstchar = source.chars().next();
let mut currentchar = match firstchar
Some(x) => x,
None => return retval,
;
let mut currentcharcount: u32 = 0;
for c in source.chars()
if c == currentchar
currentcharcount += 1;
else
if currentcharcount > 1
retval.push_str(&currentcharcount.to_string());

retval.push(currentchar);
currentchar = c;
currentcharcount = 1;


if currentcharcount > 1
retval.push_str(&currentcharcount.to_string());

retval.push(currentchar);
retval


pub fn encode_iter(data: &str) -> String
data.chars()
.group_by(

pub fn encode_slim(data: &str) -> String match count
1 => f(&c),
n => f(&format_args!("", n, c)),
)
.to_string()


struct RunLength<I>
iter: I,
saved: Option<char>,


impl<I> RunLength<I>
where
I: Iterator<Item = char>,

fn new(mut iter: I) -> Self
let saved = iter.next();
Self iter, saved



impl<I> Iterator for RunLength<I>
where
I: Iterator<Item = char>,

type Item = (char, usize);

fn next(&mut self) -> Option<Self::Item>
let c = self.saved.take().or_else(


pub fn encode_tiny(data: &str) -> String
match count
1 => s.push(c),
n => write!(&mut s, "", n, c).unwrap(),

s
)


#[cfg(test)]
mod test
use super::*;

#[test]
fn all_the_same()
let data = rand_data(1024);

let a = encode_proc(&data);
let b = encode_iter(&data);
let c = encode_slim(&data);
let d = encode_tiny(&data);

assert_eq!(a, b);
assert_eq!(a, c);
assert_eq!(a, d);








share|improve this answer














share|improve this answer



share|improve this answer








edited 5 hours ago

























answered 6 hours ago









ShepmasterShepmaster

162k16333477




162k16333477












  • Great iterator!

    – Matthieu M.
    4 hours ago

















  • Great iterator!

    – Matthieu M.
    4 hours ago
















Great iterator!

– Matthieu M.
4 hours ago





Great iterator!

– Matthieu M.
4 hours ago










David Copernicus Bowie is a new contributor. Be nice, and check out our Code of Conduct.









draft saved

draft discarded


















David Copernicus Bowie is a new contributor. Be nice, and check out our Code of Conduct.












David Copernicus Bowie is a new contributor. Be nice, and check out our Code of Conduct.











David Copernicus Bowie is a new contributor. Be nice, and check out our Code of Conduct.














Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid


  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55675093%2fwhat-are-the-performance-impacts-of-functional-rust%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Era Viking Índice Início da Era Viquingue | Cotidiano | Sociedade | Língua | Religião | A arte | As primeiras cidades | As viagens dos viquingues | Viquingues do Oeste e Leste | Fim da Era Viquingue | Fontes históricas | Referências Bibliografia | Ligações externas | Menu de navegação«Sverige då!»«Handel I vikingetid»«O que é Nórdico Antigo»Mito, magia e religião na volsunga saga Um olhar sobre a trajetória mítica do herói sigurd«Bonden var den verklige vikingen»«Vikingatiden»«Vikingatiden»«Vinland»«Guerreiras de Óðinn: As Valkyrjor na Mitologia Viking»1519-9053«Esculpindo símbolos e seres: A arte viking em pedras rúnicas»1679-9313Historia - Tema: VikingarnaAventura e Magia no Mundo das Sagas IslandesasEra Vikinge

What's the metal clinking sound at the end of credits in Avengers: Endgame?What makes Thanos so strong in Avengers: Endgame?Who is the character that appears at the end of Endgame?What happens to Mjolnir (Thor's hammer) at the end of Endgame?The People's Ages in Avengers: EndgameWhat did Nebula do in Avengers: Endgame?Messing with time in the Avengers: Endgame climaxAvengers: Endgame timelineWhat are the time-travel rules in Avengers Endgame?Why use this song in Avengers: Endgame Opening Logo Sequence?Peggy's age in Avengers Endgame

Are there legal definitions of ethnicities/races? The 2019 Stack Overflow Developer Survey Results Are In Announcing the arrival of Valued Associate #679: Cesar Manara Planned maintenance scheduled April 17/18, 2019 at 00:00UTC (8:00pm US/Eastern)Legal definitions in the United StatesAre there truly legal limits on US interest rates?Are gender identity and sexual orientation federally protected?Why is there an apparent legal bias against digital services?What limits are there to the powers of individual judges in the United States legal system?Are women only scholarships legal under Irish / EU law?Is the term “race” defined by Public Law enacted by Congress of the United StatesIs there a legal definition of race in the US?Neighbors are spying for landlord on Renters is it legal?Are Protected Classes Bi-directional?