Rendered at 20:01:30 GMT+0000 (Coordinated Universal Time) with Cloudflare Workers.
epistasis 22 hours ago [-]
> It might seem odd to prefer shell scripting over a full-featured dynamic scripting language, but shell scripts like this have some material advantages over Python:
And thus 99% of bioinformatics pipelines are shell at their heart... You need 10 packages, written in 4 different programming languages, and the common interfaces are files and pipes.
And for that matter, this could use a named pipe rather than a file (assuming `odgi depth` only uses streaming access):
Which is why bioinformaticians get bad reputations with software engineers. (I still have a fair amount of misplaced pride for adding a shebang to a Makefile once to make a pipeline into a command several decades ago...)
alexpotato 22 hours ago [-]
The "bash is 235x faster than Hadoop" article pops up on HN every so often and this is another great time to link it here:
I added a shebang to a readme once (written in literate style) so the poor engineers on the other side wouldn't have to deal with the multi-step monstrosity within.
aboardRat4 21 hours ago [-]
> It might seem odd to prefer shell scripting over a full-featured dynamic scripting language, but shell scripts like this have some material advantages over Python
Nothing strange. Shell is the most natural dynamic language. It's a shame we don't have better shells.
gianiac 23 hours ago [-]
I really like the IR-based approach, it solves something that's always bothered me about shell pipelines: you're forced to think in terms of serializing bytes, even when both ends of the pipe are the same program and could just share memory. Flash makes that optimization explicit and easy to compose with the rest of the pipeline. One question, though: have you run into any issues with the "opportunistic" binary format substitution (the .flatgfa fallback) when scripts are shared across machines where some files have already been converted and others haven't?
And thus 99% of bioinformatics pipelines are shell at their heart... You need 10 packages, written in 4 different programming languages, and the common interfaces are files and pipes.
And for that matter, this could use a named pipe rather than a file (assuming `odgi depth` only uses streaming access):
And Bash process substitution allows writing it all without an explicitly named pipe, though it may look a bit ugly: Which is why bioinformaticians get bad reputations with software engineers. (I still have a fair amount of misplaced pride for adding a shebang to a Makefile once to make a pipeline into a command several decades ago...)https://adamdrake.com/command-line-tools-can-be-235x-faster-...
Nothing strange. Shell is the most natural dynamic language. It's a shame we don't have better shells.